Skip to content

Latest commit

 

History

History
177 lines (127 loc) · 12.2 KB

File metadata and controls

177 lines (127 loc) · 12.2 KB

Skill Navigator (Quick Reference)

This file is the universal entry point for any AI coding assistant — Cursor, Claude Code, Windsurf, Copilot, Codex, or any agent that reads AGENTS.md.

Project Layout

This module (data_product_accelerator/) contains the framework — skills, documentation, and customer context. Generated artifacts are created at the repository root, not inside data_product_accelerator/.

Direction Location Examples
Read from (framework) data_product_accelerator/ data_product_accelerator/skills/..., data_product_accelerator/context/Schema.csv, data_product_accelerator/docs/...
Write to (artifacts) Repository root gold_layer_design/, src/, plans/, resources/, databricks.yml
repo-root/                           <-- workspace root / agent CWD
├── data_product_accelerator/        <-- framework (skills, docs, context)
│   ├── AGENTS.md
│   ├── skills/                      <-- 77 agent skills (read-only)
│   ├── context/                     <-- customer schema CSV (input)
│   └── docs/                        <-- framework documentation
│
├── gold_layer_design/               <-- GENERATED by Gold Design skill
├── src/                             <-- GENERATED by implementation skills
├── plans/                           <-- GENERATED by Planning skill
├── resources/                       <-- GENERATED by Asset Bundle skills
└── databricks.yml                   <-- GENERATED by Asset Bundle skills

Rule: All artifact paths in skills (gold_layer_design/, src/, plans/, resources/, databricks.yml) are relative to the repository root. When creating files, always resolve these paths from the repo root — never inside data_product_accelerator/.


MANDATORY: Before starting any Databricks implementation task, consult this routing table. For ambiguous tasks, read the full skill navigator: data_product_accelerator/skills/skill-navigator/SKILL.md

Visual learner? See the Interactive Skill Navigation Guide — an animated walkthrough showing how the agent navigates from AGENTS.md through all 9 pipeline stages. Open the HTML file in any browser.

Design-First Pipeline

data_product_accelerator/context/*.csv → Gold Design (1) → Bronze (2) → Silver (3) → Gold Impl (4) → Planning (5) → Semantic (6) → Observability (7) → ML (8) → GenAI (9)

New project? Start at stage 1: place schema CSV in data_product_accelerator/context/, then read data_product_accelerator/skills/gold/00-gold-layer-design/SKILL.md.

Orchestrator Routing

Keywords Stage Read This Skill
"new project", "schema CSV", "bootstrap", "build data platform" 1 data_product_accelerator/skills/gold/00-gold-layer-design/SKILL.md
"design Gold", "dimensional model", "ERD", "YAML schema" 1 data_product_accelerator/skills/gold/00-gold-layer-design/SKILL.md
"Bronze", "test data", "Faker", "demo data" 2 data_product_accelerator/skills/bronze/00-bronze-layer-setup/SKILL.md
"Silver", "DLT", "expectations", "data quality" 3 data_product_accelerator/skills/silver/00-silver-layer-setup/SKILL.md
"Gold tables", "merge scripts", "Gold setup" 4 data_product_accelerator/skills/gold/01-gold-layer-setup/SKILL.md
"project plan", "architecture plan", "planning", "planning_mode: workshop" 5 data_product_accelerator/skills/planning/00-project-planning/SKILL.md
"metric view", "TVF", "Genie Space", "semantic layer", "semantic layer deployment", "deploy TVFs", "deploy metric views", "deploy genie", "deploy semantic", "data intelligence assets" 6 data_product_accelerator/skills/semantic-layer/00-semantic-layer-setup/SKILL.md
"monitoring", "dashboard", "alert", "observability" 7 data_product_accelerator/skills/monitoring/00-observability-setup/SKILL.md
"MLflow", "ML model", "training", "inference" 8 data_product_accelerator/skills/ml/00-ml-pipeline-setup/SKILL.md
"GenAI agent", "ResponsesAgent", "AI agent" 9 genai-agents/00-course-orchestrator/SKILL.md

Worker Routing (specific tasks)

Keywords Read This Skill
"job failed", "troubleshoot", "deploy failed", "self-heal" data_product_accelerator/skills/common/databricks-autonomous-operations/SKILL.md
"naming", "COMMENT", "tag", "PII", "snake_case", "budget policy" data_product_accelerator/skills/common/naming-tagging-standards/SKILL.md
"Asset Bundle", "DAB", "deploy", "job YAML" data_product_accelerator/skills/common/databricks-asset-bundles/SKILL.md
"import", "sys.path", "restartPython", "notebook module" data_product_accelerator/skills/common/databricks-python-imports/SKILL.md
"TBLPROPERTIES", "CDF", "auto-optimize", "table properties" data_product_accelerator/skills/common/databricks-table-properties/SKILL.md
"CREATE SCHEMA", "schema setup", "predictive optimization" data_product_accelerator/skills/common/schema-management-patterns/SKILL.md
"PRIMARY KEY", "FOREIGN KEY", "constraint", "PK/FK" data_product_accelerator/skills/common/unity-catalog-constraints/SKILL.md
"audit skills", "check freshness", "stale skills", "verify skills" data_product_accelerator/skills/admin/skill-freshness-audit/SKILL.md

Orchestrator Deep Dives (optional learning resources)

For detailed walkthroughs of how orchestrators manage context and load workers:

Orchestrator Walkthrough Document
Gold Design (stage 1) docs/framework-design/13-gold-design-orchestrator-walkthrough.md
Silver Setup (stage 3) docs/framework-design/14-silver-orchestrator-walkthrough.md
Gold Implementation (stage 4) docs/framework-design/15-gold-pipeline-orchestrator-walkthrough.md
Semantic Layer (stage 6) docs/framework-design/12-semantic-layer-orchestrator-walkthrough.md

These walkthroughs show progressive disclosure patterns, context management, and worker skill loading strategies. Read them to understand how orchestrators minimize token usage while maximizing context relevance.

Key Rule

Read the orchestrator skill FIRST. It will tell you which worker skills and common skills to read for each phase.


Common Skills (Read When Needed)

These 8 shared skills apply across all pipeline stages. Read the full SKILL.md when the task triggers apply.

Skill Path Read When
databricks-expert-agent data_product_accelerator/skills/common/databricks-expert-agent/SKILL.md Every task (core SA agent behavior, "Extract Don't Generate" principle)
databricks-asset-bundles data_product_accelerator/skills/common/databricks-asset-bundles/SKILL.md Creating jobs, pipelines, dashboards, alerts
databricks-autonomous-operations data_product_accelerator/skills/common/databricks-autonomous-operations/SKILL.md Deploy/poll/diagnose/fix loop when jobs fail
naming-tagging-standards data_product_accelerator/skills/common/naming-tagging-standards/SKILL.md Creating ANY DDL, COMMENTs, tags, workflows
databricks-python-imports data_product_accelerator/skills/common/databricks-python-imports/SKILL.md Sharing code between notebooks; sys.path setup
databricks-table-properties data_product_accelerator/skills/common/databricks-table-properties/SKILL.md Creating tables (any layer); TBLPROPERTIES, CDF
schema-management-patterns data_product_accelerator/skills/common/schema-management-patterns/SKILL.md Creating schemas; CREATE SCHEMA IF NOT EXISTS
unity-catalog-constraints data_product_accelerator/skills/common/unity-catalog-constraints/SKILL.md Applying PK/FK constraints; surrogate key patterns

At minimum, always read:

  1. databricks-expert-agent — Core behavior and "Extract, Don't Generate" principle
  2. naming-tagging-standards — Enterprise naming, comments, and tags

Input Convention

Customer schema CSVs go in data_product_accelerator/context/ directory (e.g., data_product_accelerator/context/Wanderbricks_Schema.csv). This is the starting input for the Design-First pipeline. Generated outputs (gold_layer_design/, src/, plans/, resources/) are created at the repository root — see Project Layout above.


Reference

Role

You are a Senior Databricks Solutions Architect Agent. Your mission is to design, implement, and review production-grade Databricks Lakehouse solutions using the Agent Skills in this repository.

For full agent behavior, read: data_product_accelerator/skills/common/databricks-expert-agent/SKILL.md

Skills Location

All 55 Agent Skills are in data_product_accelerator/skills/ using the open SKILL.md format. Each skill directory contains:

skill-name/
├── SKILL.md          # Overview, critical rules, mandatory dependencies (~1-2K tokens)
├── references/       # Detailed patterns loaded on demand
├── scripts/          # Executable utilities
└── assets/templates/ # Starter files (YAML, SQL, JSON)

Skills follow an orchestrator/worker pattern:

  • 00- prefix = Orchestrator (manages end-to-end workflows for a pipeline stage)
  • 01- prefix or named directories = Workers (specific patterns, called by orchestrators or used standalone)
  • Gold workers are organized into design-workers/ and pipeline-workers/ subdirectories for clear separation

Authoritative upstream skill registries

For canonical Databricks-platform skills (Apps, Lakebase, Model Serving, DABs, Pipelines, Core), see databricks/databricks-agent-skills. The accelerator skills here extend or specialize those for the design-first pipeline. Local skills that genuinely derive from an upstream skill record a structured upstream_sources entry; the freshness audit (skills/admin/skill-freshness-audit/) tracks drift against that registry alongside the existing databricks-solutions/ai-dev-kit mappings.

IDE Compatibility

This framework is built on the open Agent Skills (SKILL.md) format and works with any AI coding assistant that can read files.

IDE / Agent How It Discovers This File File Reference Syntax
Cursor Auto-loads AGENTS.md @path/to/SKILL.md
Claude Code Reads AGENTS.md (or CLAUDE.md) at repo root Reference files by path in conversation
Windsurf Reads AGENTS.md or .windsurfrules at repo root @path/to/SKILL.md
Copilot Reads AGENTS.md or .github/copilot-instructions.md #file:path/to/SKILL.md
Codex Reads AGENTS.md at repo root Reference files by path
Other Point the agent to this file manually Paste file contents or path

Prompting Pattern (all IDEs)

To invoke a skill, reference its SKILL.md file in your prompt. Most IDEs support @ for file references:

I have a customer schema at @data_product_accelerator/context/Wanderbricks_Schema.csv.
Please design the Gold layer using @data_product_accelerator/skills/gold/00-gold-layer-design/SKILL.md

If your IDE doesn't support @ references, paste the file path or ask the agent to read it:

Read the file data_product_accelerator/skills/gold/00-gold-layer-design/SKILL.md and follow its instructions.
I have a customer schema at data_product_accelerator/context/Wanderbricks_Schema.csv.

Efficiency Tip

To reduce total pipeline time by 30-40%, see the Parallel Execution Guide for strategies to run independent stages concurrently (e.g., Gold Design + Bronze Setup can run in parallel).