Documentation Index
Fetch the complete documentation index at: https://docs.getnao.io/llms.txt
Use this file to discover all available pages before exploring further.
What these skills are
nao publishes five context-engineering skills β standardSKILL.md files (YAML frontmatter + markdown) that any agentic CLI (Claude Code, Codex, Cursor, the Claude Agent SDK, β¦) can load and invoke.
Each skill is a self-contained workflow that automates one stage of the Context Engineering lifecycle: scoping a project, writing rules, building a test suite, auditing whatβs there, and adding a semantic layer once tests show metric gaps.
Demo
Install
The published source of truth lives at github.com/getnao/nao/tree/main/skills. Install all five into the current projectβs.claude/skills/ with:
nao skills is a thin wrapper around the open-source skills CLI from Vercel Labs, so the equivalent direct call also works:
nao_config.yaml lives. Re-run any time to pick up updates; pass through --force to overwrite local edits.
The five skills
setup-context
Takes the user frompip install nao-core to a synced project with a starter RULES.md. First-time install only β for editing rules, generating tests, or reviewing an existing context, use the other skills below.
Steps
- Ask everything in one round β warehouse + auth, scope (which tables, β€100 with 20 as the target), extra context (dbt / ETL / BI repos, Notion, internal docs), LLM provider.
- Look up the warehouse-specific config from docs.getnao.io/nao-agent/context-builder/databases, write
nao_config.yaml, runnao init, then print a summary for the user to confirm before continuing. nao syncβ populatedatabases/,repos/,docs/,semantics/. Donβt move on until sync is clean.- Generate
RULES.mdby handing off towrite-context-rules. - Wire up the LLM key via
${ANTHROPIC_API_KEY}(or equivalent) β never paste keys into chat. - Recommend next steps β smoke test with
nao chat, reviewRULES.md, thencreate-context-tests.
write-context-rules
OwnsRULES.md. Generates the six standard sections, section by section, showing the user each block before moving on. If RULES.md already has content, runs an audit-and-fill flow that fills only whatβs missing.
Steps
## Business overviewβ Product + Business model (sourced from web search +databases/+ dbt repo).## Data architectureβ Warehouse, data stack, layers, sources.## Core data modelsβ### Most Used Tables(one-line pointers) +### Tables detail(Purpose, Granularity, Key Columns β€10, Use For).## Key Metrics Referenceβ grouped by category;**metric** β table, column, formula.## Date filteringβ three example formulas (last X weeks / last X days / current month) keyed off the userβs week-boundary and current-period-inclusion conventions.## Analysis Processβ five subsections: Understand β Select Table β Write Query β Validate β Context.- Validate metrics with the user β confirm every source-of-truth pointer in the metrics reference.
- Date filtering, with the user β pick week boundary (Sunday vs Monday) and current-period inclusion.
templates/RULES.md, the six-section scaffold:
create-context-tests
Generates a test suite of natural-language β SQL pairs that becomes the reliability benchmark.nao test runs each prompt through the agent, executes both the agentβs SQL and the testβs expected SQL, and diffs the result data row-by-row. See Evaluation for the scoring model.
Two authoring rules
- Prompts read like real chat. Short, vague, no table / column / method hints.
"How's churn looking this quarter?", not"What was the churn rate from fct_subscriptions in Q1?". - Output column names encode format / unit, not source.
churn_rate_float_0_1, notchurn_rate_from_fct_subscriptions.
- Ask once β does the user have trusted source-of-truth queries (Looker, dashboards, prior benchmarks)? Transform each into a test; for metrics without a trusted query, draft new ones.
- Save flat under
tests/(no subfolders), one YAML file per test. - Have the user validate β prompts match their teamβs phrasing, SQL matches their definition of truth.
- Run
nao test -m <model_id> -t 10β recap pass rate, token cost, wall-clock time as the baseline. - Diagnose failures β read
tests/outputs/, identify the rule gap, route towrite-context-rulesfor the smallest fix. Re-run between fixes so impact is attributable.
templates/test.yaml:
audit-context
Diagnoses a nao context. Finds gaps, MECE violations, failure root causes, and bloat. Output is a short in-conversation report ending in a prioritized plan. Diagnose only β never fixes. Routes fixes towrite-context-rules / add-semantic-layer / create-context-tests. Run any time: right after setup-context, mid-build, before a release, or when behavior gets surprising.
Steps β six checks in order
- Synced context β whatβs wired in (warehouse, repos, Notion, semantic layer, MCPs) vs missing. Has
nao syncrun? Scope check: β€100 tables hard ceiling, β€20 ideal. Oversized scope is the biggest predictor of reliability failure β flag it explicitly. RULES.mdvs target structure β six sections fromwrite-context-rules. Per section, mark present / missing / thin. Flag placeholders,TODO:markers, and metric entries with no source-of-truth pointer.- Per-table coverage β every table in
databases/: is it in## Most Used Tables? Has a## Tables detailblock? dbt context (schema.yml)? Per-table gaps: undocumented columns, calculated fields with no explanation, foreign keys with no relation. - Data model consistency (MECE) β two tables computing the same metric differently? Asked metrics no in-scope table can answer? Duplicated columns under different names? Ambiguous columns (
amountwithout unit,statuswithout enum values)? - Test coverage β if
tests/is empty, recommendcreate-context-tests. Otherwise readtests/outputs/and categorize each failure (data model / date selection / test issue / interpretation / metric definition) with the smallest rule change per failure. - Token optimization β files >40KB,
## Tables detailblocks past the 10-column cap, duplication betweenRULES.mdanddatabases/<table>.md, in-scope tables with no mention in any test.
sync state | scope wideness | rules quality (N/6 sections substantive) | test coverage), deep-dive only sections with findings, end with a prioritized plan that names the skill that does each fix:
add-semantic-layer
Wires a semantic layer into the agent so metric queries go through a single canonical definition. Only afternao test shows metric-reliability failures β not before. A semantic layer reduces the scope of answerable questions; the trade-off only pays off when reliability is the bottleneck. Schema gaps or date logic failures are rule-fixes, not semantic-layer fixes.
Steps
-
Pick the tool
Option Type When dbt MetricFlow Metric store Already running dbt Cloud with the Semantic Layer enabled. Snowflake views / semantic Semantic layer Snowflake; using curated views or native semantic views. nao semantic files Semantic layer No existing layer. Want a lightweight in-repo YAML. Other (Looker, Cube, β¦) Varies Search the MCP registry; otherwise fall back to nao YAML. -
Install the matching MCP under
.claude/mcp.json(dbt-mcp / mcp-server-snowflake / Cortex MCP, etc.). Credentials via${ENV_VAR}only β never inline. -
Hand off to
write-context-rulesto route every metric in## Key Metrics Referencethrough the new layer (e.g.MRR β query via dbt MCP query_metric (semantic layer)). -
Validate β
nao chatone of the userβs top questions, confirm the agent uses the semantic layer, thennao testand compare to the pre-semantic-layer baseline pass rate. Reliability is the only reason to do this β measure it.
templates/semantic.yaml):
RULES.md to it instead of writing local YAML.
When to use which
| If you want to⦠| Use |
|---|---|
| Set up nao on a brand-new project | setup-context |
Generate or rewrite RULES.md | write-context-rules |
| Add tests for a new metric, or build the first benchmark | create-context-tests |
| Find out whatβs missing, broken, or bloated | audit-context |
| Make metric calculations consistent across questions | add-semantic-layer (only after tests show the gap) |
Refine RULES.md because the agent keeps making the same miss | write-context-rules (preceded by audit-context if unsure) |
Source material
The skills are distilled from the public Context Engineering content: Read each skillβs sourceSKILL.md in the repo: github.com/getnao/nao/tree/main/skills.