Skills

What these skills are

nao publishes five context-engineering skills — standard SKILL.md files (YAML frontmatter + markdown) that any agentic CLI (Claude Code, Codex, Cursor, the Claude Agent SDK, …) can load and invoke. Each skill is a self-contained workflow that automates one stage of the Context Engineering lifecycle: scoping a project, writing rules, building a test suite, auditing what’s there, and adding a semantic layer once tests show metric gaps.

These skills are for the human/agent driving nao (in their IDE or terminal). They are different from the runtime skills the chat agent calls at query time — see Tools, MCPs, Skills for those.

Demo

Install

The published source of truth lives at github.com/getnao/nao/tree/main/skills. Install all five into the current project’s .claude/skills/ with:

nao skills add getnao/nao

nao skills is a thin wrapper around the open-source skills CLI from Vercel Labs, so the equivalent direct call also works:

npx skills add getnao/nao

Run from the root of the project where nao_config.yaml lives. Re-run any time to pick up updates; pass through --force to overwrite local edits.

Once installed, the skills are auto-discovered by Claude Code, Codex, and other agentic CLIs that load .claude/skills/. Trigger one by name (e.g. “use the setup-context skill”) or let the agent route on the skill description.

The five skills

setup-context

Takes the user from pip install nao-core to a synced project with a starter RULES.md. First-time install only — for editing rules, generating tests, or reviewing an existing context, use the other skills below. Steps

Ask everything in one round — warehouse + auth, scope (which tables, ≤100 with 20 as the target), extra context (dbt / ETL / BI repos, Notion, internal docs), LLM provider.
Look up the warehouse-specific config from docs.getnao.io/nao-agent/context-builder/databases, write nao_config.yaml, run nao init, then print a summary for the user to confirm before continuing.
nao sync — populate databases/, repos/, docs/, semantics/. Don’t move on until sync is clean.
Generate RULES.md by handing off to write-context-rules.
Wire up the LLM key via ${ANTHROPIC_API_KEY} (or equivalent) — never paste keys into chat.
Recommend next steps — smoke test with nao chat, review RULES.md, then create-context-tests.

write-context-rules

Owns RULES.md. Generates the six standard sections, section by section, showing the user each block before moving on. If RULES.md already has content, runs an audit-and-fill flow that fills only what’s missing. Steps

## Business overview — Product + Business model (sourced from web search + databases/ + dbt repo).
## Data architecture — Warehouse, data stack, layers, sources.
## Core data models — ### Most Used Tables (one-line pointers) + ### Tables detail (Purpose, Granularity, Key Columns ≤10, Use For).
## Key Metrics Reference — grouped by category; **metric** → table, column, formula.
## Date filtering — three example formulas (last X weeks / last X days / current month) keyed off the user’s week-boundary and current-period-inclusion conventions.
## Analysis Process — five subsections: Understand → Select Table → Write Query → Validate → Context.
Validate metrics with the user — confirm every source-of-truth pointer in the metrics reference.
Date filtering, with the user — pick week boundary (Sunday vs Monday) and current-period inclusion.

Template — templates/RULES.md, the six-section scaffold:

# RULES.md

> Included with every message sent to the nao agent. Keep it lean.
> Per-table detail belongs in `databases/<table>.md`, not here.

## Business overview
**Product**: …
**Business model**: …

## Data architecture
**Warehouse:** …
**Data stack:** …
**Data layers:** …
**Data sources:** …

## Core data models
### Most Used Tables
- `<table>` — one-line purpose. See `databases/.../table=<table>/` folder.

### Tables detail
#### `<table>`
**Purpose**: …
**Granularity**: One row per …
**Key Columns**: (≤10)
**Use For**: …

## Key Metrics Reference
### <Category>
- **<metric>** → `<table>.<column>`, `<formula>`

## Date filtering
> Convention: e.g. "Week starts Monday; 'last X weeks' excludes the current incomplete week."
### Last X weeks
```sql
…
```
### Last X days
### Current month

## Analysis Process
### 1. Understand the Question
### 2. Select the Right Table(s)
### 3. Write Efficient Queries
### 4. Validate Results
### 5. Provide Context

create-context-tests

Generates a test suite of natural-language → SQL pairs that becomes the reliability benchmark. nao test runs each prompt through the agent, executes both the agent’s SQL and the test’s expected SQL, and diffs the result data row-by-row. See Evaluation for the scoring model. Two authoring rules

Prompts read like real chat. Short, vague, no table / column / method hints. "How's churn looking this quarter?", not "What was the churn rate from fct_subscriptions in Q1?".
Output column names encode format / unit, not source. churn_rate_float_0_1, not churn_rate_from_fct_subscriptions.

Steps

Ask once — does the user have trusted source-of-truth queries (Looker, dashboards, prior benchmarks)? Transform each into a test; for metrics without a trusted query, draft new ones.
Save flat under tests/ (no subfolders), one YAML file per test.
Have the user validate — prompts match their team’s phrasing, SQL matches their definition of truth.
Run nao test -m <model_id> -t 10 — recap pass rate, token cost, wall-clock time as the baseline.
Diagnose failures — read tests/outputs/, identify the rule gap, route to write-context-rules for the smallest fix. Re-run between fixes so impact is attributable.

Template — templates/test.yaml:

name: churn_rate_last_quarter
prompt: How's churn looking this quarter?
sql: |
    SELECT
      SAFE_DIVIDE(churned, total) AS churn_rate_float_0_1
    FROM (
      SELECT
        COUNTIF(churned_at IS NOT NULL) AS churned,
        COUNT(*) AS total
      FROM <project>.<schema>.fct_subscriptions
      WHERE started_at < DATE_TRUNC(CURRENT_DATE, QUARTER)
    );

# Optional:
# category: revenue | activity | conversion | churn | retention | …
# difficulty: easy | medium | hard
# notes: why this test matters / what failure mode it catches

audit-context

Diagnoses a nao context. Finds gaps, MECE violations, failure root causes, and bloat. Output is a short in-conversation report ending in a prioritized plan. Diagnose only — never fixes. Routes fixes to write-context-rules / add-semantic-layer / create-context-tests. Run any time: right after setup-context, mid-build, before a release, or when behavior gets surprising. Steps — six checks in order

Synced context — what’s wired in (warehouse, repos, Notion, semantic layer, MCPs) vs missing. Has nao sync run? Scope check: ≤100 tables hard ceiling, ≤20 ideal. Oversized scope is the biggest predictor of reliability failure — flag it explicitly.
RULES.md vs target structure — six sections from write-context-rules. Per section, mark present / missing / thin. Flag placeholders, TODO: markers, and metric entries with no source-of-truth pointer.
Per-table coverage — every table in databases/: is it in ## Most Used Tables? Has a ## Tables detail block? dbt context (schema.yml)? Per-table gaps: undocumented columns, calculated fields with no explanation, foreign keys with no relation.
Data model consistency (MECE) — two tables computing the same metric differently? Asked metrics no in-scope table can answer? Duplicated columns under different names? Ambiguous columns (amount without unit, status without enum values)?
Test coverage — if tests/ is empty, recommend create-context-tests. Otherwise read tests/outputs/ and categorize each failure (data model / date selection / test issue / interpretation / metric definition) with the smallest rule change per failure.
Token optimization — files >40KB, ## Tables detail blocks past the 10-column cap, duplication between RULES.md and databases/<table>.md, in-scope tables with no mention in any test.

Output Lead with a one-paragraph summary (sync state | scope wideness | rules quality (N/6 sections substantive) | test coverage), deep-dive only sections with findings, end with a prioritized plan that names the skill that does each fix:

## Plan
(easy / 5 min) …      → write-context-rules
(small / 30 min) …    → create-context-tests
(medium / 1-2 hr) …   → audit-context (rerun after)
(large / multi-session) … → add-semantic-layer

add-semantic-layer

Wires a semantic layer into the agent so metric queries go through a single canonical definition. Only after nao test shows metric-reliability failures — not before. A semantic layer reduces the scope of answerable questions; the trade-off only pays off when reliability is the bottleneck. Schema gaps or date logic failures are rule-fixes, not semantic-layer fixes. Steps

Pick the tool

Option	Type	When
dbt MetricFlow	Metric store	Already running dbt Cloud with the Semantic Layer enabled.
Snowflake views / semantic	Semantic layer	Snowflake; using curated views or native semantic views.
nao semantic files	Semantic layer	No existing layer. Want a lightweight in-repo YAML.
Other (Looker, Cube, …)	Varies	Search the MCP registry; otherwise fall back to nao YAML.

Install the matching MCP under .claude/mcp.json (dbt-mcp / mcp-server-snowflake / Cortex MCP, etc.). Credentials via ${ENV_VAR} only — never inline.
Hand off to write-context-rules to route every metric in ## Key Metrics Reference through the new layer (e.g. MRR → query via dbt MCP query_metric (semantic layer)).
Validate — nao chat one of the user’s top questions, confirm the agent uses the semantic layer, then nao test and compare to the pre-semantic-layer baseline pass rate. Reliability is the only reason to do this — measure it.

Template — nao YAML option (templates/semantic.yaml):

dimensions:
    - name: date
      type: date
      description: Calendar date. Use this for any time-based slicing.

    - name: plan
      type: categorical
      description: Subscription plan tier.
      values: [free, pro, enterprise]

metrics:
    - name: mrr
      definition: Monthly Recurring Revenue from active paying subscriptions, in USD.
      source:
          table: fct_stripe_mrr
          column: mrr_amount
          aggregation: SUM     # SUM | COUNT | COUNT_DISTINCT | AVG | MIN | MAX
      grain: month             # day | week | month | quarter | year
      dimensions: [date, plan, country]
      filters:
          - "status = 'active'"

For the dbt MetricFlow / Snowflake / Cortex paths, the metric definitions live upstream — this skill installs the MCP and routes RULES.md to it instead of writing local YAML.

When to use which

setup-context         →  write-context-rules   →  create-context-tests  →  audit-context (anytime)
(first time only)        (any rules change)       (benchmark + extend)     (diagnose, never fix)
                                                          │
                                                          ▼
                                                   tests reveal metric
                                                   reliability gaps?
                                                          │
                                                          ▼
                                                  add-semantic-layer
                                                  (then back to write-context-rules)

If you want to…	Use
Set up nao on a brand-new project	`setup-context`
Generate or rewrite `RULES.md`	`write-context-rules`
Add tests for a new metric, or build the first benchmark	`create-context-tests`
Find out what’s missing, broken, or bloated	`audit-context`
Make metric calculations consistent across questions	`add-semantic-layer` (only after tests show the gap)
Refine `RULES.md` because the agent keeps making the same miss	`write-context-rules` (preceded by `audit-context` if unsure)

Source material

The skills are distilled from the public Context Engineering content:

Read each skill’s source SKILL.md in the repo: github.com/getnao/nao/tree/main/skills.

Get Started

Context Builder

Context Engineering

Chat Interface

Self-Hosting

nao Cloud

Developers

What these skills are

Demo

Install

The five skills

setup-context

write-context-rules

create-context-tests

audit-context

add-semantic-layer

When to use which

Source material

Get Started

Context Builder

Context Engineering

Chat Interface

Self-Hosting

nao Cloud

Developers

Documentation Index

​What these skills are

​Demo

​Install

​The five skills

​setup-context

​write-context-rules

​create-context-tests

​audit-context

​add-semantic-layer

​When to use which

​Source material

What these skills are

Demo

Install

The five skills

setup-context

write-context-rules

create-context-tests

audit-context

add-semantic-layer

When to use which

Source material