type: One of athena, bigquery, clickhouse, databricks, duckdb, fabric, mssql, mysql, postgres, redshift, snowflake, starrocks, trino
include: Optional glob patterns for schema.table values to include
exclude: Optional glob patterns for schema.table values to exclude
templates: Optional list of rendered context files
The templates field used to be called accessors. The old key still works (nao will read it and migrate automatically), but new configs should use templates.
These are the built-in templates nao can render per table:
columns
description
how_to_use (default, AI-friendly per-table usage context built from query history)
preview
profiling
indexes (optional, currently used for ClickHouse table/index metadata)
ai_summary (optional, AI-generated table summary)
If you omit templates, nao renders columns, how_to_use, preview, and profiling by default. how_to_use replaces description as the default narrative file because it carries more usable signal for the agent (metadata, partitioning, common joins, frequent queries).
Query history is supported on BigQuery, Snowflake, Databricks, Postgres, and Redshift. On warehouses without history support, how_to_use still renders the metadata + description sections.
Two optional fields let you control which queries feed into the how_to_use analysis:query_history_sql overrides the built-in history query for a database. The SQL must return a query_text column. Use the {days} placeholder to inject the configured query_history_days value:
databases: - name: warehouse_prod type: snowflake query_history_days: 30 query_history_sql: | SELECT regexp_replace(query_text, '-- Looker.*$', '', 1, 0, 'm') AS query_text FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY WHERE start_time >= DATEADD(day, -{days}, CURRENT_TIMESTAMP()) AND execution_status = 'SUCCESS' AND query_type = 'SELECT' LIMIT 10000
This is useful when you need to strip BI tool comments (e.g. Looker slugs) before grouping, or to query a custom history table.query_history_exclude_patterns filters out noise after fetching. Each entry is a case-insensitive regex. Any query whose text matches at least one pattern is dropped before analysis:
Profiling works across all supported warehouses for primitive columns and for complex column types (array, struct, map, json, row, tuple, variant, object, super). For array columns, nao unpacks the values before computing distinct counts and top values; other complex types are stringified before profiling.
ai_summary is opt-in. To use it, add ai_summary to templates and configure llm.annotation_model in nao_config.yaml.When enabled, nao renders databases/ai_summary.md.j2 and calls prompt("...") inside that template to generate LLM-based summaries during nao sync.
Use max_query_size (in GB) to cap how much data a single query can scan. When set, nao runs a BigQuery dry run before every SQL execution and rejects the query if the estimated bytes processed exceed the limit.
The check applies to every query the agent runs - chat, stories, evaluations, anything going through nao chat or nao test.
The limit is enforced before BigQuery scans the data, so blocked queries cost nothing.
Errors include the estimated bytes and the configured limit so the agent can suggest a tighter filter and retry.
Leave the field unset (or set it to 0) to disable the check.
When you create a BigQuery connection through nao init, the CLI prompts for a maximum query size as part of the interactive setup. The same field is available in the IDE under Settings -> Warehouse Connections for the cloud and IDE flows.
For BigQuery tables, nao sync exposes partition columns and clustering columns as separate sections in each tableβs context. The agent uses partition columns to enforce WHERE filters that prune scanned bytes, and clustering columns to recommend the right join keys and predicate ordering for performance. Both are detected automatically - no config required.
When the agent runs nao sync against Snowflake, it also imports any semantic views declared in your account. Each view (metrics, dimensions, relationships pulled from INFORMATION_SCHEMA.SEMANTIC_VIEWS) is written to:
The agent reads these alongside table metadata, so business definitions you maintain in Snowflake flow into the context layer with no extra config. If your account has no semantic views (or the view feature isnβt available on your edition), nao sync skips the step silently. No new fields in nao_config.yaml.
When syncing Trino tables, nao automatically imports table-level comments from system.metadata.table_comments and column-level comments from DESCRIBE. These comments appear in the generated context files alongside schema metadata, giving the agent richer descriptions without any extra configuration.
When the agent generates SQL, nao auto-detects the warehouse dialect from the target databaseβs type and injects extra rules into the system prompt so the query uses the right syntax:
T-SQL (MSSQL, Fabric): use TOP N instead of LIMIT.
BigQuery: quote identifiers with backticks and use SAFE_DIVIDE(a, b) instead of a / b to avoid divide-by-zero errors.
MySQL: quote identifiers with backticks and use IFNULL instead of COALESCE for null handling.
PostgreSQL, Snowflake, Redshift, Databricks, and other standard SQL warehouses do not get extra dialect rules - the agent falls back to standard SQL. If a chat uses several databases of different types, the rules for each are scoped to queries targeting that database.