Meta Context Schema

A practical framework for encoding business knowledge directly in dbt semantic layer YAML. Five layers, 36 fields, three tiers. Your semantic layer tells the LLM how to query. Meta context tells it how to think about what comes back.

Project Map

↓

The Schema

Five Layers, One Metric

36 fields across four views. Each layer prevents a specific LLM failure mode.

↗

Deep Read

Meta Context: Transforming dbt’s Semantic Layer

An open schema that encodes what experienced analysts know about metrics — directly in dbt YAML.

↗

Reference

Schema Reference

All 36 fields with types, tiers, and layer groupings.

↗

Eval Evidence

Multi-Model Heatmap

3 models × 6 conditions. Model capability is independent of schema benefit.

↗

Eval Evidence

Ablation Infographic

Context ablation study. Partial context was more dangerous than none.

↗

Article

A Business Context Schema dbt Already Supports

The full narrative: origin, schema, eval results, how to start.

Guide

Get Started

Three steps to your first meta context block.

↓

The empty row in the data stack

There’s no shortage of articles about why we need a “context layer.” The trillion-dollar platform opportunity. Context graphs. AI needs business context. Everyone agrees the layer is missing. Nobody has published a practical framework for filling it in.

Layer	System of Record	Solved By
Storage	The warehouse	Snowflake, BigQuery, Redshift
Logic	Transformations + metrics	dbt, MetricFlow
Intelligence	???	???

The intelligence — what analysts actually get paid to produce — has no system of record. It lives in Slack threads, Confluence pages, onboarding sessions, and the heads of people who might leave next quarter. LLMs are the first consumer that can use structured interpretive knowledge programmatically. This schema gives them what they need.

Your semantic layer tells the LLM how to query. Meta context tells it how to think about what comes back.

The Schema

Five Layers, One Metric

Each layer prevents a specific LLM failure mode. Without Layer 2, the agent can’t assess severity. Without Layer 5 alongside Layer 2, it fabricates SLA obligations. The ordering is not arbitrary.

Context "Who cares and why does this exist?" Prevents: misinterpretation ›

purposestringCoreWhat this metric measures and why

business_questionstringCoreThe decision question this metric answers

ownerstringCorePrimary team or role responsible

stakeholderslistRecOther teams who consume or are affected

definitionstringRecPrecise business definition distinguishing from similar metrics

aliaseslistOptOther names this metric goes by in the org

data_domainstringOptBusiness domain (e.g., "fulfillment", "finance")

granularitystringOptGrain the metric is computed at

Expectations "What does good look like?" Prevents: bad calibration ›

healthy_range[num, num]CoreP5/P95 operating range from trailing 12 months

warning_thresholdnumberCoreValue warranting attention (not yet critical)

critical_thresholdnumberCoreEmergency value requiring immediate action

seasonalitystringCoreWhen, how much, and why. Include magnitude.

trendstringRecCurrent direction and cause

targetnumberRecAspirational goal (distinct from healthy_range)

segment_expectationslistOptDifferent thresholds per segment

volatilitystringOptNormal day-to-day variance

baseline_datedateOptWhen thresholds were last calibrated

Investigation "When it breaks, where do I look first?" Prevents: wrong decomposition ›

causal_dimensionslist[obj]CoreDimensions to slice by, each with {name, why, priority}

investigation_pathstringCoreConditional decision tree (not a flat list)

common_false_positiveslistRecScenarios that look like problems but aren't

known_root_causeslist[obj]RecHistorical incidents with date, description, resolution

data_quality_gotchaslistOptUpstream data issues that mimic real drops

Relationships "What else moves when this moves?" Prevents: isolated reasoning ›

correlates_withlist[obj]CoreEach: {metric, relationship}. Typed: "inverse", "leading indicator", etc.

affected_bylist[obj]CoreExternal events with magnitude of impact

leads_tolist[obj]RecDownstream metrics this feeds (directional)

decomposes_intolist[obj]OptSub-metrics that compose to this one

shared_dimensionslistOptDimensions shared with correlated metrics

Decisions "What do I do about it?" Prevents: false confidence ›

when_this_dropslist[obj]CoreAction protocols: {threshold, action}

business_ruleslistCoreSLAs, policies. Without this, Layers 2–4 create false confidence.

when_this_spikeslist[obj]RecAction protocols for upward anomalies

escalation_pathlist[obj]RecWho to escalate to: {severity, contact}

notification_channelslist[obj]OptWhere alerts go: {severity, channel}

review_cadencestringOptHow often formally reviewed

Context (interpretation)

Expectations (calibration)

Investigation (framing)

Relationships (reasoning)

Decisions (action)

# Layer 1: Context
meta:
  context:
    purpose: ""
    business_question: ""
    owner: ""

  # Layer 2: Expectations
  expectations:
    healthy_range: [min, max]
    warning_threshold: null
    critical_threshold: null
    seasonality: ""

  # Layer 3: Investigation
  investigation:
    causal_dimensions:
      - name: ""
        why: ""
        priority: 1
    investigation_path: ""

  # Layer 4: Relationships
  relationships:
    correlates_with:
      - metric: ""
        relationship: ""
    affected_by:
      - event: ""
        impact: ""

  # Layer 5: Decisions
  decisions:
    when_this_drops:
      - threshold: null
        action: ""
    business_rules:
      - ""

  last_validated: ""
  schema_version: "1.0"

Proof of Concept

It Survives dbt parse

Meta context rides the existing meta: property — dbt preserves it all the way through to the compiled manifest. Zero new tooling. Zero new dependencies.

"meta": {
  "context": {
    "purpose": "Measures the share of OCT disbursement transactions
      that are declined by the receiving bank or processor.",
    "business_question": "Are OCT disbursements being rejected at an
      abnormal rate, and if so, by which failure reason?",
    "owner": "Analytics Team"
  },
  "expectations": {
    "healthy_range": [0.02, 0.08],
    "warning_threshold": 0.10,
    "seasonality": "No strong seasonal pattern. Decline rates are
      processor-dependent, not calendar-driven."
  },
  "investigation": {
    "causal_dimensions": [
      {
        "name": "program_code",
        "why": "Different BaaS programs have different decline profiles.
          A spike in one program does not indicate system-wide failure.",
        "priority": 1
      },
      {
        "name": "global_fund_transfer_status_reason",
        "why": "Status reason codes distinguish BIN ineligibility from
          limit exceeded from issuer decline — different root causes.",
        "priority": 2
      }
    ]
  }
}

From manifest.json after dbt parse — OCT decline rate metric, disbursements pipeline.

Get Started

Three Steps to Your First Meta Context

You don’t need 36 fields. Start with 3 on your most-queried metric.

Step 1

Pick one metric and gather your docs

Choose the metric your team asks about most. Gather whatever business context exists — pipeline guide, data dictionary, Confluence page, Slack threads, analysis decks. It doesn’t need to be organized.

Step 2

Hand an LLM your docs + this extraction prompt

Give any frontier model your documents and the prompt below. It extracts structured context into YAML you can paste directly into your meta: block.

I'm enriching the dbt semantic layer metric [metric_name] with structured
business context. Extract the following from the attached documents into YAML:

Layer 1 — Context:
  purpose: What does this metric measure? Scope-bounded.
  business_question: What decision does this inform?
  owner: Who's responsible?

Layer 2 — Expectations:
  healthy_range: Normal operating range (empirical, not guessed).
  warning_threshold / critical_threshold: Attention vs escalation values.
  seasonality: Time patterns with magnitude ("Q4 +20-30%", not "higher in Q4").

Layer 3 — Investigation:
  causal_dimensions: Which dimensions explain variance? {name, why, priority}
  investigation_path: Conditional tree — "IF system-wide, check X. IF single segment, check Y."

Layer 4 — Relationships:
  correlates_with: What moves with this? Type the relationship (inverse, leading indicator).
  affected_by: External events with impact magnitude.

Layer 5 — Decisions:
  when_this_drops: Action at each threshold.
  business_rules: SLAs/policies. If none, state "No formal SLA documented."

Output valid YAML starting at the "meta:" key. Only include fields where
the docs provide evidence — leave others out rather than guessing.

Step 3

Paste into your schema YAML and run dbt parse

Add the meta: block under your metric definition. The placement is simple — it’s a sibling of type:, label:, etc. dbt preserves it as-is without parsing the contents.

# Placement — meta: is a sibling of type/label on the metric:

metrics:
  - name: my_metric
    type: simple
    label: My Metric
    agg: sum
    expr: my_column
    meta:                          # <-- add here
      context:
        purpose: "..."
        business_question: "..."
        owner: "..."
      expectations:
        healthy_range: [0.90, 0.99]
        warning_threshold: 0.88
        seasonality: "..."
      investigation:
        causal_dimensions:
          - name: segment
            why: "..."
            priority: 1
      decisions:
        when_this_drops:
          - threshold: 0.88
            action: "..."
        business_rules:
          - "..."

That’s it. Run dbt parse to verify YAML validity. The nested structure is preserved all the way through compilation into the manifest and is accessible via the Semantic Layer API.