A Knowledge Engineering Presentation

From Metrics
to Knowledge

There's been plenty written about knowledge layers and context engineering. Less about what it actually looks like to implement one. This is a practical proposal for encoding business context into dbt's semantic layer using MetricFlow's existing meta property.

Patterns from a production pipeline · Order Fulfillment example

Keith Binkly · data-centered.com

01 The Problem

The semantic layer answers "what happened?"

Everything else — why it happened, whether it's normal, what to investigate, or what to do about it — lives in people's heads, Slack threads, and runbooks nobody reads.

dbt Semantic Layer

MetricFlow YAML
Dimensions, measures, metrics
Computed answers

?

Business Context

Thresholds, investigation paths
Correlations, decision rules
Tribal knowledge

The interface between these two worlds is undefined. This presentation defines it.

THE CONVERSATION

Tristan Handy: "dbt's biggest role in the AI revolution will be as the source of truth for metadata." — This deck is a concrete answer to what that metadata looks like.

Eric Simon: The data stack has read-path context (schemas, lineage, quality metrics) but is missing write-path context — why decisions were made, what edge cases matter, what judgment calls shaped the logic. That's the gap.

Veronika Heimsbakk: "Every JOIN is a confession that your data model threw away a relationship that you now need back." She argues for richer data models. This argues for richer metadata models. Same impulse.

02 Layer 0 — Starting Point

Plain MetricFlow

                sem_order_fulfillment.yml
semantic_models:
  - name: order_fulfillment
    defaults:
      agg_time_dimension: order_date
    model: ref('fct_orders')
    entities:
      - name: order
        type: primary
        expr: order_id
      - name: customer
        type: foreign
        expr: customer_id
    dimensions:
      - name: order_date
        type: time
        type_params:
          time_granularity: day
      - name: fulfillment_channel
        type: categorical
      - name: payment_method
        type: categorical
    measures:
      - name: total_orders
        agg: count
        expr: order_id
      - name: successful_orders
        agg: count
        expr: "case when order_status = 'delivered'
          then order_id end"

metrics:
  - name: order_success_rate
    type: derived
    type_params:
      expr: successful_orders / total_orders
      metrics:
        - name: successful_orders
        - name: total_orders
            

The semantic model most teams ship today. Everything the system needs to compute an answer.

Entities

Joinable business objects — orders, customers

Dimensions

How you slice the data — date, channel, payment method

Measures

Raw aggregations — counts, sums

Metrics

Business KPIs — derived from measures

This is what most teams have. It computes correctly. It explains nothing.

entities → joinable business objects dimensions → how you slice the data measures → raw aggregations metrics → business KPIs

03 What's Missing

Five questions the semantic layer can't answer

01

Why did success rate drop 8% this week?
The metric reports the number. It doesn't know where to look.

02

Is 92% good or bad?
No thresholds, no baselines, no seasonal context.

03

Which dimensions should I investigate first?
An analyst knows. The YAML doesn't.

04

Has this happened before? What caused it?
No correlation data. No event context.

05

Who owns this and what do they do about it?
No ownership. No decision rules. No runbook.

Every one of these answers exists somewhere — Slack, docs, people's heads. The question is: can we encode them in the semantic layer itself?

04 Layer 1 — Purpose & Business Questions

meta.context

What it measures and who owns it

            NEW ADDITIONS
metrics:
  - name: order_success_rate
    type: derived
    type_params:
      expr: successful_orders / total_orders
      metrics:
        - name: successful_orders
        - name: total_orders
    meta:
      context:
        purpose: |
          Measures end-to-end order completion from payment
          confirmation through successful delivery.
        business_question: |
          "Are customers receiving what they ordered,
          within the timeframe we promised?"
        owner: fulfillment-ops
        stakeholders: [logistics, customer-success, finance]
        

What this enables

Humans reading the YAML understand what the metric means in business terms. AI agents can explain what they're looking at. New team members onboard in minutes, not weeks. The business_question becomes the agent's opening line when presenting results.

05 Layer 2 — Thresholds & Expectations

meta.expectations

Define “normal” so the system knows when to worry

            NEW ADDITIONS
    meta:
      context:
        # ... purpose, business_question, owner (layer 1) ...
      expectations:
        healthy_range: [0.94, 0.99]
        warning_threshold: 0.92
        critical_threshold: 0.88
        seasonality: |
          Drops 3-5% during Nov-Dec peak season.
          Post-holiday returns inflate failure count in Jan.
        trend: "Improving ~0.5%/quarter since warehouse automation (Q3 2025)"
        

What this enables

Automated alerting that understands "normal." An agent seeing 91% in December says "within seasonal range" instead of firing an alert. The same agent seeing 91% in March says "investigate immediately — 3 points below warning threshold, outside seasonal pattern."

06 Layer 3 — Causal Dimensions

meta.investigation

Encode the investigation path experienced analysts follow

            NEW ADDITIONS
    meta:
      # ... layers 1-2 ...
      investigation:
        causal_dimensions:
          - name: fulfillment_channel
            why: "Channel determines SLA and failure mode"
            priority: 1
          - name: shipping_carrier
            why: "#1 root cause of delivery failures"
            priority: 2
          - name: warehouse_region
            why: "Regional weather/labor issues cause localized drops"
            priority: 3
          - name: payment_method
            why: "Payment failures look like fulfillment failures here"
            priority: 4
        investigation_path: |
          1. Check by fulfillment_channel (marketplace vs direct)
          2. If direct: check by shipping_carrier
          3. If carrier-specific: check by warehouse_region
          4. If across carriers: check payment_method for upstream cause
        

What this enables

An AI agent doesn't slice by every dimension randomly — it follows the investigation path that experienced analysts use. Priority ordering means the agent checks the most likely cause first. The why field explains each step so the agent can narrate its reasoning. This is tribal knowledge made queryable.

07 Layer 4 — What Else Moves

meta.relationships

Map what else moves when this metric moves

            NEW ADDITIONS
    meta:
      # ... layers 1-3 ...
      relationships:
        correlates_with:
          - metric: return_rate
            relationship: "inverse — high returns lag low success by 5-7 days"
          - metric: carrier_on_time_rate
            relationship: "leading indicator — carrier delays precede delivery failures"
          - metric: payment_decline_rate
            relationship: "upstream cause — payment failures reduce the denominator"
        affected_by:
          - event: warehouse_capacity_change
            impact: "New warehouse online improves regional success by 2-3%"
          - event: carrier_contract_update
            impact: "Carrier SLA changes directly affect delivery window compliance"
          - event: holiday_peak_season
            impact: "Volume spike + temp staff = 3-5% success rate decline"
        

What this enables

The metric doesn't exist in isolation. An agent investigating a drop can automatically check correlated metrics and known external events: "Success rate dropped 4% — and carrier_on_time_rate dropped 6% yesterday. Likely carrier issue, not a fulfillment problem."

08 Layer 5 — Decision Context

meta.decisions

Tell the system what to do when thresholds are crossed

            NEW ADDITIONS
    meta:
      # ... layers 1-4 ...
      decisions:
        when_this_drops:
          - threshold: "< 0.92"
            action: |
              Check carrier dashboard for service disruptions.
              If carrier-specific: escalate to logistics-ops.
              If cross-carrier: investigate warehouse operations.
          - threshold: "< 0.88"
            action: |
              CRITICAL: Page fulfillment-ops on-call.
              Check for payment processor outage (upstream).
              Prepare customer communication if regional.
        business_rules:
          - "SLA: 97% success rate guaranteed to enterprise customers"
          - "Below 94% triggers automatic carrier performance review"
          - "Below 90% for 3 consecutive days = executive escalation"
        

What this enables

An agent doesn't just detect a problem — it knows what to do. The decision context turns a reporting tool into a reasoning system. Business rules become enforceable, not just documented.

09 The Complete Picture

Before & After

Layer 0 — 27 lines

semantic_models:
  - name: order_fulfillment
    defaults:
      agg_time_dimension: order_date
    model: ref('fct_orders')
    entities:
      - name: order
        type: primary
        expr: order_id
      - name: customer
        type: foreign
        expr: customer_id
    dimensions:
      - name: order_date
        type: time
      - name: fulfillment_channel
        type: categorical
      - name: payment_method
        type: categorical
    measures:
      - name: total_orders
        agg: count
      - name: successful_orders
        agg: count
metrics:
  - name: order_success_rate
    type: derived
                

All 5 Layers — 27 + ~55 lines in meta

# ... same semantic_models definition ...
metrics:
  - name: order_success_rate
    type: derived
    type_params: # ...unchanged...
    meta:
      context:          # Layer 1
        purpose: ...
        business_question: ...
        owner: fulfillment-ops
        stakeholders: [...]
      expectations:     # Layer 2
        healthy_range: [0.94, 0.99]
        warning_threshold: 0.92
        critical_threshold: 0.88
        seasonality: ...
      investigation:    # Layer 3
        causal_dimensions:
          - name: fulfillment_channel
            priority: 1
          - ...
        investigation_path: ...
      relationships:    # Layer 4
        correlates_with:
          - metric: return_rate
          - ...
        affected_by:
          - event: holiday_peak_season
          - ...
      decisions:        # Layer 5
        when_this_drops:
          - threshold: "< 0.92"
            action: ...
        business_rules:
          - "SLA: 97% guaranteed"
                

The MetricFlow definition didn't change. Everything new is in meta — backwards compatible, progressive, optional. dbt ignores what it doesn't recognize.

10 Why This Works

Metadata is already there. We're just using more of it.

dbt's meta property is an official, first-party feature — a freeform dictionary on every resource. Teams already use it for ownership tags and PII flags. This takes it further: from simple labels to the full business knowledge an agent needs to reason about a metric.

Official & universal

meta is built into dbt core. Available on metrics, dimensions, entities, measures, models, sources — everywhere the semantic layer touches.

Already wired up

Compiles into manifest.json. Exposed through the Semantic Layer API. Every downstream tool — including AI agents — can read it today.

Backwards compatible

dbt ignores meta it doesn't understand. Adding context enrichment breaks nothing. Remove it anytime. Zero infrastructure changes required.

Open schema proposal

Several teams are exploring richer context layers — but we haven't seen a standardized, open schema for encoding investigation paths, causal dimensions, and decision rules. Here's a proposal.

The theory is established. The infrastructure exists. The missing piece was an opinionated schema — and the conviction to put business knowledge inside the semantic layer definition, not beside it.

11 The Progression

Each layer answers the question the previous layer raises

Layer 0 Compute → "What happened?" order_success_rate = 89.3%

Layer 1 Purpose → "What does this mean?" Measures end-to-end delivery completion

Layer 2 Expectations → "Is this normal?" Below critical threshold of 88% — not normal

Layer 3 Investigation → "Where should I look?" Channel first, then carrier, then region

Layer 4 What Else Moves → "What else is affected?" carrier_on_time_rate dropped — correlated

Layer 5 Decisions → "What should I do?" Escalate to logistics-ops, page on-call

12 File Architecture

Where the context lives

models/ semantic/ sem_order_fulfillment.yml ← MetricFlow definitions + meta (layers 0-5) sem_payments.yml ← Payment metrics with their own meta sem_shipping.yml ← Carrier metrics with their own meta docs/ business_context/ order_fulfillment_guide.md ← Deep narrative (for humans) carrier_sla_reference.md ← Source for expectations thresholds investigation_playbooks/ fulfillment_drop.md ← Source for investigation paths payment_failure_triage.md ← Upstream investigation playbook event_registry/ warehouse_changes.md ← Source for affected_by events carrier_contracts.md ← SLA change history

The boundary

The YAML meta is the machine-readable interface — structured, queryable, what agents consume. The docs folder is the human-readable source — narrative, detailed, what analysts read. They reference each other. The YAML points to the docs; the docs explain why the YAML says what it says.

13 The Agent Experience

Same question, different answers

Agent with Layer 0 only

Why did order success rate drop?

Order success rate is 89.3%, down from 94.1% last week.

...that's all I can tell you.

Agent with all 5 layers

Why did order success rate drop?

Order success rate dropped to 89.3% — that's below the warning threshold of 92%.

Following investigation path:

By channel: marketplace orders look normal (95.2%), but direct orders are at 84.1%
By carrier for direct: FastShip at 71.3% — well below their normal ~96%
Checking correlations: carrier_on_time_rate for FastShip dropped to 62% two days ago, which lines up

This looks like a FastShip service disruption affecting direct fulfillment. I'd recommend confirming via the carrier dashboard before escalating — could also be a data lag issue.

Note: if this holds for 3 days, business rules call for executive escalation (day 1 of 3).

14 Getting Started

You don't need all 5 layers to start

1

Add purpose and business_question
to your top 5 metrics. You already know these answers.

~1 hour

2

Add expectations
from your existing alert thresholds and dashboards.

~2 hours

3

Interview one analyst per metric
to capture their investigation_path — the steps they actually follow.

~30 min each

4

Document correlates_with
from your dashboards — which metrics move together?

~1 hour

5

Capture decisions
from your runbooks and on-call playbooks. This one evolves continuously.

ongoing

THE HARD PART

The hardest part isn't writing the initial context. It's keeping it current. The bet is that YAML in your dbt repo — versioned in git, reviewed in PRs, tested in CI — is a better home than Slack threads and runbooks that nobody updates either.

15

The semantic layer made your metrics reliable, consistent, and governed. The knowledge layer makes them meaningful, investigable, and actionable.

Start with your top 5 metrics. Add purpose and business_question this week.
See what changes.

All additions use dbt's existing meta property — backwards compatible, progressive, optional.
No new tools. No new infrastructure. Just knowledge, where it belongs.

From Metricsto Knowledge

The semantic layer answers "what happened?"

dbt Semantic Layer

Business Context

Plain MetricFlow

Five questions the semantic layer can't answer

What it measures and who owns it

Define “normal” so the system knows when to worry

Encode the investigation path experienced analysts follow

Map what else moves when this metric moves

Tell the system what to do when thresholds are crossed

Before & After

Metadata is already there. We're just using more of it.

Official & universal

Already wired up

Backwards compatible

Open schema proposal

Each layer answers the question the previous layer raises

Where the context lives

Same question, different answers

You don't need all 5 layers to start

From Metrics
to Knowledge