data-centered journal all entries ›

The Palantir Ontology: What They Built That We Haven't

A deep dive into Palantir's approach to enterprise semantics, and what dbt-agent might learn from it.

Deep Read Series ยท February 4, 2026

The Reputation Problem

Let's acknowledge it: Palantir has a checkered reputation. Defense contracts, surveillance implications, the general vibe of a company that's comfortable with power asymmetries. Peter Thiel. All of it.

But they also make a crap load of money. And their recent growth -- particularly around AI deployment -- suggests they've built something enterprises desperately want.

The question isn't whether we like them. It's whether they've solved problems we're still fighting.

What Palantir Actually Built

Palantir's Foundry platform centers on what they call "the Ontology." This isn't a semantic layer in the dbt/MetricFlow sense. It's more ambitious -- and more opinionated.

The Core Philosophy

From their docs: "The Ontology provides a well-defined system into which new information is modeled into a common language for the organization."

Traditional data lakes accumulate datasets and dashboards. Each project reinvents data integration. Context gets scattered across Confluence, Jira, Slack, and tribal knowledge.

Palantir's bet: invest upfront in modeling real-world entities and relationships, and everything else -- applications, AI, governance -- builds on that foundation.

The Building Blocks

Object Types: Schema definitions representing real-world entities. Not tables -- entities. A "Well" in an energy company. A "Customer" in retail. A "Flight" in logistics. Individual instances are objects; collections are object sets.

Properties: Characteristics that define what information can be stored. Like columns, but attached to business concepts rather than technical tables.

Link Types: Relationships between object types. Not just foreign keys -- semantic connections. A "Customer" places an "Order." An "Employee" manages a "Team." The link carries meaning.

Action Types: This is where it gets interesting. Actions define "a set of changes or edits to objects, property values, and links that a user can take at once." Side effects included -- notifications, webhooks, downstream triggers.

Functions: Code-based logic natively integrated with the ontology. Functions accept objects as inputs, read properties, operate across action types. The computation layer is bound to business concepts.

Interfaces: Abstraction patterns enabling polymorphism. Different object types sharing common structures can be modeled and interacted with consistently.

Why This Matters

Most semantic layers describe data. Palantir's ontology describes the business.

The difference is subtle but critical. A semantic layer might define "Customer Lifetime Value" as a metric calculated from revenue tables. Palantir's ontology would define "Customer" as an entity with properties (name, segment, status), links to other entities (Orders, Support Tickets, Account Manager), and actions that can be performed (Flag for Churn Risk, Assign to Campaign, Archive).

The metric exists, but it's embedded in a richer context.

The Four Problems They Solve

1. Connectivity at Scale

From the docs: the ontology functions as "a shared source of truth for decision-making and decision capture across a large organization."

This isn't just dashboards. Users can:

That last part matters. Traditional BI is read-only. Palantir's approach is read-write. Actions taken in applications update the ontology, creating feedback loops.

2. Interpretability for Non-Technical Users

The platform abstracts technical concepts (datasets, joins, code) into domain-specific language. "Decision makers are not technical users comfortable with code or IT concepts." They interact using "standard terms they use every day."

This is the "name vs. know" problem Feynman identified. Most data infrastructure speaks in technical names (tables, columns, models). Palantir speaks in business names (Customers, Orders, Campaigns).

3. Economies of Scale

Rather than duplicating integration work for each project, "entire applications and use cases" build on shared data assets. Developers focus on business problems rather than "data wrangling."

The reuse compounds. A "Customer" object type defined for one application is available for all applications. The ontology is infrastructure, not project-specific.

4. Decision Capture

Through configurable action types and writebacks, the ontology captures organizational decisions as data. "Organizations can learn from and improve their decision-making" while "insights captured by one user" inform others.

This addresses Jin's critique -- that decision rationale doesn't persist. In Palantir's system, decisions are first-class citizens, not comments in a Jira ticket.

The a16z Analysis: Palantirization

The Andreessen Horowitz piece on "Palantirization" offers a sobering counterpoint.

What Makes Palantir Work

Palantir achieves "category of one" status through simultaneous mastery of three things most companies can't combine:

  1. Integrated product platform: Built on reusable microservices (Gotham, Foundry, AIP, Ontology) rather than purely custom solutions
  2. Elite embedded engineering: Forward-deployed engineers (FDEs) comfortable with both production code and complex organizational navigation
  3. Mission-critical domains: Defense, intelligence, regulated sectors where stakes justify extensive customization

The FDE model is key. These engineers spend months embedded in customer organizations, building on the platform while navigating political terrain. It's not SaaS. It's not consulting. It's something in between.

Why Copycats Fail

a16z identifies structural constraints:

Problem severity misalignment: Palantir solves billion-dollar problems (counterterrorism, national security). Most vertical SaaS addresses 10-20% efficiency gains -- insufficient ROI for months of embedded engineering.

Customer concentration requirements: The model works with concentrated, high-ACV bases. Fragmented customer bases create unsustainable customization burdens.

Services trap risk: Without strong platform underpinnings, companies devolve into "Accenture for X with a nicer front-end."

What We Can Extract

So here's the question: can we learn from Palantir's ontology without needing their resources?

Pattern 1: Entities Over Tables

The shift from "what tables exist" to "what entities exist" changes how applications interact with data. Instead of JOIN logic scattered across queries, relationships are declared once and reused.

dbt-agent parallel: Canonical models registry. We already define core entities (Customer, Transaction, Product) with standardized structures. The gap is making those entities queryable as objects, not just tables.

Pattern 2: Actions as First-Class Citizens

Palantir's action types capture business operations -- the doing, not just the describing. When a user performs an action, it's logged, governed, and feeds back into the system.

dbt-agent parallel: Decision traces. We already capture resolutions of QA issues with problem/resolution/triage. The gap is binding those decisions to specific entities and making them actionable.

Pattern 3: Functions Bound to Concepts

Computation tied to business objects, not generic tables. A function that calculates churn risk operates on Customers, not on dim_customer_id columns.

dbt-agent parallel: Skills. A skill that generates a metric query operates on semantic layer concepts. The gap is tighter binding between skills and the ontology.

Pattern 4: Temporal Context

Palantir's system supports version history -- what rules existed when a trace was logged, what the ontology looked like at a point in time.

dbt-agent parallel: PKO Phase 5 (Temporal Reasoning). We built this. reconstruct_knowledge_state(timestamp) returns active rules and procedure versions for any point in time.

The Non-Sinister Palantir?

Keith's half-joke: if we productized dbt-agent, could we position as a non-sinister Palantir?

The serious version: Palantir's moat isn't just the technology. It's the combination of:

What we could offer:

The positioning isn't "Palantir for analytics." It's "what Palantir figured out about enterprise semantics, made accessible."

Specific Takeaways for dbt-agent

1. Formalize Object Types

Move beyond "canonical models" to explicit object type definitions:

object_types:
  Customer:
    properties:
      - customer_id (primary_key)
      - segment: string
      - lifetime_value: currency_amount
      - acquisition_channel: string
      - status: enum[active, churned, prospect]
    links:
      - places: Order (one_to_many)
      - has: SupportTicket (one_to_many)
      - managed_by: Employee (many_to_one)
    actions:
      - flag_churn_risk
      - assign_to_campaign

This isn't just documentation -- it's machine-readable context for agents.

2. Make Links Explicit

Current approach: relationships are implicit in JOIN conditions.

Palantir approach: relationships are declared, named, and carry semantic meaning.

link_types:
  places:
    from: Customer
    to: Order
    cardinality: one_to_many
    meaning: "Customer places Order"

  contains:
    from: Order
    to: LineItem
    cardinality: one_to_many
    meaning: "Order contains LineItem"

Agents querying the ontology can navigate relationships by semantic meaning rather than foreign key guessing.

3. Capture Actions, Not Just Reads

What if dbt-agent tracked:

These become training data for improving agent decisions.

Gap Analysis

Palantir Capability dbt-agent Status Effort to Close
Object Types Partial (canonical registry) Medium
Properties Partial (column standards) Low
Link Types Weak (implicit in SQL) Medium
Action Types Partial (decision traces) Medium
Functions Strong (36 skills) Low
Interfaces None High
Temporal Versioning Strong (PKO Phase 5) Done
Object Views Weak Medium

Final Thoughts

Palantir figured out something important: enterprises need ontologies, not just semantic layers. The difference is scope -- describing the business vs. describing the data.

Their execution requires resources most organizations don't have. But the patterns are extractable. Object types, link types, action capture, temporal context -- these can be implemented without forward-deployed engineers.

dbt-agent is already partway there. The PKO work, the decision traces, the canonical registry -- these are pieces of an ontology. The gap is making them explicit, connected, and agent-queryable.

The positioning opportunity isn't competing with Palantir. It's offering what Palantir offers -- genuine semantic infrastructure -- at a scale and cost that doesn't require defense contracts to justify.