Skip to content

The Ontology Engine — entities, types, schema, views, edges

The ontology engine (repo ontology-core-v2, package src/app/) is Atelier's data plane: the durable answer to "what exists." It is a runtime schema-evolution engine on Postgres/PostGIS/TimescaleDB where entity types, fields, relationships, declarative links, and analytical views are all defined at request time over REST (mirrored on gRPC) — no application code, no migrations, no redeploy. Every type owns a dedicated physical table; every read flows through one cursor-paginated query surface that dispatches uniformly across types, views, and stored queries. It matters because it is the foundation the whole compound-software thesis stands on: any entity declared here can later be rendered in any widget (see The Surface Engine), driven by any action (see The Action Engine), and forked into any tenant by the BFF importer — all without touching the engine. The engine is deliberately plane-agnostic: it knows nothing of Atelier's vocabulary/template/tenant planes; those are a convention the BFF imposes on top of ordinary tenant-scoped rows.

Data model

NamePurposePlaneKey fields
entity_typeDeclared TYPE; schema header owning a physical table. Single inheritance via parent_code; versioned; tenant_scoped immutable once set.runtimeid, code (unique), display_name, parent_code, table_name (entities_{code}_{uuid8}), metadata jsonb, tenant_scoped, version, deprecated_at
entities_{code}_{uuid8} (row table)Physical storage of one type's rows; geometry fields get separate PostGIS columns; rel members mirrored to shadow columns for unique constraints.runtimeid, created_at, updated_at, tenant_id (if scoped), one column per field, shadow columns per rel
field_definitionA typed column with required/indexed flags, display_name, immutable tagged default_value. Versioned per type; inherited down the ancestor chain.runtimefield_key, field_type (json), column_name, required, indexed, introduced_at, deprecated_at, default_value
relationship_definitionA real edge between two types with mandatory reverse_code and cardinality.runtimerel_code, target_type_code, reverse_code, cardinality, version
entity_relationshipsThe single shared edge table holding all relationship instances; traversed via nested EXISTS.runtimesource_id, source_type_code, rel_code, target_id, target_type_code
link_type_definitionAn FK-less link joining source_field_keytarget_field_key; a virtual relationship for __ traversal.runtimelink_code, source_field_key, target_type_code, target_field_key, reverse_code, cardinality
viewStored analytical query over one source type; ACL creator-baked into a frozen compiled_sql; read via GET /api/v1/{code}.runtimecode, source_type_code, kind (plain only), definition jsonb, compiled_sql, creator_principal_*, is_deprecated
stored_querySame definition + read shape as a view but hard-delete, no kind/deprecation.runtimecode, source_type_code, definition, creator_principal_*, last_predicate_rebuild_at
tenantIsolation unit; a SystemType. code slug is carried in X-Tenant-Id, the ACL tenant_id, and every scoped-table FK target.vocabularyid, code (unique slug), name
permission_type / entity_type_actionsGrantable ACL targets in UMS; per-type non-CRUD extra actions (publish/approve). CRUD actions cannot be deprecated.vocabulary / runtimecode, actions[]; action_key, version, is_deprecated

At read time a per-(code,version) ResolvedSchema is assembled and cached (src/app/entity/resolved_schema.py, for_type): it merges inherited fields, forward+reverse relationships and links resolved to real columns, time-series config, unique constraints, and non-CRUD actions, exposing field_map/rel_map/link_map/hop_map (= rel_map ⊕ link_map). It carries no caller-specific data, so it is safe to cache (ADR-016). The field type system (src/app/core/field_types.py) is a closed set: string→TEXT, integer, number→DOUBLE, boolean, datetime→TIMESTAMPTZ, date→DATE (rejects datetime-shaped input), uuid, slug, json→JSONB, enum→TEXT with append-only choices, geometry (requires geometry_type+srid, default 4326), file→JSONB. Each class declares sql_type, supported_operators, a ums_field_type mapping (so UMS knows the field shape for ACL templating), index DDL, and a linkable flag.

How it's declared

Everything is authored over REST under /api/v1: types via POST /api/v1/schema/types (fields, relationships, links, unique constraints, extra_actions, time_series_config, metadata) and evolved via POST /api/v1/schema/types/{code}/versions (an async job polled at .../versions/jobs/{job_id}); tenants via POST /api/v1/tenants; permission types via POST /api/v1/schema/permission-types; views via POST /api/v1/views; stored queries via POST /api/v1/stored-queries. There is no YAML authoring inside the engineverticals.yml/ontology.yml at the ONTO root are dead example schemas (confirmed: no .py/.yml/.toml/Makefile references them). The live authoring surface is the BFF's provisioning sheets (admin-api/provisioning/sheets/), which import_sheets.py compiles and POSTs to exactly these endpoints (see The Provisioning & Fork System). The read/consume surface is one GET /api/v1/{code} for types, views, and stored queries alike, with field__op= filtering, order_by, expand, spatial (within/near), time-bucketing, and cursor ?after=.

How it's provisioned

The engine never seeds schema. src/app/bootstrap.py (build_acl_client/register_with_ums) only wires the DB pool, JWT verifier, and UMS client, and registers the ontology's native metadata with UMS. All vertical schema is created by the BFF's import_sheets.py POSTing types/rels/links/views/tenants to /api/v1. A tenant gets its data plane populated by the BFF's full-catalog fork, which copies rows under a new tenant code and re-POSTs/rewires FKs — to the engine these are ordinary tenant-scoped writes. Because scoped tables FK to tenants(code), a tenant must exist (POST /api/v1/tenants) before any scoped row references it.

Extension points

  • New type: POST /api/v1/schema/types — a physical entities_{code}_{uuid8} table is created at request time (src/app/registry/entity_type_repo.py, create).
  • Evolve without breaking consumers: POST .../versions adds/deprecates fields/rels/links/actions; old versions stay queryable via ?version=N.
  • New analytics/KPI: POST /api/v1/views (aggregate/group_by/filter/join) — the only sanctioned aggregate primitive; expose to citizens by declaring a public_entity_surface for the VIEW code (BFF-side strict reader, see The Surface Engine).
  • New edge: a real relationship (entity_relationships) or an FK-less declarative link; both become __-traversable hops and join: targets in views.
  • New field type: subclass FieldType in field_types.py declaring sql_type, supported_operators, ums_field_type, index DDL — the closed set is itself the extension point.
  • New per-type action / new tenant: extra_actions (grantable in UMS) / POST /api/v1/tenants.

Invariants

  • One physical table per type; geometry fields get separate PostGIS columns (SRID 4326).
  • tenant_scoped is immutable once set; scoped creates require X-Tenant-Id and reject tenant_id in the body (422). tenant_id is implicitly prepended to unique constraints, so two tenants can share a natural key.
  • Pagination is keyset/cursor only: ?after=<cursor>, never ?offset= (422); responses are {items, next_cursor, has_next, count} (src/app/query/pagination.py, KeysetPredicate).
  • A view's data scope is creator-baked at create time — the creator's read-grants plus every JOINed (and intermediate-hop) type's ACL compiled into compiled_sql (src/app/views/service.py); X-Tenant-Id never narrows a view; view/stored-query mutations are 405.
  • default_value is immutable and fires only on create-with-omission; enum choices are append-only; reverse_code is mandatory (pass null to opt out, omitting is 400).
  • No existence leakage: an inaccessible/unknown code returns 404 / empty set, identical to a true miss; system types 404 under the generic entity path (resolve_queryable, src/app/entity/code_resolver.py).
  • Within the engine ums_enabled=false grants full access (dev escape hatch); UMS-unreachable fail-closed is the BFF's concern.

Gaps & open edges

  • tenant_scoped default divergence (confirmed flaw): EntityTypeRepo.create defaults tenant_scoped=False (entity_type_repo.py:28), while docs/API_REFERENCE.md:52 states "default true at create." Direct repo callers (tests, scripts) silently get a global type. I did not trace whether the API request schema re-defaults to true — inferred, not confirmed.
  • Stated-but-unbuilt: view kinds materialized and continuous_aggregation are documented enums but only plain is supported; both forbid join.
  • No plane semantics in the engine: vocabulary/template/tenant planes and public_entity_surface live entirely in the BFF — no public_reader/strict_public symbols exist in src/app/. Any claim that the engine enforces plane or public-surface semantics is wrong.
  • Creator-baked ACL footgun: every reader sees the creator's scope regardless of their own grants — correct for tenant-wide KPIs, dangerous if an over-privileged principal authors a view later read by a narrower one (no per-reader re-scoping).
  • Perf: reverse rels/links targeting a SystemType bypass the schema cache and do two resolve_reverse round-trips per call (resolved_schema.py:176-183) — an accepted v1 cost.
  • Surprising aggregate semantics: first/last top-level aggregates are unsupported over sub-aggregate outputs, and outer aggregates over LATERAL outputs treat no-match source rows as SQL NULL (silently ignored) — count-like KPIs can under-count.
  • Dead files: verticals.yml/ontology.yml read like an authoring surface but are unreferenced legacy artifacts.

Atelier — Platform Specification. Internal canonical reference.