The cascade. Tier by tier.
Pick a tier in the rail. See what it does, the joule range it spans, the model it runs, and the receipt it signs. The whole cascade ships behind one API.
Query intake + canonicalization
Receives the query, normalizes whitespace, applies the user's policy bundle, attaches the caller's joule budget. Doesn't think — just shapes the work.
Anatomy — operational specs
Cache lookup by query fingerprint
Hashes the canonical query (blake3), checks a vector-indexed cache. Hit → returns prior answer + provenance, skips upstream tiers. Miss → forwards. Hit rate above 35% in production for repeated query classes.
Anatomy — operational specs
Perception — spatial / visual / audio grounding
Custom non-transformer architecture for perception tasks. Grounds spatial intent ("30 minutes to the lab"), parses visual queries, embeds audio. The most expensive tier when called — typically skipped on cache hits.
Anatomy — operational specs
Language — entity linking, constraint expansion
Parses the constraint set, runs entity linking against the InformationOS graph, expands implicit constraints (e.g., "solar-ready" → roof orientation + HOA tier + interconnect status). Output: a fully-explicit constraint vector.
Anatomy — operational specs
Deterministic constraint solver
Pure-function reasoning. Takes the explicit constraint vector, queries the current state, returns the solution set. No randomness, no temperature. Cheap by design — and always re-run because cached deterministic answers expire as the world changes.
Anatomy — operational specs
Merge tier outputs into ranked result
Joins memoized perception + memoized language + fresh deterministic results. Retains provenance: each output field points back at the tier that produced it. The result is a typed object, not free text.
Anatomy — operational specs
Sign + emit the JWP ReceiptPayload
Ed25519 signature over the canonical JSON: joules total per tier, memoize stats, datacenter, model versions, the result hash. The signed receipt IS the audit trail. Compliance, billing, and Insights consume it directly.
Anatomy — operational specs
Why a cascade and not a model
One model is one shape. A cascade is a budget.
The cheapest tier that solves the query wins. Memoization eats repeat work. Math-Ground does the deterministic reasoning no model can be trusted with. The bill is dominated by deterministic compute, not generation — and the receipt proves it.