Agent OS — where we are and where we're going.
Three architectures, drawn at the same fidelity. Today shows the live system from the #0 demo. Architecture A fills the capability gaps at the same scale tier. Architecture B is the cloud-native, multi-tenant target. Same shape across all three; only the adapters swap.
scroll to begin ↓
Today's architecture — structured pipes, the rest is people.
The system that ships today: 1.5 of 8 agents live, the Memory Layer in production, the Domain Event Store deploying this week. Unstructured input still becomes a human problem.
What's missing today — and what fills each gap.
Unstructured input
Email · PDF · Slack · EDI · scanned docs — today these become a human problem. CSRs retype them into forms.
Fix: Doc Understanding (Bedrock multimodal) + Intake Agent.
Memory scattered
What "we already know" about a customer lives in spreadsheets, Slack threads, and ticket comments. No agent can use it.
Fix: Memory Hub — KG · Context Graph · Vector · Episodic · Domain Event Store.
Agents partial
1.5 of 8 specialists shipped. Cash-app + HITL run end-to-end; Order, RevRec, Reconciliation, Copilot, Intake are spec only.
Fix: Build the rest. Architecture A delivers all 8.
Audit thin
Application logs in Splunk; no cryptographic proof chain. Regulators want stronger.
Fix: Merkle-on-Postgres in A; QLDB + WORM in B.
Architecture A — fill the capability gaps, same scale tier.
All 8 specialists live. Java Tools Layer ships as 11 sidecars. Connectors + Integration Engine reduce onboarding from 6 months to 6 weeks. Same Postgres, same pgvector — no scale-tier swaps.
What A solves
- Full agent fleet — Intake · Order · Remittance · Usage · RevRec · Reconciliation · HITL · Copilot
- Java Tools sidecars: 11 deployments, MCP + REST per RecVue microservice
- Connectors + Integration Engine: customer onboarding ~6 weeks
- Domain Event Store productionized: translators · projections · triggers · audit · analytics
- Audit chain: Merkle-tree-on-Postgres tier (regulator-acceptable)
What A doesn't solve
- Multi-tenant runtime — still per-tenant deployment
- Geographic scale — single-region only
- Compliance ceiling beyond Postgres + Merkle audit
- RecVue's own platform remediation (their problem, but limits A's safe integration surface)
How A becomes B — port-by-port adapter swaps.
Each module owns one or more ports (FCIS interfaces). Architecture A and B differ only in which adapter is plugged into each port. Ports are invariant; adapters swap. No big-bang.
| Port | Architecture A adapter | Architecture B adapter |
|---|---|---|
| GraphStore · KG | DrizzlePostgresGraphStore | Neo4jGraphStore |
| EmbeddingProvider · Vector | BedrockTitan + pgvector | BedrockTitan + Weaviate |
| EventStream · Mediation | PgBossEventStream | KafkaConfluentEventStream |
| EventStore · Domain Events | PostgresEventStore | EventStoreDB / partitioned |
| Cache · Memory accel | (none today) | RedisAdapter |
| AuditLedger · Trust | MerkleTreePostgresLedger | QLDBLedger |
| MultiTenancy | TenantHeader + AsyncLocalStorage | OktaJwtClaim + RLS |
| LakehouseTier · Analytics | (warm-tier in Postgres) | IcebergLakehouseAdapter |
A team that completes Architecture A is a team that has built every Architecture B port; it just hasn't swapped the adapter yet.
Architecture B — composable platform, streaming core.
Same modules. Different adapters. K8s domain services. Confluent Kafka backbone. Neo4j for graphs. Weaviate for vectors. QLDB for audit. One platform serves every customer.
What B unlocks
- Multi-tenant runtime: one platform serves all customers; RLS + JWT claims for isolation
- Geographic scale: multi-region active-active for North America + EU
- Compliance: QLDB-backed cryptographic proof chain meets SEC 17a-4
- Cost elasticity: K8s autoscale + managed services scale per-tenant
- Independent module velocity: domain services on K8s ship independently
What B costs
- Vendor lock-in: Weaviate · Neo4j · Confluent · QLDB are commercial managed services
- Operational complexity: 10+ managed services to monitor
- Migration cost: every adapter rewritten (FCIS core stays put)
- Requires RecVue's own v4 program for multi-tenant integration to be meaningful
Memory Hub — five stores, one mental model.
What the system remembers. Five stores, each with a distinct role; together they form the substrate every agent reasons over.
Knowledge Graph
Domain ontology + entities. The "structured truth" agents reason over.
Writes: KG editor · ingest pipelines
Reads: retrieve_kg_context tool
Context Graph
In-flight workflow state via LangGraph checkpoints. Powers HITL resume.
Writes: StreamAdapter · checkpointer
Reads: HITL Cockpit · Reasoning Graph
Vector & Semantic
Embeddings + similarity search. Backs slice retrieval and copilot grounding.
Writes: Embedder cron · ingest
Reads: agents · CS Copilot
Episodic Memory
Last-N cases per customer/agent. Learned patterns. CSR override history.
Writes: HITL overrides · agent runs
Reads: agent confidence calibration
Domain Event Store
Translators + projections + triggers + audit log. Backbone of "everything that happened."
Writes: StreamAdapter translators
Reads: projections · triggers · audit
Event-driven backbone — every meaningful occurrence is an event.
The Domain Event Store sits between Mediation and Memory. It translates raw stream events into typed domain events, persists them, projects into read models, fires triggers — and gives every action a place to land.
Raw stream
StreamAdapter triple-output: DB · EventEmitter · OTel
DB + EventEmitter + OTel
Translator
Per-source translators normalize raw events into domain events
orderTranslator · paymentTranslator · usageTranslator
Domain event
Persisted to event store; ordered by tenant + correlation
payment.received · order.placed · usage.metered
Projection / Trigger
Read models update; agent triggers fire; audit chain extends
PgBoss workers · trigger registry
Topics
Agent Runtime — orchestrator + 8 specialists + tools registry.
The Orchestrator plans every workflow, routes to the right specialist, enforces policy and confidence floors. Specialists are narrow, typed-tool-only. The Tools Registry is the swap-seam for API/function/MCP tools.
AGT-01 · Orchestrator
Plan · route · policy · dispatch
Event Shell + LangGraph
Intake
triage · classify
designedOrder
change-order processing
designedCash App
remittance + match
live · #0Usage
ingestion · metering
designedRevRec
ASC 606
designedReconciliation
discrepancy detect
designedHITL
escalate · low-confidence
live · #0CS Copilot
graph-grounded chat
designedTools Registry
Polymorphic registry: api / function / mcp tool types. Immutable snapshots per agent run. Credential vault. MCP server registry.
Schema designed · tables ready
Observability & Trust — OTel · audit · self-learning.
Three things together: every workflow run emits OTel spans with correlation + causation IDs; every agent action lands in an immutable ledger; CSR overrides flow back as policy refinements that the eval harness validates against history.
OTel spans
Every node = one span; correlation_id ties them together; causation_id chains parent→child.
Audit ledger
Cryptographic proof chain. Each entry signs the previous hash. Regulators verify by replay.
Workflow run viewer · KG replay
Self-learning loop
CSR corrects → system internalizes the correction → confidence threshold drifts up per-tenant.
Cash-app remittance from email — end to end, every hop typed and traced.
One inbound email. Ten steps. Doc Understanding → Domain Event → KG slice → Orchestrator → AGT-04 → Java Tools sidecar → MS_Billing → Audit + episodic. HITL fork if confidence dips below the per-tenant floor.
The full architecture document.
The complete RecVue agentic architecture in one PDF — read inline below or open in a new tab for full-screen.
efsora.com · architecture@efsora.com