How Curia compares to a typical agent framework
| Typical agent framework | Curia | |
|---|---|---|
| Security model | Trust the agent | Hard-enforced layer separation — channel adapters cannot invoke tools |
| Audit trail | Console logs | Append-only Postgres with causal tracing across every event |
| Memory | Conversation history (lost on restart) | Knowledge graph with temporal awareness — survives restarts |
| Error handling | Retry and hope | Error budgets, state continuity, pattern detection |
| Agent communication | Agents work in isolation | Structured, auditable, threaded inter-agent discussions |
| Multi-channel | Single chat interface | Email, Signal, CLI, HTTP API — same security model everywhere |
Security pillars
Hard layer separation
Every component declares its layer at startup. The message bus enforces which event types each layer can publish or subscribe to — at registration time, not at runtime. A compromised email adapter can flood inbound messages, but it cannot invoke a skill, write to memory, or execute code. The boundary is architectural: attempting to publish an unauthorized event throws an error, not a warning.
Append-only audit trail
Every event that flows through the bus is written to Postgres before it is delivered to any subscriber. No UPDATE, no DELETE — ever. If the process crashes mid-delivery, the event is still in the log. Every event carries a
parent_event_id so you can trace the full causal chain from any action back to the message that triggered it.Secrets never reach the LLM
Credentials live in an encrypted vault (AES-256-GCM in Postgres), not in plaintext
.env files — only the master key and database credentials remain on disk. Agents never see passwords, API keys, or tokens. Skills access secrets through a scoped ctx.secret() interface, validated against each skill’s declared manifest. The LLM sees “email-parser connected to inbox” — never the IMAP password. Every secret access is audit-logged: which secret, which skill, which agent, which task, and whether it resolved from the vault.Tool output sanitization
All skill results are sanitized before being fed back into LLM context. XML and HTML tags are stripped, outputs are truncated to a configurable limit, secret-like patterns are redacted, and error messages are wrapped in structured tags to prevent prompt injection. Nothing from the outside world reaches the LLM unfiltered.
Intent drift detection
Long-running tasks store an intent anchor — the original task description — at creation time. On each execution burst, the system compares current progress against that anchor. If the agent has drifted from its goal, the task is paused, not merely flagged. In unattended mode, drift detection blocks execution; it does not advise.
Error budgets
Every agent task runs under hard caps: maximum LLM round-trips, maximum dollar spend, and maximum consecutive errors. When a budget is exceeded, the task stops. No infinite loops, no surprise bills, no runaway agents. Budgets are defined per-agent in YAML and cannot be overridden at runtime.
Prompt injection defense
Inbound message content is treated as data, not instructions. The dispatch layer strips instruction-like patterns and tags high-risk messages with a
risk_score before the LLM sees them. The coordinator’s system prompt explicitly instructs it not to follow instructions embedded in user messages. Even if an injection succeeds at the coordinator level, architectural containment limits what a tricked agent can actually do.Outbound safety
All outbound messages — regardless of which skill or agent produced them — pass through a single outbound gateway before being sent. A deterministic content filter checks for system prompt fragments, internal field names, secret patterns, and contact data leakage. A second-stage LLM-as-judge then reviews mixed-audience messages for principal-private or internal content leaking to external recipients, blocking the send and notifying you with a principal-safe reason. Display names in inbound emails are validated against the verified sender address to prevent spoofing.
Skill caller restrictions
Skills can declare which agents are allowed to invoke them via
allowed_callers in the skill manifest. The execution layer enforces this before any other gate — a structurally forbidden caller is rejected immediately, not routed through autonomy or approval workflows. Agent names are validated at startup; unknown names prevent boot.Container and supply chain
The production Docker image runs as a non-root
curia user, and its base images are pinned by SHA-256 digest. Automated security scanning runs on every pull request: Trivy (npm deps, Docker image, secrets), Semgrep (pattern-based SAST), CodeQL (semantic analysis), and Gitleaks (secret detection), plus a weekly OpenSSF Scorecard run published to the GitHub Security tab. Every GitHub Actions reference is pinned to a commit SHA, workflow GITHUB_TOKEN grants are scoped to least privilege per job, and Dependabot tracks npm, Actions, and Docker updates. Branch protection on main requires PR review and passing status checks before merge.Layer boundaries at a glance
The message bus enforces these boundaries at registration time. No layer can escape its permissions — not through configuration, not at runtime.| Layer | What it can do |
|---|---|
| Channel | Receive inbound messages from external channels; deliver outbound messages back to those channels. Cannot invoke skills or read memory. |
| Dispatch | Route messages to agents; send replies back to channels. Cannot invoke skills directly. |
| Agent | Call skills, produce responses, coordinate with other agents. Cannot directly access channels or external APIs. |
| Execution | Receive skill invocations and return results. Cannot initiate messages or access channels. |
| System | Full access across all layers — reserved for the audit logger, scheduler, and memory engine only. |
No user-defined agent or skill ever runs at the system layer. That layer is reserved exclusively for Curia’s trusted infrastructure components.
What this means in practice
If an attacker sends a malicious email crafted to trick Curia into exfiltrating data, the attack must survive all of the following:- Display name sanitization — the email’s From header is checked against the verified sender address
- Injection pattern detection — the dispatch layer scans for instruction-like patterns in the message body and adjusts the trust score accordingly
- Trust score gating — an email from an unknown sender scores around 0.12, well below the 0.8 threshold required for data export actions
- Coordinator’s prompt injection defense — explicit directives instruct the coordinator to treat message content as data, not instructions
- Architectural containment — even a tricked coordinator can only delegate to specialist agents and invoke skills; it cannot directly access the filesystem, database, or external APIs
- Skill permission validation — the execution layer validates that the invoked skill has declared the permissions needed for the requested action
- Outbound content filter — any outbound message is checked for leaked system context or internal data before being sent
Related pages
Audit log
How the append-only audit trail works and how to use it for compliance review.
Contact trust
How Curia verifies senders and controls what they can request.
Autonomy engine
How the autonomy score controls how independently Curia acts.