Architectural security for AI agents
Every tool call — file read, shell command, network request — passes through a multi-filter scoring proxy before execution. Not a wrapper. Not a prompt. An enforcement architecture.
Six design principles
Security decisions baked into the architecture, not bolted on after.
Every call evaluated
No operation executes without passing through the security proxy. Every file access, network connection, and process spawn is intercepted, scored, and gated.
Defence in depth
Multiple independent security layers ensure no single failure compromises the system. 17 filters across 3 phases.
Fail closed
Any proxy error, timeout, or unexpected condition results in DENY, not ALLOW. Security degrades safely.
Minimal privilege
Each supervised tool receives only the permissions defined by its profile. Routine operations auto-allow; everything else is scored.
Auditable
Every tool call, filter evaluation, score, and decision is logged in structured JSON. Full audit trail.
Enforcement convergence
Both execution paths route through the same proxy, filters, scoring engine, audit log, and digest system.
Two modes, one pipeline
Whether you run grith's built-in agent or wrap an external CLI tool, every operation flows through the same security proxy.
Built-in Agent
Grith's own LLM agent runs tool calls through the security proxy before execution. Every file read, shell command, and HTTP request is scored and gated.
- *Every tool call proxy-evaluated before execution
- *Profile-based allowlists scope permitted operations
- *Full audit trail with per-call scoring
$ grith run "fix the tests"
CLI Supervisor
Wrap any external tool — Claude Code, Codex, Aider — with grith exec. OS-level syscall interception routes every operation through the security proxy.
- *Linux available now with ptrace + seccomp full interception
- *macOS support coming very soon
- *Windows support coming very soon
$ grith exec -- claude-code "fix the bug"
Multi-Filter Security Proxy
Same filters, same scoring thresholds, same audit log, same digest system. Security policy is defined once and applies everywhere.
17 filters, 3 phases, one composite score
Tool calls flow through independent filter phases. Threats are caught early — most never reach the final scoring stage.
Cold-start (first 200 calls): escalation zone widens to 2.0-10.0
Every filter, explained
17 independent classifiers across three pipeline phases. Within each phase, all filters execute in parallel.
| # | Filter | Phase | Latency | Score |
|---|---|---|---|---|
| 1 | Operation risk classification Classifies tool call type and assigns base risk score | Static | <0.1ms | +1 to +4 |
| 2 | Static path matching Aho-Corasick pattern matching on file paths and arguments | Static | <0.1ms | +2 to +5 |
| 3 | Sensitive path heuristic Heuristic detection of sensitive directories (.ssh, .env, credentials) | Static | <0.1ms | +1 to +4 |
| 4 | Allowlist / denylist Exact match against user-configured allow and deny lists | Static | <0.1ms | -1 to +3 |
| 5 | Argument length / structure Detects anomalous argument sizes and malformed structures | Static | <0.1ms | 0 to +2 |
| 6 | Capability enforcement Validates operation against supervisor profile permissions | Static | <0.1ms | 0 or DENY |
| 7 | Secret / credential scanning Regex + entropy detection for 1600+ patterns (API keys, tokens, passwords) | Pattern | 1-3ms | +3 to +5 |
| 8 | Command structure analysis Shell command parsing for pipe chains, redirections, evals | Pattern | 0.5-1ms | +2 to +4 |
| 9 | Egress policy enforcement Controls outbound network destinations against trusted domain lists | Pattern | <0.1ms | +2 to +5 |
| 10 | DLP gate Data loss prevention — detects and redacts sensitive data in outbound content | Pattern | 0.5-2ms | +3 to +5 |
| 11 | Canary token detection Detects access to planted canary files and honeytokens | Pattern | <0.1ms | +4 to +5 |
| 12 | Destination reputation IP/domain reputation lookup for network requests | Context | 0.5-2ms | -1 to +4 |
| 13 | Behavioural profiling Deviation from learned baselines for this user/project | Context | 1-3ms | +1 to +3 |
| 14 | Information flow taint tracking Tracks data provenance — blocks tainted data exfiltration | Context | 0.1-0.5ms | +3 to +5 |
| 15 | Session containment Detects read-then-exfiltrate patterns within a session window | Context | 0.1-0.5ms | +2 to +4 |
| 16 | Rate limiting / anomaly detection Burst detection, frequency caps, statistical anomalies | Context | <0.1ms | +1 to +3 |
| 17 | Egress rate monitoring Tracks outbound request volume, unique destinations, and port scanning | Context | <0.1ms | +1 to +3 |
Filter latency ranges
AI agent kill chain detection
Grith maps agent behavior to a 7-phase kill chain adapted from MITRE ATT&CK. Most threats are intercepted in the earliest phases.
Expected score distribution
1,000 simulated tool calls across a typical development session. The vast majority of operations are routine and pass automatically.
Designed to support your compliance workflows
Every tool call produces structured audit evidence. Map it to the framework your auditors require. Grith provides the data — not the certification.
SOC 2 Type II
Audit evidence for trust services criteria
NIST AI RMF
Structured data for all 4 functions
EU AI Act
Articles 9, 12, 13, 14, 15, 72
Secure your AI agents
Independent syscall-level enforcement for every AI coding tool.
Responsible Disclosure
We take security vulnerabilities seriously. If you discover a security issue in grith, we ask that you disclose it responsibly.
- Email security@grith.ai with details of the vulnerability.
- Include steps to reproduce, affected versions, and potential impact.
- Allow up to 90 days for us to investigate and patch before public disclosure.
- We will acknowledge your report within 48 hours and provide regular updates on remediation progress.
Security Contact
For security-related communications, contact us at security@grith.ai. For PGP-encrypted communications, our public key is available upon request.
Security Advisories
Security advisories will be published on this page and in the project's GitHub repository. Subscribe to the repository releases to receive notifications.
For detailed architecture documentation, see the documentation site.