Architectural security for AI agents

Every tool call — file read, shell command, network request — passes through a multi-filter scoring proxy before execution. Not a wrapper. Not a prompt. An enforcement architecture.

0 filters

Independent classifiers

<0ms

Evaluation latency

0 outcomes

Allow / Queue / Deny

Six design principles

Security decisions baked into the architecture, not bolted on after.

Every call evaluated

No operation executes without passing through the security proxy. Every file access, network connection, and process spawn is intercepted, scored, and gated.

Defence in depth

Multiple independent security layers ensure no single failure compromises the system. 17 filters across 3 phases.

Fail closed

Any proxy error, timeout, or unexpected condition results in DENY, not ALLOW. Security degrades safely.

Minimal privilege

Each supervised tool receives only the permissions defined by its profile. Routine operations auto-allow; everything else is scored.

Auditable

Every tool call, filter evaluation, score, and decision is logged in structured JSON. Full audit trail.

Enforcement convergence

Both execution paths route through the same proxy, filters, scoring engine, audit log, and digest system.

Two modes, one pipeline

Whether you run grith's built-in agent or wrap an external CLI tool, every operation flows through the same security proxy.

Mode 1

Built-in Agent

Grith's own LLM agent runs tool calls through the security proxy before execution. Every file read, shell command, and HTTP request is scored and gated.

*Every tool call proxy-evaluated before execution
*Profile-based allowlists scope permitted operations
*Full audit trail with per-call scoring

Terminal

$ grith run "fix the tests"

Path 2

CLI Supervisor

Wrap any external tool — Claude Code, Codex, Aider — with grith exec. OS-level syscall interception routes every operation through the security proxy.

*Linux x86_64 available now with ptrace + seccomp full interception
*macOS support tracked for v2.0 (Endpoint Security port)
*Windows support tracked for v2.0 (ETW + supervisor port)
*Linux aarch64 tracked for v2.0 (supervisor register-backend port)

Terminal

$ grith exec -- claude-code "fix the bug"

Multi-Filter Security Proxy

Same filters, same scoring thresholds, same audit log, same digest system. Security policy is defined once and applies everywhere.

ALLOW

score < 3.0

QUEUE

score 3.0-8.0

DENY

score > 8.0

17 filters, 3 phases, one composite score

Tool calls flow through independent filter phases. Threats are caught early — most never reach the final scoring stage.

Allow

score < 3.0

Queue for review

score 3.0 - 8.0

Deny

score > 8.0

Cold-start (first 200 calls): escalation zone widens to 2.0-10.0

Every filter, explained

17 independent classifiers across three pipeline phases. Within each phase, all filters execute in parallel.

#	Filter	Phase	Latency	Score
1	Operation risk classification Classifies tool call type and assigns base risk score	Static	<0.1ms	+1 to +4
2	Static path matching Aho-Corasick pattern matching on file paths and arguments	Static	<0.1ms	+2 to +5
3	Sensitive path heuristic Heuristic detection of sensitive directories (.ssh, .env, credentials)	Static	<0.1ms	+1 to +4
4	Allowlist / denylist Exact match against user-configured allow and deny lists	Static	<0.1ms	-1 to +3
5	Argument length / structure Detects anomalous argument sizes and malformed structures	Static	<0.1ms	0 to +2
6	Capability enforcement Validates operation against supervisor profile permissions	Static	<0.1ms	0 or DENY
7	Secret / credential scanning Regex + entropy detection for 1600+ patterns (API keys, tokens, passwords)	Pattern	1-3ms	+3 to +5
8	Command structure analysis Shell command parsing for pipe chains, redirections, evals	Pattern	0.5-1ms	+2 to +4
9	Egress policy enforcement Controls outbound network destinations against trusted domain lists	Pattern	<0.1ms	+2 to +5
10	DLP gate Data loss prevention — detects and redacts sensitive data in outbound content	Pattern	0.5-2ms	+3 to +5
11	Canary token detection Detects access to planted canary files and honeytokens	Pattern	<0.1ms	+4 to +5
12	Destination reputation IP/domain reputation lookup for network requests	Context	0.5-2ms	-1 to +4
13	Behavioural profiling Deviation from learned baselines for this user/project	Context	1-3ms	+1 to +3
14	Information flow taint tracking Tracks data provenance — blocks tainted data exfiltration	Context	0.1-0.5ms	+3 to +5
15	Session containment Detects read-then-exfiltrate patterns within a session window	Context	0.1-0.5ms	+2 to +4
16	Rate limiting / anomaly detection Burst detection, frequency caps, statistical anomalies	Context	<0.1ms	+1 to +3
17	Egress rate monitoring Tracks outbound request volume, unique destinations, and port scanning	Context	<0.1ms	+1 to +3

Filter latency ranges

AI agent kill chain detection

Grith maps agent behavior to a 7-phase kill chain adapted from MITRE ATT&CK. Most threats are intercepted in the earliest phases.

Intercepted early

Intercepted mid-chain

Intercepted late

Expected score distribution

1,000 simulated tool calls across a typical development session. The vast majority of operations are routine and pass automatically.

91%

Auto-allowed

Queued for review

Auto-denied

Designed to support your compliance workflows

Every tool call produces structured audit evidence. Map it to the framework your auditors require. Grith provides the data — not the certification.

SOC 2 Type II

Audit evidence for trust services criteria

▼

NIST AI RMF

Structured data for all 4 functions

▼

EU AI Act

Articles 9, 12, 13, 14, 15, 72

▼

ISO — AI management systemOWASP — 6/10 risks coveredFedRAMP — ConMon-compatible data export (planned)

HIPAA — PHI access loggingPCI-DSS — Audit trail complianceITAR — 5-year retention

Secure your AI agents

Independent syscall-level enforcement for every AI coding tool.

Get early access

Responsible Disclosure

We take security vulnerabilities seriously. If you discover a security issue in grith, we ask that you disclose it responsibly.

Email security@grith.ai with details of the vulnerability.
Include steps to reproduce, affected versions, and potential impact.
Allow up to 90 days for us to investigate and patch before public disclosure.
We will acknowledge your report within 48 hours and provide regular updates on remediation progress.

Security Contact

For security-related communications, contact us at security@grith.ai. For PGP-encrypted communications, our public key is available upon request.

Security Advisories

Security advisories will be published on this page and in the project's GitHub repository. Subscribe to the repository releases to receive notifications.

For detailed architecture documentation, see the documentation site.