Architectural security for AI agents

Every tool call — file read, shell command, network request — passes through a multi-filter scoring proxy before execution. Not a wrapper. Not a prompt. An enforcement architecture.

0 filters
Independent classifiers
<0ms
Evaluation latency
0 outcomes
Allow / Queue / Deny
Tool Call17 FILTERS x 3 PHASESSTATICPATTERNCONTEXTSCOREALLOWQUEUEDENY

Six design principles

Security decisions baked into the architecture, not bolted on after.

Every call evaluated

No operation executes without passing through the security proxy. Every file access, network connection, and process spawn is intercepted, scored, and gated.

Defence in depth

Multiple independent security layers ensure no single failure compromises the system. 17 filters across 3 phases.

Fail closed

Any proxy error, timeout, or unexpected condition results in DENY, not ALLOW. Security degrades safely.

Minimal privilege

Each supervised tool receives only the permissions defined by its profile. Routine operations auto-allow; everything else is scored.

Auditable

Every tool call, filter evaluation, score, and decision is logged in structured JSON. Full audit trail.

Enforcement convergence

Both execution paths route through the same proxy, filters, scoring engine, audit log, and digest system.

Two modes, one pipeline

Whether you run grith's built-in agent or wrap an external CLI tool, every operation flows through the same security proxy.

Mode 1

Built-in Agent

Grith's own LLM agent runs tool calls through the security proxy before execution. Every file read, shell command, and HTTP request is scored and gated.

  • *Every tool call proxy-evaluated before execution
  • *Profile-based allowlists scope permitted operations
  • *Full audit trail with per-call scoring
Terminal
$ grith run "fix the tests"
Path 2

CLI Supervisor

Wrap any external tool — Claude Code, Codex, Aider — with grith exec. OS-level syscall interception routes every operation through the security proxy.

  • *Linux available now with ptrace + seccomp full interception
  • *macOS support coming very soon
  • *Windows support coming very soon
Terminal
$ grith exec -- claude-code "fix the bug"

Multi-Filter Security Proxy

Same filters, same scoring thresholds, same audit log, same digest system. Security policy is defined once and applies everywhere.

ALLOW
score < 3.0
QUEUE
score 3.0-8.0
DENY
score > 8.0

17 filters, 3 phases, one composite score

Tool calls flow through independent filter phases. Threats are caught early — most never reach the final scoring stage.

file_readfile_writeshell_execnet_requestSTATICOp risk classificationStatic path matchingSensitive path heuristicAllowlist/denylistArg length/structureCapability enforcementPATTERNSecret/credential scanCommand structureEgress policyDLP gateCanary token detectionCONTEXTDest reputationBehavioural profilingTaint trackingSession containmentRate limiting/anomalyEgress rate monitoringALLOW910QUEUE70DENY20
Allow
score < 3.0
Queue for review
score 3.0 - 8.0
Deny
score > 8.0

Cold-start (first 200 calls): escalation zone widens to 2.0-10.0

Every filter, explained

17 independent classifiers across three pipeline phases. Within each phase, all filters execute in parallel.

#FilterPhaseLatencyScore
1
Operation risk classification
Classifies tool call type and assigns base risk score
Static<0.1ms+1 to +4
2
Static path matching
Aho-Corasick pattern matching on file paths and arguments
Static<0.1ms+2 to +5
3
Sensitive path heuristic
Heuristic detection of sensitive directories (.ssh, .env, credentials)
Static<0.1ms+1 to +4
4
Allowlist / denylist
Exact match against user-configured allow and deny lists
Static<0.1ms-1 to +3
5
Argument length / structure
Detects anomalous argument sizes and malformed structures
Static<0.1ms0 to +2
6
Capability enforcement
Validates operation against supervisor profile permissions
Static<0.1ms0 or DENY
7
Secret / credential scanning
Regex + entropy detection for 1600+ patterns (API keys, tokens, passwords)
Pattern1-3ms+3 to +5
8
Command structure analysis
Shell command parsing for pipe chains, redirections, evals
Pattern0.5-1ms+2 to +4
9
Egress policy enforcement
Controls outbound network destinations against trusted domain lists
Pattern<0.1ms+2 to +5
10
DLP gate
Data loss prevention — detects and redacts sensitive data in outbound content
Pattern0.5-2ms+3 to +5
11
Canary token detection
Detects access to planted canary files and honeytokens
Pattern<0.1ms+4 to +5
12
Destination reputation
IP/domain reputation lookup for network requests
Context0.5-2ms-1 to +4
13
Behavioural profiling
Deviation from learned baselines for this user/project
Context1-3ms+1 to +3
14
Information flow taint tracking
Tracks data provenance — blocks tainted data exfiltration
Context0.1-0.5ms+3 to +5
15
Session containment
Detects read-then-exfiltrate patterns within a session window
Context0.1-0.5ms+2 to +4
16
Rate limiting / anomaly detection
Burst detection, frequency caps, statistical anomalies
Context<0.1ms+1 to +3
17
Egress rate monitoring
Tracks outbound request volume, unique destinations, and port scanning
Context<0.1ms+1 to +3

Filter latency ranges

0ms2ms4ms6ms8ms10ms12ms1. Operation risk classif...2. Static path matching3. Sensitive path heurist...4. Allowlist / denylist5. Argument length / stru...6. Capability enforcement7. Secret / credential sc...8. Command structure anal...9. Egress policy enforcem...10. DLP gate11. Canary token detection12. Destination reputation13. Behavioural profiling14. Information flow taint...15. Session containment16. Rate limiting / anomal...17. Egress rate monitoring

AI agent kill chain detection

Grith maps agent behavior to a 7-phase kill chain adapted from MITRE ATT&CK. Most threats are intercepted in the earliest phases.

1RECONDirectory enumerationConfig file probingEnv var access2WEAPONIZEDependency injectionScript generationPayload crafting3DELIVERYcurl | sh patternspip install from URLDownload + execute4EXPLOITPrivilege escalationInterception evasionPermission creep5INSTALLCron job creationService registrationFile persistence6C2Outbound to unknown IPsDNS tunnelingEncoded payloads7EXFILSensitive file readsBulk data transferSecret exposure
Intercepted early
Intercepted mid-chain
Intercepted late

Expected score distribution

1,000 simulated tool calls across a typical development session. The vast majority of operations are routine and pass automatically.

ALLOWQUEUEDENYSSH key access: 8.5curl | sh: 1203581015COMPOSITE SCORE050100150200
91%
Auto-allowed
7%
Queued for review
2%
Auto-denied

Designed to support your compliance workflows

Every tool call produces structured audit evidence. Map it to the framework your auditors require. Grith provides the data — not the certification.

25%50%75%100%LoggingAccess ControlMonitoringIncident ResponseData ProtectionTransparency

SOC 2 Type II

Audit evidence for trust services criteria

NIST AI RMF

Structured data for all 4 functions

EU AI Act

Articles 9, 12, 13, 14, 15, 72

ISO AI management systemOWASP 6/10 risks coveredFedRAMP ConMon-compatible data export (planned)
HIPAAPHI access loggingPCI-DSSAudit trail complianceITAR5-year retention

Secure your AI agents

Independent syscall-level enforcement for every AI coding tool.

Responsible Disclosure

We take security vulnerabilities seriously. If you discover a security issue in grith, we ask that you disclose it responsibly.

  • Email security@grith.ai with details of the vulnerability.
  • Include steps to reproduce, affected versions, and potential impact.
  • Allow up to 90 days for us to investigate and patch before public disclosure.
  • We will acknowledge your report within 48 hours and provide regular updates on remediation progress.

Security Contact

For security-related communications, contact us at security@grith.ai. For PGP-encrypted communications, our public key is available upon request.

Security Advisories

Security advisories will be published on this page and in the project's GitHub repository. Subscribe to the repository releases to receive notifications.

For detailed architecture documentation, see the documentation site.

© 2026 grith. All rights reserved.

Product names and logos are trademarks of their respective owners. Their use indicates compatibility, not endorsement.