Skip to content

Policy Engine & Hooks

The policy engine is the enforcement layer that sits between the LLM and the outside world. It intercepts every action at critical points in the data flow and makes deterministic ALLOW, BLOCK, or REDACT decisions. The LLM cannot bypass, modify, or influence these decisions.

Core Principle: Enforcement Below the LLM

+------------------------------------------------------------------+
|  LLM / Agent Reasoning Layer                                     |
|  - Can be manipulated via prompt injection                       |
|  - Does NOT make security decisions                              |
|  - Requests actions, does not execute them                       |
|  - Has ZERO authority                                            |
+------------------------------------------------------------------+
                            |
                            v  action request (structured)
+------------------------------------------------------------------+
|  POLICY ENFORCEMENT LAYER                                        |
|                                                                  |
|  - Pure code, deterministic                                      |
|  - Tracks taint labels on all data in context                    |
|  - Computes effective classification of any output               |
|  - Compares against channel/recipient classification             |
|  - BLOCKS or ALLOWS -- no LLM discretion                         |
|  - Logs all decisions for audit                                  |
|  - Cannot be prompt-injected                                     |
+------------------------------------------------------------------+
                            |
                            v  allowed actions only
+------------------------------------------------------------------+
|  Execution Layer (plugins, tool calls, messages)                 |
+------------------------------------------------------------------+

SECURITY

The LLM sits above the policy layer. It can be prompt-injected, jailbroken, or manipulated -- and it does not matter. The policy layer is pure code that runs below the LLM, examining structured action requests and making binary decisions based on classification rules. There is no pathway from LLM output to hook bypass.

Hook Types

Eight enforcement hooks intercept actions at every critical point in the data flow.

Hook Architecture

  External Input
        |
        v
  +--------------------------------------------+
  | HOOK: PRE_CONTEXT_INJECTION                |
  | - Validate sender identity                 |
  | - Assign classification to input           |
  | - Scan for injection patterns              |
  | - Create lineage record                    |
  +--------------------------------------------+
        |
        v
  LLM Context (agent processes input)
        |
        v
  +--------------------------------------------+
  | HOOK: PRE_TOOL_CALL                        |
  | - Validate tool is permitted               |
  | - Check user has permission for action     |
  | - Verify parameters within policy          |
  | - Rate limit check                         |
  +--------------------------------------------+
        |
        v
  Tool Execution
        |
        v
  +--------------------------------------------+
  | HOOK: POST_TOOL_RESPONSE                   |
  | - Classify returned data                   |
  | - Update session taint                     |
  | - Create/update lineage records            |
  | - Scan for sensitive patterns              |
  +--------------------------------------------+
        |
        v
  LLM Response Generation
        |
        v
  +--------------------------------------------+
  | HOOK: PRE_OUTPUT                           |
  | - Verify output classification <= target   |
  | - Final PII/sensitive data scan            |
  | - Record lineage destination               |
  | - BLOCK or ALLOW (deterministic)           |
  +--------------------------------------------+
        |
        v
  Output Channel

All Hook Types

HookTriggerKey ActionsFailure Mode
PRE_CONTEXT_INJECTIONExternal input enters contextClassify input, assign taint, create lineage, scan for injectionReject input
PRE_TOOL_CALLLLM requests tool executionPermission check, rate limit, parameter validationBlock tool call
POST_TOOL_RESPONSETool returns dataClassify response, update session taint, create/update lineageRedact or block
PRE_OUTPUTResponse about to leave systemFinal classification check against target, PII scanBlock output
SECRET_ACCESSPlugin requests a credentialLog access, verify permission against declared scopeDeny credential
SESSION_RESETUser requests taint resetArchive lineage, clear context, verify confirmationRequire confirmation
AGENT_INVOCATIONAgent calls another agentVerify delegation chain, enforce taint ceilingBlock invocation
MCP_TOOL_CALLMCP server tool invokedGateway policy check (server status, tool permissions, schema)Block MCP call

Hook Interface

Every hook receives a context and returns a result. The handler is a synchronous, pure function.

typescript
interface HookContext {
  readonly sessionId: SessionId;
  readonly hookType: HookType;
  readonly timestamp: Date;
  // Hook-specific payload varies by type
}

interface HookResult {
  readonly decision: "ALLOW" | "BLOCK" | "REDACT";
  readonly reason: string;
  readonly metadata?: Record<string, unknown>;
}

type HookHandler = (context: HookContext) => HookResult;

INFO

HookHandler is synchronous and returns HookResult directly -- not a Promise. This is by design. Hooks must complete before the action proceeds, and making them synchronous eliminates any possibility of async bypass. If a hook times out, the action is rejected.

Hook Guarantees

Every hook execution carries four invariants:

GuaranteeWhat it means
DeterministicSame input always produces the same decision. No randomness. No LLM calls within hooks. No external API calls that affect decisions.
SynchronousHooks complete before the action proceeds. No async bypass is possible. Timeout equals rejection.
LoggedEvery hook execution is recorded: input parameters, decision made, timestamp, and policy rules evaluated.
UnforgeableLLM output cannot contain hook bypass instructions. The hook layer has no "parse LLM output for commands" logic.

Policy Rules Hierarchy

Policy rules are organized into three tiers. Higher tiers cannot override lower tiers.

Fixed Rules (always enforced, NOT configurable)

These rules are hardcoded and cannot be disabled by any admin, user, or configuration:

  • No write-down: Classification flow is one-directional. Data cannot flow to a lower level.
  • UNTRUSTED channels: No data in or out. Period.
  • Session taint: Once elevated, stays elevated for the session lifetime.
  • Audit logging: All actions logged. No exceptions. No way to disable.

Configurable Rules (admin-tunable)

Administrators can adjust these through the UI or configuration files:

  • Integration default classifications (e.g., Salesforce defaults to CONFIDENTIAL)
  • Channel classifications
  • Action allow/deny lists per integration
  • Domain allowlists for external communications
  • Rate limits per tool, per user, or per session

Declarative Escape Hatch (enterprise)

Enterprise deployments can define custom policy rules in structured YAML for advanced scenarios:

yaml
# Block any Salesforce query containing SSN patterns
hook: POST_TOOL_RESPONSE
conditions:
  - tool_name: salesforce.*
  - content_matches: '\b\d{3}-\d{2}-\d{4}\b'
action: REDACT
redaction_pattern: '[SSN REDACTED]'
log_level: ALERT
notify: security-team@company.com
yaml
# Require approval for high-value transactions
hook: PRE_TOOL_CALL
conditions:
  - tool_name: stripe.create_charge
  - parameter.amount: '>10000'
action: REQUIRE_APPROVAL
approvers:
  - role: finance-admin
timeout: 1h
timeout_action: DENY
yaml
# Time-based restriction: no external sends after hours
hook: PRE_OUTPUT
conditions:
  - recipient_type: EXTERNAL
  - time_of_day: "18:00-08:00"
  - day_of_week: "Mon-Fri"
action: BLOCK
reason: "External communications restricted outside business hours"

TIP

Custom YAML rules must pass validation before activation. Invalid rules are rejected at configuration time, not at runtime. This prevents misconfiguration from creating security gaps.

Denial User Experience

When the policy engine blocks an action, the user sees a clear explanation -- not a generic error.

Default (specific):

I can't send confidential data to a public channel.

  -> Reset session and send message
  -> Cancel

Opt-in (educational):

I can't send confidential data to a public channel.

Why: This session accessed Salesforce (CONFIDENTIAL).
WhatsApp personal is classified as PUBLIC.
Data can only flow to equal or higher classification.

Options:
  -> Reset session and send message
  -> Ask your admin to reclassify the WhatsApp channel
  -> Learn more: [docs link]

The educational mode is opt-in and helps users understand why an action was blocked, including which data source caused the taint escalation and what the classification mismatch is. Both modes offer actionable next steps rather than dead-end errors.

How Hooks Chain Together

In a typical request/response cycle, multiple hooks fire in sequence. Each hook has full visibility into the decisions made by earlier hooks in the chain.

User sends: "Check my Salesforce pipeline and message my wife"

1. PRE_CONTEXT_INJECTION
   - Input from owner, classified as PUBLIC
   - Session taint: PUBLIC

2. PRE_TOOL_CALL (salesforce.query_opportunities)
   - Tool permitted? YES
   - User has Salesforce connection? YES
   - Rate limit? OK
   - Decision: ALLOW

3. POST_TOOL_RESPONSE (salesforce results)
   - Data classified: CONFIDENTIAL
   - Session taint escalates: PUBLIC -> CONFIDENTIAL
   - Lineage record created

4. PRE_TOOL_CALL (whatsapp.send_message)
   - Tool permitted? YES
   - Decision: ALLOW (tool-level check passes)

5. PRE_OUTPUT (message to wife via WhatsApp)
   - Session taint: CONFIDENTIAL
   - Target effective classification: PUBLIC (external recipient)
   - CONFIDENTIAL -> PUBLIC: BLOCKED
   - Decision: BLOCK
   - Reason: "classification_violation"

6. Agent presents reset option to user

Released under the MIT License.