security

package
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 24, 2026 License: Apache-2.0 Imports: 16 Imported by: 0

Documentation

Overview

Package security provides prompt injection defense through trust-tagged content blocks, tiered verification, and cryptographic audit trails.

Index

Constants

View Source
const EntropyThreshold = 4.8

EntropyThreshold is the bits/byte threshold above which content is considered potentially encoded. Base64 typically has entropy around 5.0-5.5, so we use a lower threshold to catch it.

Variables

View Source
var HighRiskTools = map[string]bool{
	"bash":        true,
	"write":       true,
	"web_fetch":   true,
	"spawn_agent": true,
}

HighRiskTools is the set of tools that require extra scrutiny.

Functions

func ContainsSuspiciousContent

func ContainsSuspiciousContent(content string) (suspicious bool, reasons []string)

ContainsSuspiciousContent checks both patterns and encoding.

func HasEncodedContent

func HasEncodedContent(content string) bool

HasEncodedContent returns true if the content appears to contain encoded payloads.

func HasSensitiveKeywords

func HasSensitiveKeywords(content string) bool

HasSensitiveKeywords returns true if any sensitive keywords are detected.

func HasSuspiciousPatterns

func HasSuspiciousPatterns(content string) bool

HasSuspiciousPatterns returns true if any suspicious patterns are detected.

func IsHighEntropy

func IsHighEntropy(data []byte) bool

IsHighEntropy returns true if the content has entropy above the threshold.

func RegisterCustomKeywords

func RegisterCustomKeywords(keywords []string)

RegisterCustomKeywords adds user-defined keywords (from policy.toml).

func RegisterCustomPatterns

func RegisterCustomPatterns(patterns []string) error

RegisterCustomPatterns adds user-defined patterns (from policy.toml). Format: "name:regex" e.g., "exfil_attempt:send.*to.*external"

func SegmentEntropy

func SegmentEntropy(content string, minSegmentLen int) (highestSegment string, highestEntropy float64)

SegmentEntropy calculates entropy for segments of a string. Returns the segment with highest entropy and its value. This is useful for detecting encoded payloads embedded in normal text.

func ShannonEntropy

func ShannonEntropy(data []byte) float64

ShannonEntropy calculates the Shannon entropy of a byte slice in bits per byte. Higher entropy indicates more randomness/compression/encoding.

Typical values:

  • English text: 3.0 - 4.5 bits/byte
  • Source code: 4.0 - 5.0 bits/byte
  • Base64 encoded: 5.5 - 6.0 bits/byte
  • Compressed/random: 7.5 - 8.0 bits/byte

func VerifyRecord

func VerifyRecord(record *SecurityRecord, publicKeyBase64 string) (bool, error)

VerifyRecord verifies a security record's signature.

Types

type AuditTrail

type AuditTrail struct {
	// contains filtered or unexported fields
}

AuditTrail manages cryptographic signing of security decisions.

func NewAuditTrail

func NewAuditTrail(sessionID string) (*AuditTrail, error)

NewAuditTrail creates a new audit trail with a fresh Ed25519 keypair.

func (*AuditTrail) Destroy

func (a *AuditTrail) Destroy()

Destroy zeros out the private key from memory. Call this when the session ends.

func (*AuditTrail) ExportLog

func (a *AuditTrail) ExportLog(securityMode string) *SessionLog

ExportLog exports the audit trail as a session log.

func (*AuditTrail) PublicKey

func (a *AuditTrail) PublicKey() string

PublicKey returns the base64-encoded public key for verification.

func (*AuditTrail) RecordDecision

func (a *AuditTrail) RecordDecision(block *Block, tier1, tier2, tier3 string) *SecurityRecord

RecordDecision creates and signs a security decision record.

func (*AuditTrail) Records

func (a *AuditTrail) Records() []*SecurityRecord

Records returns all recorded security decisions.

func (*AuditTrail) SessionID

func (a *AuditTrail) SessionID() string

SessionID returns the session identifier.

type Block

type Block struct {
	// ID is a unique identifier for taint tracking.
	ID string `json:"id"`

	// Trust indicates who created the content.
	Trust TrustLevel `json:"trust"`

	// Type indicates how to interpret the content.
	Type BlockType `json:"type"`

	// Mutable indicates whether later content can override this block.
	// Immutable blocks have precedence immunity.
	Mutable bool `json:"mutable"`

	// Content is the actual text content.
	Content string `json:"content"`

	// ContentHash is SHA256 hash of content for de-duplication.
	ContentHash string `json:"content_hash,omitempty"`

	// Source describes where the content came from (for debugging).
	Source string `json:"source,omitempty"`

	// AgentContext identifies which agent/sub-agent created this block.
	// Used to filter blocks during security checks in multi-agent scenarios.
	AgentContext string `json:"agent_context,omitempty"`

	// TaintedBy lists IDs of blocks that influenced this block.
	TaintedBy []string `json:"tainted_by,omitempty"`

	// CreatedAtSeq is the session event sequence when this block was created.
	// Used to correlate blocks with session events in forensic analysis.
	CreatedAtSeq uint64 `json:"created_at_seq,omitempty"`

	// DedupeHit indicates this block was reused from a previous identical content.
	DedupeHit bool `json:"dedupe_hit,omitempty"`
}

Block represents a piece of content with security metadata.

func NewBlock

func NewBlock(id string, trust TrustLevel, typ BlockType, mutable bool, content, source string) *Block

NewBlock creates a new block with the given properties. It enforces security invariants: - Untrusted content is always type=data - Untrusted content is always mutable=true

func (*Block) CanOverride

func (b *Block) CanOverride(other *Block) bool

CanOverride returns true if this block can override the other block. Higher trust + immutable beats lower trust + mutable.

func (*Block) IsData

func (b *Block) IsData() bool

IsData returns true if this block contains data only.

func (*Block) IsImmutable

func (b *Block) IsImmutable() bool

IsImmutable returns true if this block has precedence immunity.

func (*Block) IsInstruction

func (b *Block) IsInstruction() bool

IsInstruction returns true if this block contains executable instructions.

type BlockType

type BlockType string

BlockType represents how content should be interpreted.

const (
	// TypeInstruction means content contains executable instructions.
	TypeInstruction BlockType = "instruction"
	// TypeData means content is data only, never to be interpreted as instructions.
	TypeData BlockType = "data"
)

type Config

type Config struct {
	// Mode is the security mode (default, paranoid, or research).
	Mode Mode

	// ResearchScope describes the authorized scope for research mode.
	// Required when Mode is ModeResearch.
	ResearchScope string

	// UserTrust is the trust level for user messages.
	UserTrust TrustLevel

	// TriageProvider is the LLM provider for Tier 2 triage (cheap/fast model).
	TriageProvider llm.Provider

	// SupervisorProvider is the LLM provider for Tier 3 supervision (capable model).
	SupervisorProvider llm.Provider

	// Logger for security events.
	Logger *logging.Logger
}

Config holds security verifier configuration.

type EncodingDetection

type EncodingDetection struct {
	Detected bool
	Type     EncodingType
	Segment  string
	Entropy  float64
}

EncodingDetection holds the result of encoding detection.

func DetectEncoding

func DetectEncoding(content string) EncodingDetection

DetectEncoding analyzes content for potential encoded payloads.

type EncodingType

type EncodingType string

EncodingType represents a detected encoding format.

const (
	EncodingNone      EncodingType = ""
	EncodingBase64    EncodingType = "base64"
	EncodingBase64URL EncodingType = "base64url"
	EncodingHex       EncodingType = "hex"
	EncodingURL       EncodingType = "url"
)

type KeywordMatch

type KeywordMatch struct {
	Keyword string
}

KeywordMatch represents a matched sensitive keyword.

func DetectSensitiveKeywords

func DetectSensitiveKeywords(content string) []KeywordMatch

DetectSensitiveKeywords scans content for sensitive keywords (not patterns). Checks both built-in keywords and custom keywords from policy.toml.

type Mode

type Mode string

Mode represents the security operation mode.

const (
	ModeDefault  Mode = "default"
	ModeParanoid Mode = "paranoid"
	ModeResearch Mode = "research"
)

type PatternMatch

type PatternMatch struct {
	Name    string
	Pattern string
	Match   string
}

PatternMatch represents a matched suspicious pattern.

func DetectSuspiciousPatterns

func DetectSuspiciousPatterns(content string) []PatternMatch

DetectSuspiciousPatterns scans content for injection-related patterns. Checks both built-in patterns and custom patterns from policy.toml.

type SecurityRecord

type SecurityRecord struct {
	BlockID     string    `json:"block_id"`
	SessionID   string    `json:"session_id"`
	Timestamp   time.Time `json:"timestamp"`
	ContentHash string    `json:"content_hash"`
	Trust       string    `json:"trust"`
	Type        string    `json:"type"`
	Tier1Result string    `json:"tier1_result"`
	Tier2Result string    `json:"tier2_result"`
	Tier3Result string    `json:"tier3_result"`
	Signature   string    `json:"signature,omitempty"`
}

SecurityRecord represents a signed security decision.

type SecuritySupervisor

type SecuritySupervisor struct {
	// contains filtered or unexported fields
}

SecuritySupervisor performs Tier 3 full LLM-based security verification.

func NewSecuritySupervisor

func NewSecuritySupervisor(provider llm.Provider, mode Mode, researchScope string) *SecuritySupervisor

NewSecuritySupervisor creates a new security supervisor.

func (*SecuritySupervisor) Evaluate

Evaluate performs full security supervision on a tool call.

type SessionLog

type SessionLog struct {
	SessionID       string            `json:"session_id"`
	StartedAt       time.Time         `json:"started_at"`
	PublicKey       string            `json:"public_key"`
	SecurityMode    string            `json:"security_mode"`
	SecurityRecords []*SecurityRecord `json:"security_records"`
}

SessionLog represents a complete session security log.

type SupervisionRequest

type SupervisionRequest struct {
	ToolName        string
	ToolArgs        map[string]interface{}
	UntrustedBlocks []*Block
	OriginalGoal    string
	Tier1Flags      []string
	Tier2Reason     string
}

SupervisionRequest contains the information for security supervision.

type SupervisionResult

type SupervisionResult struct {
	Verdict      Verdict
	Reason       string
	Correction   string
	LatencyMs    int64 // Time taken for supervisor LLM call
	InputTokens  int   // Input tokens used
	OutputTokens int   // Output tokens used
}

SupervisionResult contains the supervision verdict.

type SuspiciousPattern

type SuspiciousPattern struct {
	Name    string
	Pattern *regexp.Regexp
}

SuspiciousPattern represents a pattern that may indicate prompt injection.

type TaintLineageNode

type TaintLineageNode struct {
	BlockID   string              `json:"block_id"`
	Trust     TrustLevel          `json:"trust"`
	Source    string              `json:"source"`
	EventSeq  uint64              `json:"event_seq,omitempty"`
	Depth     int                 `json:"depth,omitempty"`
	TaintedBy []*TaintLineageNode `json:"tainted_by,omitempty"`
}

TaintLineageNode represents a node in the taint dependency tree. Exported for use by session package.

type Tier1Result

type Tier1Result struct {
	Pass          bool
	Reasons       []string
	SkipReason    string   // Why escalation was skipped (for forensic clarity)
	Block         *Block   // The primary untrusted block that triggered escalation
	RelatedBlocks []*Block // All blocks whose content is used in this tool call
}

Tier1Result holds the result of deterministic checks.

type Triage

type Triage struct {
	// contains filtered or unexported fields
}

Triage performs Tier 2 cheap model triage on a potentially suspicious action.

func NewTriage

func NewTriage(provider llm.Provider) *Triage

NewTriage creates a new triage instance with the given LLM provider.

func (*Triage) Evaluate

func (t *Triage) Evaluate(ctx context.Context, req TriageRequest) (*TriageResult, error)

Evaluate asks the cheap model whether the tool call appears influenced by injection.

func (*Triage) SetResearchScope

func (t *Triage) SetResearchScope(scope string)

SetResearchScope sets the security research scope for exception handling.

type TriageRequest

type TriageRequest struct {
	ToolName       string
	ToolArgs       map[string]interface{}
	UntrustedBlock *Block
}

TriageRequest contains the information needed for triage.

type TriageResult

type TriageResult struct {
	Suspicious   bool
	Reason       string
	LatencyMs    int64 // Time taken for triage LLM call
	InputTokens  int   // Input tokens used
	OutputTokens int   // Output tokens used
}

TriageResult contains the triage decision.

type TrustLevel

type TrustLevel string

TrustLevel represents the origin-based authenticity of content.

const (
	// TrustTrusted is for framework-generated content (system prompt, supervisor messages).
	TrustTrusted TrustLevel = "trusted"
	// TrustVetted is for human-authored content (Agentfile goals, signed packages).
	TrustVetted TrustLevel = "vetted"
	// TrustUntrusted is for external content (tool results, file reads, web fetches).
	TrustUntrusted TrustLevel = "untrusted"
)

func PropagatedTrust

func PropagatedTrust(a, b TrustLevel) TrustLevel

PropagatedTrust returns the trust level when combining this block with another. The result is the lowest (least trusted) of the two.

type Verdict

type Verdict string

Verdict is the security supervisor's decision.

const (
	VerdictAllow  Verdict = "ALLOW"
	VerdictDeny   Verdict = "DENY"
	VerdictModify Verdict = "MODIFY"
)

type VerificationResult

type VerificationResult struct {
	Allowed      bool
	ToolName     string
	DenyReason   string
	Modification string
	Tier1        *Tier1Result
	Tier2        *TriageResult
	Tier3        *SupervisionResult
	TaintLineage []*TaintLineageNode // Taint dependency tree for related blocks
}

VerificationResult holds the complete verification result.

type Verifier

type Verifier struct {
	// contains filtered or unexported fields
}

Verifier implements the tiered security verification pipeline.

func NewVerifier

func NewVerifier(cfg Config, sessionID string) (*Verifier, error)

NewVerifier creates a new security verifier.

func (*Verifier) AddBlock

func (v *Verifier) AddBlock(trust TrustLevel, typ BlockType, mutable bool, content, source string) *Block

AddBlock adds a content block to the context.

func (*Verifier) AddBlockWithContext

func (v *Verifier) AddBlockWithContext(trust TrustLevel, typ BlockType, mutable bool, content, source, agentContext string) *Block

AddBlockWithContext adds a content block with an agent context identifier.

func (*Verifier) AddBlockWithTaint

func (v *Verifier) AddBlockWithTaint(trust TrustLevel, typ BlockType, mutable bool, content, source, agentContext string, eventSeq uint64, taintedBy []string) *Block

AddBlockWithTaint adds a content block with explicit taint lineage. eventSeq is the session event sequence when this block was created. taintedBy lists IDs of blocks that influenced this block.

func (*Verifier) AuditTrail

func (v *Verifier) AuditTrail() *AuditTrail

AuditTrail returns the audit trail for export.

func (*Verifier) ClearContext

func (v *Verifier) ClearContext()

ClearContext removes all blocks from the context.

func (*Verifier) Destroy

func (v *Verifier) Destroy()

Destroy cleans up resources, including zeroing the private key.

func (*Verifier) GetBlock

func (v *Verifier) GetBlock(id string) *Block

GetBlock returns a block by ID (thread-safe).

func (*Verifier) GetCurrentUntrustedBlockIDs

func (v *Verifier) GetCurrentUntrustedBlockIDs() []string

GetCurrentUntrustedBlockIDs returns IDs of all untrusted blocks in context. Used to mark LLM responses as tainted by these blocks.

func (*Verifier) GetTaintLineage

func (v *Verifier) GetTaintLineage(blockID string) *TaintLineageNode

GetTaintLineage builds the full taint dependency tree for a block. Returns nil if the block is not found.

func (*Verifier) GetTaintLineageForBlocks

func (v *Verifier) GetTaintLineageForBlocks(blocks []*Block) []*TaintLineageNode

GetTaintLineageForBlocks builds lineage trees for multiple blocks.

func (*Verifier) VerifyToolCall

func (v *Verifier) VerifyToolCall(ctx context.Context, toolName string, args map[string]interface{}, originalGoal, agentContext string) (*VerificationResult, error)

VerifyToolCall runs the tiered verification pipeline for a tool call. agentContext filters blocks to only those from the same agent (empty = all blocks).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL