Documentation
¶
Overview ¶
Package security provides prompt injection defense through trust-tagged content blocks, tiered verification, and cryptographic audit trails.
Index ¶
- Constants
- Variables
- func ContainsSuspiciousContent(content string) (suspicious bool, reasons []string)
- func HasEncodedContent(content string) bool
- func HasSensitiveKeywords(content string) bool
- func HasSuspiciousPatterns(content string) bool
- func IsHighEntropy(data []byte) bool
- func RegisterCustomKeywords(keywords []string)
- func RegisterCustomPatterns(patterns []string) error
- func SegmentEntropy(content string, minSegmentLen int) (highestSegment string, highestEntropy float64)
- func ShannonEntropy(data []byte) float64
- func VerifyRecord(record *SecurityRecord, publicKeyBase64 string) (bool, error)
- type AuditTrail
- func (a *AuditTrail) Destroy()
- func (a *AuditTrail) ExportLog(securityMode string) *SessionLog
- func (a *AuditTrail) PublicKey() string
- func (a *AuditTrail) RecordDecision(block *Block, tier1, tier2, tier3 string) *SecurityRecord
- func (a *AuditTrail) Records() []*SecurityRecord
- func (a *AuditTrail) SessionID() string
- type Block
- type BlockType
- type Config
- type EncodingDetection
- type EncodingType
- type KeywordMatch
- type Mode
- type PatternMatch
- type SecurityRecord
- type SecuritySupervisor
- type SessionLog
- type SupervisionRequest
- type SupervisionResult
- type SuspiciousPattern
- type TaintLineageNode
- type Tier1Result
- type Triage
- type TriageRequest
- type TriageResult
- type TrustLevel
- type Verdict
- type VerificationResult
- type Verifier
- func (v *Verifier) AddBlock(trust TrustLevel, typ BlockType, mutable bool, content, source string) *Block
- func (v *Verifier) AddBlockWithContext(trust TrustLevel, typ BlockType, mutable bool, ...) *Block
- func (v *Verifier) AddBlockWithTaint(trust TrustLevel, typ BlockType, mutable bool, ...) *Block
- func (v *Verifier) AuditTrail() *AuditTrail
- func (v *Verifier) ClearContext()
- func (v *Verifier) Destroy()
- func (v *Verifier) GetBlock(id string) *Block
- func (v *Verifier) GetCurrentUntrustedBlockIDs() []string
- func (v *Verifier) GetTaintLineage(blockID string) *TaintLineageNode
- func (v *Verifier) GetTaintLineageForBlocks(blocks []*Block) []*TaintLineageNode
- func (v *Verifier) VerifyToolCall(ctx context.Context, toolName string, args map[string]interface{}, ...) (*VerificationResult, error)
Constants ¶
const EntropyThreshold = 4.8
EntropyThreshold is the bits/byte threshold above which content is considered potentially encoded. Base64 typically has entropy around 5.0-5.5, so we use a lower threshold to catch it.
Variables ¶
var HighRiskTools = map[string]bool{ "bash": true, "write": true, "web_fetch": true, "spawn_agent": true, }
HighRiskTools is the set of tools that require extra scrutiny.
Functions ¶
func ContainsSuspiciousContent ¶
ContainsSuspiciousContent checks both patterns and encoding.
func HasEncodedContent ¶
HasEncodedContent returns true if the content appears to contain encoded payloads.
func HasSensitiveKeywords ¶
HasSensitiveKeywords returns true if any sensitive keywords are detected.
func HasSuspiciousPatterns ¶
HasSuspiciousPatterns returns true if any suspicious patterns are detected.
func IsHighEntropy ¶
IsHighEntropy returns true if the content has entropy above the threshold.
func RegisterCustomKeywords ¶
func RegisterCustomKeywords(keywords []string)
RegisterCustomKeywords adds user-defined keywords (from policy.toml).
func RegisterCustomPatterns ¶
RegisterCustomPatterns adds user-defined patterns (from policy.toml). Format: "name:regex" e.g., "exfil_attempt:send.*to.*external"
func SegmentEntropy ¶
func SegmentEntropy(content string, minSegmentLen int) (highestSegment string, highestEntropy float64)
SegmentEntropy calculates entropy for segments of a string. Returns the segment with highest entropy and its value. This is useful for detecting encoded payloads embedded in normal text.
func ShannonEntropy ¶
ShannonEntropy calculates the Shannon entropy of a byte slice in bits per byte. Higher entropy indicates more randomness/compression/encoding.
Typical values:
- English text: 3.0 - 4.5 bits/byte
- Source code: 4.0 - 5.0 bits/byte
- Base64 encoded: 5.5 - 6.0 bits/byte
- Compressed/random: 7.5 - 8.0 bits/byte
func VerifyRecord ¶
func VerifyRecord(record *SecurityRecord, publicKeyBase64 string) (bool, error)
VerifyRecord verifies a security record's signature.
Types ¶
type AuditTrail ¶
type AuditTrail struct {
// contains filtered or unexported fields
}
AuditTrail manages cryptographic signing of security decisions.
func NewAuditTrail ¶
func NewAuditTrail(sessionID string) (*AuditTrail, error)
NewAuditTrail creates a new audit trail with a fresh Ed25519 keypair.
func (*AuditTrail) Destroy ¶
func (a *AuditTrail) Destroy()
Destroy zeros out the private key from memory. Call this when the session ends.
func (*AuditTrail) ExportLog ¶
func (a *AuditTrail) ExportLog(securityMode string) *SessionLog
ExportLog exports the audit trail as a session log.
func (*AuditTrail) PublicKey ¶
func (a *AuditTrail) PublicKey() string
PublicKey returns the base64-encoded public key for verification.
func (*AuditTrail) RecordDecision ¶
func (a *AuditTrail) RecordDecision(block *Block, tier1, tier2, tier3 string) *SecurityRecord
RecordDecision creates and signs a security decision record.
func (*AuditTrail) Records ¶
func (a *AuditTrail) Records() []*SecurityRecord
Records returns all recorded security decisions.
func (*AuditTrail) SessionID ¶
func (a *AuditTrail) SessionID() string
SessionID returns the session identifier.
type Block ¶
type Block struct {
// ID is a unique identifier for taint tracking.
ID string `json:"id"`
// Trust indicates who created the content.
Trust TrustLevel `json:"trust"`
// Type indicates how to interpret the content.
Type BlockType `json:"type"`
// Mutable indicates whether later content can override this block.
// Immutable blocks have precedence immunity.
Mutable bool `json:"mutable"`
// Content is the actual text content.
Content string `json:"content"`
// ContentHash is SHA256 hash of content for de-duplication.
ContentHash string `json:"content_hash,omitempty"`
// Source describes where the content came from (for debugging).
Source string `json:"source,omitempty"`
// AgentContext identifies which agent/sub-agent created this block.
// Used to filter blocks during security checks in multi-agent scenarios.
AgentContext string `json:"agent_context,omitempty"`
// TaintedBy lists IDs of blocks that influenced this block.
TaintedBy []string `json:"tainted_by,omitempty"`
// CreatedAtSeq is the session event sequence when this block was created.
// Used to correlate blocks with session events in forensic analysis.
CreatedAtSeq uint64 `json:"created_at_seq,omitempty"`
// DedupeHit indicates this block was reused from a previous identical content.
DedupeHit bool `json:"dedupe_hit,omitempty"`
}
Block represents a piece of content with security metadata.
func NewBlock ¶
func NewBlock(id string, trust TrustLevel, typ BlockType, mutable bool, content, source string) *Block
NewBlock creates a new block with the given properties. It enforces security invariants: - Untrusted content is always type=data - Untrusted content is always mutable=true
func (*Block) CanOverride ¶
CanOverride returns true if this block can override the other block. Higher trust + immutable beats lower trust + mutable.
func (*Block) IsImmutable ¶
IsImmutable returns true if this block has precedence immunity.
func (*Block) IsInstruction ¶
IsInstruction returns true if this block contains executable instructions.
type Config ¶
type Config struct {
// Mode is the security mode (default, paranoid, or research).
Mode Mode
// ResearchScope describes the authorized scope for research mode.
// Required when Mode is ModeResearch.
ResearchScope string
// UserTrust is the trust level for user messages.
UserTrust TrustLevel
// TriageProvider is the LLM provider for Tier 2 triage (cheap/fast model).
TriageProvider llm.Provider
// SupervisorProvider is the LLM provider for Tier 3 supervision (capable model).
SupervisorProvider llm.Provider
// Logger for security events.
Logger *logging.Logger
}
Config holds security verifier configuration.
type EncodingDetection ¶
type EncodingDetection struct {
Detected bool
Type EncodingType
Segment string
Entropy float64
}
EncodingDetection holds the result of encoding detection.
func DetectEncoding ¶
func DetectEncoding(content string) EncodingDetection
DetectEncoding analyzes content for potential encoded payloads.
type EncodingType ¶
type EncodingType string
EncodingType represents a detected encoding format.
const ( EncodingNone EncodingType = "" EncodingBase64 EncodingType = "base64" EncodingBase64URL EncodingType = "base64url" EncodingHex EncodingType = "hex" EncodingURL EncodingType = "url" )
type KeywordMatch ¶
type KeywordMatch struct {
Keyword string
}
KeywordMatch represents a matched sensitive keyword.
func DetectSensitiveKeywords ¶
func DetectSensitiveKeywords(content string) []KeywordMatch
DetectSensitiveKeywords scans content for sensitive keywords (not patterns). Checks both built-in keywords and custom keywords from policy.toml.
type PatternMatch ¶
PatternMatch represents a matched suspicious pattern.
func DetectSuspiciousPatterns ¶
func DetectSuspiciousPatterns(content string) []PatternMatch
DetectSuspiciousPatterns scans content for injection-related patterns. Checks both built-in patterns and custom patterns from policy.toml.
type SecurityRecord ¶
type SecurityRecord struct {
BlockID string `json:"block_id"`
SessionID string `json:"session_id"`
Timestamp time.Time `json:"timestamp"`
ContentHash string `json:"content_hash"`
Trust string `json:"trust"`
Type string `json:"type"`
Tier1Result string `json:"tier1_result"`
Tier2Result string `json:"tier2_result"`
Tier3Result string `json:"tier3_result"`
Signature string `json:"signature,omitempty"`
}
SecurityRecord represents a signed security decision.
type SecuritySupervisor ¶
type SecuritySupervisor struct {
// contains filtered or unexported fields
}
SecuritySupervisor performs Tier 3 full LLM-based security verification.
func NewSecuritySupervisor ¶
func NewSecuritySupervisor(provider llm.Provider, mode Mode, researchScope string) *SecuritySupervisor
NewSecuritySupervisor creates a new security supervisor.
func (*SecuritySupervisor) Evaluate ¶
func (s *SecuritySupervisor) Evaluate(ctx context.Context, req SupervisionRequest) (*SupervisionResult, error)
Evaluate performs full security supervision on a tool call.
type SessionLog ¶
type SessionLog struct {
SessionID string `json:"session_id"`
StartedAt time.Time `json:"started_at"`
PublicKey string `json:"public_key"`
SecurityMode string `json:"security_mode"`
SecurityRecords []*SecurityRecord `json:"security_records"`
}
SessionLog represents a complete session security log.
type SupervisionRequest ¶
type SupervisionRequest struct {
ToolName string
ToolArgs map[string]interface{}
UntrustedBlocks []*Block
OriginalGoal string
Tier1Flags []string
Tier2Reason string
}
SupervisionRequest contains the information for security supervision.
type SupervisionResult ¶
type SupervisionResult struct {
Verdict Verdict
Reason string
Correction string
LatencyMs int64 // Time taken for supervisor LLM call
InputTokens int // Input tokens used
OutputTokens int // Output tokens used
}
SupervisionResult contains the supervision verdict.
type SuspiciousPattern ¶
SuspiciousPattern represents a pattern that may indicate prompt injection.
type TaintLineageNode ¶
type TaintLineageNode struct {
BlockID string `json:"block_id"`
Trust TrustLevel `json:"trust"`
Source string `json:"source"`
EventSeq uint64 `json:"event_seq,omitempty"`
Depth int `json:"depth,omitempty"`
TaintedBy []*TaintLineageNode `json:"tainted_by,omitempty"`
}
TaintLineageNode represents a node in the taint dependency tree. Exported for use by session package.
type Tier1Result ¶
type Tier1Result struct {
Pass bool
Reasons []string
SkipReason string // Why escalation was skipped (for forensic clarity)
Block *Block // The primary untrusted block that triggered escalation
RelatedBlocks []*Block // All blocks whose content is used in this tool call
}
Tier1Result holds the result of deterministic checks.
type Triage ¶
type Triage struct {
// contains filtered or unexported fields
}
Triage performs Tier 2 cheap model triage on a potentially suspicious action.
func (*Triage) Evaluate ¶
func (t *Triage) Evaluate(ctx context.Context, req TriageRequest) (*TriageResult, error)
Evaluate asks the cheap model whether the tool call appears influenced by injection.
func (*Triage) SetResearchScope ¶
SetResearchScope sets the security research scope for exception handling.
type TriageRequest ¶
TriageRequest contains the information needed for triage.
type TriageResult ¶
type TriageResult struct {
Suspicious bool
Reason string
LatencyMs int64 // Time taken for triage LLM call
InputTokens int // Input tokens used
OutputTokens int // Output tokens used
}
TriageResult contains the triage decision.
type TrustLevel ¶
type TrustLevel string
TrustLevel represents the origin-based authenticity of content.
const ( // TrustTrusted is for framework-generated content (system prompt, supervisor messages). TrustTrusted TrustLevel = "trusted" // TrustVetted is for human-authored content (Agentfile goals, signed packages). TrustVetted TrustLevel = "vetted" // TrustUntrusted is for external content (tool results, file reads, web fetches). TrustUntrusted TrustLevel = "untrusted" )
func PropagatedTrust ¶
func PropagatedTrust(a, b TrustLevel) TrustLevel
PropagatedTrust returns the trust level when combining this block with another. The result is the lowest (least trusted) of the two.
type VerificationResult ¶
type VerificationResult struct {
Allowed bool
ToolName string
DenyReason string
Modification string
Tier1 *Tier1Result
Tier2 *TriageResult
Tier3 *SupervisionResult
TaintLineage []*TaintLineageNode // Taint dependency tree for related blocks
}
VerificationResult holds the complete verification result.
type Verifier ¶
type Verifier struct {
// contains filtered or unexported fields
}
Verifier implements the tiered security verification pipeline.
func NewVerifier ¶
NewVerifier creates a new security verifier.
func (*Verifier) AddBlock ¶
func (v *Verifier) AddBlock(trust TrustLevel, typ BlockType, mutable bool, content, source string) *Block
AddBlock adds a content block to the context.
func (*Verifier) AddBlockWithContext ¶
func (v *Verifier) AddBlockWithContext(trust TrustLevel, typ BlockType, mutable bool, content, source, agentContext string) *Block
AddBlockWithContext adds a content block with an agent context identifier.
func (*Verifier) AddBlockWithTaint ¶
func (v *Verifier) AddBlockWithTaint(trust TrustLevel, typ BlockType, mutable bool, content, source, agentContext string, eventSeq uint64, taintedBy []string) *Block
AddBlockWithTaint adds a content block with explicit taint lineage. eventSeq is the session event sequence when this block was created. taintedBy lists IDs of blocks that influenced this block.
func (*Verifier) AuditTrail ¶
func (v *Verifier) AuditTrail() *AuditTrail
AuditTrail returns the audit trail for export.
func (*Verifier) ClearContext ¶
func (v *Verifier) ClearContext()
ClearContext removes all blocks from the context.
func (*Verifier) Destroy ¶
func (v *Verifier) Destroy()
Destroy cleans up resources, including zeroing the private key.
func (*Verifier) GetCurrentUntrustedBlockIDs ¶
GetCurrentUntrustedBlockIDs returns IDs of all untrusted blocks in context. Used to mark LLM responses as tainted by these blocks.
func (*Verifier) GetTaintLineage ¶
func (v *Verifier) GetTaintLineage(blockID string) *TaintLineageNode
GetTaintLineage builds the full taint dependency tree for a block. Returns nil if the block is not found.
func (*Verifier) GetTaintLineageForBlocks ¶
func (v *Verifier) GetTaintLineageForBlocks(blocks []*Block) []*TaintLineageNode
GetTaintLineageForBlocks builds lineage trees for multiple blocks.
func (*Verifier) VerifyToolCall ¶
func (v *Verifier) VerifyToolCall(ctx context.Context, toolName string, args map[string]interface{}, originalGoal, agentContext string) (*VerificationResult, error)
VerifyToolCall runs the tiered verification pipeline for a tool call. agentContext filters blocks to only those from the same agent (empty = all blocks).