← Atlas · Playbooks PLAYBOOK · P2

PLAYBOOK · P2 · OWASP Agentic AI v1.1

Preventing Memory Poisoning & AI Knowledge Corruption

Keep short- and long-term memory clean of adversarial writes and retrievals.

Goal: Prevent AI from storing, retrieving, or propagating manipulated data that could corrupt decision-making or spread misinformation.

Aligned with Step 2: Does the AI agent rely on stored memory for decision-making? · 2 threats mitigated · 17 mitigations referenced

At a glance

THREATS COVERED
2
T1 · T5
NAVIGATOR STEP
P2
Step 2: Does the AI agent rely on stored memory for decision-making?
MITIGATIONS
17
distinct Helmwart controls referenced across the three phases

Defence-in-depth chain

When a poisoned write targets agent memory or a knowledge store, Proactive controls (memory content validation and permission-aware vector retrieval) block at the write boundary by validating content and enforcing access policy. If a poisoned entry persists, Reactive controls (memory anomaly detection and separation of actor and recorder) detect the anomaly at runtime and trigger a rollback. Detective controls (multi-source verification) cross-check knowledge lineage to flag drift and contamination.

ATTACK ARRIVES poisoned write PROACTIVE Write-time validation Retrieval ACLs Session isolation blocked WRITE REJECTED REACTIVE Memory anomaly detection Actor / recorder split Consensus check contained ROLLBACK TRIGGERED DETECTIVE Runtime anomaly scan Output provenance Multi-source check alert CORRUPTION FLAGGED attack passes attack passes OUTCOME rollback + audit

proactive Step 1: Secure AI memory access & validation

  • Scan every candidate memory insertion for anomalies and reject writes from untrusted sources, applying cryptographic validation for long-term stored entries.

  • Log all memory reads and writes to an immutable audit trail so every access can be traced after the fact.

  • Isolate each session's memory partition so the agent cannot read or carry over knowledge from a different user's session.

    Helmwart controls: Session isolation
  • Enforce access-control lists on vector stores and shared memory so each agent can only retrieve data relevant to its current task.

  • Pin every model artefact to a registry-managed, checksummed version and gate promotion to production on a behavioural regression suite.

    Helmwart controls: Model registry
  • Enforce retention limits keyed to data sensitivity so agents discard historical knowledge before it can be exploited.

    Helmwart controls: Mem validate
  • Record the originating source for every memory update so modifications can be traced back to a trusted or untrusted actor.

    Helmwart controls: Provenance tracking
  • Require multi-agent or external corroboration before committing any memory change that will persist across sessions.

  • Cross-check new knowledge against trusted external sources before writing it to long-term storage, and fail closed when confidence is insufficient.

reactive Step 2: Detect & respond to memory poisoning

  • Monitor memory logs in real time for unexpected updates or unauthorised access and raise an alert on any anomaly.

    Helmwart controls: Mem anomaly
  • Re-run multi-agent or external validation on any suspect memory entry after it has been committed to confirm or rule out poisoning.

  • Periodically re-check existing stored knowledge against trusted sources to detect drift or contamination introduced over time.

    Helmwart controls: Multi-source verify
  • Roll back agent knowledge to the last validated snapshot whenever an anomaly is detected in the memory store.

    Helmwart controls: Mem anomaly
  • Take periodic memory snapshots with actor attribution so a forensic rollback is possible after a poisoning event.

    Helmwart controls: Mem anomaly Split actor
  • Alert when memory modification frequency for an agent spikes, as a sudden high rewrite rate is a reliable early indicator of manipulation.

    Helmwart controls: Mem anomaly

detective Step 3: Prevent the spread of false knowledge

  • Cross-check new knowledge against multiple trusted sources before accepting it as established fact within the agent's knowledge base.

    Helmwart controls: Multi-source verify
  • Block knowledge propagation from unverified sources so low-trust inputs cannot influence downstream agent decisions.

  • Maintain a full provenance lineage of how agent knowledge evolved to enable forensic investigation into misinformation spread.

  • At the RAG retrieval boundary, route attacker-influenceable content through a quarantined extractor model before the privileged executor ever sees it.

    Helmwart controls: PI defences+
  • Version-control every knowledge update so corrupted changes can be audited and rolled back to a clean state.

  • Continuously analyse memory access patterns and cross-audit access logs to catch long-term anomalies or policy drift before they compound.

  • Apply embedding-space anomaly detection and adversarial re-ranking at the retrieval layer to reduce the impact of poisoned vectors.

    Helmwart controls: Memory-poison defence

Source

OWASP Agentic AI: Threats and Mitigations v1.1 (Dec 2025), §Mitigation Strategies. Action text is taken verbatim or paraphrased from the canonical document; the Helmwart additions are the per-action mappings onto deployable mitigation entries.