RAG security is the practice of protecting Retrieval-Augmented Generation systems — the pipelines that fetch documents from a knowledge base and inject them into an LLM prompt at query time — from data poisoning, prompt injection, embedding inversion, and access-control leakage. Because a RAG system pulls untrusted content into the model on every request, it expands the attack surface well beyond a plain chatbot, and it now has its own dedicated entry in the OWASP Top 10 for LLM Applications 2025: LLM08, Vector and Embedding Weaknesses.
What Is RAG, and Why Does Security Matter?
Retrieval-Augmented Generation solves a real problem. A base model only knows what it saw during training, so it cannot answer questions about your private documents, your latest pricing, or last night support tickets. RAG closes that gap by embedding your content into vectors, storing those vectors in a database, and — at query time — retrieving the most semantically similar chunks and pasting them into the prompt. The model then answers grounded in your data instead of hallucinating.
That grounding is exactly why RAG matters for security. Every document you index becomes something the model will read and, potentially, act on. Every retrieval crosses a trust boundary. And the vector store itself becomes a new asset holding a searchable, reversible representation of your sensitive text. As we argued in our guide to securing GenAI with defense in depth, the model should be treated as an untrusted component — and in a RAG system, so should the content it retrieves.
How RAG Expands the Attack Surface
A standalone LLM has essentially two untrusted inputs: the user prompt and the model own generations. RAG adds three more: the knowledge base that anyone with write access can contaminate, the embedding pipeline that transforms text into vectors, and the retrieval logic that decides which chunks reach the prompt. Each is a place where an attacker can insert content, exfiltrate data, or bypass authorization. The OWASP Gen AI Security Project groups these under LLM08 precisely because they behave differently from classic web vulnerabilities.
OWASP LLM08:2025 (Vector and Embedding Weaknesses) warns that RAG risks cluster around three failure modes: permission and access-control bypass, sensitive data exposure through the vector store, and cross-context information leaks between users or tenants. These sit alongside LLM01 (Prompt Injection) and LLM02 (Sensitive Information Disclosure) in the 2025 list.
The Top RAG Security Risks
The concrete threats fall into five recurring categories. We see all of them in production reviews, and the research literature has demonstrated each against real systems.
- Indirect prompt injection via retrieved documents: a poisoned chunk carries hidden instructions the model follows when it lands in the prompt, without the user ever typing anything malicious.
- Knowledge-base / data poisoning: attackers plant text crafted to rank highly in similarity search, steering answers across many users at once.
- Embedding inversion and data leakage: stored vectors retain enough semantic information to reconstruct the original text, turning a leaked index into a data breach.
- Broken access control and over-retrieval: retrieval that ignores per-user permissions returns documents from other users, other tenants, or restricted classifications.
- Stale or malicious sources: outdated, unsigned, or unverified feeds silently degrade answer quality or introduce attacker-controlled content over time.
Indirect Prompt Injection Through the Corpus
Indirect prompt injection is the RAG-flavored cousin of the attack we dissected in our deep dive on real prompt-injection attacks against LLM apps. The difference is delivery: instead of typing the payload, the attacker plants it in a document the system will index — a support ticket, a wiki edit, a scraped web page. When that chunk is retrieved and concatenated into the prompt, the model cannot tell your instructions from the attacker instructions. It may leak the system prompt, follow hostile directions, or emit malicious output. Because the payload rides in on trusted-looking data, ordinary input filtering on the user message never sees it.
Knowledge-Base Poisoning
If an attacker can write to your corpus, they can corrupt the source of truth. The PoisonedRAG work, accepted to USENIX Security 2025, formalized this as an optimization problem: craft a passage that both states the attacker target answer as fact and is tuned to match the victim query in embedding space. Injecting as few as five poisoned texts per question achieved attack success rates above 90 percent across multiple LLMs and datasets. Follow-up research such as CorruptRAG showed the attack can work with a single poisoned document, which makes it far more practical against real, high-volume knowledge bases.
Embedding Inversion and Data Leakage
Teams often assume vectors are safe to store loosely because they look like meaningless arrays of numbers. They are not. Research on embedding inversion — notably the Vec2Text line of work — showed that iterative optimization can recover 92 percent of 32-token text inputs exactly from their embeddings, with near-verbatim recovery on common phrasings and structured text across popular models including OpenAI text-embedding-ada-002 and sentence-transformers. In practice this means a leaked or over-permissioned vector database is a leaked corpus of the original sensitive text. That is squarely an LLM02 Sensitive Information Disclosure problem.
The Vec2Text result reframes the vector store as regulated data, not an opaque cache. If the underlying documents contain personal data, health records, or trade secrets, the embeddings inherit the same classification and the same breach-notification obligations. Encrypt them, restrict read access, and log every query.
Broken Access Control and Over-Retrieval
This is the quietest and most common failure. Many RAG systems retrieve against a single shared index with no per-user filtering, so semantic similarity alone decides what surfaces. In a multi-tenant deployment, one tenant query can match and return another tenant documents; inside one organization, a junior employee can pull board-level material simply because it is topically relevant. OWASP LLM08 calls out missing tenant isolation explicitly. The vulnerability is not exotic — it is authorization that was enforced in the application layer but forgotten at the retrieval layer.
Stale or Malicious Sources
A RAG corpus is a living system. Feeds go stale, third-party sources get compromised, and re-indexing jobs pull in whatever the source now contains. Without provenance and integrity checks, a source that was trustworthy at onboarding can quietly turn into an injection or poisoning vector months later. Freshness and source integrity are security properties, not just quality metrics.
How to Secure a RAG Pipeline
No single control neutralizes RAG risk, so we layer defenses across ingestion, storage, retrieval, and output. The following steps map directly to the threats above.
- Vet and sign your sources: allowlist what gets indexed, validate content at ingestion, and attach a signed provenance record to every chunk so you can trace and revoke bad data.
- Enforce per-user authorization at retrieval: apply document-level access control and metadata filtering on every query, and isolate tenants with separate collections or dedicated stores rather than a single shared index.
- Filter inputs and outputs: treat retrieved chunks as untrusted, scan them for injection patterns before they reach the prompt, and validate model output before any downstream system acts on it.
- Protect the vector store as sensitive data: encrypt embeddings at rest, restrict read access tightly, and monitor for bulk or anomalous vector reads that could signal an inversion attempt.
- Track chunk provenance and monitor continuously: log which chunks fed each answer, alert on retrieval-permission anomalies and sudden shifts in retrieved content, and re-verify source integrity on every re-index.
These controls reinforce each other. Source vetting and signing blunt poisoning; per-user authorization stops over-retrieval and cross-tenant leaks; input and output filtering catches indirect prompt injection; encryption and access restriction contain embedding inversion; and provenance plus monitoring give you the detection and forensics to respond when something slips through anyway.
Governance: Making RAG Security Durable
Controls decay without ownership. The durable move is to fold RAG into the same threat-modeling and governance rhythm you use for the rest of the stack. Our walkthrough of STRIDE threat modeling for LLM apps gives a repeatable way to enumerate retrieval, storage, and ingestion threats before you ship, and to revisit them as the corpus grows. Assign an owner for the knowledge base, define what may be indexed and who may read it, set a re-verification cadence for sources, and require access-control tests on the retrieval path in every release.
RAG is not going away — it is the default pattern for grounding models in private data, and it delivers real value. The takeaway is not to avoid it but to build it as a security-sensitive system from day one: untrusted content in, least privilege at retrieval, validated output, and provenance you can audit. Do that, and RAG becomes an asset you can defend rather than a quiet path straight into your data.