Episode 35 · Module 8 · AI Security

Embedding Security and Vector Database Isolation

19 May 2026 · 9:07 · Security for Legal SaaS

9:07 9:07

In Episode 8, we covered Row-Level Security (RLS) in PostgreSQL — the database engine automatically filtering queries so that Tenant A can never see Tenant B's data. In Episode 32, we hardened the database with network isolation, audit logging, and least-privilege access. Your relational database has decades of security architecture behind it. Your vector database does not. Vector databases — Pinecone, Weaviate, Qdrant, Milvus, pgvector — are the storage layer for AI-powered legal tools. They hold the embeddings (numerical representations of documents) that make retrieval-augmented generation (RAG) work, as we discussed in Episode 34.

Today’s Lesson

Security for Legal SaaS — Episode 35: Embedding Security and Vector Database Isolation

The Relational vs. Vector Security Gap

Your vector database does not.

Vector databases — Pinecone, Weaviate, Qdrant, Milvus, pgvector — are the storage layer for AI-powered legal tools. They hold the embeddings (numerical representations of documents) that make retrieval-augmented generation (RAG) work, as we discussed in Episode 34. But they were designed for similarity search performance, not security. Most lack the fine-grained access controls that relational databases take for granted. The OWASP Top 10 for LLM Applications 2025 added a new entry — LLM08: Vector and Embedding Weaknesses — specifically because this gap is being exploited.¹

This episode covers three threats: embedding inversion attacks (extracting sensitive text from embeddings), multi-tenant isolation failures (one client's data leaking to another), and access control gaps in the vector layer.

How Embeddings Encode Sensitive Information

An embedding is a vector — a list of numbers (typically 768 to 3072 dimensions) — that represents the meaning of a text passage. When your legal SaaS platform indexes a client's contract, the embedding model converts each passage into a vector and stores it in the vector database. The original text is either stored alongside the vector or in a linked relational database.

The security assumption is that embeddings are opaque — a list of floating-point numbers that cannot be reverse-engineered back into the original text. This assumption is wrong.

Embedding Inversion Attacks

Recent research has demonstrated that embeddings can be inverted to reconstruct significant portions of the original text. A study documented by security researchers achieved over 92% accuracy reconstructing exact token sequences — including full names, health diagnoses, and email addresses — from text embeddings alone, without access to the original embedding model.²

The attack works by training a neural network to map embeddings back to text. Critically, these inversion models are transferable: an attacker can train their inversion model on a different embedding model than the one you use, and it still works with reduced but meaningful accuracy. This means the attacker does not need access to your specific embedding model — they can build a surrogate that approximates yours.³

For legal SaaS, the implication is direct: if an attacker gains read access to your vector database, they can potentially reconstruct privileged client communications, contract terms, case strategy notes, and other sensitive text from the embeddings alone.

Multi-Tenant Vector Database Isolation

Legal SaaS platforms are typically multi-tenant — one platform serves multiple law firms or clients. The relational database enforces tenant isolation through RLS. But what about the vector database?

Isolation Approaches Compared

Approach	Isolation Level	Performance	Cost	Risk
Single collection, metadata filtering	Logical (query-time filter)	Best	Lowest	Highest — a bug in the filter = cross-tenant leak
Namespace per tenant (Pinecone)	Logical (server-enforced partitioning)	Good	Low	Moderate — namespaces are logically but not physically isolated⁴
Collection per tenant (Weaviate, Qdrant)	Physical (separate storage, indices, and vectors)	Good	Higher	Lowest — operations on one tenant cannot touch another's shard
Separate database instance per tenant	Complete	Worst	Highest	Lowest — but operationally complex

For legal SaaS handling privileged data, the minimum acceptable approach is namespace-per-tenant with server-enforced partitioning. The recommended approach is collection-per-tenant, where each tenant has physically separate storage. The metadata-filtering approach — storing all tenants in one collection and filtering at query time — is the most dangerous because a single missing filter parameter returns results across all tenants.⁵

The pgvector exception: If you use pgvector — PostgreSQL's vector extension — you get RLS for free. Because pgvector stores vectors in regular PostgreSQL tables, the same Row-Level Security policies that protect your relational data also protect your embeddings. This is a significant security advantage over standalone vector databases.⁶

Common Misconfiguration: Frontend API Keys

A frequently observed vulnerability: embedding vector database API keys in frontend JavaScript code. Security researchers have documented that developers commonly place Pinecone or Weaviate API keys in client-side code, giving any user — or attacker — full read and write access to the entire knowledge base. API keys for vector databases belong in backend services, managed through the secrets management infrastructure covered in Episode 30.⁷

Access Control on Retrieval

Even with proper tenant isolation, the retrieval layer needs its own access controls. The question is not just "which tenant does this embedding belong to?" but also "within this tenant, is the requesting user authorised to see this document?"

Permission-Aware Retrieval

The retrieval pipeline should enforce document-level permissions before returning results to the model:

User queries the AI. The query is converted to an embedding.
Vector similarity search returns candidate documents. These are the mathematically closest embeddings to the query.
Permission filter. Before any candidate is included in the model's context, the system checks: does this user have access to this document? This requires mapping each embedding back to its source document and checking the document's access control list (ACL).
Filtered results go to the model. Only documents the user is authorised to see are included in the generation context.

Without step 3, a junior associate asking a question could inadvertently retrieve — and the model could reveal — content from a partner-only matter, a different client's case, or a document under a restricted access protocol.

Nearest-Neighbour Probing

An attacker with query access to the vector database can exploit similarity search to probe for information. By sending crafted queries and observing which documents are returned (or which similarity scores change), the attacker can infer the presence and approximate content of documents in the database — even without direct read access. This is the vector database equivalent of a timing attack, which we discussed in the context of password comparison in Episode 17.⁸

Mitigations include rate limiting on vector queries, monitoring for anomalous query patterns, and adding noise to similarity scores returned to the client.

Embedding Model Selection: Privacy Implications

The choice of embedding model affects privacy:

Consideration	Implication
Cloud-hosted embedding APIs (OpenAI, Cohere, Google)	Your text is sent to a third-party server for embedding. The provider may log inputs. For privileged legal content, this may violate confidentiality obligations
Self-hosted embedding models (sentence-transformers, local models)	Text never leaves your infrastructure. Full control over logging and retention. Higher operational cost
Model dimensionality	Higher-dimensional embeddings encode more information — potentially making inversion attacks more effective
Fine-tuned models	Models fine-tuned on your domain data may encode domain-specific patterns that are easier to invert

For legal SaaS handling privileged information, self-hosted embedding models are the conservative choice. If using cloud-hosted APIs, ensure the provider offers a data processing agreement that prohibits logging and training on your inputs, and verify that the processing jurisdiction is acceptable under your data protection obligations.⁹

Practical Architecture: Secure Embedding Pipeline

A minimal secure embedding pipeline for multi-tenant legal SaaS:

Embed locally. Use self-hosted models for privileged content. Cloud APIs only for non-sensitive content with appropriate DPAs.
Isolate per tenant. Separate vector collections or namespaces per tenant. Never store multiple tenants in a single flat collection with metadata filtering as the sole isolation mechanism.
Enforce permissions at retrieval. Check document-level ACLs before returning any result to the model.
Audit retrieval operations. Log every vector query: who queried, what was returned, which tenant's data was accessed. Ship logs to your SIEM (from Episode 7).
Rotate embedding keys. If using API-based embedding services, rotate API keys regularly through your secrets manager (Episode 30).

What's Next

Episode 36 covers Model Inversion and Membership Inference — can an attacker determine whether a specific privileged document was in your AI's training data? For legal AI, that question has direct privilege implications.

Sources & Further Reading

Sources & references

OWASP, LLM08:2025 Vector and Embedding Weaknesses.
TianPan.co, The Privacy Architecture of Embeddings: What Your Vector Store Knows About Your Users.
Rafter, Vector DBs & Embeddings: The Overlooked Security Risk.
Blockchain Council, Securing and Governing Vector Databases in 2026.
Indusface, OWASP LLM08: Vector and Embedding Security Risks (2025).
pgvector, Open-Source Vector Similarity Search for PostgreSQL.
Security Boulevard / FireTail, LLM08: Vector & Embedding Weaknesses.
Securityium, A Guide to Mitigating LLM08:2025 Vector and Embedding Weaknesses.
Cobalt, Vector and Embedding Weaknesses: Vulnerabilities and Mitigations.
Ackuity AI, Mitigating Vector and Embedding Weaknesses.
PointGuard AI, Vector Embedding Weaknesses.

Alice: Welcome back to Security for Legal SaaS. I'm Alice.

Dan: And I'm Dan. Episode 35 — embedding security and vector database isolation. Alice, last two episodes covered prompt injection and RAG poisoning. This feels like we're going one layer deeper — into the infrastructure that makes RAG work. What's the vector database, and why does it need its own security episode?

Alice: Because it's the biggest security gap in most AI-powered legal tools. Let me set the scene. Over the last thirty-some episodes, we've hardened your relational database — PostgreSQL with Row-Level Security, network isolation, audit logging, encrypted connections, the works. Your relational database has decades of security engineering behind it. Your vector database has almost none.

Dan: And the vector database is where all the AI knowledge base stuff lives?

Alice: Exactly. When your legal SaaS platform indexes a client's documents for AI search, it converts each passage into an embedding — a list of numbers, typically hundreds or thousands of dimensions, that represents the meaning of the text. These embeddings are stored in a vector database — Pinecone, Weaviate, Qdrant, or pgvector. When a lawyer asks the AI a question, the question gets converted to an embedding, and the vector database finds the most similar stored embeddings. Those similar documents get fed to the model. That's retrieval-augmented generation, which we covered last episode.

Dan: Right. So the vector database is like a specialised filing cabinet for AI searches. What's the security problem?

Alice: Three problems. First, embeddings are not as opaque as people think. Most developers assume that a list of floating-point numbers can't be reversed back into the original text. That assumption is wrong. Recent research achieved over 92 percent accuracy reconstructing exact token sequences — full names, email addresses, medical diagnoses — from embeddings alone. Without access to the original model that created them.

Dan: Hmm. Ninety-two percent? So if someone gets read access to your vector database, they can essentially reconstruct the original documents?

Alice: Large portions of them, yes. The technique is called embedding inversion. An attacker trains a neural network to map embeddings back to text. And the frightening part is that these inversion models are transferable. The attacker doesn't need your specific embedding model — they build a surrogate that approximates yours, and it still works. For a legal SaaS platform, this means that a vector database breach could expose privileged attorney-client communications, case strategy, contract terms — reconstructed from numbers that were supposed to be meaningless.

Dan: Mm. That's the first problem. What's the second?

Alice: <sigh> Multi-tenant isolation — or the lack of it. Your legal SaaS platform serves multiple law firms. In your PostgreSQL database, Row-Level Security ensures Firm A can never see Firm B's data. The database engine enforces it. But most vector databases don't have an equivalent. The most common approach — and the most dangerous — is to store all tenants' embeddings in a single collection and filter by a tenant ID metadata field at query time. If that filter has a bug, or if someone forgets to include it in a query, the similarity search returns results across all tenants. One missing parameter and Client A's privileged documents show up in Client B's AI responses.

Dan: So what's the safe way to do multi-tenant isolation in a vector database?

Alice: There's a spectrum. The minimum acceptable approach is namespace-per-tenant, which platforms like Pinecone support — each tenant gets a logically separated namespace, and queries are scoped to one namespace per call. Better is collection-per-tenant, which Weaviate offers — each tenant gets physically separate storage, vectors, and indices. Operations on one tenant literally cannot touch another tenant's data because they're separate shards. The best option for security, though it costs more to operate, is separate database instances per tenant.

Dan: Mm-hmm. And there's a cheat code, isn't there? If you use PostgreSQL for vectors?

Alice: pgvector. It's PostgreSQL's vector extension — it stores embeddings in regular PostgreSQL tables. Which means all the security infrastructure you've already built — Row-Level Security, audit logging, network isolation, encrypted connections — automatically applies to your embeddings. You don't need a separate isolation strategy for vectors because your relational security already covers them. It's a significant argument for pgvector over standalone vector databases, at least from a security perspective.

Dan: Yeah, that's elegant — using the security you already have. What's the third problem?

Alice: Access control at retrieval. Even with proper tenant isolation, you need document-level permissions within a tenant. When the AI searches for relevant documents, the vector database returns the mathematically closest matches. But "closest match" is not the same as "document this user is authorised to see." A junior associate asking a question could retrieve content from a partner-only matter, a different client's case file, or a document under restricted access — unless the retrieval pipeline checks permissions before returning results to the model.

Dan: So the retrieval layer needs to be permission-aware.

Alice: Exactly. The pipeline runs the similarity search, gets candidate documents, then checks: does this user have access to each candidate document? Only authorised documents make it into the model's context. Without that permission check, your AI becomes an accidental privilege escalation tool — users can ask questions and get answers based on documents they wouldn't normally be able to read.

Dan: Right. And I imagine the choice of embedding model matters too — whether you host it yourself or use a cloud API?

Alice: It matters a lot. If you use a cloud-hosted embedding API — OpenAI, Cohere, Google — you're sending your client's text to a third-party server to be converted to vectors. That server may log the input. For privileged legal content, that may violate confidentiality obligations. The conservative choice for legal SaaS is self-hosted embedding models — sentence-transformers or similar open models running on your own infrastructure. The text never leaves your environment. If you do use cloud APIs, make sure you have a data processing agreement that prohibits logging and training on your inputs.

Dan: And there's an attack where someone just sends lots of queries to probe the vector database?

Alice: Nearest-neighbour probing. An attacker with query access sends carefully crafted queries and observes which documents come back and what similarity scores they get. By systematically probing, they can infer the presence and approximate content of documents — even without direct read access to the embeddings. It's the vector database equivalent of the timing attacks we discussed in Episode 17 for password comparison. The mitigations are similar too — rate limiting on queries, monitoring for anomalous patterns, and adding noise to similarity scores.

Dan: Yeah. Let me try to sum this up. Your relational database has Row-Level Security, network isolation, audit logging — decades of security architecture. Your vector database is the new kid with almost none of that built in. So you need to build those controls yourself or choose infrastructure that gives them to you.

Alice: That's it exactly. The relational versus vector security gap is real, and most legal SaaS platforms haven't closed it yet. The minimum checklist: isolate per tenant using collections or namespaces, never flat metadata filtering. Enforce document-level permissions at retrieval time. Audit every vector query. Consider self-hosted embedding models for privileged content. And if you can, use pgvector — let your existing PostgreSQL security infrastructure do the heavy lifting.

Dan: Next episode — Model Inversion and Membership Inference. Can an attacker tell whether a specific privileged document was used to train your AI? And what does that mean for attorney-client privilege?

Alice: Until then, I'm Alice.

Dan: And I'm Dan.

Alice: Security for Legal SaaS is a series written with AI assistance. Alice and Dan are AI-generated voices — no professional advice here, just education.

Security for Legal SaaS is a series written with AI assistance. Alice and Dan are AI-generated voices — no professional advice here, just education.