Security for Legal SaaS

Episode 35 · Module 8 · AI Security

Embedding Security and Vector Database Isolation

19 May 2026 · 9:07 · Security for Legal SaaS

9:07 9:07

In Episode 8, we covered Row-Level Security (RLS) in PostgreSQL — the database engine automatically filtering queries so that Tenant A can never see Tenant B's data. In Episode 32, we hardened the database with network isolation, audit logging, and least-privilege access. Your relational database has decades of security architecture behind it. Your vector database does not. Vector databases — Pinecone, Weaviate, Qdrant, Milvus, pgvector — are the storage layer for AI-powered legal tools. They hold the embeddings (numerical representations of documents) that make retrieval-augmented generation (RAG) work, as we discussed in Episode 34.

Today’s Lesson

Security for Legal SaaS — Episode 35: Embedding Security and Vector Database Isolation

The Relational vs. Vector Security Gap

In Episode 8, we covered Row-Level Security (RLS) in PostgreSQL — the database engine automatically filtering queries so that Tenant A can never see Tenant B's data. In Episode 32, we hardened the database with network isolation, audit logging, and least-privilege access. Your relational database has decades of security architecture behind it.

Your vector database does not.

Vector databases — Pinecone, Weaviate, Qdrant, Milvus, pgvector — are the storage layer for AI-powered legal tools. They hold the embeddings (numerical representations of documents) that make retrieval-augmented generation (RAG) work, as we discussed in Episode 34. But they were designed for similarity search performance, not security. Most lack the fine-grained access controls that relational databases take for granted. The OWASP Top 10 for LLM Applications 2025 added a new entry — LLM08: Vector and Embedding Weaknesses — specifically because this gap is being exploited.1

This episode covers three threats: embedding inversion attacks (extracting sensitive text from embeddings), multi-tenant isolation failures (one client's data leaking to another), and access control gaps in the vector layer.

How Embeddings Encode Sensitive Information

An embedding is a vector — a list of numbers (typically 768 to 3072 dimensions) — that represents the meaning of a text passage. When your legal SaaS platform indexes a client's contract, the embedding model converts each passage into a vector and stores it in the vector database. The original text is either stored alongside the vector or in a linked relational database.

The security assumption is that embeddings are opaque — a list of floating-point numbers that cannot be reverse-engineered back into the original text. This assumption is wrong.

Embedding Inversion Attacks

Recent research has demonstrated that embeddings can be inverted to reconstruct significant portions of the original text. A study documented by security researchers achieved over 92% accuracy reconstructing exact token sequences — including full names, health diagnoses, and email addresses — from text embeddings alone, without access to the original embedding model.2

The attack works by training a neural network to map embeddings back to text. Critically, these inversion models are transferable: an attacker can train their inversion model on a different embedding model than the one you use, and it still works with reduced but meaningful accuracy. This means the attacker does not need access to your specific embedding model — they can build a surrogate that approximates yours.3

For legal SaaS, the implication is direct: if an attacker gains read access to your vector database, they can potentially reconstruct privileged client communications, contract terms, case strategy notes, and other sensitive text from the embeddings alone.

Multi-Tenant Vector Database Isolation

Legal SaaS platforms are typically multi-tenant — one platform serves multiple law firms or clients. The relational database enforces tenant isolation through RLS. But what about the vector database?

Isolation Approaches Compared

Approach Isolation Level Performance Cost Risk
Single collection, metadata filtering Logical (query-time filter) Best Lowest Highest — a bug in the filter = cross-tenant leak
Namespace per tenant (Pinecone) Logical (server-enforced partitioning) Good Low Moderate — namespaces are logically but not physically isolated4
Collection per tenant (Weaviate, Qdrant) Physical (separate storage, indices, and vectors) Good Higher Lowest — operations on one tenant cannot touch another's shard
Separate database instance per tenant Complete Worst Highest Lowest — but operationally complex

For legal SaaS handling privileged data, the minimum acceptable approach is namespace-per-tenant with server-enforced partitioning. The recommended approach is collection-per-tenant, where each tenant has physically separate storage. The metadata-filtering approach — storing all tenants in one collection and filtering at query time — is the most dangerous because a single missing filter parameter returns results across all tenants.5

The pgvector exception: If you use pgvector — PostgreSQL's vector extension — you get RLS for free. Because pgvector stores vectors in regular PostgreSQL tables, the same Row-Level Security policies that protect your relational data also protect your embeddings. This is a significant security advantage over standalone vector databases.6

Common Misconfiguration: Frontend API Keys

A frequently observed vulnerability: embedding vector database API keys in frontend JavaScript code. Security researchers have documented that developers commonly place Pinecone or Weaviate API keys in client-side code, giving any user — or attacker — full read and write access to the entire knowledge base. API keys for vector databases belong in backend services, managed through the secrets management infrastructure covered in Episode 30.7

Access Control on Retrieval

Even with proper tenant isolation, the retrieval layer needs its own access controls. The question is not just "which tenant does this embedding belong to?" but also "within this tenant, is the requesting user authorised to see this document?"

Permission-Aware Retrieval

The retrieval pipeline should enforce document-level permissions before returning results to the model:

  1. User queries the AI. The query is converted to an embedding.
  2. Vector similarity search returns candidate documents. These are the mathematically closest embeddings to the query.
  3. Permission filter. Before any candidate is included in the model's context, the system checks: does this user have access to this document? This requires mapping each embedding back to its source document and checking the document's access control list (ACL).
  4. Filtered results go to the model. Only documents the user is authorised to see are included in the generation context.

Without step 3, a junior associate asking a question could inadvertently retrieve — and the model could reveal — content from a partner-only matter, a different client's case, or a document under a restricted access protocol.

Nearest-Neighbour Probing

An attacker with query access to the vector database can exploit similarity search to probe for information. By sending crafted queries and observing which documents are returned (or which similarity scores change), the attacker can infer the presence and approximate content of documents in the database — even without direct read access. This is the vector database equivalent of a timing attack, which we discussed in the context of password comparison in Episode 17.8

Mitigations include rate limiting on vector queries, monitoring for anomalous query patterns, and adding noise to similarity scores returned to the client.

Embedding Model Selection: Privacy Implications

The choice of embedding model affects privacy:

Consideration Implication
Cloud-hosted embedding APIs (OpenAI, Cohere, Google) Your text is sent to a third-party server for embedding. The provider may log inputs. For privileged legal content, this may violate confidentiality obligations
Self-hosted embedding models (sentence-transformers, local models) Text never leaves your infrastructure. Full control over logging and retention. Higher operational cost
Model dimensionality Higher-dimensional embeddings encode more information — potentially making inversion attacks more effective
Fine-tuned models Models fine-tuned on your domain data may encode domain-specific patterns that are easier to invert

For legal SaaS handling privileged information, self-hosted embedding models are the conservative choice. If using cloud-hosted APIs, ensure the provider offers a data processing agreement that prohibits logging and training on your inputs, and verify that the processing jurisdiction is acceptable under your data protection obligations.9

Practical Architecture: Secure Embedding Pipeline

A minimal secure embedding pipeline for multi-tenant legal SaaS:

  1. Embed locally. Use self-hosted models for privileged content. Cloud APIs only for non-sensitive content with appropriate DPAs.
  2. Isolate per tenant. Separate vector collections or namespaces per tenant. Never store multiple tenants in a single flat collection with metadata filtering as the sole isolation mechanism.
  3. Enforce permissions at retrieval. Check document-level ACLs before returning any result to the model.
  4. Audit retrieval operations. Log every vector query: who queried, what was returned, which tenant's data was accessed. Ship logs to your SIEM (from Episode 7).
  5. Rotate embedding keys. If using API-based embedding services, rotate API keys regularly through your secrets manager (Episode 30).

What's Next

Episode 36 covers Model Inversion and Membership Inference — can an attacker determine whether a specific privileged document was in your AI's training data? For legal AI, that question has direct privilege implications.

Sources & Further Reading

Sources & references

  1. OWASP, LLM08:2025 Vector and Embedding Weaknesses.
  2. TianPan.co, The Privacy Architecture of Embeddings: What Your Vector Store Knows About Your Users.
  3. Rafter, Vector DBs & Embeddings: The Overlooked Security Risk.
  4. Blockchain Council, Securing and Governing Vector Databases in 2026.
  5. Indusface, OWASP LLM08: Vector and Embedding Security Risks (2025).
  6. pgvector, Open-Source Vector Similarity Search for PostgreSQL.
  7. Security Boulevard / FireTail, LLM08: Vector & Embedding Weaknesses.
  8. Securityium, A Guide to Mitigating LLM08:2025 Vector and Embedding Weaknesses.
  9. Cobalt, Vector and Embedding Weaknesses: Vulnerabilities and Mitigations.
  10. Ackuity AI, Mitigating Vector and Embedding Weaknesses.
  11. PointGuard AI, Vector Embedding Weaknesses.