Episode 3 · Module 1 · Foundations

Attack Surfaces in Legal Tech

18 May 2026 · 8:58 · Security for Legal SaaS

0:00 8:58

An attack surface is every point where an attacker can get in or pull data out. In this episode, Alice and Dan map the seven major surfaces of a legal AI SaaS platform — from the web application layer through document ingestion, LLM integrations, and internal admin tools. They examine why adversary-supplied documents make legal tech uniquely dangerous, walk through the Proskauer Rose incident, and outline practical reduction strategies.

Today’s Lesson

Every Entry Point Is a Promise to Attackers

An attack surface is the sum total of all points where an unauthorised user can attempt to enter or extract data from a system. OWASP defines it as¹ “all of the different points where an attacker could get into a system, and where they could get data out.” NIST SP 800-53 control SA-15(5)² frames attack surface reduction as “giving attackers less opportunity to exploit weaknesses.”

Key stat: Exploitation of vulnerabilities as an initial access vector increased 180% in the 2024 Verizon DBIR,³ driven primarily by web application flaws.

Anatomy of a Legal Tech Attack Surface

Attack Surface	Entry Points	Primary Threat
Web Application	Login pages, document viewers, admin panels	Credential stuffing, XSS, session hijacking
API Layer	REST/GraphQL endpoints, webhooks, OAuth callbacks	BOLA, injection, broken authentication
Database	Connection strings, query interfaces, backups	SQL injection, credential theft
Object Storage	S3/Blob buckets, pre-signed URLs, CDN origins	Misconfiguration, over-permissive ACLs
LLM Integrations	Prompt endpoints, RAG pipelines, embeddings	Prompt injection, data poisoning
Email Systems	SMTP relay, IMAP ingestion, OAuth tokens	Phishing relay, token theft
Document Ingestion	Upload endpoints, file parsers, format converters	Malicious payloads, parser exploits, XXE

The Document Ingestion Pipeline: Your Most Dangerous Surface

Most SaaS applications receive trusted input from their own users. Legal tech receives adversary-supplied content as a core workflow. Opposing counsel sends contracts. Third parties attach evidence. Clients forward hostile correspondence.

Case study — PDF exploits: CVE-2023-26369 demonstrated⁴ that a crafted PDF could achieve arbitrary code execution through a heap-based buffer overflow. Apache Tika has faced XXE injection through XFA content embedded in PDFs,⁵ allowing attackers to access local files and internal network resources from the parser itself.

What Makes Legal Document Ingestion Uniquely Dangerous

The sender is often adversarial by design. In litigation, opposing counsel has a direct interest in the outcome.
Documents traverse the entire stack. An uploaded contract touches web server, object storage, parser, AI pipeline, database, and notifications.
Content becomes trusted input downstream. Once parsed, document text feeds into search indexes, AI models, and reporting.
Format complexity creates parser attack surface. DOCX files are ZIP archives containing XML. PDFs can embed JavaScript and arbitrary binary streams.

The Distinctive Threat Profile of Legal Tech

ABA Formal Opinion 477R⁶ requires lawyers to make reasonable efforts to prevent unauthorised access to client information. The platform bears professional-conduct-grade obligations. A BOLA vulnerability⁷ — number one on the OWASP API Security Top 10, roughly 40% of API attacks — in a legal platform exposes privileged communications belonging to non-users who never consented to the platform handling their information.

The Proskauer Rose Incident

In April 2023, Proskauer Rose exposed approximately 184,000 files⁸ containing private M&A documents, NDAs, and financial deals on an unsecured Microsoft Azure cloud server for six months. Indexed by GrayHatWarfare and accessible to anyone with the URL. One misconfigured object storage bucket — one row in the attack surface table — causing catastrophic privilege breach across hundreds of matters.

LLM Integration: The Newest Attack Surface

The OWASP Top 10 for LLM Applications (2025)⁹ catalogues risks specific to language model integrations. The fundamental problem: LLMs cannot reliably distinguish data from instructions. When your AI processes a contract from opposing counsel, the contract’s content is data — but the LLM may treat embedded text as instructions.

Internal Attack Surfaces: The Admin Panel Problem

INC Ransomware’s 2024 campaign against law firms¹⁰ exploited vulnerabilities in remote management tools — Citrix, Fortinet, SimpleHelp — to gain initial access. They didn’t attack the main application. They attacked the admin tools.

Attack Surface Reduction: Least Exposure

Action	What It Eliminates
Disable unused API endpoints	Orphaned routes with stale auth
Remove default admin panels from production	Predictable URL attack surface
Restrict object storage to private + pre-signed URLs	Public bucket enumeration
Network-segment AI inference services	Lateral movement from LLM to DMS
Enforce allowlist-only file formats	Parser exploit surface for exotic formats
Require VPN/zero-trust for all internal tools	Network-exposed admin interfaces

Ransomware attacks on law firms increased 30% in Q1 2024,¹³ with average demands exceeding $500,000. The cheapest defence is removing things attackers could target. You can’t exploit a service that isn’t running.

Conclusion

Map every surface. Reduce what you can. Isolate what remains so that breaching one surface doesn’t give access to others. Legal tech has an attack surface unlike any other vertical — your users receive adversary-supplied content by design, your data carries professional-conduct obligations, and your AI integrations create novel exploitation pathways.

Alice: Welcome back to Security for Legal SaaS. I’m Alice.

Dan: And I’m Dan. Episode 3 — Attack Surfaces in Legal Tech. Alice, last time we walked through STRIDE as a framework. Today we’re mapping what you actually apply it to.

Alice: Right. An attack surface is every point where an attacker can try to get in or pull data out. OWASP defines it as all the different points where an attacker could enter a system and where they could get data out. For a typical legal AI SaaS, that surface is enormous compared to most web applications.

Dan: Walk me through it. What does the surface look like for, say, a contract review platform?

Alice: Seven major surfaces. First, the web application itself — login pages, document viewers, admin panels. Second, the API layer — REST endpoints, webhook receivers, OAuth callbacks. Third, the database — connection strings, query interfaces. Fourth, object storage — your S3 buckets or Azure Blob containers where documents live. Fifth, LLM integrations — prompt endpoints, RAG pipelines, embedding services. Sixth, email systems — SMTP relay, IMAP ingestion for receiving documents via email. Seventh, and this is the critical one — the document ingestion pipeline.

Dan: Why is document ingestion the critical one? Every SaaS has file uploads.

Alice: Because in legal tech, the person sending you documents is often your adversary. Literally. In litigation, opposing counsel sends contracts, evidence, filings. In M&A, the counterparty sends due diligence materials. These aren’t trusted users uploading their own files. These are parties who may have a direct financial interest in compromising your system or extracting information from it.

Dan: That’s a completely different threat model from a photo-sharing app where users upload their own pictures.

Alice: Exactly. And the uploaded document doesn’t just sit in one place. It traverses the entire stack. The web server receives it, object storage holds it, the document parser extracts text — and parsers are a massive attack surface because PDFs can embed JavaScript, DOCX files are ZIP archives containing XML, and any parser vulnerability becomes remote code execution from a file upload. Then that parsed text feeds into your AI pipeline, your search index, your database.

Dan: So a malicious payload in a document could potentially touch every component in your architecture.

Alice: In 2023, researchers demonstrated that a crafted PDF could achieve arbitrary code execution through a heap buffer overflow in Adobe’s parser. Apache Tika — which half the Java world uses for document extraction — had an XXE injection flaw through XFA content embedded in PDFs. An attacker could access local files and internal network resources from the document parser itself. And that’s just the parser layer. Once the text is extracted, if it feeds into an LLM, you have prompt injection. The document’s content becomes instructions the AI follows.

Dan: Right — we covered that in episode one. The AI can’t distinguish between data and instructions. But let’s talk about something else that makes legal tech distinctive. The confidentiality angle.

Alice: This is fundamental. Most SaaS handles data belonging to the user. Legal tech handles data belonging to the user’s clients — who are often in conflict with other parties who may also use your platform. Think about an e-discovery platform. Both sides of a lawsuit might be on the same service. A multi-tenancy isolation failure isn’t just a data breach — it’s a privilege waiver. Attorney-client privilege, once breached, may be permanently waived.

Dan: And there are professional conduct obligations here too.

Alice: ABA Formal Opinion 477R requires lawyers to make reasonable efforts to prevent unauthorised access to client information. That means the platform bears professional-conduct-grade obligations. A BOLA vulnerability — Broken Object Level Authorisation, number one on the OWASP API Security Top Ten, roughly 40 percent of all API attacks — in a legal platform doesn’t just expose user data. It exposes privileged communications belonging to non-users who never consented to your platform handling their information.

Dan: Give me a real-world example of this going wrong.

Alice: Proskauer Rose, April 2023. One of the largest law firms in the world. A third-party vendor configured a Microsoft Azure storage container without proper access controls. 184,000 files — M&A documents, NDAs, financial deals, confidential client data — sat publicly accessible for six months. Indexed by GrayHatWarfare, available to anyone who knew where to look. One misconfigured object storage bucket. That’s one row in our attack surface table causing catastrophic damage across hundreds of client matters.

Dan: One misconfiguration. Six months. 184,000 files. That’s terrifying. What about the internal surfaces — the ones people tend to forget about?

Alice: Admin panels, database management UIs, CI/CD pipelines, monitoring dashboards. These are security boundaries because they have elevated privileges. Create users, modify permissions, deploy code, access raw data. But teams often treat them as internal-only and therefore safe. The INC Ransomware group’s campaign against law firms in 2024 specifically targeted remote management tools — Citrix, Fortinet, SimpleHelp. They didn’t attack the main application. They attacked the admin tools.

Dan: Because once you’re inside an admin interface, you typically have the keys to everything.

Alice: Exactly. Every internal tool exposed to the network without its own authentication boundary is attack surface. Compromise one developer’s laptop through phishing and you’re on the internal network with access to anything that trusts the network perimeter as its only security layer.

Dan: OK, so we’ve mapped this enormous surface. How do we reduce it?

Alice: The principle is least exposure. NIST SP 800-53 codifies this as attack surface reduction — implementing least privilege, deprecating unsafe functions, reducing entry points, eliminating vulnerable APIs. Practically? Disable every API endpoint you’re not actively using. Remove default admin panels from production. Make object storage private-only with pre-signed URLs. Network-segment your AI inference services so a compromised LLM can’t reach your document management system. Enforce allowlists for upload file formats — if you only need PDF and DOCX, reject everything else at the boundary. Require VPN or zero-trust verification for every internal tool.

Dan: Every unnecessary open port is liability without benefit.

Alice: That’s the mental model. Ransomware attacks on law firms increased 30 percent in Q1 2024, with average demands exceeding half a million dollars. The cheapest defense is removing things attackers could target. You can’t exploit a service that isn’t running. You can’t access a port that isn’t open. You can’t abuse an endpoint that doesn’t exist.

Dan: Map, reduce, isolate.

Alice: Map every surface — web app, API, database, storage, AI, email, document ingestion, admin tools. Reduce what you can eliminate. Isolate what remains so that breaching one surface doesn’t give you access to others.

Dan: And that isolation idea — that’s exactly what we’ll dig into next time. Episode 4: Defence in Depth. Layering controls so when one boundary falls, the attacker hits another wall.

Alice: Until then, I’m Alice.

Dan: And I’m Dan.

Security for Legal SaaS is a series written with AI assistance. Alice and Dan are AI-generated voices — no professional advice here, just education.

Sources & references

OWASP, “Attack Surface Analysis Cheat Sheet”
NIST SP 800-53 Rev. 5, Control SA-15(5): Attack Surface Reduction
Verizon, 2024 Data Breach Investigations Report — 180% increase in vulnerability exploitation
ThreatLocker, “CVE-2023-26369: One-click PDF exploits”
GBHackers, “Apache Tika Core Flaw” — XXE injection through XFA content
ABA Formal Opinion 477R, “Securing Communication of Protected Client Information”
OWASP, “API1: Broken Object Level Authorization” — ~40% of API attacks
TechCrunch, “Proskauer exposed clients’ confidential M&A data,” April 2023 — 184,000 files
OWASP, “Top 10 for Large Language Model Applications (2025)”
Halcyon, “INC Ransom Group Mounts Rapid Campaign Against Law Firms,” 2024
CrowdStrike, “What is Attack Surface Reduction?”
CISA, “Primary Mitigations to Reduce Cyber Threats”
ProcessBolt, “Why Law Firm Data Breaches Are Skyrocketing in 2024” — 30% increase; $500K average demands