Today’s Lesson
Security for Legal SaaS — Episode 55: Disaster Recovery and Business Continuity
Your Primary Region Is Underwater
On a Monday morning, the cloud region hosting your legal SaaS platform experiences a catastrophic failure. Power outage, network partition, natural disaster — the cause doesn't matter. What matters is that hundreds of lawyers at dozens of firms cannot access their case files, their court deadlines are in hours, and your phone is ringing.
Disaster recovery (DR) and business continuity (BC) planning answer a single question: how long until your clients can work again, and how much data do they lose? For most SaaS platforms, downtime is inconvenient. For legal SaaS, it can be professionally catastrophic. Court filing deadlines don't move because your server crashed. A missed limitation period cannot be refiled. GDPR Article 32(1)(c) explicitly requires "the ability to restore the availability and access to personal data in a timely manner in the event of a physical or technical incident" — making disaster recovery not just an operational concern but a data protection compliance requirement.1
RTO and RPO: The Two Numbers That Define Your Recovery
Every disaster recovery plan is built around two metrics:
RTO (Recovery Time Objective): The maximum acceptable time your system can be down. If your RTO is four hours, your DR plan must be capable of restoring service within four hours of a disaster declaration.
RPO (Recovery Point Objective): The maximum acceptable amount of data loss, measured in time. If your RPO is one hour, you must have backups or replication no more than one hour old. Any data created in the gap between the last backup and the disaster is lost.
| Platform Type | Typical RTO | Typical RPO | Justification |
|---|---|---|---|
| Case management SaaS | 1-4 hours | 15 minutes | Court deadlines; active matters need near-continuous access |
| Document review platform | 4-8 hours | 1 hour | Large datasets; some delay tolerable during off-peak |
| E-filing integration | < 1 hour | Near-zero | Filing deadlines are absolute; missed filings have legal consequences |
| Contract repository | 4-12 hours | 1 hour | Reference material; lower urgency than active litigation tools |
The legal-specific risk: A general-purpose SaaS platform might tolerate 24-hour RTO because users can wait. Legal SaaS cannot. A lawyer who can't access case files before a hearing, can't retrieve a contract before a signing deadline, or can't file a document before a court deadline faces professional liability — potentially malpractice claims. Your DR plan must account for the legal urgency of your users' work, not just the technical difficulty of restoring service.
Backup Strategies
Backups are the foundation of any DR plan. But "we have backups" is not a DR plan — it's the beginning of one.
Backup Types
| Strategy | How It Works | RPO Achievable | Restore Speed |
|---|---|---|---|
| Full backups | Complete copy of all data at a point in time | Depends on frequency (daily = 24hr RPO) | Fastest — single restore |
| Incremental backups | Only changes since the last backup | Depends on frequency | Slower — must replay from last full + all increments |
| Continuous Data Protection (CDP) | Every write is replicated in near-real-time | Seconds | Fast — minimal data loss |
| Snapshot-based | Point-in-time copy of storage volume | Minutes to hours | Fast — restore from snapshot |
| Cross-region replication | Data continuously replicated to another geographic region | Seconds to minutes | Fastest failover — region already has current data |
The 3-2-1 Rule
At minimum, maintain:
- 3 copies of your data
- On 2 different types of storage media
- With 1 copy offsite (different geographic region)
For legal SaaS handling privileged communications, add encryption at rest for all backup copies. AWS, Azure, and GCP all provide managed backup services with built-in encryption and cross-region replication.2
The Backup Nobody Tests
An untested backup is not a backup. Regularly test that you can actually restore from your backups to a functioning system. At least quarterly, perform a full restore test: take a backup, restore it to a clean environment, verify the application works and the data is intact. The number of organisations that discover their backups are corrupted or incomplete during an actual disaster is sobering. Testing is the only proof.
Multi-Region Failover
For RTOs below four hours, single-region deployments with backup restoration are usually too slow. You need multi-region architecture.
Active-Passive
Your application runs in a primary region. A secondary region has infrastructure provisioned but not actively serving traffic. Data is continuously replicated from primary to secondary. When the primary fails, traffic is redirected to the secondary.
Pros: Lower cost than active-active. Simpler to implement.
Cons: Failover takes time (minutes to an hour). Secondary region may be stale by the RPO window.
Buxton Consulting's DR guide for SaaS recommends active-passive as the pragmatic starting point for most SaaS platforms, with active-active reserved for mission-critical workloads with near-zero RTO requirements.3
Active-Active
Both regions actively serve traffic simultaneously. Data is replicated bidirectionally. If one region fails, the other absorbs the full load with zero or near-zero downtime.
Pros: Near-zero RTO. No failover delay.
Cons: Significantly more expensive. Bidirectional data replication introduces conflict resolution complexity — what happens when two users edit the same document in different regions simultaneously?
AWS Well-Architected DR Strategies
The AWS Well-Architected Framework's Reliability Pillar defines four DR strategies mapped to RTO/RPO ranges:4
| Strategy | RTO | RPO | Cost |
|---|---|---|---|
| Backup & Restore | Hours | Hours | Lowest |
| Pilot Light | Minutes to hours | Minutes | Low-medium |
| Warm Standby | Minutes | Seconds to minutes | Medium-high |
| Multi-Site Active-Active | Near-zero | Near-zero | Highest |
DR Testing: Proving Your Plan Works
Tabletop Exercises
Gather your team around a table (or a video call). Present a disaster scenario: "It's 9 AM Monday. AWS us-east-1 is completely down. Your monitoring shows all services unreachable. What do you do?" Walk through the response step by step. Who makes the failover decision? Who communicates with clients? How long does each step take?
Tabletop exercises are cheap, fast, and reveal gaps in your plan that aren't visible on paper. Run them quarterly.
Actual Failover Drills
At least annually, execute a real failover. Simulate a primary region failure and switch to your DR region. Verify:
- Application functions correctly in the DR region
- Data is complete and current (within RPO)
- External integrations (court filing systems, email providers, payment processors) reconnect
- Performance is acceptable under full load
- The failback to the primary region works cleanly
Cloud4C's 2026 business resilience guide recommends automated failover drills with synthetic traffic, reducing the risk and effort of manual testing.5
Chaos Engineering
For mature teams, introduce controlled failures in production: terminate a database replica, simulate a network partition, corrupt a cache. This validates not just your DR plan but your system's day-to-day resilience. Netflix's Chaos Monkey pioneered this approach; tools like AWS Fault Injection Simulator and Gremlin provide managed chaos engineering platforms.
Legal-Specific DR Considerations
Court Filing Deadlines
Court filing systems have hard deadlines that cannot be extended because your platform is down. Your DR plan must ensure that e-filing integrations have independent failover paths — either through redundant integration endpoints or manual filing procedures that lawyers can follow when the platform is unavailable.
Privileged Communication Continuity
If your platform stores attorney-client privileged communications, DR replication must maintain the same access controls and encryption in the DR region as in production. A failover that exposes privileged documents to users who shouldn't see them is worse than downtime.
Regulatory Requirements
Beyond GDPR's availability requirement, the EU's Digital Operational Resilience Act (DORA), enforceable since January 2025, requires financial entities to maintain ICT continuity policies with defined RTOs and RPOs.5 While DORA targets financial services, legal SaaS platforms serving regulated industries must meet their clients' compliance requirements — which increasingly include DR certification.
The DR Minimum for Legal SaaS
| Control | Purpose |
|---|---|
| Defined RTO and RPO for each service tier | Sets recovery expectations and drives architecture decisions |
| Automated daily backups with cross-region replication | Ensures data survives regional failures |
| Encrypted backup storage with access controls | Protects client data at rest in backup locations |
| Quarterly backup restore tests | Proves backups actually work |
| Documented failover procedures | Enables fast, reliable region switches |
| Annual failover drill | Validates the entire DR chain under realistic conditions |
| Client communication template for outages | Prepares clear, consistent messaging during incidents |
| Manual filing procedures for e-filing integrations | Provides lawyers a fallback when automation fails |
Disaster recovery isn't about preventing disasters — it's about ensuring your clients can keep working when one happens. For legal SaaS, "keep working" means meeting court deadlines, accessing case files, and preserving privileged communications. Plan for the worst. Test regularly. The drill that feels unnecessary is the one that saves you.
Next episode: security testing in your development process — how to find vulnerabilities before attackers do, from code commit to production.
Sources & references
- Konfirmity, GDPR Incident Response Plan: A Practical Guide — Article 32 DR Requirements.
- Microsoft, Business Continuity, High Availability, and Disaster Recovery.
- Buxton Consulting, Building and Testing Disaster Recovery Plans for SaaS Applications.
- AWS, Well-Architected Framework: Plan for Disaster Recovery.
- Cloud4C, Business Resilience in 2026: A Cross-Sector Guide to Rapid Disaster Recovery.
- ATOZDEBUG, Disaster Recovery for SaaS — A Complete 2025 Strategy Guide.
- GainHQ, Disaster Recovery SaaS Guide For Business Continuity In 2026.
- Opsio Cloud, Disaster Recovery & Business Continuity in the Cloud.
- Exodata, IT Disaster Recovery Plan Template (2026).
- N2W Software, Best Cloud Recovery Tools for Business Continuity: Top 5 in 2026.