Disaster Recovery and Migration

Disaster recovery

Disaster recovery is about preparing for those disaster that can happen to the data center.

RPO: Recovery point objective. How often do you run backups. How far can you go back just before the data loss. "How much data did you lose just before the disaster". It will be measure in terms of time, the lower the better, that means that you basically lose little to no data.

RTO: Recovery time object. How long will it take you recover from the disaster. How much downtime after the disaster. Again also in time the lower the better.

Disaster recovery strategies

All these strategies will have different RPO and RTO and different cost.

Backup and restore

High RPO!

You make regular backups basically to the cloud, then when disaster strikes then you can just restore from the snapshot for RDS, or EBS.

This is cheap and the only cost is storing the backups, and high RPO and high RTO.

Pilot light

A small version of the app is always running in the cloud. The useful and critical application.

Faster backup and restore since the critical system are already up. Then you just have to spin up the other non-core systems on the fly.

Lower RPO and RTO, but a little bit more expensive.

Warm standby

A full replication of the system but is scaled down. When disaster strikes then you scale it up.

Way faster RTO, and lower RPO but you are spending more now.

Multi Site / Hot Site Approach

You have basically a full replication not scaled down running concurrently on the cloud environment ready to go.

Very fast RTO and low RPO, his is the most expensive.

All AWS Multi Region

Deploy everything to the cloud to multiple regions, when one region is down just failover to the other.

Recovery tips

Backup do EBS snapshots, RDS automated backups. Push those snapshots and backup to S3.

High availability, use Route 53 to automatic migrate DNS from region to region.

RDS multi-AZ.