Setting Up Failover Between Cloud Providers
Failover between cloud providers — extreme protection against vendor outage. Historically, major cloud providers have failed: AWS us-east-1 (multiple times), GCP (2019, 2020), Cloudflare (2019). If SLA requires 99.99%+, one provider is insufficient.
Prerequisites for cross-cloud failover
Without these conditions, cross-cloud failover makes no sense:
- Cloud-agnostic architecture — application doesn't use provider-specific APIs
- Containerization — Kubernetes provides uniform execution environment
- Data synchronization — mechanism for data replication between providers
- Infrastructure as Code — infrastructure of both providers described in Terraform
DNS as switching point
Cloudflare — optimal choice for managing failover between providers: works with both, belongs to neither.
import CloudFlare
cf = CloudFlare.CloudFlare(token=CF_TOKEN)
def switch_to_provider(zone_id: str, record_name: str, new_ip: str):
records = cf.zones.dns_records.get(zone_id, params={'name': record_name})
record_id = records[0]['id']
cf.zones.dns_records.put(
zone_id,
record_id,
data={
'type': 'A',
'name': record_name,
'content': new_ip,
'ttl': 60,
'proxied': True
}
)
Cloudflare Load Balancing with health checks automates switching without manual intervention.
Timeline
Preliminary audit — 2-3 days Terraform for second provider — 5-10 days Data replication setup — 5-10 days Failover automation — 3-5 days Testing — 3-5 days







