High Availability

Odoo always available.
Automatic failover <30 s.

Primary + standby architecture, WAL streaming replication, topology-aware Smart Healthchecks. In production today at 200 GB+ filestore.

< 30 s
RTO
~0
RPO
200 GB+
Filestore in prod
Multi-DC
Architecture

The four components of high availability

Each component handles a specific layer of the failover chain

Smart Healthchecks

Topology-aware Python probes editable from the interface. Distinguish standby down from primary down without reconfiguration during a switchover.

  • Topology-aware: role resolved dynamically
  • Switchover-transparent: zero reconfiguration
  • Pushover push alert to smartphone
🔀

Automatic failover < 30 s

Standby promoted to primary in under 30 seconds, then automatic re-synchronisation of the recovered node.

  • Standby to primary promotion
  • Automatic re-sync after recovery
  • Zero manual intervention
🔁

Traefik Load Balancer

Integrated HTTP/HTTPS proxy that redirects connections to the active node after a switchover. Transparent for Odoo.

  • Automatic routing post-switchover
  • TLS certificates natively managed
  • Zero application modification
🌐

Multi-provider DNS failover

Automatic DNS failover after promotion via OVH, Scaleway or the integrated Muppy DNS (MBD). The URL remains identical for your users.

  • OVH, Scaleway, integrated Muppy DNS (MBD)
  • Automatic failover post-switchover
  • Application URL unchanged

Failover in 4 steps

From detection to full recovery -- without human intervention

1
Detection -- 0 to 5 s

The meta-cluster detects via smart healthchecks that the primary node is unreachable. Replication lag and standby state checked.

2
Isolation -- 5 to 10 s

The failing node is marked out-of-service by the meta-cluster. The standby enters promotion phase without risk of split-brain.

3
Promotion -- 10 to 20 s

The standby is promoted to primary. Traefik automatically redirects connections. DNS failover switches the URL to the new active node.

4
Re-synchronisation -- after recovery

The recovered node rejoins the cluster automatically. Replication and filestore synchronisation resume without intervention.

Supported architectures

Odizy adapts to your infrastructure, not the other way round.

  • OVH primary + Scaleway standby
  • Cloud primary + On-premise standby
  • Multi-datacentre same provider
  • AWS / Azure / GCP + sovereign FR
  • On-premise inter-site instances (subsidiaries)

FAQ

How does PostgreSQL high availability work in Odizy?

Odizy deploys a primary + standby architecture with WAL streaming physical replication. The Muppy meta-cluster continuously monitors clusters via smart healthchecks. On failure, the failing node is isolated, the standby is promoted in under 30 seconds, and traffic is redirected automatically via Traefik and DNS failover.

What is the RTO and RPO of Odizy in the event of a failure?

The RTO (Recovery Time Objective) is under 30 seconds -- the automatic failover delay, with no human intervention. The RPO (Recovery Point Objective) is near zero in synchronous mode, and a few seconds in asynchronous mode.

How are alerts routed in the event of an incident?

Smart healthchecks send a push notification (Pushover) to smartphone as soon as a threshold is crossed, without a third-party service like PagerDuty.

Ready to guarantee availability for your Odoo instances?

Demo on a real workload. We analyse your current architecture.

Contact us →