About MongoDB: Backup or Replication?

: Facundo Lavallen; Hits: 4189

Sometimes MongoDB users question whether they need a backup if they are already using replication, or vice versa. Doesn’t replication already protect their data sufficiently? How do backup and replication compare?. In short, backup involves making a copy or copies of data, while replication is the act of copying and then moving data between a company’s site. Let's deepen about this.

Over time, infrastructures suffer predictable failures. So do power supplies, network cards, and other components. Sometimes your server runs out of space on its disk, at which point you’ll need to take it offline so you can install a bigger one. All these issues impact cloud infrastructures in different ways and would lead to your system being unavailable. To insulate infrastructures from these completely inevitable events, we build redundancy into our systems. If a component becomes unavailable, a standby system takes over immediately and transparently. The MongoDB feature that enables this redundancy is called replication.

Basically, when you organize MongoDB into a replica set, all members of the set stay in sync. All database writes go to the primary member of the replica set and are quickly synced to all secondary members. In the case a primary node fails, an election takes place between the remaining members and a new primary is chosen. Thus, replication ensures that in the event of a node failure, your application will still be available. Replication can be synchronous, asynchronous or near-synchronous and it is normally measured in Recovery Time Objective (RTO) and Recovery Point Objective (RPO). Minimizing the recovery time objective (RTO) is key.

Disaster recovery, in contrast, is about dealing with events that are far less predictable, and which by their nature defeat the protections offered by redundancy. These events fall into two kinds: human errors (bugs, deliberate hacking, accidental deletion, etc), and catastrophic failure (scenarios where all members of your replica set are destroyed). For these lower probability events, you want a relatively inexpensive solution that is well isolated from your production system. The MongoDB feature that enables protection against disasters is Backup.

Backup typically relies on snapshots which are copies of the data set taken at a predetermined point in time and requires a tape library and some place to store archived tapes. Restoring from backup is not instantaneous, but because it covers catastrophes, this is acceptable. If you are restoring from backup, your application is already so broken that the cost of the downtime is less than the cost of running continuously with bad or missing data.

Bottom line: Backup and Replication have distinct use-cases. You want backup to cover the events that you hope to never see happen. (If it’s happening often, you need to change the process.) Replication, on the other hand, offers fault tolerance against events that are fairly common and must be addressed without the user being aware of the event.

About Facundo Lavallen

Facundo is a Software Engineer with more than 10 years of experience developing Web applications for some of the most important Fortune 500 companies

Nowadays Facundo works at the Engineering department of TISA, looking for implementing the latest technologies and frameworks to be used in future projects.

Beyond his technical knowledge and passion for the technology Facundo enjoys playing Ping-Pong, Soccer and Paddle.