Traditional Backup and Data Replication for Data Protection
By: Nicole P @ EnduraData
Data backup and data replication are an essential part of virtually every organization’s business continuity and disaster recovery plans. But often, people are under the impression that backup and data replication are one and the same. While similar, they are not interchangeable technologies and it is important to understand the differences between the two. The following article is meant to clarify the differences between the two technologies. I will strive to explain both backup and replication in very simple terms, hence at time I will simplify things.
What is Traditional Backup?
Backup creates a copy of information that can be restored if the original data is misplaced or infected, damaged or is unviable. Backup uses either tape or disk-based solutions and requires a storage site for the archives. Backups typically include an initial full data copy and other copies of what changed. Data backups are taken on a scheduled basis—either monthly, weekly or daily . The main objectives are to help businesses recover in case of a disaster, maintain compliance and for data recovery purposes.
The following are the main reasons why organizations use backup :
- Data Corruption: a failure in hardware or software caused by bugs or malware
- Human error: happens when an employee or other user accidentally deletes data or damages it
- Hardware failure: happens when a server fails or a hard disk crashes
- Loss of hardware: Theft or natural disasters such as fires, floods
- Data hostages: This is the case of ransomware such as CryptoLocker or WannaCry.
- Data malice: An example is a disgruntled employee or hackers.
Full Back up VS Incremental Backup
The two most well-known types of backup are full backup and incremental backups. Full back up is the type of backup that offers the greatest defense since it is a complete replica of an organization’s data. The downside with this type of backup is that most businesses can only use it every so often because it consumes a large amount of time and resources .
Incremental backup only backs up a copy of data that has been altered or modified since the preceding backup, also known as the delta or change. Incremental backup is a good type of backup because it does not take as much time as full backups do to complete but it takes a long time to restore .
Backups can be scheduled depending on the value and frequency of the organization’s information changes. Generally, a full backup (archive) is performed first and then the organization chooses a schedule that best suits its needs. It is usually recommended that businesses do daily backups and use incremental backups in the time between traditional full backups .
Risks with Traditional Backup
Although traditional backup has been a longstanding method used for total data protection and business continuity, there are risks associated with traditional backups. The following are some risks associated with backup:
- There is no perfect time to do a backup because even with scheduled backups, there are still windows of vulnerability 
- Data is constantly being generated and changed and there is a limit to the amount of data that can be backed up within the backup window 
- Long recovery time objectives (RTOs): restoration takes a long time and is not fast enough in emergency situations because of the cost of lost services, opportunity cost, idle staff, etc. 
- Media failure is the most common reason why backup fails. When backup media goes bad, the backup copy is destroyed. Organizations must be sure to adhere to the vendor’s instructions for management and storing of the media, swap out tapes and clean the drives frequently to prevent failure 
- Life expectancy of the media has limits and its reuse increases the probability of failure
- Obsolescence of hardware (May need backup for backup hardware?)
- Vendor lock in because of the format and software used to write the data
- Vendor going out of business and data locked in a format no longer supported.
Conclusion of Part One (Backup)
In this article, we discussed what traditional backup is, the difference between full and incremental backup, backup schedules, and the risks associated with backup. In the next article, we will talk about data replication, various types of replication and the risks and defensive techniques available to protect data while using replication.
What is Data Replication?
Data replication is the procedure of synchronizing data from one site to another. Its chief purpose is to assist businesses to have current replicas of its information in case of corruption or one or more disasters strikes. Replication can be either scheduled or in real-time (synchronous or asynchronous). Both approaches generate an exact secondary replica of information to match the original copy but the happens in real time happens as the data changes.
Since replication contains livedata, any data loss or corruption is instantly (or almost instantly) copied to the secondary duplicate, which makes it an ineffectual backup approach unless it is well designed and well configured. Additionally, data replication software only keeps a sole copy of the information at the offsite location which means that, unlike backup, it will not contain past versions of the information . Some replication software supports multiple copies, multiple versions and archives of previous files. For this reason, it is critical to use a data replication software that supports multiple versioning and multiple remote locations and paths for different snapshots.
Scheduled VS Real-time Replication
Two types of data replication are common: scheduled and real-time. With scheduled replication, data is replicated on a scheduled basis. The schedule can be close to real-time (for example, replicate data to a remote location every 10 minutes) . On demand replication happens as the user or system administrator issues the commands to start replicating data.
There are several benefits to using scheduled replication. One advantage is it is less expensive than real-time replication because it does not need as much bandwidth or special hardware as some hardware based real-time replication. Furthermore, it functions across extended distances because it is not operating in real-time (Case of hardware based replication). Because of this, scheduled replication can handle some connectivity issues and still be effective. Additionally, the replication window of vulnerability is significantly smaller because replication is more frequent than backup. On the other hand, organizations are in danger of losing some changes to data if the main site did not finish replicating before a catastrophic event . Some vendors have implemented techniques to avoid such cases.
Real-time replication happens when information is written to the main storage site and is copied instantaneously to the offsite storage site. As a result, the main storage site and the secondary site are always synchronized. Since synchronization happens in real-time, it is generally used in situations when there cannot be high recovery time objectives (RTOs) and when instant recovery is needed. The advantage of real-time replication is that it reduces the risk of data loss because it works instantly. There are additional drawbacks that are associated with hardware based replication: it is costlier than software based real-time replication and it will not work for distances further than 185 miles away .
Replication Risks associated with Vendors
Since some replication relies on the use of outside vendors to help store their information, organizations must realize the risks associated with them. It is not uncommon for vendors to go out of business or to be bought by bigger companies. When either of those scenarios happen, it is likely that the organization’s information will also disappear with the vendor unless the software is licensed to continue to operate .
To combat this risk, it is recommended that businesses do two things. First, they should carefully research potentially vendors prior to selecting one to do business with. Second, organizations should have a disaster recovery plans in place in case the vendor goes out of business or experiences an outage . Finally, users should understand their licensing and contracts.
Finally, for both backup and replication or any data synchronization software, it is critical to know where the software is made to avoid one byte for me and one byte for them (Data leaks and theft). The news about how NSA discovered what Huawei is a great reminder of why we should monitor the networks to make sure any software does not reach out to the motherland   .
Defensive Techniques to Protect Data Using Scheduled or Real-time File Replication
There are various defensive techniques that can be used to protect data while using data replication:
- Replicate to Linux
- Replicate to Multiple Locations
- Combine file archival with replication
- Take snapshots.
Conclusion of Part Two
The primary difference between backup and replication is that data replication duplicates data in a structure or format that is recognized by your server or database and can be used instantly without the need for replication or backup software. This guards against vendor lock in or vendors going out of business. Whereas if an organization utilizes traditional backup, the information is copied to a disk or tape-based solution in a format that only the backup application and vendor understands which means the information needs to be restored from the tape or disk prior to it being accessible . While the two technologies may achieve the same goals, may do similar things, they are not identical. Nevertheless, both data replication and traditional backup are necessary components of an organization’s total data protection plan.
In all cases avoid disasters by protecting data using a combination of replication and remote online backup.
- Evans, Chris (June 2014). “Backup vs replication, snapshots, CDP in data protection strategy”. TechTarget. Retrieved 10 May 2017.
- Posey, Brien (October 2016). “Backup technologies keep getting better and better”. TechTarget. Retrieved 11 May 2017.
- Posey, Brien (July 2010). “Data backup types explained: Full, incremental, differential and incremental-forever backup”. TechTarget. Retrieved 11 May 2017.
- “Establishing a Backup Schedule”. TechNet. Retrieved 10 May 2017.
- Gsoedl, Jacob (1 March 2011). “Replication technologies: Asynchronous vs. synchronous replication”. Storage Magazine. Retrieved 11 May 2017.
- Rodrigues, Thoran (23 October 2013). “What happens when your cloud provider goes out of business?”. TechRepublic. Retrieved 11 May 2017.
- Cook, Rick (July 2006). “Data backup failure: Five tips for prevention”. TechTarget. Retrieved 10 May 2017.
- “What is the difference between disk backup and data replication?”. TechTarget. 2006. Retrieved 10 May 2017.
- Apuzzo, Matt (15 November 2016). “Secret Back Door in Some U.S. Phones Sent Data to China, Analysts Say”. The New York Times. Retrieved 17 May 2017.
- Sharwood, Simon (23 March 2014). “US saves self from Huawei spying by spying on Huawei spying”. The Register. Retrieved 17 May 2017.
- Gewirtz, David (21 April 2011). “Welcome to the new Cold War: China vs. the United States” . ZD Net. Retrieved 17 May 2017