How to Sync files between Linux servers

A. A. El Haddi file mirroring software

Global file replication between countries
Mirror files between Linux Servers

Many businesses and agencies need sync files between Linux servers located in the same data center or between data centers located around the globe. EnduraData EDpCloud Linux replication and synchronizations tools help us achieve this.

In this blog, I cover a few configuration scenarios that allow us to synchronize data between Linux servers located in Minneapolis, Chicago, London, and Rome.

Our objective is to mirror and synchronize data from the Linux server located in Minneapolis with the other Linux servers located in data centers in Chicago, London, and Rome. Therefore, any file changes that occur in the Linux server in Minneapolis are replicated to the other Linux servers using incremental copies of only the delta file changes. This kind of replication is also known as delta or bloc file replication. The data is compressed and encrypted by default in transit.

The following are some options available to the system administrators to configure the Linux file synchronization tools and suite:

  • Create one link or replication set for all the Linux servers with Minneapolis as the replication source and with the others as the Linux data replication targets, also known as destinations or receivers
  • Create a replication set or link for each of the Linux replication destinations.

1. Using one replication set to sync files between Linux servers

The following is an XML configuration that syncs data from the Linux server in Minneapolis to all the other Linux servers.

1.1 Replicating data using one link only

Figure 1: We store the following content in the eddist.cfg configuration file.

<?xml version="1.0" encoding="UTF-8"?>
<config name="Network_Configuration" rowsize="8192">
   <link name="outgoing" isrealtime="1" workers="4" password="foo">
        <sender hostname="mpls" alias="*" />
        <receiver hostname="chicago"   storepath="/home/incoming/chicago"/>
        <receiver hostname="london"   storepath="/home/incoming/london"/>
        <receiver hostname="rome"   storepath="/home/incoming/rome"/>
    </link>

</config>

The directive and keywords in the configuration above instruct the Linux replication software to do the following:

  • The replication set or link name is called outgoing
  • The sending Linux server in Minneapolis is a host with mpls as its hostname
  • host mpls replicates data in real time (mpls monitors changes to the file systems and synchronizes the file changes with the remote Linux servers; See section 1.2 for more)
  • The Linux replication receiving server or called chicago stores data received from mpls in /home/incoming/chicago
  • The Linux replication receiver called london stores data received from mpls in /home/incoming/london
  • The Linux replication receiver called rome stores replicated data from mpls in /home/incoming/rome

Creating one replication set to sync files to multiple remote sites works but is not advised for reasons I list at the end of this post.

1.2 Creating a configuration for real-time file system replication monitoring

The following is the content of the real-time file system monitor (edfsmonitor.cfg)

Figure 2: Content of edfsmonitor.cfg (Linux real time file system monitor)

/home/code/svn
/home/users
/data/outgoing
/nfsserver/finacial/models/run

The content of edfsmonitor.cfg has a list of all directories monitored in real time, and any changes made to these file systems on the mpls Linux server are sent to all the other servers in London, Rome, and Chicago.

2. Using multiple replication sets to sync files between Linux servers

This is the recommended configuration to replicate files between Linux servers described above.

Figure 3: A better Linux file sync alternative to mirroring files from one site to multiple sites.

<?xml version="1.0" encoding="UTF-8"?>
<config name="Network_Configuration" rowsize="8192">
    <link name="chicago" isrealtime="1" workers="4" password="foo">
        <sender hostname="mpls" alias="*" />
        <receiver hostname="chicago"   storepath="/home/incoming/chicago"/>
    </link>
    <link name="london" isrealtime="1" workers="4" password="foo">
        <sender hostname="mpls" alias="*" />
        <receiver hostname="london"   storepath="/home/incoming/london"/>
    </link>
    <link name="rome" isrealtime="1" workers="4" password="foo">
        <sender hostname="mpls" alias="*" />
        <receiver hostname="rome"   storepath="/home/incoming/rome"/>
    </link>

</config>

What we have done in figure 3 is to create one replication set or link for each Linux server in each city. This type of configuration has several advantages.

You can:

  • pause replication sets independently of each other
  • resume replication sets independently of each other
  • perform initial file sync of all sets independently of each other
  • monitor each Linux replication set separately
  • troubleshoot each replication set alone
  • generate replication history for each link in a different file
  • cancel replication for each link
  • use different include or exclude replications to restrict what is sent by the Linux server in Minneapolis, and what data is received by the Linux servers in Chicago, London or Rome
  • replication takes advantage of parallelism for disk I/O, network I/O, and CPU scheduling.

Download EDpCloud, and start syncing files between Linux servers, Windows servers, Solaris, Mac, OpenBSD or call 1-952-746-4160.

How to Sync files between Linux servers was last modified: October 8th, 2019 by A. A. El Haddi

Share this Post