Synchronous vs. asynchronous replication

Every high-availabile system must keep application data up to date for multiple nodes simultaneously. As long as our application is stateless (like static webpages for instance) we may copy data manually between nodes. If our application writes data (aka stateful) we have to make sure data is replicated between nodes. We can do this either synchronous or asynchronous.

With synchronous replication a write operation only completes after it is completed on all nodes. Asynchronous replication means that our write completes once it is completed on the node we are writing to.

To make that more clear let’s put that into a simple script.

We simulate two nodes by creating two directories.

mkdir {node1,node2}

We implement synchronous replication with the following script and save it as sync.sh

#!/bin/bash
# synchronous replication
START="$(date +%T)"

# read data from stdin and write it to file
timeout 1 /usr/bin/cat > node1/file

# replicate that file
/usr/bin/rsync node1/file node2/file
END="$(date +%T)"

echo -e "DONE\nStart: ${START}\nEnd: ${END}"
exit 0

We use netcat to simulate a server that runs our replication script.

ncat -e "/bin/bash -c ./sync.sh"  -l 8080

Now we submit some data.

echo "test data" | ncat localhost 8080

Now let us make a small modification to our script and save it as async.sh.

#!/bin/bash
# asynchronous replication
START="$(date +%T)"

# read data from stdin and write it to file
timeout 1 /usr/bin/cat > node1/file

# replicate that file
nohup /usr/bin/rsync node1/file node2/file &
END="$(date +%T)"

echo -e "DONE\nStart: ${START}\nEnd: ${END}"
exit 0

The only difference is that we now run our replication as a background process.
Again we run it with netcat and submit some test data.

ncat -e "/bin/bash -c ./async.sh"  -l 8080
echo "test data" | ncat localhost 8080

In both cases our write operation completes within approx 1 second. To understand the difference between sync and async let’s simulate some network latency between node1 and node2.

sync.sh

#!/bin/bash
# synchronous replication
START="$(date +%T)"

# read data from stdin and write it to file
timeout 1 /usr/bin/cat > node1/file

# replicate that file
sleep 9 && /usr/bin/rsync node1/file node2/file
END="$(date +%T)"

echo -e "DONE\nStart: ${START}\nEnd: ${END}"
exit 0

async.sh

#!/bin/bash
# asynchronous replication 
START="$(date +%T)"

# read data from stdin and write it to file
timeout 1 /usr/bin/cat > node1/file

# replicate that file  
nohup sleep 9 && /usr/bin/rsync node1/file node2/file & 
END="$(date +%T)"

echo -e "DONE\nStart: ${START}\nEnd: ${END}"
exit 0

If we now resubmit our test data we get a positive response within 1 second from the asynchronous version but it takes 10 seconds to complete if we run it in synchronous mode.

So one could ask now why synchronous replication exists at all then? Think what happens if node1 has a hardware failure right after we got the response that our data was written successfully.

Synchronous vs. asynchronous replication is the decision between strong consistency and maximum performance. We can not have both at the same time.

Contact