“slave_net_timeout
“There is no heartbeat mechanism that helps slaves to know that
their connection to a master hasn’t vanished. What can happen
is a network connection gets broken between the two in a way that
neither detects. This is typically a firewall/router issue or
something that neither host initiates or sees, so neither one is
able to send (or receive) a TCP packet that would normally begin to
shut down the connection. This is especially true of replication
topologies that involve crossing significant distances where
multiple networks and providers may be involved.“MySQL uses a simple timeout mechanism to detect this hopefully
rare occurrence. If the slave I/O thread has not seen anything from
the master in slave_net_timeout seconds, it will connect and then
attempt to reconnect and continue replicating. That mechanism works
very well and allows slaves to deal with the occasional network
glitch.“Unfortunately the default value for this variable is 3600.
That’s a full hour of time that passes before the slave
decides to give up and try starting with a new connection. So not
only do you run the risk of a slave being nearly an hour behind on
replication, you may find that this is trickier to detect than you
might think!”