Berkeley DB Reference Guide:
Berkeley DB Replication


Building the communications infrastructure

The replication support in an application is typically written with one or more threads of control looping on one or more communication channels, receiving and sending messages. These threads accept messages from remote environments for the local database environment, and accept messages from the local environment for remote environments. Messages from remote environments are passed to the local database environment using the DB_ENV->rep_process_message method. Messages from the local environment are passed to the application for transmission using the callback interface specified to the DB_ENV->set_rep_transport method.

Processes establish communication channels by calling the DB_ENV->set_rep_transport method, regardless of whether they are running in client or server environments. This method specifies the send interface, a callback interface used by Berkeley DB for sending messages to other database environments in the replication group. The send interface takes an environment ID and two opaque data objects. It is the responsibility of the send interface to transmit the information in the two data objects to the database environment corresponding to the ID, with the receiving application then calling the DB_ENV->rep_process_message method to process the message.

The details of the transport mechanism are left entirely to the application; the only requirement is that the data buffer and size of each of the control and rec DBTs passed to the send function on the sending site be faithfully copied and delivered to the receiving site by means of a call to DB_ENV->rep_process_message with corresponding arguments. The DB_ENV->rep_process_message method is free-threaded; it is safe to deliver any number of messages simultaneously, and from any arbitrary thread or process in the Berkeley DB environment.

There are a number of informational returns from the DB_ENV->rep_process_message method:

When DB_ENV->rep_process_message returns DB_REP_DUPMASTER, it means that another database environment in the replication group also believes itself to be the master. The application should complete all active transactions, close all open database handles, reconfigure itself as a client using the DB_ENV->rep_start method, and then call for an election by calling the DB_ENV->rep_elect method.

When DB_ENV->rep_process_message returns DB_REP_HOLDELECTION, it means that another database environment in the replication group has called for an election. The application should call the DB_ENV->rep_elect method.

When DB_ENV->rep_process_message returns DB_REP_NEWMASTER, it means that a new master has been elected. The call will also return the local environment's ID for that master. If the ID of the master has changed, the application may need to reconfigure itself (for example, to redirect update queries to the new master rather then the old one). If the new master is the local environment, then the application must call the DB_ENV->rep_start method, and reconfigure the supporting Berkeley DB library as a replication master.

When DB_ENV->rep_process_message returns DB_REP_NEWSITE, it means that a message from a previously unknown member of the replication group has been received. The application should reconfigure itself as necessary so it is able to send messages to this site.

When DB_ENV->rep_process_message returns DB_REP_OUTDATED, it means that the environment has been partitioned from the master for too long a time, and the master no longer has the necessary log files to update the local client. The application should shut down, and the client should be reinitialized (see Initializing a new site for more information).


Copyright Sleepycat Software