OpenMulticast: Reliable Multicast Communication Protocol

Main

Welcome

The OpenMulticast project implements a reliable IP Multicast communication protocol for the distribution of real-time information.

It is a "best effort" reliable multicast protocol that is well suited for trading floors to distribute real-time feeds to the trader's workstations (100's of them).
It scales well to large number of clients by not maintaining any group membership and by using a purely nack based retransmissions scheme.
CPU, memory and network usage stay constant, even for large number of subscribers.

Architecture

The OpenMulticast project consists of a collection of communication layers providing different functionality that can be assembled as the user see fit.
The base layers are:

Multicast Network Layer: Implements the sending and receiving of UDP multicast datagrams.
It also includes the discovery and configuration of the network interface(s) enabled for OpenMulticast.
Multicast Transport Layer: Implements the FIFO ordering and retransmission of messages.
This is basically the heart of the OpenMulticast communication protocol and is detailed further in the following paragraphs.
Fragmentation Layer: Fragments large messages into smaller ones before sending them and reconstruct the large message from the fragments on the receiver side.
This allows to keep the UDP datagram at their optimal size for the network.

Additional layers:

Asynchronous layer: Decouples the receiver thread from the deliver thread.
This layer can be configured to disconnect slow consumers when the receiver's backlog grows over specified criteria. (time, number of messages, size of the log).
Tcp network layer: Can be used to convert the OpenMulticast traffic to TCP for reaching remote clients.
Compression layer.
Encryption layer.
Batch layer: Used to improve network usage by batching messages together.
Throttling layer: Used to prevent flooding the network with a producer that would saturate the network bandwidth.

Slow Clients

Having a large number of receivers, it is to be expected that some of them will misbehave (at least temporarily).
Clients sometimes experience CPU or memory resource issues (for ex. during the start of a large application) or temporary network problems. When such situation occurs, it should not affect the "healthy" receivers.
Some reliable multicast protocols implement flow control. In our case this feature is typically not wanted as a slow consumer could delay the healthy ones.
For a real-time financial market feed, it is more important to continue delivering the feed to the healthy subscribers than to maintain the connection to slow clients.

Retransmissions

The server maintains a backlog of the recently sent messages and retransmits them upon request (nack).
Therefore, if a slow clients recovers fast enough, it is given a chance to catch up.
Different criteria can be used to configure how long messages remain in the backlog. These are:

Number of messages in the backlog.
Total size of the messages in the backlog.
Maximum time messages are retained in the backlog.

The rate of retransmissions is limited (throttled) by the publisher in order to prevent saturation of the publisher CPU & network.

Nack Storms

Multicast messages can get lost at the sender side or when the network is encountering a temporary glitch.
This can cause all or a large number of subscriber to miss the same message(s).
In some reliable multicast protocol, this scenario can lead to the "Nack storm" syndrome where too many nack and retransmission messages are being transmitted simultaneously causing the network resources to saturate.
OpenMulticast is designed from the start not to suffer this issue. This is achieved by the following:

Time randomization for sending the nacks.
Nacks are sent via multicast.
Retransmissions are sent via multicast so that all subscribers can receive them.
Subscribers listen for each other's nacks and retransmissions, they make sure not to submit redundant nacks before some configurable delay.

Failure detection

The publisher can be configured to send keepalive messages when the publisher has not send any message after some configurable time.
Failure detection triggers if no message has been received from the publisher after a configurable grace period (normally 3 times keepalive period).

In case you do not want to use failure detection, you might still want to enable the keepalive messages as they prevent the network infrastructure from closing multicast routes when there is no traffic.

High Availability of the publisher

It is possible to implement high availability of the publisher by transferring the failed publisher's backlog to another instance of the publisher. This can be achieved by replicating the backlog with active replication (JGroups) or by storing the backlog in a database that is reloaded when the backup takes over.

Implementation status

The base layers are fully implemented. They provide the reliable multicast protocol features.
Some of the additional layers are also implemented, the others will be added shortly to the project.

Implemented:

Multicast Network Layer.
Multicast Transport Layer.
Fragmentation Layer.
Encryption layer.
Configure the protocol stack using an XML configuration file.
Asynchronous layer.
Tcp Network layer.
Throttling layer.
Compression layer.
Batch layer.

Under developement:

Trace layer.
Improved coverage of MulticastTransportLayer unit tests.

Benoit Jardin