Lab 8: Gossip Protocols
- Due No Due Date
- Points 200
- Submitting a text entry box, a website url, a media recording, or a file upload
Objective
The purpose of this lab is to gain experience in peer-to-peer messaging using a gossip protocol by building a simple chat system using picos.
Reading and Reference
Read the following:
- Gossip Protocol Links to an external site. - Wikipedia page is a nice intro
- The promise, and limitations, of gossip protocols Links to an external site.
- Gossip Dissemination (Links to an external site.) - This is a easy to understand article, but keep in mind that Fowler is describing a system where all the nodes could send a message to any other node if they chose. In the system you'll build, nodes will have subscriptions to a group of other nodes. Picos can only communicate with picos they are directly adjacent to in the subscription graph.
The following video may be helpful:
Gossip Video
Links to an external site.
You may find the following resources useful:
- Using Gossip Protocols For Failure Detection, Monitoring, Messaging, And Other Good Things Links to an external site.
- Gossip protocols for large-scale distributed systems Links to an external site. - tutorial slides
- RFC 1036 - Standard for Interchange of USENET Messages Links to an external site. - Old-style USENET news used a variant of a gossip protocol.
Prerequisites
You should have completed:
- Lab 7: Reactive Programming Patterns
Implementation Notes
- This can be a difficult topic. You are going to be thinking in ways you likely haven't before about asynchronous, distributed processes. If you have any questions about the gossip protocol, please ask the TA's. Leave yourself plenty of time.
- By now, you have considerable experience with picos. Consequently, there is considerable leeway in how you architect the solution to this problem. You will be picking function and event names. Be sure to follow previous guidelines for naming.
- You will use scheduled events Links to an external site. in this lab to 'wake' a pico up to propagate messages. You will need a setup or initialization rule that responds to a startup event that sets the schedule for the pico. Make sure the scheduling period can be updated.
- When choosing what message type to send, the random Links to an external site. library may be of use.
- One implementation details we can conveniently ignore for an assignment like this is storage. Don't try to conserve storage for this assignment. Don't optimize pre-maturely.
- You may find it handy to have sensor picos respond to a `initialization` event and provide a rule in the sensor management pico that sends it to all its child picos. This will allow you to easily reset the state of your system as you experiment.
- Similarly, ensuring that your rulesets have a ruleset initialization rule that sets any required state is useful for easily restarting the entire ecosystem. The scatter-gather pattern is useful for envisioning how this works, although you may not, for this lab, need to gather up much information from the children.
- You will use subscriptions to create the graph of picos that constitutes the Gossip network. You can do this manually for this lab.
- Part of this assignment is to create an eventually consistent, shared state of the temperature readings for each sensor in the network. Be consistent. A pico's own temperature readings, for example, are just another part of the shared state. Don't make it a special case.
- You will save yourself time if you use the Sensor Management pico to manage your Sensor Picos. For example, rules to clear the relevant state in the children, start and stop the gossip heartbeat, adjust the emitter and gossip periods, etc. will make it easy to manage your network and debug problems.
Gossip Protocols and Picos
In this lab you'll use a gossip protocol to ensure that everyone can see everyone else's temperature readings even if they don't have direct connections to them--so long as there is a fully connected graph between each other's picos.
Gossip protocols are very important in peer to peer (heterarchical) systems. For example, most blockchains make use of gossip protocols for communication between nodes.
A gossip protocol is useful for distributing messages in a graph of connected (but not fully connected) nodes. The foundation of your gossip system will be messages that are exchanged between different picos.
In most gossip protocol implementations, you’d pass messages over a cheap, but less reliable protocol such as UDP. For simplicity, we’re going to use events between picos since you already know how to use them.
In a gossip-based messaging system, not every peer knows about all the others, so each one must be able to forward messages it has received to other nodes who might not yet have received them, while ensuring that this forwarding does not cause infinite loops (e.g., A sends a message to B, which sends it back to A, which sends it back to B, etc.).
The point of exchanging gossip messages is to create an eventually consistent, shared state of the temperature readings for each sensor in the network. In Lab 8: Reactive Programming Patterns you used a distinguished leader (the sensor pico) to coordinate the scatter-gather algorithm and create a temperature report. But what if we can't use a leader who might become a central point of failure? Gossip protocols can solve this problem.
In this lab, you will create a gossip network between the temperature sensor picos so that they can all get all the temperature data from all of the nodes. Rather than a leader who scatters the event giving the need for a report, the nodes will all share temperature information with any nodes they are connected to and build a report from that.
The idea is to create a data structure that shows the latest temperatures from all the sensors in a network (created via pico subscriptions). You should be able to query this data structure at any node and get (roughly, modulo propagation delays) the same data as from any other node.
Message IDs
Each rumor message will need a unique ID (a correlation identifier) that can be used to keep track of gossip messages. You will use this ID to ensure there are no loops, find nodes to forward messages to, and keep messages separated.
Message IDs will consist of two parts (for reasons that will become clear in a minute):
- A unique origin ID. You can use the random library to give your pico a unique name. Make sure you save it in an entity variable. This can be set in a pico initialization rule.
- A sequence number that distinguishes successive messages from a given origin. Each gossip node will assign sequence numbers to messages consecutively starting with 0. You'll need to keep track of the current sequence number in an entity variable.
We’ll separate the two parts with a colon, so a complete message ID might look like this:
ABCD-1234-ABCD-1234-ABCD-1234:5
Message IDs made in this way make it easy for peers to compare notes on which messages from which other peers they have or have not yet received. For example, if A has seen messages originating from C up to sequence number 5, and compares notes with B who has seen C’s messages only up to sequence number 3, then A knows that it should propagate C’s messages 4 and 5 to B.
Messages
There are two kinds of messages:
- Rumor Message: contains the text of the user message to be gossiped. The message is a JSON object containing the following fields:
MessageID
—a string containing the unique ID for this message as described above. If you originated the message, it will be one you generated, otherwise you will use the message ID already in the messageSensorID
—a string giving the ID (the first part of the message ID) of the sensor originating the temperatureTemperature
—a string containing the temperatureTimestamp
—an ISO Datetime stamp of when the temperature was recorded.
Here’s an example:
{
"MessageID": "ABCD-1234-ABCD-1234-ABCD-1234:5", "SensorID": "BCDA-9876-BCDA-9876-BCDA-9876", "Temperature": "78",
"Timestamp": <ISO DATETIME>,
}
- Seen Message: summarizes the set of messages the sending peer has seen so far. A seen message is a JSON object containing zero or more of the following fields:
<OriginID>
—the keys are the origin IDs that the sender knows about. The number associated with each key is the last message sequence number the node has seen from thisOriginID
. This number will be considered the highest complete sequence number. i.e. if a pico stored messages 0,1,2, and 4, then its highest reported sequence number should be 2 because it is still missing 3, even though it has stored 4. This requirement will not seem needed for this lab, but you'll see if's crucial in some forms of gossiping in the next lab.
Here’s an example:
{"ABCD-1234-ABCD-1234-ABCD-125A": 3, "ABCD-1234-ABCD-1234-ABCD-129B": 5, "ABCD-1234-ABCD-1234-ABCD-123C": 10 }
You can assume that this is a complete list (i.e., it contains all the nodes and messages that the sending node knows about).
Propagating Rumors
Each node will run the following message propagation algorithm:
when gossip_heartbeat {
subscriber = getPeer(state)
m = prepareMessage(state, subscriber)
send (subscriber, m)
update(state)
}
A scheduled event from the pico will periodically send itself a gossip_heartbeat
event that will cause a message to be prepared and sent to one of the pico's Gossip subscriptions. For scheduled events, use a "one time event" that is simply scheduled at the end of your gossip heartbeat
event. This allows us the change the value of n according to an entity variable. In the scheduled event
Links to an external site. docs, you can change "minutes" to "seconds" for the value of n.
The functions operate as follows:
getPeer()
—uses the current state to determine one peer to send a message to. This is not as simple as picking a node at random. You must choose a peer that needs something from you. If the peer knows your current temperature and it's last seen message indicates that it has seen everything you already know, then you don't need to send it anything. You should try to be fair, not simply pick the first node who needs something from you. You're going to have to keep state about who knows what (to the best of your knowledge) and use that state to determine which peer to send a message to.prepareMessage()
—return a message to propagate to a specific neighbor; randomly choose a needed message type, and if it's a rumor, which message.update()
—update state of who has been sent what.send()
—send a message to the peer
Note that this is just pseudocode. You may find it advantageous to combine, for example, getting the peer and preparing the message into a single action. Or you may need to split these apart into smaller units.
Each pico should also include two rules for responding the messages: one for responding to rumor
events and one for responding to seen
events. Use gossip rumor
and gossip seen
respectively for the domain and type to select on those events.
The rule for responding to rumors should store the information in the rumor in an entity variable representing what the pico knows about the state of the outside world. A special case to watch out for would be where pico A sends a rumor to B with sequence number 5, but B has only seen up until 3. Just store number 5 and be sure to only report the highest complete sequence number as 3 when Pico B sends a seen
message to another pico. This will allow any gaps to be filled.
The rule for responding to seen
events should check for any rumors the pico knows about that are not in the seen message and send them (as rumors) to the pico that sent the seen event. Note that how this is done affects the amount of time it takes for the network to reach consistency. For example, you could just send one needed piece of information (the stingy algorithm) or all of the needed information.
Adding Messages
The gossip ruleset should function independently of the temperature sensor rulesets. The sensor will continue to update the temperature from the sensor as it always has. The gossip heartbeat
event causes the gossip system to function, not the sensor. In other words, you should not alter the temperature ruleset to raise the gossip heartbeat
event.
Each pico can consider its temperature as just one more piece of state that gets updated, independent of how it gossips with others about that state. Consequently, you don't need a special message to tell others about your own temperature, it's just another rumor that is selected according to your algorithm. Your sequence number should only be incremented when you send a rumor about your own temperature, not every time there's a new temperature.
Note that the gossip heartbeat period should be less than the sensor heartbeat period or else gossip can never keep up.
Adding Peers
The mechanism that keeps track of a dynamically modifiable list of peers is subscriptions. This list will be the set of neighbors you pick from to gossip with. Each pico can start with a fixed set of peers that are manually added, but you will need a way to add new peers (manually adding a new subscription between two sensor picos will be fine). Ensure that the roles on the subscription are defined in a way that allows you to send gossip to other sensor peers rather than any pico that has a subscription. To do this, use Tx_role and Rx_role in each subscription to simply be "node".
The above functionality should give your application everything it needs, at least in theory, to operate “at large” over the Internet.
Do This
- Implement the simple gossip scheme described above.
- Test it by running some picos of your own.
- Subscribe picos to each other to form the connections between them. They only know about nodes they have a direct subscription to. You can fake the temperature updates for this test.
- You'll need to start enough for the test to be interesting
- The picos will have to connect to each other somehow. Don't create a fully connected graph. You could do this manually or have a setup process that randomly connects them.
- Send a startup event to each pico to set its schedule, or you could just select on the
wrangler ruleset_added
event to start the gossip on installation in a pico. Remember to schedule the event according to some customizable time intervaln
. - Vary the value of
n
, the length of time in seconds between raising the scheduled events. What do you observe? - Add the ability to subscribe to dynamic peers in your gossip application (manually setting up a new subscription is also acceptable).
- Connect your temperature pico to ones from others in the class. Make sure messages propagate. If you all connect to a few others, you could potentially have the temperatures from all the sensors in class on your node.
- Devise a way using entity variables and conditional rules so you can switch off message processing on a given node by sending it a
process
event with a status attribute ofon
oroff
. Use this to stop processing on one pico. Send some messages among the other nodes. Turn the stopped pico back on. Did it catch up?
Deliverables
Turn in the following:
- URLs for the rulesets you created
- Short screencast (< 5 min with sound) showing how your system works, message propagation, how changing the sleep time affects message propagation, how peers are added, and operation of the system after a new peer is added.
- Answers to the following questions:
Questions
- This lab uses a vector clock Links to an external site. algorithm to create unique message IDs based on a sequence number. Could we replace the sequence number with a timestamp? What are the advantages and disadvantages of such an approach?
- Are the temperature messages in order? Why or why not? If not, what could you do to fix this?
- How did you avoid looping (sending messages back to someone who already has it)? Why was the unique ID helpful?
- The propagation algorithm sleeps for
n
seconds between each iteration. What are the trade-offs between a low and high value forn
. - Did new messages eventually end on all the nodes that were connected? Were the messages displayed in the same order on each node? Why or why not?
- Why does temporarily disconnecting a node from the network not result in permanent gaps in the messages seen at that node?
- Describe, in a paragraph or two, how you could use the basic scheme implemented here to add failure detection to the system using a reachability table.