Time, Clocks and Ordering of Events in a Distributed System

Abstract:

One event occuring before another is shown to define a partial ordering of events.
A Distributed Algorithm for synchronizing a system of logical clocks is given.
A method for solving the synchronization problems in clocks is also provided.
This algorithm is then specialized for physical clocks and a bound is derived on how far out of sync can the clocks become.

Introduction:

Distributed System: Collection of distinct processes that are spatially separated and communicate with each other using message passing.
Any system is distributed if the message transmission delay is not negligible compared to the time between events in a single process.

Partial Ordering:

A Happened-Before relation can be used to order events in a process. That is, event A happened before event B, if their physical timestamps are in sequence.
This observation however is made based on the actual physical values of time and hence requires physical clocks to exist, to define physical timestamps.
Even if there are real clocks, the ordering is not guaranteed to be completely accuracy, since clocks may not be precise.
Assumption: Events of a process form a sequence such that 'a' occurs before 'b' in this sequence, if event 'a' occurs before event 'b'.
Happened-Before : Relation denoted by '->' which satisfies the following conditions:
1. If 'a' and 'b' are events in the same process, and 'a' comes before 'b', then a->b.
2. If 'a' denotes sending of a message and 'b' denotes receipt of the same message, then a->b.
3. If a->b and b->c, then a->c.
Two events 'a' and 'b' are said to be concurrent if a !-> b and b !-> a.
a->b also implies that it is possible for event 'a' to causally affects event 'b'.
Two events are concurrent if neither can causally affect the other.

Logical Clocks:

Consider a clock function that assigns a numerical value {C(a)} to each event 'a' in process P_j.
This clock function doesn't have to be related to the physical time at all. It could be something as trivial as a counter mechanism and is just a means of ordering.
Clock Condition: For any events a, b: if a -> b, then C(a) < C(b).
Note: Converse isn't true, as it implies two concurrent events occur at the same time. {Check diagram from source}
There are two conditions to be met for the clock condition to hold:
1. If a, b belong to Process P_i and 'a' comes before 'b', then C_i(a) < C_i(b).
2. If a denotes sending a message from process P_i and 'b' denotes receipt of the message at process P_j, then C_i(a) < C_j(b).
Implementation Rule:
1. Each process P_i increments C_i between any two successive events
2. a) If event 'a' denotes sending a message 'm' by process P_i, then the message m contains a timestamp T_m = C_i(a).
  b) Upon receiving 'm', process P_j sets value of C_j greater than or equal to its present value AND greater than T_m.

Ordering the Events Totally:

Define a relation '=>' such that a=>b ('a' precedes 'b') if and only if:
1. C_i(a) < C_j(b) OR
2. C_i(a) = C_j(b) and P_i < P_j
This relation defines total ordering, and if a->b, then a=>b [From Clock condition]. P_i is an arbitratry condition to decide ordering.
NOTE: The ordering '=>' is not unique and depends on the clocks used and the arbitrary function P_i used.
Consider a mutual exclusion problem with 3 conditions:
1. If a process gets a resource, it must eventually release it before the resource can be given to other processes.
2. Different requests for the resource must be granted in the order they are made.
3. If every process requesting a resource eventually releases the resource, then all requests are eventually granted.
Assumption:
1. For any two processes P_i and P_j, messages sent from P_i to P_j are received in the same order they are sent.
2. Every message is eventually received.
3. Every process can send messages to every other process.
We use the Implementation Rules to define a total ordering '=>' of all events.
Algorithm:
1. To request a resource, process P_i sends a message Tm:Pi requests resource to every other process, and puts that message on it's request queue.
2. When process P_j receives this message, it puts the message on it's request queue and sends an ack.
3. To release a resource, P_i removes any Tm:Pi request resource message from it's request queue and sends a message Tm:Pi release resource to all other processes.
4. When process P_j receives this message, it removes all the request message for P_i from it's request queue and sends an ack.
5. Process P_i is given the resource if the following 2 conditions are met:
 a) There is a Tm:Pi request resource in it's request queue which is ordered before any other request in the queue. We use the relation '=>' to maintain this ordering.
 b) For such a message in the queue, there has been an ack from all other processes with timestamp T_resp greater than the message timestamp T_m.
Verifying the algorithm:
- Rule 5.b) coupled with the assumption that messages are received in order guarantees that P_i has learned about all the messages which preceeded it's current request.
- Rules 3, 4 ensure that condition 1 holds.
- Rules 3, 4 also ensure that Rule 5.a) holds which ensures condition 3.
- Since total ordering '=>' extends partial ordering '->', condition holds.
This algorithm can be used to attain a any synchronization for a distributed multiprocess system.
Perhaps the biggest shortcoming of this approach is, Failure of a single process results in the entire algorithm failing.

Anomalous Behavior:

Consider a situation in a distributed system, where a Process P_i sends a message M_i to process P_k on a far off node. Another process P_j submits a message M_j to the same process P_k after P_i sends the message.
Clearly M_i should be ordered before M_j. However, if the message M_i faces some congestion in the network and is delayed resulting in M_j reaching P_k first, the ordering will be different from what is expected, violating the planned behavior.
2 approaches are suggested for resolving this:
1. Timestamp T_j associated with M_j could be greater than T_i causing the ordering to be exactly what is expected.
2. Add the following clock condition to the system of clocks:
 Stronger Clock condition: For any events a, b in Set S, if a=>b then C(a) < C(b).

Physical Clocks:

Assume clocks run in continuous intervals, rather than discrete ticks. C_i(t) represents the reading of a physical clock at time 't'. dC_i(t)/dt represents the rate at which the clock runs at time 't'.
All clocks must run at approximately the same correct rate. That is dC_i/dt~=1.
Condition 1: There exists a constant k<<1, such that, for all i : dC_i(t)/dt - 1 < k.
It is also important for all the clocks to have similar/less amount of delays. That is, C_i(t) ~= C_j(t).
Condition 2: For all i, j: C_i(t) - C_j(t) < e.
New implementation rules for clocks:
1. For each i, if P_i does not receive a message at time 't', then C_i is differentiable at t and dC_i(t)/dt > 0.
2. a) If P_i sends a message 'm' at time 't', then the message contains a timestamp T_m = C_i(t).
  b) Upon receiving a message 'm' at time t₁, process P_j sets C_j(t₁) equal to a maximum of (C_j(t₁ - 0), T_m + u_m)

Conclusion:

Described concept of happened-before along with a partial ordering of the events in the system.
Extended the partial ordering into a total ordering.
Total ordering can still show anomalous behavior if there is a disagreement with the time chosen by the users. This can be solved by using physical clocks.
Describe a theorem to synchronize clocks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lamport_clocks.md

Lamport_clocks.md

Time, Clocks and Ordering of Events in a Distributed System

Abstract:

Introduction:

Partial Ordering:

Logical Clocks:

Ordering the Events Totally:

Anomalous Behavior:

Physical Clocks:

Conclusion:

Files

Lamport_clocks.md

Latest commit

History

Lamport_clocks.md

File metadata and controls

Time, Clocks and Ordering of Events in a Distributed System

Abstract:

Introduction:

Partial Ordering:

Logical Clocks:

Ordering the Events Totally:

Anomalous Behavior:

Physical Clocks:

Conclusion: