-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathdescription.txt
43 lines (25 loc) · 3.95 KB
/
description.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
CSE 403 Assignment #2 (25 points)
Spring 2019
Distributed Replicated Hash Table (RHT):
This programming assignment involves the implementation of a distributed hash table where each of its element is replicated on multiple nodes (RHT) . The specification intentionally leaves undefined many components and designs. Please document your decisions/attempts/success/failures. They will be the main topic of the grading process, along with discussing the performance evaluation of your RHT.
The submission should be done through coursesite.lehigh.edu and should consist of:
- the source code of your assignment;
- a document containing the performance plots and the findings you consider worth to be shared with me (if any).
I will grade assignments during a one-on-one meeting with you. During the one-to-one meeting.
Good luck guys!
**** Submission deadline is Thursday 4th of April through coursesite.
========================
Specification:
- The RHT offers three APIs: put(K,V)/get(K)/put(<K1,V1>,<K2,V2>,<K3,V3>). The put(K,V) API is responsible for storing the value V associated with the key K. The get(K) API is responsible for retrieving the value associated with the key K. The put(<K1,V1>,<K2,V2>,<K3,V3>) stores the three keys K1,K2,K3 in the RHT atomically.
Keys are assumed to be of type String and values can be of any type (use generics).
- More in details, the semantics of the above operations is the following:
** The put(K,V) updates the key K in the RHT if K already exists; otherwise it adds it to the RHT. In both these cases, the put returns true. If the put operation cannot be performed due to concurrency with other operations on the same K, then the put returns false. If the put returns false, the application should retry the same put operation until it succeeds.
** The put(<K1,V1>,<K2,V2>,<K3,V3>) has the same semantics as the put(K,V), but applied to multiple keys, atomically.
** The get(K) should returns the consistent value associated with K, if any; otherwise NULL.
- Only conflicting operations should be prevented from acting in parallel (e.g., using locks). Any non-conflicting operation should be allowed to complete regardless the existence of other concurrent operations.
- Each key stored in the RHT is replicated on two different nodes in the system. For simplicity you can assume the range of keys is known a priori and their mapping to nodes is also known. Feel free to explore other solutions as well.
- It is expected that you run your RHT through a script file that starts up all the processes in your RHT. That means, you should not manually start your processes through different terminals. Since each process knows the composition of the distributed system, each process waits until all other processes are running and then it starts executing operations.
- Deploy the RHT on at least five processes. These processes can be five different nodes in sunlab, or on Amazon, or on any other platform that offers computing nodes.
- To test your DHT, write a simple application per process that activates eight threads and generates a configurable number of operations on the RHT (e.g., 100000)in a closed-loop. Each operation accesses random keys. You should have 20% of probability to execute a put on a single key, 20% to execute a put on multiple keys, and 60% of probability to execute a get.
- The test application should also collect performance metrics to be included in plots. Specifically, two metrics need to be collected: average throughput (i.e., the average number of operations per second performed by all processes) and average latency (i.e., the average time needed by the system to accomplish one operation). At least two plots need to be delivered: one plot should have on the y-axis the system throughout, and one plot should have on the y-axis the average latency. Both of them have the range of keys as x-axis. The ranges of keys to use are the following {10;100;1000;10000}.
- No specific programming language it is required to be used. Choose one wisely.