Distributed Key Value Store System Inspired by Amazon DynamoDB

Design Principles

The design of this project follows Dynamo-style data partitioning, including the use of a consistent hash ring and replication policies, with a simplified consistency strategy. The communication framework uses gRPC.

Data Partitioning

The partitioning algorithm is based on a hash ring. Both nodes and keys are hashed into unique positions in the hash space, and these positions determine the key ranges each node is responsible for and which key range a key falls in.

The hashing function uses Splitmix64, which is a pseudo-random generator that produces random but deterministic 64-bit tokens for nodes and keys. By leveraging a uniform hash function, keys are distributed across different locations in the hash ring, which partitions keys uniformly into different key ranges (nodes).

As an example, if node A is hashed to position 1 on the hash ring and node B to position 5 as the next node following A, then a key hashed to position 3 will fall within the key range A–B, as illustrated in the figure below.

Figure 2: Partitioning and replication of keys in Dynamo ring.

The hashed value (token), IP address, port, and the version number of the hash ring used for maintaining consistency are stored in a Routing Table. This table is updated and maintained by the manager node, cached locally by each storage node, and fetched by the client during get or put operations.

Data Replication

Replication is handled by walking the hash ring and assigning replica nodes after the primary node. The primary node is defined as the first node whose token is greater than or equal to the key’s token.

For example, when the replication factor is 3 and a key K falls within the key range A–B, the replicas are assigned to the next two nodes following the primary node on the ring.

Because placement is fully determined by the hash ring and all nodes follow the same rules, every node knows exactly which key ranges it should store after any membership change. This keeps the replication algorithm simple, reliable, and deterministic.

Data Consistency

Manager Node Behavior

The manager maintains two routing tables:

Current Table: Represents the state that has been synchronized with storage nodes in the previous round.
Next Table: Represents the new system state that has not yet been synchronized (including newly added nodes, removed failed nodes, and rehashed tokens).

Separating the two tables keeps the system stable and prevents overlap between old and new states.

The manager enforces consistency with storage nodes at a fixed interval (every 5 seconds in this implementation). During each synchronization round, the next table becomes the current table, and the manager broadcasts the updated routing table to all storage nodes.

Each storage node then uses this table to perform consistency operations and fetch key ranges it is newly responsible for.

Client Behavior

The client.put() function writes each key–value pair to all primary and replica nodes. However, partial failures can lead to inconsistent versions across replicas. For example, with a replication factor of 3, two nodes may successfully apply an update while the third fails.

To handle this, each key is associated with a timestamp. During client.get(), the client queries all nodes that hold a copy of the key and returns the value with the most recent timestamp, ensuring the newest version is selected regardless of replica inconsistencies. Any inconsistencies are later corrected during synchronization initiated by the manager.

Storage Node Behavior

When a storage node receives a synchronization message from the manager containing the current routing table, it computes all token ranges it is now responsible for, including both primary and replica ranges.

The node then contacts other nodes that previously owned these ranges and fetches all key–value pairs that fall within them. Among replicas, the latest value is selected based on timestamps.

To ensure consistency, storage nodes wait for all peers to complete the fetch phase using a barrier mechanism implemented via a blocking gRPC call to the manager. Once all acknowledgments are received (or a timeout occurs), the node installs the new routing table, removes keys that no longer belong to its ranges, and inserts newly fetched keys.

The use of timestamps guarantees correctness even if a replica node fails during an update, since newer timestamps always dominate older values.

Experiment Usage

run make to compile project.
run make validate to run correctness test
run make perf for throughput, load balance performance test
run make clean to clean build files
run make kill to kill manager/storage process.

References

Rosetta Code contributors. (n.d.). Pseudo-random numbers/Splitmix64. Rosetta Code.
Retrieved November 14, 2025, from https://rosettacode.org/wiki/Pseudo-random_numbers/Splitmix64
DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A.,
Sivasubramanian, S., Vosshall, P., & Vogels, W. (2007).
Dynamo: Amazon’s highly available key-value store.
Proceedings of the 21st ACM Symposium on Operating Systems Principles (SOSP’07), 205–220.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
image		image
src		src
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pressure_test.sh		pressure_test.sh
validation_test.sh		validation_test.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Distributed Key Value Store System Inspired by Amazon DynamoDB

Design Principles

Data Partitioning

Data Replication

Data Consistency

Manager Node Behavior

Client Behavior

Storage Node Behavior

Experiment Usage

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Distributed Key Value Store System Inspired by Amazon DynamoDB

Design Principles

Data Partitioning

Data Replication

Data Consistency

Manager Node Behavior

Client Behavior

Storage Node Behavior

Experiment Usage

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages