Skip to content

clustering #10

@wkalt

Description

@wkalt

dp3 deployments will need to scale dynamically. There are some complications:

  • A particular (producer, topic) combo must be routed to only one node at a time. Zero (failure or blocking) is ok; multiple is unacceptable. Writes to a single tree must be serialized or data may be lost. After a node stops processing a (producer, topic) it must never do so again until it restarts and gains a higher range of versions than any other node has claimed.
  • To make things as easy as possible for the user, we want to minimize the number of deployed components. We will already have at least two in distributed environments (dp3 + rootmap). Adding more for cluster management/cluster state management will be unpleasant. Components written in go that we can run in-process are attractive.
  • Since we don't want to put too many assumptions on the load balancer, we should assume any node can receive a write request and must then forward it to the node that can actually process it. So each node needs an up-to-date shard mapping. If the mapping is out of date, the request needs to be rejected or fail.
  • Shard re-balancing must consider the write ahead log.

Limited factors in our favor:

  • dp3 is targeting a deployment shape with a relatively static number of beefy nodes. Scaling is not expected to be highly dynamic. Limited blocking during scaling events (while shards are rebalanced) may be acceptable.
  • reads can be served by any node at any time, limiting disruption to writes, which are probably queue-fed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions