Domain Driven Design (DDD) and Microservices have fundamentally changed the way we develop software in the last decade. The Data in the analytics department is fast catching up with that also. As software architectures have moved from monolithic to microservices, the data platforms are also moving away from being centralized monoliths to domain-oriented, decentralized data spaces. This decentralization is critical to eliminate the failure modes in the current centralized data platform implementations (2). Further to that, each of these autonomous data environments will also use one or more purpose built, best-fit datastores (3) (event brokers, relational systems, object stores, graph databases etc.) to persist and expose their data products.
This decentralization, together with the polyglot and hybrid- cloud characteristics of the modern data environments have led to some challenging consequences to the field of data governance. As the adoption of this new paradigm accelerates (4), there is a critical pressure for reinventing a data governance solution that works at this scale and complexity, but without compromising the organizations’ agility in producing and democratizing data.
This repo contains a core idea of how to address the data governace at scale challenge.