Data Platform Designs

Different industries and different use-cases will require different data platforms designs and architectures.

The Data Lake Platform

The focus of a Data Lake based Data Platform is the storage of data “as is”. There maybe no need to transform to the data to an internal data model or the data model maybe done as “schema on read”. Allowing different users and systems to pick the data they need from the “raw” data.

NinjaDraw diagram is not selected.

DataLakePlatform

Components Required : Storage, Orchestration, Ingestion
Optional : Data Quality, Data Lineage

The Distribution Platform

The focus of Distribution based Data Platform is to act a centralised Data Distribution for an organisation. Organisations where data is shared between multiple departments or teams would benefit from having a single storage unit acting like a “data bus” rather than use point to point data transfers.

An Ingestion layer is optional as it could be left up to the individual feeder systems to load the data themselves. Each may have a preference for their ingestion methods. However, as the Data Platform grows in use, the nature tendency is to build one loading mechanism used by all.

Components Required : Storage, Dissemination
Optional : Ingestion, Data Quality, Data Lineage

The “Two Pillars” Platform

The “Two Pillars” Platform is a combination of a Data Lake Platform and a Distribution Platform. The transformation layer could be left out as each data consumer could perform its own transformation to its own data model.

Components Required : Storage, Orchestration, Ingestion, Dissemination
Optional : Ingestion, Data Quality, Data Lineage

The “Full Monty” Platform

The full Data Platform fits for organisations pulling in 3rd party data, re-modelling it and disseminating the data across teams and storing internally produced data

Components Required : Storage, Orchestration, Ingestion, Transformation, Dissemination
Optional : Data Quality, Data Lineage