Data About Data
The important difference between data lakes and data swamps is prudently organized data leads to an efficient lake while a swamp is just data that is either over-replicated or siloed by its users. Getting the information on how the production data is being used across organization can not only be beneficial in building a well-organized data lake but it will also help data engineers to fine-tune the data pipelines or data itself.
To understand how data is consumed, we need to figure out answers to some basic questions like:
from DZone.com Feed https://ift.tt/2CQ43dl
No comments:
Post a Comment