The Data Vault approach gives the data modelers a lot of options to choose from: how many satellites to create, how to connect hubs with links, what historicity to use, which field to use as a business key. Such flexibilites leaves a lot of options for inoptimal modeling decisions.

I want to illustrate some choices (I call them issues) with risks and possible solutions from other modeling techniques, like Anchor Modeling. All issues are based on the years of evolving the Data Vault and Anchor Modeling data warehouses of 100+ TB in such databases as Vertica and Snowflake.

Speaker: Nikolai Golov is Head of Data Engineering of ManyChat (SaaS startup with offices in San Francisco and Yerevan), and a lecturer at Harbour Space University in Barcelona (data storage course). He studies modern data modeling techniques, like Data Vault and Anchor Modeling, and their applicability to big data volumes (tens and hundreds of TB). He also, as a consultant, helps companies to launch their own analytical/data platform.

Recorded at the Data Modeling Meetup Munich (DM3), 2022-07-18

Also recommended are the additional Medium articles by Anton Poliakov:

