Unitemporal Vertica support in test

With the help of Nikolay Golov (avito.ru) support for generating unitemporal Vertica implementations have been added to the test version of the online modeling tool. Vertica is an MPP columnar database, and Avito runs a cluster of 12 nodes with 50TB of data in an Anchor model. A relational Big Data solution that outperforms previously tested NoSQL alternatives!

In Vertica there are three distribution models, and they happen to coincide with Anchor modeling constructs.

The data is available locally (duplicated) on every node in the cluster. This suits knots very well, since they may be joined from both ties and attributes anywhere.

The data is split according to some operation across the nodes. For example, a modulo operation on the identity column in the anchor and the attribute tables could be used to determine on which node data should end up. This keeps an instance of an entity and its history of changes together on the same node.

Data necessary in order to do a cross-node join is stored across the nodes, for all directions of the join. Ties fit this purpose perfectly and does not introduce any overhead thanks to only containing the columns on which joins are done.

Anchor modeling fits MPP extremely well, since the constructs are designed in such a way that the MPP is utilized as efficiently as possible. A knotted attribute or tie can be resolved locally on a node, and a join over a tie cannot be more efficient, while the assembly of the results from different nodes is trivial.

Incidentally, if you search about issues in Vertica, implementations may suffer from having to create lots of projections or broadcasts of wide tables in order to support ad-hoc querying. This is a natural result of using less normalized models. Anchor modeling, on the contrary, works “out of the box” for ad-hoc querying. Furthermore, what should be projected and broadcast is known beforehand, and not tuning work a DBA would have to worry about on a daily basis.

Since Vertica does not have table valued functions or stored procedures, only create table statements are generated so far. We are still working on providing some form of latest views.

Tweet about this on TwitterShare on LinkedInShare on Google+Share on FacebookPin on Pinterest

Published by

Lars Rönnbäck

Co-developer of the Anchor Modeling technique. Programmer of the online modeling tool. Site maintainer. Presenter and trainer.

Leave a Reply

Your email address will not be published. Required fields are marked *