November 2, 2024

Advanced Analytics: Analytic Platforms Should Be Columnar Orientation

Author: DATAVERSITY via YouTube
Go to Source

A columnar database is an implementation of the relational theory, but with a twist. The data storage layer does not contain records. It contains a grouping of columns.

Due to the variable column lengths within a row, a small column with low cardinality, or variability of values, may reside completely within one block while another column with high cardinality and longer length may take a thousand blocks. In columnar, all the same data β€” your data β€” is there. It’s just organized differently (automatically, by the DBMS).

The main reason why you would want to utilize a columnar approach is simply to speed up the native performance of analytic queries.

Learn about the columnar orientation and how it can be effective for your needs. This is the native orientation of many databases and several others that have optional column-oriented storage layers.

There is also the equivalent in the cloud storage world, which is open format Parquet.

Go to Source