Raj More Raj More - 1 month ago 9
SQL Question

What is a columnar database?

I have been working with warehousing for a while now.

I am intrigued by Columnar Databases and the speed that they have to offer for data retrievals.

I have multi-part question:


  • How do Columnar Databases work?

  • How do they differ from relational databases?


mjv mjv
Answer

How do Columnar Databases work?
Columnar database is a concept rather a particular architecture/implementation. In other words, there isn't one particular description on how these databases work; indeed, several are build upon traditional, row-oriented, DBMS, simply storing the info in tables with one (or rather often two) columns (and adding the necessary layer to access the columnar data in an easy fashion).

How do they differ from relational databases? They generally differ from traditional (row-oriented) databases with regards to ...

  • performance...
  • storage requirements ...
  • ease of modification of the schema ...

...in specific use cases of DBMSes.
In particular they offer advantages in the areas mentioned when the typical use is to compute aggregate values on a limited number of columns, as opposed to try and retrieve all/most columns for a given entity.

Is there a trial version of a columnar database I can install to play around? (I am on Windows 7) Yes, there are commercial, free and also open-source implementation of columnar databases. See the list at the end of the Wikipedia article for starter.
Beware that several of these implementations were introduced to address a particular need (say very small footprint, highly compressible distribution of data, or spare matrix emulation etc.) rather than provide a general purpose column-oriented DBMS per-se.

Note: The remark about the "single purpose orientation" of several columnar DBMSes is not a critique of these implementations, but rather an additional indication that such an approach for DBMSes strays from the more "natural" (and certainly more broadly used) approach to storing record entities. As a result, this approach is used when the row-oriented approach isn't satisfactory, and therefore and tends to
a) be targeted for a particular purpose b) receive less resources/interest than work on "General Purpose", "Tried and Tested", tabular approach.

Tentatively, the Entity-Attribute-Value (EAV) data model, may be an alternative storage strategy which you may want to consider. Although distinct from the "pure" Columnar DB model, EAV shares several of the characteristics of Columnar DBs.