Column-Oriented Database Internals: Vectorized Execution, Late Materialization, and Compression in Clickhouse and Duckdb

Authors

  • Win Mathew John Marian College Kuttikkanam (Autonomous), Kerala, India Author

Keywords:

Column-Oriented Database, OLAP, Vectorized Execution, Late Materialization, Lightweight Compression, ClickHouse, DuckDB, Query Processing, Analytical Databases

Abstract

The database systems that power analytics look almost nothing inside like the ones that power transactions, even when they speak the same SQL. The difference begins with a single decision about how a table is laid out on storage. A transactional system stores each row contiguously, which suits writing and reading whole records; an analytical system stores each column contiguously, which suits scanning a few columns across millions of rows. That one choice cascades into a distinct set of internal techniques that together make modern analytical engines an order of magnitude faster than row stores on the queries they target. This paper examines three of those techniques as realised in two influential open-source systems, ClickHouse and DuckDB. Vectorized execution processes data in batches of thousands of values rather than one tuple at a time, amortising interpreter overhead and unlocking the data-parallel instructions of modern CPUs. Late materialization keeps data in compact column form for as long as possible, delaying the expensive reconstruction of rows until it is unavoidable. And because a column holds values of one type with much local redundancy, lightweight compression schemes shrink storage and, critically, let the engine operate directly on compressed data. We explain the mechanism behind each, show how they reinforce one another, and discuss why the column store has become the default architecture for analytical workloads.

Author Biography

  • Win Mathew John, Marian College Kuttikkanam (Autonomous), Kerala, India

    Associate professor, PG Department of Computer Applications

Downloads

Published

2026-06-12

Issue

Section

Articles