Datafusion on Home

Datafusion on Home https://blog.rafaelfernandez.dev/tags/datafusion/ Recent content in Datafusion on Home Hugo -- gohugo.io en © 2026 Rafael Fernandez Tue, 24 Mar 2026 00:00:00 +0000 How Query Engines Work 2. Why modern query engines think in columns https://blog.rafaelfernandez.dev/posts/how-query-engines-work-2-columnar-storage-arrow-rust/ Tue, 24 Mar 2026 00:00:00 +0000 https://blog.rafaelfernandez.dev/posts/how-query-engines-work-2-columnar-storage-arrow-rust/ Why do modern query engines pass around columns instead of rows? Because the hardware loves it. This post explains why columnar layout is so fast, how Apache Arrow represents it in memory, and how to build and manipulate Arrow arrays in Rust without treating the whole thing like black magic. How Query Engines Work 1. The small compiler hiding behind every SQL query https://blog.rafaelfernandez.dev/posts/how-query-engines-work-1-from-sql-to-results/ Mon, 23 Mar 2026 00:00:00 +0000 https://blog.rafaelfernandez.dev/posts/how-query-engines-work-1-from-sql-to-results/ You write a SQL query, hit enter, and a few milliseconds later results appear. In between, a small compiler has already parsed your text, built a plan, optimized it, and executed it. This post walks through that pipeline with a real query and real Rust code using DataFusion. Sail. Sailing Through Giants and Sparks https://blog.rafaelfernandez.dev/posts/sail-sailing-through-giants-and-sparks/ Wed, 16 Jul 2025 00:00:00 +0000 https://blog.rafaelfernandez.dev/posts/sail-sailing-through-giants-and-sparks/ In this article, I share my critical view on the current state of data engineering, dominated by heavyweight platforms like Spark and Databricks, and introduce Sail, an open-source engine built on top of Apache Arrow and DataFusion, written in Rust, that offers a new path: lightweight, efficient, and powerful.