The Ultimate Hub for Data Lakehouse Engineering

Build a unified, high-performance analytics platform with the open data lakehouse architecture. Combine the flexibility of a data lake with the reliability of a warehouse using open standards.

banner image

Why Build a Data Lakehouse?

/images/code.svg

Unified Access

Eliminate data silos. Query your data where it lives—in S3, ADLS, or GCS—without moving it.

/images/oop.svg

Open Standards

Avoid vendor lock-in. Use open formats like Apache Iceberg and open catalogs to keep your data accessible to any engine.

/images/user-clock.svg

High Performance

Achieve sub-second query performance on data lake scale datasets using engines like Dremio.

The Modern Data Stack is Open

The Data Lakehouse Hub is your central resource for tutorials, architectural guides, and community support. Whether you're migrating from a warehouse or building from scratch, we have the resources to help you succeed with open data standards.

workflow image

Must Read Articles

Deep dives into Apache Iceberg, Agentic AI, and modern Data Lakehouse architecture by Alex Merced.

Agentic AI

Agentic Analytics on the Apache Lakehouse

Discover how autonomous AI agents replace manual dashboard querying by reading governed semantic layers directly on your data lakehouse, delivering faster and more reliable analytical results.

Read Article
Apache Iceberg

What is Apache Iceberg? The Table Format Revolution

Learn how Apache Iceberg turned raw Parquet files in S3 into a fully ACID-compliant, time-traveling analytical database without moving your data out of object storage.

Read Article
Apache Iceberg

The 2025 State of the Apache Iceberg Ecosystem

Survey data from data professionals reveals adoption rates, popular tooling, and where the Apache Iceberg ecosystem is heading through 2026 and beyond.

Read Article
Data Lakehouse

What Are Table Formats and Why Were They Needed?

Table formats like Apache Iceberg solved the ACID, schema evolution, and query performance problems that turned data lakes into unmanageable data swamps.

Read Article
Apache Iceberg

2026 Intro to Apache Iceberg

A comprehensive beginner-to-intermediate introduction to Apache Iceberg, covering its metadata layer, catalog integrations, and why it has become the dominant open table format.

Read Article
Agentic AI

Why AI Fails Without a Semantic Layer

Explore how a semantic layer provides the business context and governed metrics that AI agents require to generate accurate, trustworthy analytical answers at scale.

Read Article
call to action image

Stay Ahead of the Curve

Subscribe to our newsletter and event calendar to get the latest tutorials, webinars, and meetups delivered to your inbox.

Find an Event