What is GPU-accelerated Fabric Data Warehouse and when is it available?

GPU-accelerated Fabric Data Warehouse (CoddSpeed) uses NVIDIA accelerated computing to run eligible analytical queries directly on GPUs inside the Fabric query engine — no query rewrites, no cluster configuration. Enable it in workspace settings and the query optimizer routes eligible queries to the GPU automatically. In internal benchmarking (May 2026), GPU-accelerated Fabric Data Warehouse ran up to 7x faster than three comparable cloud data warehouses at 64-user concurrency. UNC Health reported 5x improvement on existing production workloads. The underlying CoddSpeed research won the SIGMOD 2026 Best Industry Paper award. Early access preview is scheduled for July 2026.

When should I choose Lakehouse instead of Data Warehouse?

Choose Fabric Lakehouse when: your team works primarily in PySpark, Scala, or Python notebooks; your data includes unstructured or semi-structured formats (JSON, Parquet, CSV, images); you need flexible schema evolution and do not require multi-table ACID transactions across tables; or you are building Bronze and Silver medallion layers where data evolves rapidly. Lakehouse also has the advantage for machine learning feature engineering, Materialized Lake Views, and data science workloads that need full compute control.

When should I choose Data Warehouse instead of Lakehouse?

Choose Fabric Data Warehouse when: your team works primarily in T-SQL and SQL Server tooling; you need full multi-table transactional guarantees (INSERT, UPDATE, DELETE, MERGE across multiple tables atomically); your workload is structured data only with a stable schema; you are building Gold-layer star schemas and dimensional models for BI; or you need the highest query concurrency for interactive dashboards. As of Build 2026, the GPU-accelerated engine (CoddSpeed, SIGMOD 2026 Best Industry Paper) makes Warehouse the highest-performance choice for high-concurrency analytical workloads.

What is Materialized Lake Views and how does it change the decision?

Materialized Lake Views (MLV) reached General Availability in March 2026. They let you define transformation logic in SQL or PySpark — Fabric materialises the result as a Delta table and manages incremental refresh, dependency tracking, and retry automatically. Extended lineage (June 2026) allows MLVs to span multiple Lakehouses across workspaces, with one schedule in a Gold Lakehouse cascading through Silver and Bronze. MLVs reduce the need to choose Warehouse purely for automated Gold-layer materialisation — a Lakehouse with MLVs can serve many of the same Gold-layer use cases that previously required a Warehouse. Use Warehouse when you additionally need multi-table T-SQL transactions, ALTER COLUMN, or Time Travel via T-SQL.

Architecture Decision · Microsoft Fabric

Fabric Lakehouse vs Data Warehouse Best Practices — 2026 Guide

Q: Is Fabric Lakehouse replacing the traditional Data Warehouse?

No. In Microsoft Fabric, Lakehouse and Data Warehouse are complementary — not competing. Both store data in Delta tables on OneLake. Lakehouse is optimised for data engineering, Spark-based transformation, and data science. Warehouse provides a full T-SQL engine with multi-table ACID transactions, dimensional modelling, and governed BI consumption. The Microsoft decision guide recommends choosing based on three criteria: preferred development model (Spark vs T-SQL), whether multi-table transactions are needed, and data complexity (unstructured/semi-structured vs structured only).

The official Microsoft decision framework, the June 2026 updates that change the calculus — GPU-accelerated Warehouse, Materialized Lake Views GA, ALTER COLUMN, Time Travel at SQL analytics endpoint, Datawarehouse Monitor — and the production patterns that enterprise teams actually use.

Quick Answer

Both Lakehouse and Data Warehouse store Delta tables on OneLake — the storage choice is not the decision. Microsoft’s official framework reduces to three questions: Do you prefer Spark or T-SQL? Do you need multi-table ACID transactions? Is your data structured only, or mixed? As of June 2026, the Warehouse gained GPU acceleration (CoddSpeed, SIGMOD 2026 Best Industry Paper) delivering up to 7× faster performance at high concurrency — early access preview July 2026. The Lakehouse gained Materialized Lake Views GA with cross-workspace extended lineage, closing the Gold-layer automation gap that previously pushed teams toward Warehouse by default.

📅 Last verified: June 2026 ⏱ ~13 min read ✍️ A.J., Data Engineering Researcher & Technical Writer · UIG Data Lab 🔗 Source: Microsoft Learn

Section 00

Fabric Lakehouse vs Data Warehouse: The Decision Framework

Microsoft publishes a formal decision guide at learn.microsoft.com/fabric/fundamentals/decision-guide-lakehouse-warehouse. It reduces to a three-question tree. The framework applies regardless of data volume — it is about skill set, transaction requirements, and data shape, not scale.

Decision Criterion	Answer → Use Lakehouse	Answer → Use Warehouse
Development model How does your team prefer to work?	Spark, PySpark, Scala, Python notebooks	T-SQL, SQL Server tooling (SSMS, Azure Data Studio)
Multi-table transactions Do you need full ACID across multiple tables?	No — single-table ACID via Delta is sufficient	Yes — you need INSERT/UPDATE/DELETE/MERGE across multiple tables atomically
Data complexity What formats are you analysing?	Mixed: unstructured, semi-structured (JSON, logs, images), or don’t know yet	Structured only: relational data with stable schemas

Source: Microsoft Fabric decision guide — choose between Warehouse and Lakehouse

ℹ️

You Can Add Either One Later

Microsoft’s official guidance explicitly states that you can always add one or the other at a later point should your business needs change. Starting with Lakehouse does not lock you out of Warehouse. Both use the same Polaris SQL engine for T-SQL queries. Regardless of where you start, compute is fully compatible and data stays in the same OneLake location.

Field note — A.J., Data Engineering Researcher

The most common mistake in this decision is choosing based on team size or data volume rather than the three criteria above. A team of two data engineers can be building a Warehouse-first architecture. A team of twenty can be entirely Lakehouse-first. The decision is about what tools your team knows and what transactional guarantees your Gold layer needs — not how big you are. Defaulting to Warehouse because it “sounds more enterprise” and defaulting to Lakehouse because it “sounds more modern” are both wrong reasons to choose.

Section 01

OneLake — Why the Storage Is No Longer the Decision

In every data platform that predates Fabric, choosing a storage engine was the primary architectural decision: Parquet on ADLS vs. Synapse Dedicated Pool vs. Databricks Delta vs. Snowflake internal storage. Each choice created a silo. Moving data between them required ETL pipelines, refresh schedules, and duplicate storage costs.

OneLake changes this completely. Both Lakehouse and Data Warehouse store their data as Delta Parquet files in OneLake. There is one physical location. The Lakehouse Spark engine, the Lakehouse SQL analytics endpoint, and the Warehouse T-SQL engine all read the same files. Power BI Direct Lake reads the same files. No copies. No synchronisation.

The decision between Lakehouse and Warehouse is therefore a decision about which compute surface and access model fits your workload — not which storage system to commit to. You can switch, combine, or add later without moving data.

Capability	Lakehouse	Data Warehouse
Storage format	Delta Parquet in OneLake	Delta Parquet in OneLake
Primary compute	Apache Spark (Runtime 1.3 — Spark 3.5)	Polaris distributed T-SQL engine
SQL access	Read-only SQL analytics endpoint (T-SQL)	Full read/write T-SQL with DML
DML (INSERT/UPDATE/DELETE)	Via Spark only — not in SQL analytics endpoint	Full T-SQL DML — multi-table transactions
Power BI Direct Lake	Supported — SQL analytics endpoint must be refreshed	Supported — Refresh SQL Endpoint activity (GA June 2026)
Shortcuts (external data)	Full support — ADLS, S3, GCS, other Lakehouses	Read via cross-database queries to Lakehouse
Materialized Lake Views	GA March 2026 — SQL and PySpark, cross-workspace lineage (June 2026)	Not available
Time Travel	Via Spark or SQL analytics endpoint (Preview, June 2026)	T-SQL — FOR TIMESTAMP AS OF — configurable retention (default 30 days)

Section 02

Fabric Lakehouse — What It Is and When to Use It

A Fabric Lakehouse is a data architecture platform built on Delta tables in OneLake. It has two primary access surfaces: the Spark engine (via notebooks and Spark Job Definitions) for code-first data engineering and data science, and the SQL analytics endpoint (a read-only T-SQL interface) for SQL-based consumption without Spark.

When you create a Lakehouse, Fabric automatically creates the SQL analytics endpoint and a default Power BI semantic model alongside it. SQL analysts and Power BI users get immediate access to the same data the Spark engineer just wrote — with no separate load step.

✅

Lakehouse Is the Right Choice When

Your team writes PySpark, Scala, or Python notebooks daily
Data includes JSON, logs, images, or other semi-structured/unstructured formats
Schema evolves rapidly and schema-on-read flexibility is needed
Workload involves ML feature engineering, model training, or data science experiments
Bronze and Silver medallion layers with fast-changing raw data
You want Materialized Lake Views for automated incremental Gold-layer materialisation

⚠️

Lakehouse Limitations to Know

SQL analytics endpoint is read-only — no DML via T-SQL
Multi-table T-SQL transactions are not supported — use Spark Delta MERGE for upserts
ALTER COLUMN is not available at the SQL analytics endpoint
Time Travel via T-SQL at SQL analytics endpoint is Preview (June 2026) — limited by VACUUM retention settings per table
OneSecurity (OneLake data access roles) is incompatible with Direct Lake semantic models on the same Lakehouse (known issue, June 2026)

The Lakehouse SQL analytics endpoint gets a Refresh SQL Endpoint Activity (GA, June 2026) as a native pipeline step — refresh the endpoint on demand inside the same pipeline as your ETL load, so downstream BI queries always read the latest state without a separate scheduling workaround. See the full deep-dive in the Fabric Lakehouse Tutorial.

Section 03

Fabric Data Warehouse — What It Is and When to Use It

A Fabric Data Warehouse is a lake-centric, distributed T-SQL engine built on Delta tables in OneLake. It provides the full SQL Server-compatible T-SQL surface: DDL, DML (INSERT, UPDATE, DELETE, MERGE), stored procedures, views, functions, schemas, and cross-database querying via three-part names. Teams migrating from SQL Server, Azure Synapse Dedicated Pools, or Teradata will recognise the development model immediately.

✅

Warehouse Is the Right Choice When

Your team works primarily in T-SQL and SQL Server tooling (SSMS, Azure Data Studio)
You need full multi-table ACID transactions via T-SQL
Workload is structured data only with stable schemas and dimensional models
Gold-layer star schemas, conformed dimensions, and BI-ready fact tables
High-concurrency interactive query workloads serving many simultaneous Power BI users
You need ALTER COLUMN, Time Travel via T-SQL, or Datawarehouse Monitor for query observability

🚀

GPU Acceleration — Build 2026 (CoddSpeed)

NVIDIA accelerated computing built directly into the Warehouse query engine
No query rewrites, no infrastructure setup, no cluster to size
Enable in workspace settings — query optimizer routes eligible queries to GPU automatically
Up to 7× faster than three comparable cloud warehouses at 64-user concurrency (internal benchmarking, May 2026)
UNC Health: 5× production improvement on existing workloads
CoddSpeed: SIGMOD 2026 Best Industry Paper — first fully managed SaaS warehouse with GPU acceleration
Early access preview: July 2026

✅

Unified Analyze Data Menu — June 2026

Fabric now provides a unified Analyse data with menu that gives a consistent way to move from data to analysis across Lakehouse, Warehouse, and Eventhouse. Instead of navigating different menus per workload, analysis starts from a single, predictable entry point. The SQL Query Editor in Warehouse also expanded with integrated experiences: OneLake Analytics (KQL and Spark on warehouse data), Copy Job creation directly from the editor, Power BI Direct Lake over OneLake semantic model creation, and Fabric Ontologies for defining business concepts over warehouse tables.

Section 04

June 2026 — What Changed for Both

The June 2026 release (Build 2026, June 2) and the June Feature Summary (June 2, 2026) introduced substantial changes to both Lakehouse and Warehouse that affect architecture decisions made in 2026.

Data Warehouse — June 2026 Updates

Feature	Status	What It Changes
GPU-Accelerated Warehouse (CoddSpeed)	Early Access — July 2026	Up to 7× faster analytical queries at high concurrency. Enable in workspace settings. SIGMOD 2026 Best Industry Paper. No query rewrites required.
ALTER COLUMN	GA — June 2026	Change column length, precision, and scale with standard T-SQL syntax. Takes effect instantly with no data rewrite. Eliminates CTAS-based workarounds in CI/CD and migration workflows.
Time Travel via T-SQL	Preview	Query warehouse tables as they existed at any prior point in time using `FOR TIMESTAMP AS OF`. Default 30-day configurable retention window. Cross-warehouse time travel within the same workspace.
Time Travel at SQL Analytics Endpoint	Preview — June 2026	Extends T-SQL time travel to the Lakehouse SQL analytics endpoint. Retention limited by per-table VACUUM settings. Important: this feature now means both Lakehouse and Warehouse support Time Travel via T-SQL.
Datawarehouse Monitor (formerly Query Activity)	Preview — June 2026	Single place to monitor running and historical queries. Per-query metrics: CPU time, data scanned, cache usage. Drill into query run history without writing T-SQL. Replaces the previous Query Activity view.
SQL Analytics Endpoint CI/CD (DacFx)	Preview — June 2026	Version-control and deploy SQL analytics endpoint objects via DacFx. Item Definition APIs enable programmatic CRUD for CI/CD pipeline integration and environment management.

Source: Everything new in Fabric Data Warehouse at Build 2026 · Fabric June 2026 Feature Summary

Lakehouse — June 2026 Updates

Feature	Status	What It Changes
Materialized Lake Views (MLV) extended lineage	June 2026	MLVs now span Lakehouse boundaries across workspaces. One schedule in Gold cascades through Silver and Bronze automatically. Faulted nodes flag missing dependencies before execution. Removes the primary orchestration reason for choosing Warehouse at the Gold layer.
Refresh SQL Endpoint Activity	GA — June 2026	Refresh the Lakehouse SQL analytics endpoint as a native pipeline step. Sync refreshes with ETL loads so Direct Lake semantic models always reflect the latest data.
OneLake Storage Tiers + Lifecycle Management	Preview — June 2026	Hot/Cool/Cold tiers with automatic lifecycle policies. Move Bronze archives to Cold storage automatically — reduce cost without deleting compliance data.
High Concurrency mode for Lakehouse	GA — January 2026	Up to 5 independent Lakehouse jobs share one Spark session. Reduces startup wait time on smaller F-SKU capacities. Material improvement for interactive development on F2–F8.
Lakehouse Maintenance Activity (VACUUM + OPTIMIZE)	Preview	Run VACUUM and OPTIMIZE as native pipeline steps — no separate maintenance notebook required. Schedule nightly in the same pipeline as ETL load.

Section 05

Medallion Architecture — Layer Mapping in 2026

Microsoft’s recommended pattern places Lakehouse at Bronze and Silver, with a choice at Gold between Lakehouse-with-MLV and Data Warehouse. The June 2026 updates to both shift that choice in important ways.

Layer	Recommended Engine	Primary Tools	When to Use Warehouse at This Layer
Bronze (Raw landing)	Lakehouse	Data Factory Copy Job, Eventstream, Data Mirroring, Spark notebooks	Almost never — Lakehouse handles all source formats; Bronze data is rarely structured-only at landing
Silver (Cleaned, conformed)	Lakehouse	Spark notebooks (Delta MERGE), Materialized Lake Views, Dataflow Gen2	Rarely — only if Silver data is already structured-only and BI teams need direct T-SQL write access to Silver tables
Gold (Curated, BI-ready)	Either — depends on requirements	MLV (Lakehouse) or T-SQL star schema (Warehouse) + Direct Lake Power BI	Use Warehouse at Gold when: multi-table T-SQL transactions are required; BI team works exclusively in T-SQL; high-concurrency GPU-accelerated queries needed; or schema requires ALTER COLUMN in CI/CD

Better Together — The Standard Enterprise Pattern

The most common production pattern in Fortune 500 Fabric deployments: Lakehouse for Bronze and Silver (Spark-based ETL, CDC, feature engineering), Data Warehouse for Gold (T-SQL star schemas, dimensional models, certified semantic models). All layers remain in OneLake — the Warehouse reads Lakehouse Silver tables via cross-database queries using three-part names, and the Gold layer adds only the curated dimensional model on top, not a separate copy of the data.

ℹ️

MLV at Gold vs Warehouse at Gold — How to Choose in 2026

Before Materialized Lake Views GA (March 2026), teams often chose Warehouse at Gold purely for automated materialisation — Spark notebooks required manual scheduling that a Warehouse stored procedure did not. MLVs with extended lineage (June 2026) eliminate that gap: the MLV automatically materialises Gold tables from Silver, dependency-aware, incremental, cross-workspace. Choose Warehouse at Gold when you additionally need full T-SQL DML, multi-table transactions, ALTER COLUMN in CI/CD, or the GPU-accelerated query engine at high concurrency. Choose Lakehouse with MLV at Gold when your Gold layer is primarily SQL-definable aggregations with no multi-table write requirements and BI teams consume via Direct Lake.

Section 06

Ingestion — Which Path for Which Target

Every Fabric ingestion method can write to both Lakehouse and Warehouse. The right choice depends on the source type, latency requirement, and whether the team prefers code-first or low-code tooling — not on the destination engine.

Ingestion Method	Best Target	Typical Use
Copy Job Activity (GA April 2026)	Lakehouse Bronze	CDC from SQL Server, Azure SQL, Azure SQL MI (GA). Bulk load, incremental, SCD Type 2 for Warehouse. Auto-schema detection, no manual mapping.
Data Pipelines — Copy Activity	Lakehouse or Warehouse	90+ connectors, custom schema mapping, fault tolerance. Use when Copy Job does not expose the configuration needed.
Dataflow Gen2	Lakehouse Silver or Warehouse Gold	Visual Power Query with 200+ connectors. Low-code preparation and transformation. Writes Delta tables to OneLake destination.
Spark Notebooks	Lakehouse Bronze or Silver	Complex ETL, CDC with Delta MERGE, custom schema enforcement, ML feature engineering. Cannot write directly to Warehouse tables — land in Lakehouse, cross-query from Warehouse.
Eventstream	Lakehouse or Eventhouse	Streaming from Event Hubs, IoT Hub, Kafka, CDC sources. Routes to Lakehouse Delta tables and/or Eventhouse KQL simultaneously.
Data Mirroring	Lakehouse (as Delta tables in OneLake)	Near-real-time zero-ETL replication from Azure SQL DB, Cosmos DB, Snowflake, Databricks Unity Catalog. CDC for SQL estates now GA.

⚠️

Spark Cannot Write Directly to Warehouse Tables

This is the most common architectural surprise for teams migrating from Databricks or Synapse Spark. Spark notebooks in Fabric write to Lakehouse Delta tables — they cannot write directly to Warehouse tables via spark.write.saveAsTable(). The standard pattern: Spark writes to a Lakehouse Silver table, the Warehouse reads it via cross-database query using three-part naming (silverlakehouse.schema.tablename) or loads it via a T-SQL CREATE TABLE AS SELECT or INSERT INTO ... SELECT.

Section 07

Governance and Security — Key Differences

Both Lakehouse and Warehouse use OneLake at the storage layer, so Purview lineage, sensitivity labels, and OneLake audit logs apply to both. The difference is in how each engine exposes access control to teams who need fine-grained permissions.

🔐

Lakehouse Security Model

Workspace RBAC + item-level permissions
OneLake security roles (OneSecurity — Preview) — folder, table, row, column level
RLS at semantic model level for Direct Lake
Known limitation: OneSecurity blocks Direct Lake semantic models — manage RLS at semantic model level until resolved
Sensitivity labels via Microsoft Purview inherit automatically from M365

🔐

Warehouse Security Model

Full T-SQL security: schemas, roles, GRANT/REVOKE/DENY
Dynamic data masking for column-level obfuscation
Row-level security via T-SQL predicates
More familiar to DBAs from SQL Server and Azure Synapse
Warehouse permissions do not cascade to semantic model RLS automatically — configure at both layers

For a complete governance strategy covering both engines — OneLake catalog Govern tab, Domains, Purview DLP, and the Outbound Access Protection GA — see the Microsoft Fabric Data Governance Tutorial.

Section 08

Cost and Performance — What Drives the Numbers

Both Lakehouse and Warehouse consume CUs from the same F-SKU capacity pool. There is no separate pricing tier per engine. Cost differences come from workload patterns, not from being charged differently per engine type.

Factor	Lakehouse (Spark)	Data Warehouse (T-SQL)
Interactive query latency	5–15 second Spark cold start (Starter Pools). Subsequent queries fast within session.	No cold start — serverless SQL responds immediately. Preferred for interactive BI with many users.
High-concurrency BI	SQL analytics endpoint handles concurrent T-SQL. GPU acceleration not available.	Optimised for high-concurrency. GPU acceleration (CoddSpeed) delivers up to 7× improvement at 64-user concurrency — early access July 2026.
Heavy transformation (TB+)	Spark with Starter Pools — cost-effective for large-scale batch jobs. Partition, broadcast, and V-Order for performance.	T-SQL engine handles large loads well — no advantage over Spark for volume. Choose based on skill set.
ML and feature engineering	Spark — full Python ecosystem, scikit-learn, MLflow. Notebooks access all Lakehouse data.	Not the right tool — use Lakehouse for ML workloads even in a Warehouse-primary architecture.
Storage cost optimisation	OneLake storage tiers (Preview, June 2026) — move cold Bronze data to Cool/Cold tier automatically.	Configurable retention for Time Travel (default 30 days). Longer retention increases storage cost.

For detailed CU estimation and workload-specific cost modelling, the Microsoft Fabric Pricing Calculator covers both Lakehouse Spark CU consumption and Warehouse SQL CU consumption by workload type. For production stability, burst management, and capacity right-sizing, the Fabric Capacity Optimization guide covers the scheduling and pause strategies that prevent throttling.

Section 09

Frequently Asked Questions

Is Fabric Lakehouse replacing the traditional Data Warehouse?

No. In Microsoft Fabric, Lakehouse and Data Warehouse are designed to work together — not to replace each other. Both store Delta tables on OneLake. The choice is about compute surface, not storage. Lakehouse is optimised for Spark-based engineering and data science. Warehouse is optimised for T-SQL, multi-table transactions, and high-concurrency BI. Microsoft’s official guidance explicitly states you can add either one at any point — starting with Lakehouse does not preclude Warehouse later, and both use the same Polaris SQL engine for T-SQL queries. Source: Microsoft Learn — What is Fabric Data Warehouse.

What is GPU-accelerated Fabric Data Warehouse and how does it work?

GPU-accelerated Fabric Data Warehouse (CoddSpeed) was announced at Build 2026 (June 2, 2026). It uses NVIDIA accelerated computing built directly into the Warehouse query execution engine. The query optimizer automatically routes eligible analytical queries to the GPU — no query rewrites, no infrastructure configuration, no new system to manage. Enable it in workspace settings and it applies to all SQL Analytics Endpoints and Data Warehouses in that workspace. In internal benchmarking conducted in May 2026, it ran up to 7× faster than three comparable cloud data warehouses at 64-user concurrency. UNC Health reported 5× improvement on existing production workloads. The underlying research — CoddSpeed: Hardware Accelerated Query Processing in Microsoft Fabric — won the SIGMOD 2026 Best Industry Paper award. Early access preview is scheduled for July 2026. Source: Microsoft Fabric blog.

When should I use Lakehouse and when should I use Warehouse?

Microsoft’s official decision framework uses three criteria. Development model: if your team prefers Spark/PySpark/Python → Lakehouse; if your team prefers T-SQL → Warehouse. Multi-table transactions: if you need full ACID across multiple tables via T-SQL → Warehouse; if single-table Delta MERGE is sufficient → Lakehouse. Data complexity: if data includes unstructured or semi-structured formats → Lakehouse; if structured data only → Warehouse. When all three answers point the same direction, the decision is clear. When they point different directions — common in teams with mixed skill sets building medallion architectures — the Better Together pattern (Lakehouse for Bronze/Silver, Warehouse for Gold) is the standard approach. Source: Microsoft Fabric decision guide.

What is Materialized Lake Views and does it change the decision?

Materialized Lake Views (MLV) reached General Availability in March 2026. They let you define transformation logic in SQL or PySpark — Fabric materialises the result as a Delta table and manages incremental refresh, dependency tracking, and scheduling automatically. Extended lineage (June 2026) means MLVs now span Lakehouse boundaries across workspaces: one schedule in a Gold Lakehouse cascades through Silver and Bronze automatically. Before MLVs, teams often chose Warehouse at Gold purely for automated materialisation. MLVs close that gap — a Lakehouse with MLVs can now handle many of the same Gold-layer use cases without Warehouse. Choose Warehouse at Gold when you additionally need multi-table T-SQL transactions, ALTER COLUMN in CI/CD, or the GPU-accelerated query engine at high concurrency. See the Fabric Lakehouse Tutorial for a full MLV deep-dive.

Can Spark write directly to Warehouse tables?

No. Spark notebooks in Fabric write to Lakehouse Delta tables — they cannot write directly to Warehouse tables via spark.write.saveAsTable() or spark.write.format("delta") targeting a Warehouse. The standard pattern in a Better Together architecture: Spark transforms and writes to a Lakehouse Silver table, then the Warehouse loads or references it via cross-database T-SQL query using three-part names (silverlakehouse.schema.tablename), or via a T-SQL INSERT INTO gold_table SELECT * FROM silverlakehouse.schema.tablename. This is not a workaround — it is the documented Microsoft pattern for combining both engines.

Should every medallion tier be its own Lakehouse?

Both approaches are valid and Microsoft supports both. Separate Lakehouses per tier (Bronze Lakehouse, Silver Lakehouse, Gold Lakehouse) improves security boundaries — workspace RBAC means Bronze access can be restricted to engineering only, while Gold access is open to analysts — and makes lineage and ownership clearer in the OneLake catalog. The trade-off is higher item count and more cross-Lakehouse shortcut configuration. Single Lakehouse with schemas (bronze schema, silver schema, gold schema) reduces management overhead but requires strict governance documentation and means all workspace members see all tiers. For small teams with three engineers, a single Lakehouse with schemas is often simpler. For larger organisations with separate team ownership per tier, separate Lakehouses per tier with MLV extended lineage (June 2026) is the cleaner architecture.

When should I use Eventhouse instead of Lakehouse or Warehouse?

Use Eventhouse (backed by KQL/Azure Data Explorer) when your workload involves streaming event data and you need sub-second interactive query latency on time-series or high-granularity data — IoT telemetry, clickstream analytics, security event correlation, or operational dashboards. The Microsoft data store decision guide explicitly positions Eventhouse for streaming event data and high-granularity interactive analytics. Eventhouse is not a replacement for Lakehouse or Warehouse — it is a third path for a specific workload type. Eventstream can route data to Eventhouse and Lakehouse simultaneously, so operational real-time dashboards can run on Eventhouse while historical batch analysis runs on the same data in Lakehouse. Source: Microsoft Fabric decision guide — choose a data store.

⚠ Accuracy Disclaimer

Feature descriptions are based on official Microsoft Learn documentation, the Fabric June 2026 Feature Summary, and the Build 2026 Azure blog (published June 2, 2026). GPU-accelerated Warehouse early access is scheduled for July 2026 — not yet GA. Verify current feature status at learn.microsoft.com/fabric/fundamentals/whats-new. UIG Data Lab is an independent publication, not affiliated with or endorsed by Microsoft Corporation.

A.J. Data Engineering Researcher & Technical Writer · UIG Data Lab All articles →

A.J. researches and writes about data engineering, analytics architecture, Microsoft Fabric, and modern cloud data platforms. Coverage spans Microsoft Fabric, Power BI, Azure Data Engineering, Databricks, Snowflake, Apache Spark, dbt, Apache Airflow, and modern cloud data infrastructure. The focus is practitioner-level content that helps data professionals understand platform capabilities, evaluate technology decisions, optimise costs, and implement practical solutions using official documentation, product updates, community insights, and industry best practices. His writing covers real decisions from real deployments — not documentation rewrites.

Microsoft Fabric Lakehouse Data Warehouse OneLake Delta Lake Medallion Architecture Direct Lake Apache Spark Power BI Azure