← Back to Hub

Microsoft Fabric Architecture
Interview Questions (2026 Guide)

Master OneLake, Shortcuts, Mirroring, and Multi-Cloud strategies with 30 architect-level scenarios and Microsoft Fabric Architecture Interview Questions.

AI Snapshot

What are the top Microsoft Fabric Architecture interview questions?

The most common Microsoft Fabric Architecture interview questions focus on the “One Copy” feature of OneLake, the decision framework for using Shortcuts vs. Mirroring, and how to optimize costs using F-SKU Capacity. Architects are also tested on designing multi-cloud solutions that minimize egress fees when connecting to AWS S3 or Google Cloud, and implementing federated governance using Domains.

If you are applying for a Solution Architect or Data Lead role, preparing for Microsoft Fabric Architecture interview questions is non-negotiable. Unlike standard engineering roles, architects must demonstrate the ability to design scalable, secure, and cost-effective data platforms. You must be able to articulate why you would choose a Domain architecture over a simple Workspace structure.

This guide compiles 30 advanced Microsoft Fabric Architecture interview questions organized into six strategic modules. We have integrated real-world solutions from our Microsoft Fabric Tutorial Series to ensure your answers reflect production-ready experience. We cover everything from the OneLake folder hierarchy to complex disaster recovery scenarios.

Module A: OneLake Mechanics & Foundation

This module covers the core virtualization concepts that underpin the entire Fabric ecosystem. You must understand how Fabric abstracts storage to answer these questions effectively.

Core Concepts

Q1: What is OneLake and how does it abstract physical storage?

OneLake is a logical, SaaS-based data lake that comes automatically provisioned with every Fabric tenant. It provides a single namespace (onelake://) for the entire organization. Under the hood, it is built on top of Azure Data Lake Storage (ADLS) Gen2.

Architecture Deep Dive: OneLake abstracts physical storage accounts. Users interact with logical “Items” (Lakehouses, Warehouses) without managing Blob keys or Access Control Lists (ACLs) on individual containers. This unification enables the “One Copy” feature, where diverse compute engines like Spark, T-SQL, and KQL can all read the same physical Delta Parquet files without data duplication.

[attachment_0](attachment)
Why It Matters In traditional Azure architectures, you had to manage disparate Storage Accounts and manually sync permissions. OneLake removes this overhead, acting as the “OneDrive for Data.”
Q2: Explain the OneLake folder hierarchy from Tenant to Item.

Fabric enforces a strict 4-level hierarchy to manage governance and security. You should be able to draw this on a whiteboard:

  1. Tenant (Root): The top-level boundary (e.g., `contoso.onelake.fabric.microsoft.com`).
  2. Domain (Logical): A grouping of workspaces tailored for business units (e.g., “Finance Domain”, “HR Domain”) to support Data Mesh architectures.
  3. Workspace (Physical): The boundary for security (RBAC) and billing (Capacity). It acts as a “Project” container.
  4. Item (Functional): The artifact itself (Lakehouse, Warehouse, KQL Database).

This structure is critical for implementing Federated Governance across large enterprises, ensuring that Finance admins cannot accidentally access HR data.

Storage Architecture

Q3: Managed vs. Unmanaged Tables: What is the difference?

In a Fabric Lakehouse, data storage is divided into two distinct zones, and knowing the difference is crucial for data lifecycle management:

  • Managed Tables (Tables Folder): Spark manages both the metadata AND the physical data files. If you execute a DROP TABLE command, the underlying Parquet files in OneLake are permanently deleted. These tables support full ACID transactions and Time Travel.
  • Unmanaged Tables (Files Folder): Spark manages only the metadata. The data typically resides in the “Files” section or an external shortcut. Consequently, if you drop the table, the metadata is removed, but the actual physical files remain intact. This is often used for raw/bronze ingestion.
Q4: How does Fabric handle Medallion Architecture (Bronze/Silver/Gold)?

Fabric does not enforce a physical separation of data lakes. Instead, architects typically choose between two patterns based on organizational scale:

  • Single Lakehouse (Small Scale): Use folder prefixes or schemas (e.g., `bronze_sales`, `silver_sales`) within one Lakehouse item. This is simple but can lead to “noisy neighbor” issues.
  • Multi-Workspace (Enterprise Scale): Use separate workspaces for Bronze, Silver, and Gold. This isolates security roles (e.g., only Data Engineers see Bronze) and allows for independent capacity scaling for ingestion versus serving layers.

Optimization

Q5: How does OneLake Caching work for external shortcuts?

When you create a Shortcut to an external source (like AWS S3), Fabric does not copy the data initially. However, when you query that data, Fabric triggers “Read-Through Caching.”

It pulls the required file segments from S3 and caches them in the OneLake backend (in Azure). Subsequent queries hit this cache, which significantly reduces egress fees from AWS and improves latency. This is a critical feature for multi-cloud architectures. For more on optimizing performance, see our Spark Shuffle Optimization Guide.

Q6: What is V-Order and why does it matter?

V-Order is a write-time optimization specific to Microsoft Fabric. It applies special sorting and compression algorithms to Parquet files to make them highly efficient for the VertiPaq engine used by Power BI Direct Lake.

The Trade-off V-Order increases write times by approximately 15% due to the extra compute required for sorting. However, it boosts Direct Lake read performance by 2x-3x. Therefore, you should disable V-Order for Bronze/Staging layers (write-heavy) but enforce it for Gold layers (read-heavy).

Learn more in our Direct Lake Optimization Guide.

Module B: Shortcuts & Data Virtualization

Shortcuts enable the “Zero Copy” architecture. These Microsoft Fabric Architecture interview questions test your ability to avoid data duplication and manage virtualization.

Shortcut Fundamentals

Q7: Shortcut vs. Copy Activity: When to use which?

This is a core decision framework that every architect must know:

FeatureShortcutPipeline (Copy)
Data MovementZero (Pointer only)Physical Copy (Bytes move)
LatencyReal-time accessBatch Latency
Source DataMust be compatible (Parquet/Delta)Can transform legacy formats
Best ForCross-workspace sharing & VirtualizationLegacy/Slow sources (On-prem SQL)

Conclusion: Use Shortcuts whenever data is already in a compatible format (Parquet/Delta) and accessible via ADLS/S3. Use Copy Activities for legacy ingestion.

Q8: How do you fix Shortcut “403 Forbidden” errors?

A common issue in production is “403 Forbidden” when accessing a shortcut. This usually happens because shortcuts use delegated authorization. If User A creates a shortcut, but User B tries to read it, User B must have permissions on the target path (the source data).

If User B does not have read access to the source ADLS container or OneLake item, the query fails. To resolve this, you must grant the consuming user (or their security group) at least “Read” permissions on the source item. For a step-by-step resolution, follow our guide on Resolving Shortcut Authorization Failures.

Advanced Scenarios

Q9: Can you shortcut to On-Premises SQL?

No. Currently, Shortcuts support ADLS Gen2, AWS S3, Google GCS, and internal OneLake items. You cannot create a direct shortcut to on-premises SQL Server tables.

Workaround: To access on-prem data, you must physically ingest data using a Data Pipeline or Dataflow Gen2 via the On-Premises Data Gateway. Once the data lands in OneLake (as Delta Parquet), you can then shortcut to that Lakehouse table from other workspaces.

Q10: Explain the “Egress Fee Trap” with Multi-Cloud shortcuts.

If you shortcut to AWS S3, data travels from AWS (e.g., us-east-1) to Azure (your Fabric region) during query execution. AWS charges for this outbound data transfer (egress).

Architect Warning While Fabric’s caching helps, high-volume scans on massive S3 buckets can still incur significant AWS bills. For petabyte-scale data that is frequently accessed, consider a one-time migration to OneLake using the “Copy Job” to eliminate recurring egress fees.
Q11: What happens if a Shortcut source is deleted?

A Shortcut is strictly a pointer/reference. If the source file or table is deleted (or renamed) in the source system, the shortcut becomes a “broken link.” Fabric does not cascade delete the shortcut metadata automatically. Queries against it will fail with a “Path not found” error until you manually remove or update the broken shortcut.

Q12: How do Cross-Tenant shortcuts work?

Fabric supports Cross-Tenant shortcuts to ADLS Gen2 via B2B sharing mechanisms. However, creating shortcuts to OneLake items in a different Fabric tenant is currently limited and typically requires treating the external OneLake as a generic ADLS Gen2 source using SAS tokens or Service Principals for authentication.

Module C: Mirroring & Replication

Mirroring is the “Killer Feature” replacing traditional ETL. These Microsoft Fabric Architecture interview questions cover CDC and real-time replication.

Mirroring Architecture

Q13: What is Mirroring and how does it use CDC?

Mirroring creates a near-real-time replica of a database (Snowflake, Azure SQL, Cosmos DB) inside OneLake. Internally, it uses Change Data Capture (CDC) to read the transaction logs of the source database.

It automatically converts the proprietary source format into open Delta Parquet format in OneLake. Unlike Shortcuts (which read remotely), Mirroring creates a local copy, but maintains it automatically without users managing complex ETL pipelines. See our Mirroring Tutorial for setup steps.

Q14: Decision Matrix: Mirroring vs. Shortcut?

This is a definitive architect-level distinction that you must memorize:

Decision Rule
  • Use Shortcuts for File-Based Sources (ADLS, S3, GCS) where data is already in a usable format (Parquet/CSV/Delta).
  • Use Mirroring for Database Sources (Snowflake, SQL DB, Cosmos) where data is locked in a proprietary engine and needs to be converted to Delta Parquet for Fabric compute to access it.

Performance & Cost

Q15: Does Mirroring consume F-SKU Capacity?

Yes. Although Mirroring storage in OneLake is free (up to a limit based on purchased capacity), the compute resources required to read the CDC stream and write Delta Parquet files consume Capacity Units (CU) from your F-SKU. You must monitor this using the Capacity Metrics App to avoid throttling.

Q16: Can you write back to a Mirrored Database?

No. Mirrored databases in Fabric are Read-Only. They are intended for analytics, reporting (Direct Lake), and cross-database querying. Any write operations (INSERT/UPDATE/DELETE) must occur in the source system (e.g., Snowflake). The changes will then propagate to Fabric via the mirror.

Q17: How does Snowflake Mirroring handle data types?

Fabric automatically maps Snowflake data types to Spark/Delta types. However, there are limitations. Complex types like `GEOGRAPHY` or certain custom User Defined Types (UDTs) may not be supported or might be cast to strings. It is essential to validate schema compatibility during the Proof of Concept phase to ensure data fidelity.

Q18: Mirroring vs. Eventstreams: Which is faster?

Eventstreams are designed for sub-second, real-time ingestion (IoT, Telemetry). Mirroring is “near-real-time,” typically ranging from seconds to minutes depending on the log replication interval. For IoT use cases requiring immediate action, use Eventstreams instead of database mirroring.

Module D: Governance & Domains

Enterprise governance is a critical topic. These questions focus on Data Mesh, security propagation, and compliance.

Data Mesh & Security

Q19: What is a Domain and why use it over a Workspace?

A Domain is a higher-level logical container that groups multiple workspaces. It is designed for Data Mesh architectures where different business units (e.g., Finance, HR) need autonomy.

Using Domains allows you to delegate admin rights to specific business owners (“Domain Admins”) rather than having central IT manage every single workspace. It also enables “Federated Governance” where policies can be set at the Domain level.

Q20: How does security propagate from Workspace to Items?

Security cascades implicitly. If a user is an “Admin” or “Member” of a Workspace, they generally have full access to all data within that workspace’s Lakehouses and Warehouses.

Security Risk Even a “Viewer” role in the workspace grants read access to the underlying data via the SQL Endpoint by default. To restrict this, you must implement Item-Level Sharing or use OneLake Data Access Roles.
Q21: What are OneLake Data Access Roles?

This feature (currently in preview/rollout) brings folder-level security (RBAC) to OneLake. It allows you to define roles (e.g., “Sales_Read”) and assign permissions to specific folders inside a Lakehouse, independent of the Workspace roles. This mimics the granular Access Control Lists (ACLs) found in ADLS Gen2.

Compliance

Q22: How does Purview integration work with Fabric?

Fabric has native integration with Microsoft Purview. When you apply a Sensitivity Label (e.g., “Highly Confidential”) to a Lakehouse, that label travels with the data. If a user creates a Power BI report from that Lakehouse, the report automatically inherits the label, ensuring compliance across the entire data lineage.

Q23: Scenario: How to isolate HR Data from IT Admins?

In traditional systems, IT admins often see everything. In Fabric, you can create a dedicated “HR Domain” and “HR Workspace.” You assign the HR lead as the Workspace Admin. By strictly controlling the Workspace Access List, you can prevent central IT from viewing the actual data rows, although Tenant Admins typically retain some oversight capabilities for auditing purposes.

Module E: Architect Scenarios

Use the STAR method (Situation, Task, Action, Result) for these scenario-based Microsoft Fabric Architecture interview questions.

Global Architecture

Q24: Design a Multi-Region Architecture for Data Residency.

Scenario: A global bank needs data to remain physically in Germany (GDPR) but needs to be reported on by HQ in the US.

Solution: Use Multi-Geo Capacities. Provision a Fabric Capacity in West Europe and another in East US. Create a “Germany Workspace” assigned to the EU capacity. Create a separate “US Workspace” assigned to the US capacity. Use Shortcuts in the US workspace pointing to the Germany data. Metadata travels to the US, but the physical storage compliance remains rooted in the EU capacity.

Q25: Design a Disaster Recovery (DR) plan for Fabric.

Fabric provides built-in BCDR (Business Continuity and Disaster Recovery). OneLake data is automatically replicated (ZRS or GRS depending on configuration). However, for critical resilience, architects should maintain code artifacts (Notebooks, Pipelines) in a Git repo (Azure DevOps) and be ready to redeploy them to a secondary region workspace in case of a region-wide outage.

Performance & Cost

Q26: Optimize a “Slow” Direct Lake Report.

Situation: A Direct Lake report is falling back to DirectQuery mode, causing slow performance.

Action: Check the Direct Lake fallback metrics. Likely causes are high column cardinality or unsupported DAX features. The fix involves optimizing the Delta table (ensuring V-Order is enabled) and reducing the number of unique values in columns used for filtering.

Q27: Architecting for High Concurrency.

Scenario: 1,000 users need to query the data simultaneously at 9 AM.

Solution: Use Direct Lake mode over Import mode. Direct Lake pushes the query load to the Fabric Capacity (F-SKU) rather than the individual Power BI user limits. Ensure “Smoothing” is monitored to handle the 9 AM spike without throttling.

Q28: Cost Optimization Strategy.

To reduce costs, identify “Cold” data. Move infrequently accessed data to an ADLS Gen2 account with the “Cool” or “Archive” tier enabled, and then shortcut to it. This keeps OneLake capacity free for “Hot” data while maintaining accessibility. Use our Pricing Calculator to model F-SKU needs.

Module F: Migration Strategies

Databricks & Synapse

Q29: How to migrate from Databricks to Fabric?

Do not “Lift and Shift.” Adopt a phased approach:

  1. Phase 1 (Shortcut): Leave data in ADLS (managed by Databricks). Shortcut it to Fabric to test Power BI Direct Lake.
  2. Phase 2 (Compute): Migrate simple notebooks to Fabric Spark.
  3. Phase 3 (Storage): Repoint ingestion pipelines to write to OneLake directly. See our Fabric vs Databricks comparison for more details.
Q30: Migrating Synapse Dedicated Pools to Fabric.

The migration involves moving T-SQL logic. Fabric Warehouse does not support all Synapse features (e.g., Identity Columns are limited). You must rewrite DDL statements. Use the “Fabric Migration Assistant” to assess compatibility. Check our Synapse vs Fabric comparison for a full feature mapping.

Ready for the next step?

Now that you have mastered Architecture, let’s dive into Spark optimization.

Start Spark Questions →

References: Microsoft Learn | Delta Lake

Scroll to Top