Welcome to the third part Data Pipelines in Fabric in our Microsoft Fabric Learning Path. In this tutorial, you’ll master Data Ingestion in Microsoft Fabric, focusing on both batch and real-time strategies. We’ll explore Data Pipelines in depth and integrate Eventstream for real-time use cases.
🔍 What is Data Ingestion in Microsoft Fabric?
Data ingestion is the process of moving raw data from source systems into a central repository, such as a Lakehouse or Warehouse in Microsoft Fabric. Fabric offers two primary ingestion paths:
- Batch ingestion using Data Pipelines
- Real-time ingestion using Eventstream

🛠️ Data Pipelines in Fabric: Core Ingestion Engine
Data Pipelines in Fabric are used to orchestrate and automate data workflows. They consist of activities that connect to data sources, apply logic, and load into destinations.
📦 Copy Activity
This is the most basic ingestion tool. It copies data from a source (like Blob Storage, SharePoint, SQL Server, or local file) into Fabric destinations such as Lakehouse or Warehouse.
🧠 Dataflow Gen2 Activity
Used for low-code/no-code data transformation and ingestion. It allows you to use Power Query to cleanse, transform, and load data.
📝 Notebook Activity
Executes Spark notebooks as part of the pipeline. Ideal for advanced data transformations using PySpark, Spark SQL, or machine learning logic.
🔤 Script Activity
Runs SQL scripts directly against Lakehouse or Warehouse. Useful for creating tables, inserting records, or performing updates.
🔗 Web Activity
Calls REST APIs or webhooks from within the pipeline. Useful for external system integration like notifying apps or fetching auth tokens.
🔁 Control Activities
- Filter Activity: Filters input rows based on a condition
- If Condition Activity: Adds branching logic
- Switch Activity: Similar to switch-case for routing logic
- ForEach Activity: Loops through an array of items
- Until Activity: Repeats activities until a condition is met
🧪 Stored Procedure Activity
Executes stored procedures in supported databases. Often used for ETL jobs in SQL-based environments.
⏱️ Triggers in Data Pipelines
- Manual Trigger: Runs on demand
- Schedule Trigger: Executes at a specific frequency
- Event Trigger: Executes on data arrival or external signals
⚡ Real-Time Ingestion with Eventstream
Eventstream is Fabric’s real-time ingestion platform. It connects to Event Hubs, IoT Hubs, and other streaming sources to push real-time data into your Lakehouse or KQL Database.
🚀 Eventstream Pipeline Example
- Connect to Azure Event Hub or IoT Hub
- Transform using inline expressions or filters
- Route to destinations like Lakehouse or Power BI Direct Lake
📊 Monitor & Debug Your Pipelines
Use the Monitoring Hub in Fabric to check pipeline execution history, errors, performance, and trigger logs. Add alerts and retry logic where needed.
📁 Sample Dataset for Practice
Use the dataset below to practice batch ingestion workflows:
📥 Download sales_data_1000_rows.csv
✅ Best Practices
- Organize OneLake folders into raw/silver/gold zones
- Use naming conventions for pipeline runs
- Store secrets in Azure Key Vault or Fabric credentials
- Use Git integration for version control of pipelines
🚀 What’s Next?
In the next post, we’ll dive into transforming data with Notebooks and Spark in Fabric. Stay tuned!
🔗 Read Next: Transformation Data Using Notebooks & Spark – Microsoft Fabric Tutorial Series
Made with ❤️ for data learners. This post is part of the Microsoft Fabric Tutorial Series.