Transform Data Using Notebooks – Microsoft Fabric Tutorial Series

In this tutorial, we explore how to transform data using Notebooks in Microsoft Fabric. Whether you’re a data engineer, analyst, or citizen developer, notebooks offer a powerful and flexible way to clean, transform, analyze, and visualize your data. This hands-on guide walks you through the full notebook experience in Microsoft Fabric, from creation to execution and best practices.


πŸ“˜ What is a Notebook in Fabric?

A notebook in Microsoft Fabric is an interactive environment that allows you to combine code (like PySpark or SQL), markdown text, and visualizations. It is built on top of Apache Spark, enabling distributed processing and advanced analytics within the Fabric ecosystem.

πŸš€ Why Use Notebooks for Data Transformation?

Here’s why notebooks are essential in data engineering within Fabric:

  • Run complex ETL pipelines with Python or PySpark.
  • Visualize data transformations step-by-step.
  • Collaborate with team members using comments and versioning.
  • Trigger notebooks from pipelines or schedule them independently.

πŸ› οΈ How to Create a Notebook in Microsoft Fabric

  1. Go to your Fabric workspace and click on New β†’ Notebook.
  2. Choose your runtime (Spark) and set language preference (Python, PySpark, SQL).
  3. Add code or markdown cells to begin your transformation logic.

✨ Key Notebook Features (with Visuals)

πŸ§‘β€πŸ€β€πŸ§‘ Tagging Others in a Comment

You can @mention collaborators inside any markdown cell comment for faster feedback and teamwork.

Transform Data Using Notebooks

πŸ“ Markdown Folding

Collapse or expand long markdown explanations to keep your notebook clean and organized.

Markdown folding demo

πŸ”’ Lock and Freeze Cell Output

Protect outputs or freeze key steps in your pipeline for consistent processing.

Lock and Freeze Cells

βš™οΈ Parameterized Session Configuration

Control your notebook runtime environment using parameter settings.

Session Parameters in Fabric Notebook

πŸ“¦ Using Variables for Dynamic Input

Define and re-use variables throughout your code cells to enhance readability and reusability.

Notebook Variables

⚑ Executing Data Transformations

Here’s a simple transformation pipeline using PySpark in Fabric:


# Read data from Lakehouse
df = spark.read.load('lakehouse://sales_data/')

# Clean and transform
df_clean = df.dropna().filter(df['Region'] == 'East')

# Write back to Lakehouse table
df_clean.write.mode('overwrite').save('lakehouse://cleaned_sales_data/')

You can also visualize intermediate steps using:


display(df_clean)

🧠 Best Practices for Fabric Notebooks

  • Always use markdown cells to describe each step.
  • Use version control or comments for collaborative tracking.
  • Schedule notebooks or integrate into Pipelines for automation.
  • Group related logic into sections using markdown headers.
  • Monitor resource usage for Spark runtime (memory, partitions).

πŸ“š Continue Learning

Notebooks are just one part of the Microsoft Fabric journey. Check out the next post on Data Warehouse in Fabric to explore orchestration in depth.


Series: Free Microsoft Fabric Tutorial: A Step-by-Step Learning Series

data transformation in Fabric, Microsoft Fabric notebooks, Spark in Microsoft Fabric, transform data in Fabric, data wrangling Fabric, Fabric Spark tutorial, Fabric data engineering, Microsoft Fabric tutorial, data prep in Microsoft Fabric, Fabric Delta Lake,transform data using notebooks, data transformation with notebooks, Microsoft Fabric notebooks, notebook-based data transformation, data wrangling with notebooks, PySpark notebooks, Spark notebooks in Fabric, Fabric data engineering notebooks, clean data using notebooks, enrich data with notebooks

Scroll to Top