Fabric and dbt Cloud - First Impressions

I recently had the chance to work on a proof of concept combining dbt Cloud with Microsoft Fabric.
It was a great opportunity to explore how well these two platforms work together.
In this blog post, I'll share my first impressions.

What I really liked

Diving into my first project with dbt Cloud was a breath of fresh air. Right from the start, the platform impressed me with its clear documentation and user-friendly setup. It took very little time to get up and running. Whether you're working on Snowflake, BigQuery, Redshift, or Databricks, dbt’s broad compatibility with major platforms makes it a versatile choice for data teams.

One of the standout features for me was how effortlessly dbt Cloud integrates with Azure DevOps. Setting up automated CI testing triggered by pull requests was straightforward. One important sidenote: this does require an Enterprise dbt license.

Beyond the smooth setup, I discovered a robust set of tools that made testing and monitoring data transformation much more efficient.

Data Tests

These are data quality checks that run during the build process.
They fall into two groups: singular and generic.

  • Singular tests are written for one specific case.
  • Generic tests are reusable and can be applied to many models or columns.

dbt comes with four standard generic tests: unique, not_null, accepted_values and relationships.
Beyond that, the dbt package hub offers many other tests. You can even build your own for specific use cases.

Unit Testing

Unit tests help ensure complex logic behaves as intended during development.
For example, you might want to verify that some regex logic in a data model still works correctly.
You define input records and expected output. The test checks whether your transformations behave as expected.

Integration Testing

This type of testing happens during the release process.
All affected models are built in a separate environment to ensure they still work before merging changes.
Since only impacted models are built, this process is often called slim CI.


Data Health Tiles

Another cool feature is the data health tile. This allows you to embed visuals in Power BI (or other tools) showing the health and freshness of incoming data. If something went wrong upstream, users will know right away—helping to build trust between departments.

A picture is worth a thousand words, so here’s one:

dbt - Data Health Tile


Two limitations I liked a little less about this setup

1. Workspace items and compute are not decoupled in Microsoft Fabric

🔒 Isolation

In Fabric a capacity is assigned at the workspace level.
Everything within that workspace shares the same compute resources.
This means:

  • You need to plan workload allocation across workspaces carefully.
  • Isolating workloads within the same workspace is difficult.
  • A heavy process could negatively impact other processes in the same or other workspaces that share the same capacity.

This also makes it less convenient to allocate compute just for dbt.

⚙️ Configurability

Microsoft describes Fabric warehouses as:

"A processing system that is serverless in that the backend compute capacity scales up and down autonomously to meet workload demands."

But remember: it's still running on a capacity with a size that you choose.
Yes, there are optimizations in place, but the behavior depends heavily on your configuration and usage. If you're interested in diving deeper into this topic, here's a great starting point: 🔗 Microsoft Fabric Costs Explained

To me, this setup is a bit of a double-edged sword:

  • ✅ Less infrastructure to worry about.
  • ❌ Fewer controls when it comes to tuning performance.

Also worth noting: pricing is not truly serverless.
As long as the capacity is running, you're being charged.
It’s your responsibility to turn off the capacity when not in use.


2. Workspaces and the data layer are not decoupled in Microsoft Fabric

📖 Impact on a single workspace configuration

In Fabric, each workspace has one SQL endpoint and this endpoint supports cross-database queries. So, if you're using a warehouse, you can access tables from all lakehouses (read) and warehouses (read/write) in the same workspace.

Lakehouses are used when working with notebooks.
If you create a notebook and want to read warehouse tables, shortcuts are required - even when both are within the same workspace. This approach feels unnecessarily cumbersome. Shortcuts should be the solution for connecting to external data sources, not for navigating data that resides within the same platform and tenant.

A workaround, without creating table objects in the lakehouse, would be connecting to the warehouse via the SQL endpoint using the Spark connector for Microsoft Fabric Data Warehouse or writing custom code. These solutions allow both read/write, but aren't particularly elegant either.

For dbt, which connects to a warehouse, this isn’t a major issue.
But for downstream processes based on tables created by dbt, this peculiarity could be more relevant.

Diagram - Fabric Single workspace


📚 Impact on a configuration with multiple workspaces

Things get more complex when working across workspaces.

Each workspace has its own SQL endpoint. So, you can’t do cross-workspace queries directly.
Once more, shortcuts are the solution for reading across workspaces.

Let's explain with an example

You have:

  • warehouse_b in workspace_b
  • You want to access its data from warehouse_a in workspace_a

To do this, you need to perform two actions:

  1. Create a lakehouse in workspace_a (let’s call it lakehouse_a)
  2. In lakehouse_a, create shortcuts to the tables in warehouse_b

Now, warehouse_a can access the tables from warehouse_b via the shortcuts in lakehouse_a.

But why the lakehouse you might ask? It's because shortcuts can't be created in warehouses.

So yeah, it works... but it requires some tinkering.

Another example

Let's say we want to write across workspaces within the same dbt project and environment, can we do it?
Unfortunately the answer is no. This will not work as your connection is stuck in the specified workspace.

Diagram - Fabric Multiple workspaces


Conclusion

Despite some initial challenges — especially from a data engineering point of view — Microsoft Fabric shows clear potential, especially for companies already invested in the Microsoft ecosystem. As the platform matures and its flexibility improves, its value will only grow.

Tools like dbt can play a key role in enhancing and streamlining the Fabric experience.
dbt is also compatible with more mature platforms like Snowflake and Databricks, which can offer solutions for more advanced use cases.

I’m very curious to see what the future holds—both for dbt and for Microsoft Fabric.