Fabric Lakehouse - Load data - Native Load Control

Before loading the data from your Source(s) System(s) to your Fabric Lakehouse Target System, please:

You are now ready to load the data.

To load the data natively with Fabric Lakehouse, you will use a multi-thread approach to load them in parallel.

It is also possible to load data partially from a Data Modeling View.

To load the data:

Open Microsoft Fabric at https://app.fabric.microsoft.com/
Open the Workspace where you deployed the artifacts by clicking on the Workspace option on the left menu:
Then, on the desired Workspace (bgfabricdvdm in our example):
Open the LoadControl/XXX_MultithreadingLoadExecution.ipynb file
Choose the Lakehouse by:
- Clicking on the Lakehouses menu on the left:
- Clicking on the Add button:
- Selecting Existing lakehouse:
- Selecting the Lake House previously created (docu_bglakehouse in our example):
Execute all the steps
The data were loaded:
- You should have the target Parquet files created for each Target Model Object, for example, for the Stage CreditCard:
- Step 3 displayed a resume of the number of rows loaded for each Target Model Object, for example:

You can now check that your data was correctly loaded with the following script:

--Create a new step with the following code:
mydf = spark.sql("select * from `rdv_hub_customer_hub_result`")
mydf.show(truncate = False)

And see the content of your Target Parquet file:

It is possible to generate, deploy, and then load data partially based on the Dataflow Modeling Views content.

All the previous steps are similar to the complete Project:

You use the generated code only for the View.

If you organized your Project Model Objects in different Dataflow Modeling Views, you can:

To load the data for all the Views in a Data Factory Pipeline:

A Model Object must be in only one View
If a Model Object needs to be loaded after another, it must be in the same View or in a View loaded after the first one

Example: 3 View CreditCard, Customer, and Order. Order contains Model Objects that depend on Model Objects in the Customer and CreditCard Views.

The View flow is:

For each Data Modeling View:

Generate for a View rather than the overall Project
Deploy the partial generation
- Deploy the load control only once (with the first deployed View, for example)

Then, create a Data Factory Pipeline:

Open Microsoft Fabric at https://app.fabric.microsoft.com/
Open the Workspace where you deployed the artifacts by clicking on the Workspace option on the left menu:
Click on the New item button:
Search for Pipeline:
Fill in a name:
Start with a blank canvas:
Search for Notebook:
- General tab:
  - Name: Load_View_XXX (XXX is the name of the first View to load)
- Settings tab:
  - Notebook: Select the notebook XXX_MultithreadingLoadExecution.ipynb for the corresponding view
Create the other Activities:
- Activities tab, click on Notebook:
- Fill in similar information as for the first task
Link activities:
- Hover over the source Activity, drag the green icon(On Success) to the target Activity:
- Do it for each dependency
In our example, the complete Pipeline looks like the following:
For each Notebook used in the pipeline:
- Open it
- Choose the Lakehouse by:
  - Clicking on the Lakehouses menu on the left:
  - Clicking on the Add button:
  - Selecting Existing lakehouse:
  - Selecting the Lake House previously created (docu_bglakehouse in our example):
Click on the Run now button:
The data load starts:

Fabric Lakehouse - Load data - Native Load Control - 2.0