Databricks Unity - Load data - Native Load Control - 2.0
Before loading the data from your Source(s) System(s) to your Databricks Unity Target System, please:
You are now ready to load the data.
To load the data natively with Databricks Unity Catalog, you will use a multi-thread approach to load them in parallel.
It is also possible to load data partially from a Data Modeling View.
You can also load the data with an external tool as:
Load the data
We will explain how to load the data using the possible target environment Databricks Unity.
To load the data:
- Open Databricks with the URL provided in the Azure Databricks Service resource:

- Databricks is open:

- Click on the Workspace menu on the left-hand side:

- Expand the folders Workspace > Users > Your_user > artifacts:

- Open the LoadControl/XXX_MultithreadingLoadExecution.ipynb file
- Select the Compute Cluster (Demo Cluster in our example)

- Execute all the steps
- The data were loaded:
- You should have the target Parquet files created for each Target Model Object, for example, for the Stage CreditCard:

- You should have the target Parquet files created for each Target Model Object, for example, for the Stage CreditCard:
- You can check the load:
- Open the Helpers/XXX_SelectResults.ipynb file
- Run it
- A resume of the number of rows loaded for each Target Model Object is displayed, for example:

You can now check that your data was correctly loaded with the following script:
%sql
select * from `rawvault`.`rdv_hub_creditcard_hub`;
And see the content of your Target Parquet file:

Load the data partially
It is possible to generate, deploy, and then load data partially based on the Dataflow Modeling Views content.
All the previous steps are similar to the complete Project:
- Generate for a View rather than the overall Project
- Deploy the partial generation
- Load the data - partially
You use the generated code only for the View.
Load all the data from partial deployments
If you organized your Project Model Objects in different Dataflow Modeling Views, you can:
- Load the data partially for each View (previous chapter)
- Load the data for all the Views in a Databricks Job (see below)
To load the data for all the Views in a Databricks Job:
- A Model Object must be in only one View
- If a Model Object needs to be loaded after another, it must be in the same View or in a View loaded after the first one
Example: 3 View CreditCard, Customer, and Order. Order contains Model Objects that depend on Model Objects in the Customer and CreditCard Views.
The View flow is:

For each Data Modeling View:
- Generate for a View rather than the overall Project
- Deploy the partial generation
- Deploy the load control only once (with the first deployed View, for example)
Then, create a Databricks Job:
- Open Databricks with the URL provided in the Azure Databricks Service resource:

- Databricks is open:

- Click on the Jobs & Pipelines menu on the left-hand side:

- Click on Create new > Job:

- Add the first task by clicking on Notebook:

- Fill in:
- Task name: Load_View_XXX (XXX is the name of the first View to load)
- Type: keep Notebook
- Path: select the notebook XXX_MultithreadingLoadExecution.ipynb for the corresponding view
- Compute: select your compute
- Click on Create task
- Create the other tasks:
- Click on Add task

- Choose Notebook:

- Fill in similar information as for the first task
- Configure the Depends on:

You can let the Depends on empty if there is no dependency on another view load.
- Click on Add task
- In our example, the complete Job looks like the following:

- Click on the Run now button:

- The data load starts:
