Microsoft Fabric - Target environment

This article proposes a possible target environment for Microsoft Fabric DataVault and Microsoft Fabric DataVault and Mart generators.

Installation and configuration of the target environment are not part of biGENIUS support.

Unfortunately, we won't be able to provide any help beyond this example in this article.

Many other configurations and installations are possible for a Microsoft Fabric target environment.

Below is a possible target environment setup for a Microsoft Fabric generator.

The Property Target Platform should be set to the Fabric value:

Setup environment

You should have access to a Microsoft Fabric target environment:

 

To activate the git integration later, you should also have access to an Azure DevOps git repository. 

Enter the Synapse Data Engineering section:

The Synapse Data Engineering section is displayed:

Create a Workspace

Click on the Workspace entry in the left menu, then on + New workspace:

In this example, we will create a Workspace named bgfabricworkspace1.

Fill in the name and click on the Apply button:

Now click on the Workspace settings menu to configure the git integration:

Choose the Git integration entry in the left menu:

Connect to your Azure DevOps git repository, then click on the Connect and sync button:

Source data

There are three ways to provide source data to a Microsoft Fabric Data Vault or Data Vault and Mart generators:

  • From Parquet files by using the Microsoft Fabric Stage Files generator as a Linked Project
  • From any database accessed through JDBC by using the Microsoft Fabric Stage JDBC generator as a Linked Project
  • From existing Delta files by using a direct Discovery to a Spark External Table that contains the Delta file data

Parquet Files

If your source data are stored in Parquet files, please:

  • Create a first Project with the Microsoft Fabric Stage Files generator
  • In this first Project, discover the Parquet files, create the Stage Model Object, generate, deploy, and load data in a Lake House.
  • Create a second Project with the Microsoft Fabric Data Vault or DataVault and Mart generators.
  • In this second Project, use the first Project Stage Model Object as a source by using the Linked Project feature.

Database

If your source data are stored in a database such as Microsoft SQL Server or Postgres (or any database you can access through JDBC), please:

  • Create a first Project with the Microsoft Fabric Stage JDBC generator
  • In this first Project, discover the database tables, create the Stage Model Object, generate, deploy, and load data in a Lake House.
  • Create a second Project with the Microsoft Fabric Data Vault or Microsoft Fabric DataVault and Mart generator
  • In this second Project, use the first Project Stage Model Object as a source by using the Linked Project feature.

Delta Files

You may have existing Delta Files in your Lake House that you want to use as source data.

You should manually create a Spark External Table in your Lake House containing the Delta File data.

For example, we have the following Delta File containing Credit Card data in our Lake House named bglakehouse1:

We will create a Spark External table named creditcard_delta with the following code:

--Replace bgfabricworkspace1 by your workspace name
--Replace bglakehouse1 by your lake house name

df = spark.sql(f"""
    CREATE EXTERNAL TABLE creditcard_delta
    USING DELTA
    LOCATION 'abfss://bgfabricworkspace1@onelake.dfs.fabric.microsoft.com/bglakehouse1.Lakehouse/Files/Delta/'
""")

Then, create a Discovery in the Microsoft Fabric DataVault or DataVault and Mart Project using a JSON discovery file generated by the Discovery Companion Application with a Delta file.

Upload Artifacts in Microsoft Fabric

Please now upload the generated Artifacts from the biGENIUS-X application to the Microsoft Fabric Workspace.

Please replace the placeholders before uploading the artifacts.

  • Click on the Workspace we just created in the left menu:
  • Click on the New button and choose Import notebook:
    >
  • Click on the Upload button:
  • Select all the generated artifacts from the folder Jupyter and then from the folder Helpers:
  • Import, in addition, the following helper:

In the file 500_Deploy_and_Load_DataVault_Fabric.ipynb, adapt the name of the XXX_Deployment, the  XXX_SimpleLoadexecution.ipynb, and the XXX_SelectResults.ipynb by the name of your Helper files.

  • Commit all the changes into your git repository by:
    • Clicking on the Source control menu:
    • Selecting all the changes:
    • Clicking on the Commit button:

You're now ready to deploy these artifacts and subsequently load the data based on the Generator you are using.