Databricks Unity - Replace placeholders - 1.7

The Generated Artifacts zip file contains files with placeholders for the target environment parameters.

You should replace all the placeholders to deploy your Databricks Unity Target System.

Replace Placeholders

A toolkit is provided in the generated artifacts to replace placeholders.

It is composed of a Powershell script to execute: replace_placeholders.ps1.

For executing commands with Powershell later in this article, please use at least version 7 provided by Microsoft. If you would like to install the latest version, please check here.

The process to follow is:

  • Update the replacement_config.json file with your values:
    • Insert the value to replace into the value node
      • Each database_name placeholder should contain the database name in your Unity Catalog
        • In our example, it is datalakehouse
      • Each schema_name placeholder should contain the schema name in your Unity Catalog
        • In our example, it is docu_rawvault, docu_businessvault and docu_mart

 

Depending on your source:

  • Linked Project (from a Stage JDBC or a Stage File Project):
    • Each database_name placeholder should contain the database name of the source table created before with the Linked Project in your Unity Catalog:
      • In our example, it is datalakehouse
    • Each schema_name placeholder should contain the schema name of the source table created before with the Linked Project in your Unity Catalog: 
      • In our example, it is docu_stage

 

If you are using Airflow as a Load Control environment, please update the following placeholders:

  • databricks_job#databricks_job#user_name: it must be your databricks user name
  • databricks_job#databricks_job#notebook_task_path: it must be the location of the Jupyter Notebooks into Databricks
    • In our example: /Workspace/Users/<username>/Artifacts_DVDM/
  • databricks_job#databricks_job#existing_cluster_id: it must be the ID of the personal compute to use. To find it:
    • Open Databricks
    • Click on the Compute menu on the left-hand-side:
    • Open the Personal Compute created in your Databricks environment
    • Click on the More ... button:
    • Select the View JSON option
    • The Compute ID is available on the top of the JSON:

 

Replace the placeholders in the files:

  • Open Windows Powershell (or equivalent) in the replace_placeholders.ps1 location:
  • Execute the following command :
  • .\replace_placeholders.ps1
    • You should have a similar result:
  • The configured values in all generated artifacts replaced the placeholders
  • You can now use these files and deploy your Target system

Some parameters can be added to the replace_placeholders.ps1 command.

All are described and available by executing:

.\replace_placeholders.ps1 -help

The -ReplacementConfigPath parameter mainly permits using a replacement_config.json file in another path. It is beneficial when you are in a development mode for your project.

Example of usage:

.\replace_placeholders.ps1 -ReplacementConfigPath "C:\TEMP\Replacement config files\replacement_config_