Discover Parquet or Delta files on Azure

Please first read how to use the Discovery Application.

Discover Parquet or Delta files

To discover Parquet or Delta files on Azure with the Discovery Application, please choose the following source system type:

Then fill in the following information:

The value examples below correspond to the source example explained in the Databricks Stage Files target environment.

  • Name: discovery name
    • Example: Parquet_AW2019_Azure
  • Storage Account Name: account name of the Azure Storage 
    • Example: bgdatabrickslandingzone1

The Storage Account Name can be found in Microsoft Azure Storage Explorer: select the Storage Account and copy the Account Name displayed in the Properties tab: 

  • Storage Account Key: account key of the Azure Storage 

The Storage Account Key can be found in Microsoft Azure Storage Explorer: select the Storage Account and copy the Primary Key displayed in the Properties tab after clicking on the eye icon: 

  • Blob Container URL
    • Example: https://bgdatabrickslandingzone1.blob.core.windows.net/source
  • File Path Type:
    • Directory With Multiple Entities (Delta or Parquet): select this option if you want to discover several entities (Parquet or Delta) contained in a folder:
      • Example of 6 entities in a source folder:

      • Each entity folder contains the file to discover:
    • Folder for a Single Entity (Delta or Parquet): select this option to discover a folder containing all the Parquet or Delta files concerning an Entity:
      • Example of 1 entity Customer with multiple Parquet files stored by partition (date):
    • Single File (Parquet only): select this option if you want to discover a single Parquet file
      • Example of a single Parquet file:
    • File Path (without URL): according to the File Path Type selected, fill:
      • Directory With Multiple Entities (Delta or Parquet): the path to the folder containing all the entity's folders
        • Example: adventureworks/
      • Folder for a Single Entity (Delta or Parquet): the path to the folder containing all the entity files
        • Example: blackforestmarkets_parquet/CUSTOMER/
      • Single File (Parquet only): the path to the single file
        • Example: adventureworks/CreditCard/