Discover CSV files on Azure

Please first read how to use the Discovery Application.

Discover CSV files

To discover CSV files on Azure with the Discovery Application, please choose the following source system type:

Then fill in the following information:

The value examples below correspond to the source example explained in the Databricks Stage Files target environment.

  • Name: discovery name
    • Example: CSV_AW2019_Azure
  • Storage Account Name: account name of the Azure Storage 
    • Example: bgdatabrickslandingzone1

The Storage Account Name can be found in Microsoft Azure Storage Explorer: select the Storage Account and copy the Account Name displayed in the Properties tab: 

  • Storage Account Key: account key of the Azure Storage 

The Storage Account Key can be found in Microsoft Azure Storage Explorer: select the Storage Account and copy the Primary Key displayed in the Properties tab after clicking on the eye icon: 

  • Blob Container URL
    • Example: https://bgdatabrickslandingzone1.blob.core.windows.net/source
  • File Path Type:

    • Single CSV File: select this option if you want to discover a single CSV file
      • Example of a single CSV file:
    • Directory: select this option to discover a folder containing several CSV files:
      • Example of 4 CSV files in the same folder:
  • File Path (without URL): according to the File Path Type selected, fill:
    • Single CSV File: the path to the single file
      • Example: csv/CreditCard.csv
    • Directory: the path to the folder containing all the CSV files
      • Example: csv/
  • Column Delimiter: delimiter character used to separate the values between the columns in your CSV file.
    • Example: |
  • Row Delimiter: row delimiter that separates the rows in the source CSV file:

    • Example: \r\n
  • Field Quote (optional): specify which character surrounds the string values in the file
    • Example: "
  • Field Length (optional): if you want to specify a length for each column in your file, you can enter it here. 
    • Example: 400
      • All the columns from the file will be varchar(400)