Please first read how to use the Discovery Application.
Discover Parquet or Delta files
To discover Parquet or Delta files on AWS with the Discovery Application, please choose the following source system type:
Then fill in the following information:
- Name: discovery name
- Example: Parquet_AW2019_AWS
- S3 bucket: name of the S3 bucket on AWS containing the files to discover
- Example: bigenius-data-training-user-01
- Access Key: AWS access key ID
- Example: AKIAWW2SPUKXNWWXEUH2
To create an access key on AWS:
- Connect to your account at https://signin.aws.amazon.com/
- Open the security credentials menu under your account name:
- Create an access key by clicking on the Create access key button:
- Copy the Access key ID and the Secret access key in a safe place:
- Secret Access Key: AWS secret access key
- File Path Type:
- Directory With Multiple Entities (Delta or Parquet): select this option if you want to discover several entities (Parquet or Delta) contained in S3 bucket folder:
- Example of 6 entities in an adventureworks folder:
- Each entity folder contains the file to discover:
- Example of 6 entities in an adventureworks folder:
- Folder for a Single Entity (Delta or Parquet): select this option to discover a folder containing all the Parquet or Delta files concerning an Entity:
- Directory With Multiple Entities (Delta or Parquet): select this option if you want to discover several entities (Parquet or Delta) contained in S3 bucket folder:
-
-
- Example of 1 entity Customer with multiple Parquet files stored by partition (date):
- Example of 1 entity Customer with multiple Parquet files stored by partition (date):
-
-
- Single File (Parquet only): select this option if you want to discover a single Parquet file
- Example of a single Parquet file:
- File Path (without URL): according to the File Path Type selected, fill:
- Single File (Parquet only): select this option if you want to discover a single Parquet file
-
-
- Directory With Multiple Entities (Delta or Parquet): the path to the folder containing all the entity's folders
- Example: adventureworks/
- Directory With Multiple Entities (Delta or Parquet): the path to the folder containing all the entity's folders
-
-
-
- Folder for a Single Entity (Delta or Parquet): the path to the folder containing all the entity files
- Example: adventureworks/CUSTOMER/
- Single File (Parquet only): the path to the single file
- Example: adventureworks/CREDITCARD/dt=2023-01-01/
- Folder for a Single Entity (Delta or Parquet): the path to the folder containing all the entity files
-