Snowflake - Target environment

This article outlines a potential setup for a target environment for Snowflake generators.

Please note that the installation and configuration of this target environment fall outside the scope of biGENIUS support.

We will not offer assistance beyond the provided example in this article.

A Snowflake target environment has numerous alternative configurations and installations.

Below is a potential target environment setup for a Snowflake generator.

Setup environment

The Snowflake target environment needs at least the following items configured into your Snowflake account:

  • (Optional) A target Warehouse WH_XXX
    • You can also work into the default Snowflake Warehouse COMPUTE_WH
  • A source Database SD_XXX into the target Warehouse WH_XXX (or COMPUTE_WH)
  • A target Database TD_XXX into the target Warehouse WH_XXX (or COMPUTE_WH)
  • A Role R_XXX with the following privileges on the source database SD_XXX and the target database TD_XXX: OWNERSHIP, CREATE SCHEMA, MODIFY, MONITOR, USAGE, and EXECUTE TASK

For example, you can add the EXECUTE TASK privilege for the Role R_XXX with the following script:

GRANT EXECUTE TASK ON ACCOUNT TO ROLE R_XXX; 
  • A User XXX who can access the role R_XXX

Source data

In our target environment example, we use the same database for source and target data.

The source data are stored in a dedicated schema named AW2017 and filled with data from Parquet files stored in AWS S3 Bucket.

Globally, the source data are loaded according to the following architecture:

H405_Annex

The following AWS services are required:

  • AWS S3 buckets:
    • An Amazon S3 bucket is a public cloud storage resource available in Amazon Web Services (AWS) Simple Storage Service (S3), an object storage offering. Amazon S3 buckets, similar to file folders, store objects consisting of data and its descriptive metadata.
    • For this example, we will store into the S3 bucket parquet files containing data extracted from the Microsoft SQL Server AdventureWorks2017 sample database in the S3 bucket.
  • AWS Glue Data Catalog:
    • The AWS Glue Data Catalog is an index to your data's location, schema, and runtime metrics. Information in the Data Catalog is stored as metadata tables, where each table specifies a single data store.
    • For this example, some external tables will store the data from the S3 bucket (parquet files)
  • AWS Athena:
    • Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL
    • For this example, we used Athena to create the metadata tables stored in the Glue Data Catalog

Following, you can observe the source data in the schema AW2017:

Upload artifacts in Snowflake

Please now upload the generated Artifacts from the biGENIUS-X application to Snowflake.

Please replace the placeholders before uploading the artifacts.

  • Click on the Projects then Worksheets menu entry in the left menu:
  • Click on the Context icon and choose Create Worksheet from SQL File:
  • Select a SQL script from the SQL folder in the generated artifacts
  • Repeat the operation for all the SQL scripts in the SQL folder and the LoadControl folder
  • You should have all the generated artifacts now loaded in Snowflake Worksheets:

A known bug exists in version 1.4.0 of the Snowflake Generators.

Please add the following code at the beginning of the Load Control SQL Script (Snowflake1_4_0DataVaultandMart_tasks.sql in our example):

USE DATABASE IDENTIFIER('<Name of your target database>'); 

Then, if you don't use the default Warehouse COMPUTE_WH, please replace COMPUTE_WH with the name of your Warehouse everywhere in this file.


You're now ready to deploy these artifacts and subsequently load the data based on the Generator you are using.