Avoid to load all data in development

In a development mode, if your source data volume is important, you may want to do a partial load of your data.

The Template module described below is optimized for a Microsoft SQL Server target technology. However, you can use the same concept for all target technologies.

The idea here is to select only the TOP X rows for the Stage Model Objects by choosing the X value.

Create a Custom Property SelectTopValue

Create a Custom Property with the following information:

  • Name: SelectTopValue
  • Property Group: Implementation
  • Model Object Type: Stage
  • Type: Single Line String

Some help is available in Create a Custom Property.

Create a Template Module

Create a template Module with the following information:

  • Model Object Type: Stage
  • Part per Model Object Type: Stage Source View
  • Use for all future Model Objects: checked
  • Model Objects list: select all the existing Stage Model Objects

Some help is available in Create a Template Module.

For the code itself inside the Editor, let's first generate the Stage source view of a Satge Model Object named Customer to understand the default code generated by the generator:

-- StageSourceView: Customer_Stage Source View_1
CREATE OR ALTER VIEW [{sqlserverdimensional1.7#stage#schema_name}].[STG_ST_Customer_Source]
AS
SELECT
     CAST(NULL AS DATETIMEOFFSET) AS [BG_LoadTimestamp]
    ,CAST(NULL AS NVARCHAR(255)) COLLATE DATABASE_DEFAULT AS [BG_SourceSystem]
    ,[s1].[CustomerID] AS [CustomerID]
    ,[s1].[PersonID] AS [PersonID]
    ,[s1].[StoreID] AS [StoreID]
    ,[s1].[TerritoryID] AS [TerritoryID]
    ,[s1].[AccountNumber] COLLATE DATABASE_DEFAULT AS [AccountNumber]
    ,[s1].[rowguid] AS [rowguid]
    ,[s1].[ModifiedDate] AS [ModifiedDate]
FROM [{sqlserverdimensional1.7#adventureworks-sales#server_name}].[{sqlserverdimensional1.7#adventureworks-sales#database_name}].[{sqlserverdimensional1.7#adventureworks-sales#schema_name}].[Customer] AS [s1]
;
GO

The objective of the custom Template Module is to generate similar code, including a TOP X instruction for the data selection, with the X value coming from the Custom Property SelectTopValue:

-- StageSourceView: Customer_Stage Source View_1
CREATE OR ALTER VIEW [{sqlserverdimensional1.7#stage#schema_name}].[STG_ST_Customer_Source]
AS
SELECT TOP 1000
     CAST(NULL AS DATETIMEOFFSET) AS [BG_LoadTimestamp]
    ,CAST(NULL AS NVARCHAR(255)) COLLATE DATABASE_DEFAULT AS [BG_SourceSystem]
    ,[s1].[CustomerID] AS [CustomerID]
    ,[s1].[PersonID] AS [PersonID]
    ,[s1].[StoreID] AS [StoreID]
    ,[s1].[TerritoryID] AS [TerritoryID]
    ,[s1].[AccountNumber] COLLATE DATABASE_DEFAULT AS [AccountNumber]
    ,[s1].[rowguid] AS [rowguid]
    ,[s1].[ModifiedDate] AS [ModifiedDate]
FROM [{sqlserverdimensional1.7#adventureworks-sales#server_name}].[{sqlserverdimensional1.7#adventureworks-sales#database_name}].[{sqlserverdimensional1.7#adventureworks-sales#schema_name}].[Customer] AS [s1]
;
GO

For this, we will implement the following Template Module code (the bold text is the default code provided by biGENIUS-X)

using biGENIUS.ArchitectureModel.Domain;
namespace CustomTemplate
{
  public class Executor
  {
    public void WriteOutput(TargetObject targetObject, ITemplateContext templateContext)
    {
    //A: Source placeholder names
      var sourceServer = "{sqlserverdimensional1.7#adventureworks-sales#server_name}";
      var sourceDatabase = "{sqlserverdimensional1.7#adventureworks-sales#database_name}";
      var sourceSchema = "{sqlserverdimensional1.7#adventureworks-sales#schema_name}";
     
//B: Target placeholder names
      var targetSchema = "{sqlserverdimensional1.7#stage#schema_name}";
 
//C: Alias name of the dataflow set model object
  var DFSMO_alias = "";
  foreach(var dataflow in targetObject.Dataflows) {
    foreach(var dataflowset in dataflow.DataflowSets) {
          foreach(var dataflowsetMO in dataflowset.DataflowSetObjects) {
              DFSMO_alias = dataflowsetMO.Alias;
          }
    }
  }
     
//D: Model object name
      var ModelObjectName = targetObject.GetPropertyValue("SemanticModelObjectName");
     
//E: Value of the SelectTopValue Property
  var SelectTopValue = targetObject.GetPropertyValue("SelectTopValue");
     
//F: Identify the source model object ID
  var sourceModelObjectID = "";
  foreach(var dataflow in targetObject.Dataflows) {
    foreach(var dataflowset in dataflow.DataflowSets) {
      foreach(var dataflowsetMO in dataflowset.DataflowSetObjects) {
        sourceModelObjectID = dataflowsetMO.SourceModelObjectId.ToString();
      }
    }
  }
     
//G: Identify the source model object name from the ID
  var source = "";
  foreach(var modelobject in templateContext.ArchitectureModel.ModelObjects) {
    if(modelobject.Id.ToString() == sourceModelObjectID) {
          source = modelobject.Name;
    }
  }
 
//H: Beginning of the output text
  templateContext.Output.WriteLine("-- Customized template for selection the TOP X rows if Property SelectTopValue is filled");
  templateContext.Output.WriteLine($"-- StageSourceView: {targetObject.Name}");
  templateContext.Output.WriteLine($"CREATE OR ALTER VIEW [{targetSchema}].[{targetObject.TargetName}]");
  templateContext.Output.WriteLine("AS");
 
//I: Add SELECT TOP X if the Property SelectTopValue is filled
  if(!string.IsNullOrWhiteSpace(SelectTopValue)) {
    templateContext.Output.WriteLine($"SELECT TOP {SelectTopValue}");
  }
  else {
    templateContext.Output.WriteLine("SELECT");
  }
 
//J: Add the default terms
  templateContext.Output.WriteLine("     CAST(NULL AS DATETIMEOFFSET) AS [BG_LoadTimestamp]");
  templateContext.Output.WriteLine("     , CAST(NULL AS NVARCHAR(255)) COLLATE DATABASE_DEFAULT AS [BG_SourceSystem]");
 
//K: List of non-default terms
  foreach(var term in targetObject.Terms) {
    if (!term.IsDefaultTerm) {
      templateContext.Output.WriteLine($"     , [{DFSMO_alias}].[{term.Name}] AS [{term.Name}]");
    }
  }

  //L: End of the output text
  templateContext.Output.WriteLine("FROM");      
  templateContext.Output.WriteLine($"[{sourceServer}].[{sourceDatabase}].[{sourceSchema}].[{source}] AS [{DFSMO_alias}]");
      templateContext.Output.WriteLine(";");
  templateContext.Output.WriteLine("GO");
  }
  }
}

Let's now explain each part of this Custom template Module:

A- Source placeholder names

In this part, we initialize some variables with source placeholder values, which will be used later in part L.

B- Target placeholder names

In this part, we initialize some variables with the target placeholder value, which will be used later in part H.

C- Alias name of the dataflow set model object

In this part, we initialize a variable with the alias of the Dataflow set Model Object. 

For this, we explore the targetObject and go through some properties of the architecture model to access the alias.

This variable will be used in part K.

D- Model object name

In this part, we initialize a variable with the Model Object name. It is contained in a Property named SemanticModelObjectName.

E- Value of the SelectTopValue Property

In this part, we initialize a variable with the SelectTopValue Custom property value. 

This variable will be used in part I.

F- Identify the source model object ID

In this part, we initialize a variable with the ID of the Source Model Object. 

For this, we explore the targetObject and examine some properties of the architecture model to access the ID.

This variable will be used in part G.

G- Identify the source model object name from ID

In this part, by using the Source Model Object ID found in part F, we can access the Source Model Object name by exploring the templateContext.

H- Beginning of the output text

In this part, we start to write the output text of our template with some comments and the first SQL statements to create a view.

I- Add SELECT TOP X if the Property is filled

In this part, we check whether the SelectTopValue Custom property is filled with a value.

If so, we can add a SELECT TOP X to the output text. 

If not, we keep a simple SELECT.

J- Add the default terms

In this part, we add the default terms mandatory in a Stage Source View: BG_LoadTimestamp and BG_SourceSystem.

K- List of non-default terms

In this part, we iterate the non-default Terms of the Source Model Object to add them to the SQL output text.

L- End of the output text

In this part, we finalize the SQL statement to create the view.

 

The output result of the Template Module for a Stage Model Object named Customer_Custom where the Custom property SelectTopValue contains the value 1000 is:

-- Customized template for selection the TOP X rows if Property SelectTopValue is filled
-- StageSourceView: Customer_Custom_Stage Source View_1
CREATE OR ALTER VIEW [{sqlserverdimensional1.7#stage#schema_name}].[STG_ST_Customer_Custom_Source]
AS
SELECT TOP 1000
     CAST(NULL AS DATETIMEOFFSET) AS [BG_LoadTimestamp]
     , CAST(NULL AS NVARCHAR(255)) COLLATE DATABASE_DEFAULT AS [BG_SourceSystem]
     , [s1].[CustomerID] AS [CustomerID]
     , [s1].[PersonID] AS [PersonID]
     , [s1].[StoreID] AS [StoreID]
     , [s1].[TerritoryID] AS [TerritoryID]
     , [s1].[AccountNumber] AS [AccountNumber]
     , [s1].[rowguid] AS [rowguid]
     , [s1].[ModifiedDate] AS [ModifiedDate]
FROM
[{sqlserverdimensional1.7#adventureworks-sales#server_name}].[{sqlserverdimensional1.7#adventureworks-sales#database_name}].[{sqlserverdimensional1.7#adventureworks-sales#schema_name}].[Customer] AS [s1]
;
GO

 

Here we go!

We implement a Template Module to be able to avoid loading all the data in a development mode.