Reduce the volume of the Cleansing Error tables

Reduce the volume of Cleansing Error tables

By default, the cleansing action logs a line of type Missing foreign key value in the CLS_XXX_Error tables if a Foreign Key has a NULL value in a relationship between a Fact and an Entity or between 2 Entities.

Example: We don't know who buys the Product X: the Person is set to NULL in the Fact Sale.

You have the same kind of line if a Foreign Key has a value that does not exist in the target object of the relationship.

Example: Product X was bought by Person Y, but Person Y didn't exist in the list of persons at that time.

For more information about the cleansing actions, see the article Take care of entries from Cleansing Error tables.

By default, the cleansing action logs an error if a Foreign Key has a NULL value in a relationship between a Fact and an Entity or between two Entities. However, you might want to treat a NULL Foreign Key as meaningful, indicating that the related record is unknown, but the data is still correct in the source.


To reduce the volume of entries in the CLS_XXX_Error tables, you may choose to suppress the creation of these Cleansing Action Log lines for such cases.

To avoid logging these as errors in the Cleansing Action error tables, apply the following updates to your Data Solution:

  • Include a new entry in the relevant Entity to represent the "N/A" value. This is called a singleton. For example, in the case of the Product_Subcategory Entity, you can add the following line:
-- Enable IDENTITY_INSERT in Product_Subcategory Entity
SET IDENTITY_INSERT [DW].[COR_EN_Product_Subcategory] ON
GO

--Add the singleton value in Product_Subcategory Entity
INSERT INTO [DW].[COR_EN_Product_Subcategory]
([BG_SourceSystem]
      ,[BG_LoadTimestamp]
      ,[BG_UpdateTimestamp]
      ,[Product_Subcategory_ID]
      ,[ProductSubcategoryID]
      ,[FK_Product_Category_ProductCategoryID]
      ,[Product_Category_Product_Category_ID]
      ,[Name])
VALUES (
      'N/A'
      ,CONVERT(DATETIME, '19000101', 112)
      ,CONVERT(DATETIME, '19000101', 112)
      ,-10
      ,-10
      ,0
      ,0
      ,'Not available');
GO

-- Disable IDENTITY_INSERT in Product_Subcategory Entity
SET IDENTITY_INSERT [DW].[COR_EN_Product_Subcategory] OFF
GO
  • To implement the use of the N/A value, add a term rule to the Foreign Key in the related Entity. For instance, in the case of Product_Product, the term rule should ensure that NULL values are replaced with the "N/A" value.
    • Edit the FK_Product_Subcategory_ProductSubcategoryID Term Mapping
    • Fill in the following Expression:
CASE
WHEN s1.ProductSubcategoryID IS NULL
THEN -10 --apply the N/A singleton value
ELSE s1.ProductSubcategoryID
END

 

Once you generate, deploy, and load your project again, no more Cleansing Action Log entries will be created for a NULL Foreign Key in the relationship affected by this update (e.g., the relationship between a Product and a Product_Subcategory).

This update allows you to distinguish between "real" unknown values (which are subject to a Cleansing Action) and N/A values, which are now treated as valid and meaningful.

The new singleton line added to the relevant Entity for the N/A value is not generated by biGENIUS-X since it was created manually.
As a result, a new deployment will remove this line.

To ensure the N/A value persists, it is recommended to create a script that can be executed manually after deploying the generated artifacts. This script should reinsert the N/A entry into the Entity table.