Reduce the volume of the Data Quality rules log
This article is valid for a Project with a Generator Configuration version higher than or equal to 1.10.X.
For lower versions, see Reduce the volume of the Cleansing Error tables.
Reduce the volume of the Data Quality rule logs
By default, the Missing Foreign Keys Data Quality rule logs a line of type Missing foreign key value in the XXX_DataQualityError tables if a Foreign Key has a NULL value in a relationship between 2 Model Objects.
Example: We don't know who buys Product X: the Person is set to NULL in the Fact Sale.
You have the same kind of line if a Foreign Key has a value that does not exist in the target object of the relationship.
Example: Product X was bought by Person Y, but Person Y didn't exist in the list of persons at that time.
For more information about the Data Quality rules logs, see the article Take care of entries from the Data Quality rules log.
By default, the Data Quality rule logs an error if a Foreign Key has a NULL value in a relationship between a Fact and an Entity or between two Entities.
However, you might want to treat a NULL Foreign Key as meaningful, indicating that the related record is unknown while the data in the source is still correct.
To reduce the volume of entries in the XXX_DataQualityError, you may choose to suppress the creation of these Data Quality rule Log lines for such cases.
To avoid logging these as errors, apply the following updates to your Data Solution:
- Include a new entry in the relevant Entity to represent the "N/A" value. This is called a singleton. For example, in the case of the Product_Subcategory Entity, you can add the following line:
-- Enable IDENTITY_INSERT in Product_Subcategory Entity
SET IDENTITY_INSERT [DW].[COR_EN_Product_Subcategory] ON
GO
--Add the singleton value in Product_Subcategory Entity
INSERT INTO [DW].[COR_EN_Product_Subcategory]
([BG_SourceSystem]
,[BG_LoadTimestamp]
,[BG_UpdateTimestamp]
,[Product_Subcategory_ID]
,[ProductSubcategoryID]
,[FK_Product_Category_ProductCategoryID]
,[Product_Category_Product_Category_ID]
,[Name])
VALUES (
'N/A'
,CONVERT(DATETIME, '19000101', 112)
,CONVERT(DATETIME, '19000101', 112)
,-10
,-10
,0
,0
,'Not available');
GO
-- Disable IDENTITY_INSERT in Product_Subcategory Entity
SET IDENTITY_INSERT [DW].[COR_EN_Product_Subcategory] OFF
GO
- To implement the use of the N/A value, add a term rule to the Foreign Key in the related Entity. For instance, in the case of Product_Product, the term rule should ensure that NULL values are replaced with the "N/A" value.
- Edit the FK_Product_Subcategory_ProductSubcategoryID Term Mapping
- Fill in the following Expression:
CASE
WHEN s1.ProductSubcategoryID IS NULL
THEN -10 --apply the N/A singleton value
ELSE s1.ProductSubcategoryID
END
Once you generate, deploy, and load your project again, no more Data Quality rule log entries will be created for a NULL Foreign Key in the relationship affected by this update (e.g., the relationship between a Product and a Product_Subcategory).
This update allows you to distinguish between "real" unknown values (which are subject to a Data Quality rule) and N/A values, which are now treated as valid and meaningful.
The new singleton line added to the relevant Entity for the N/A value is not generated by biGENIUS-X since it was created manually.
As a result, a new deployment will remove this line.
To ensure the N/A value persists, it is recommended to create a script that can be executed manually after deploying the generated artifacts. This script should reinsert the N/A entry into the Entity table.