Partition Filter (Source Term)

Description

The Property named Partition Filter is available on a Source Term for mapping in the Spark Generator.

It indicates if the mapped source column should be used for partition pruning.

Partition pruning in Spark is a performance optimization that limits the number of files and partitions Spark reads when querying.

After partitioning the data, queries that match certain partition filter criteria improve performance by allowing Spark only to read a subset of the directories and files.

Parquet source files can be split up into partitions.

We want to be able to neglect partitions which were already loaded so that they are not reloaded in every load.

Format

The Partition Filter is a Boolean.

The possible values are True and False.

Example:

Default Value

The default value is False.