ETL Architecture Overview
Typically an ETL process contains 10 steps. Flow Engine calls Template Engine to generate SQLs based on the ETL templates for each entity during run time and Flow Engine executes those 10 steps one by one.
ETL templates contains most of the common ETL patterns, so you only need to focus on the business logic when developing new content.
In general, ETL templates support the following ETL patterns:
Change Data Capture
ETL Engine captures only change data based on the is_cdc attributes defined in the source metadata. Deleted data can be captured based on the is_bk attributes defined in the source metadata.
Slow Changing Dimension
ETL Engine supports slow changing dimension. It automatically keeps the dimension history if you mark a field as scd2.
Late Arriving Dimension
ETL Engine automatically generates dummy record for late arriving dimension data, and updates the dummy record when data arrives.
Surrogate Key Generation
ETL Engine generates surrogate key automatically. It generates the enterprise key to identify records and the primary key to identify the history of records.
Based on the target metadata definition, ETL Engine stores hierarchy relation information in the BRIDGE_HIEARCH tables. Unlimited hierarchy levels are supported.
Date Timezone Conversion
ETL Engine converts date to the data warehouse timezone automatically. You need to specify the PPM database timezone and PPM data warehouse timezone during installation.