Scenario: Creating New Dimension Entities

If you want to create new dimension entities for the TM_TIME_SHEETS table, follow these steps. The TIME_SHEET_ID and DESCRIPTION fields are to be added.

Defining Content Packs

To define content packs, follow these steps:

  1. Create two content pack folders: CUSTOMIZATION_PPM.cp and CUSTOMIZATION_TARGET.cp.
  2. Create folders under these two content packs. For detailed structures, see Content Pack Structure .
  3. Under CUSTOMIZATION_TARGET.cp, do the following:

    1. Under the dwmetadata\entities directory, create and define the target entity JSON file CUSTOMIZATION_TIME_SHEETS.json by copying the following to the file:

      {
          "metadata_layout_version": "1.0",
          "version": "1.0",
          "content_pack": "CUSTOMIZATION_TARGET",
          "entity_name": "CUSTOMIZATION_TIME_SHEETS",
          "entity_business_name": "Customization TIME SHEETS",
          "entity_description": "Customization TIME SHEETS",
      	"dimension":{
      		"dimension_business_name":"Customization time sheets",
      		"is_conformed":"true",
      		"dimension_type":"primary",
      		"storage_strategy":{
      			"segmented_by":"default",
      			"partition_by":"na"
      		}
      	},
          "schema": [
      	    {
                  "attribute": "TIME_SHEET_ID",
                  "attribute_name": "TIME_SHEET_ID",
                  "attribute_business_name": "TIME SHEET ID",
                  "attribute_description": "TIME SHEET ID",
                  "attribute_type": "dimension",
                  "scd": "scd1",
                  "target_data_type": "INT",
                  "size": "na",
                  "is_required": "false"
              },
      	    {
                  "attribute": "DESCRIPTION",
                  "attribute_name": "Description",
                  "attribute_business_name": "Customization Description",
                  "attribute_description": "Customization Description",
                  "attribute_type": "dimension",
                  "scd": "scd1",
                  "target_data_type": "varchar",
                  "size": "650",
                  "is_required": "false"
              }
          ]
      } 
      

      CUSTOMIZATION_TIME_SHEETS.json defines:

      • Two fields in the target table: TIME_SHEET_ID and DESCRIPTION.

      After you complete this step, the target entity JSON file is defined.

    2. Create and define cp.json by copying the following to the file. The value of content_pack_name should be the same value defined in content_pack in step 3a:

      {
      	"metadata_layout_version": "1.0",
      	"content_pack_name": "CUSTOMIZATION_TARGET",
      	"version": "1.0",
      	"description": "Project Management & Portfolio Management Content Pack, Shared Entities.",
      	"require": {
      		"platform": ">=1.0.0"
      	},
      	"target_entities": [
      		{
      			"name": "CUSTOMIZATION_TIME_SHEETS"
      		}
      	]
      }
      

      cp.json defines:

      • The target entity CUSTOMIZATION_TIME_SHEETS, as defined in CUSTOMIZATION_TIME_SHEETS.json

      • The content pack name: CUSTOMIZATION_TARGET

    After you complete this step, the cp.json file is defined.

  4. In CUSTOMIZATION_PPM.cp, do the following:

    1. Under the dwmetadata\entities directory, create and define the source entity JSON file CUSTOMIZATION_TM_TIME_SHEETS.json:

      {
          "metadata_layout_version": "1.0",
          "version": "1.0",
          "content_pack": "CUSTOMIZATION_PPM",
          "source_product": "PPM",
          "entity_name": "CUSTOMIZATION_TM_TIME_SHEETS",
          "entity_business_name": "Customization PPM TIME SHEETS table",
          "entity_description": "Customization TIME SHEETS info",
          "schema": [
              {
                  "attribute": "TIME_SHEET_ID",
                  "attribute_name": "TIME_SHEET_ID",
                  "attribute_business_name": "Time sheet id",
                  "attribute_description": "Time sheet id",
                  "sql_data_type": "INT",
                  "size": "na",
                  "is_bk": "true",
                  "is_cdc": "false",
                  "is_required": "true",
                  "column_sequence": "1"
              },
              {
                  "attribute": "DESCRIPTION",
                  "attribute_name": "DESCRIPTION",
                  "attribute_business_name": "Description",
                  "attribute_description": "Description",
                  "sql_data_type": "VARCHAR",
                  "size": "650",
                  "is_bk": "false",
                  "is_cdc": "false",
                  "is_required": "false",
                  "column_sequence": "2"
              }
          ]
      }
      

      CUSTOMIZATION_TM_TIME_SHEETS.json defines:

      • Two fields in the source table: TIME_SHEET_ID and DESCRIPTION
      • The source entity name: CUSTOMIZATION_TM_TIME_SHEETS

      After you complete this step, the source entity JSON file is defined.

    2. Under dwmetadata\streams, create and define the stream entity JSON file CUSTOMIZATION_TIME_SHEETS_STREAM.json:

      {
      	"metadata_layout_version":"1.0",
      	"version":"1.0",
      	"content_pack":"CUSTOMIZATION_PPM",
      	"source_product":"PPM",
      	"stream_name":"CUSTOMIZATION_TIME_SHEETS_STREAM",
      	"source_entities_includes":[
      		{
      		"source_entity_include":"CUSTOMIZATION_TM_TIME_SHEETS"
      		}
      	],
      	"target_entities_includes":[
      		{
      		"target_entity_include":"CUSTOMIZATION_TIME_SHEETS"
      		}	
      	],
      	"transforms":[
      		{
      		"sql": "drop table if exists PPM_CUSTOMIZATION_TIME_SHEETS_${MD_SOURCE_INSTANCE_ID}_SSI CASCADE; 
      --==--==--
      create table PPM_CUSTOMIZATION_TIME_SHEETS_${MD_SOURCE_INSTANCE_ID}_SSI ( 
                      MD_BUSINESS_KEY			varchar(1000),
      			CUSTOMIZATION_TIME_SHEETS_EXTEND_BUSINESS_KEY     varchar(1000),
                      CUSTOMIZATION_TIME_SHEETS_EXTEND_ENTERPRISE_KEY     INT,
      				DESCRIPTION     varchar(650),
      				
      			
                      MD_BATCH_ID                 INT,
                      MD_PROCESS_ID               INT,
                      MD_SOURCE_INSTANCE_ID       INT,	
                      MD_FLAG                varchar(10)
      ) unsegmented all nodes; 
      --==--==--
      insert into PPM_CUSTOMIZATION_TIME_SHEETS_${MD_SOURCE_INSTANCE_ID}_SSI (
                      MD_BUSINESS_KEY	,
      			CUSTOMIZATION_TIME_SHEETS_EXTEND_BUSINESS_KEY,
      			DESCRIPTION,
                      MD_BATCH_ID ,
                      MD_PROCESS_ID ,
                      MD_SOURCE_INSTANCE_ID ,
                      MD_FLAG                       
      ) 
      select          
      			tab.MD_BUSINESS_KEY	,
      			tab.CUSTOMIZATION_TIME_SHEETS_EXTEND_BUSINESS_KEY,	
      			tab.DESCRIPTION,				
                      ${MD_BATCH_ID} AS MD_BATCH_ID ,
                      ${MD_PROCESS_ID} AS MD_PROCESS_ID ,
                      ${MD_SOURCE_INSTANCE_ID} AS MD_SOURCE_INSTANCE_ID ,
                      tab.MD_FLAG 
      from (
                      select
                                      t1.md_source_instance_id || ':' || t1.md_business_key as MD_BUSINESS_KEY, 
      t1.md_source_instance_id || ':' || t1.TIME_SHEET_ID as CUSTOMIZATION_TIME_SHEETS_EXTEND_BUSINESS_KEY,		
      		                   DESCRIPTION,
                                      decode(t1.md_flag,'NEW','NEW','DEL','DEL','UPD') as md_flag,  
                                      row_number() over( partition by t1.md_source_instance_id || ':' || t1.md_business_key 
      						   ) multi_flag
                      from CUSTOMIZATION_TIME_SHEETS_STREAM_CUSTOMIZATION_TM_TIME_SHEETS_${MD_SOURCE_INSTANCE_ID}_EXT t1
      				 where t1.md_pf_flag = 'D') tab
      where tab.multi_flag = 1; 
      --==--==--
      	SELECT ANALYZE_STATISTICS('PPM_CUSTOMIZATION_TIME_SHEETS_${MD_SOURCE_INSTANCE_ID}_SSI');
      --==--==--
      "
      		}
      	]
      }
      

      CUSTOMIZATION_TIME_SHEETS_STREAM.json defines:

      • Source entities that provide data
      • Target entities that accept data
      • SQLs mainly to transform data from source entities to target entities

      After you complete this step, the stream entity JSON file is defined.

    3. Under extmetadata, create and define the extractor entity JSON file CUSTOMIZATION_TM_TIME_SHEETS_EXT.json:

      {
          "metadata_layout_version": "1.0",
          "version": "1.0",
          "source_product": "PPM",
          "content_pack": "CUSTOMIZATION_PPM",
          "entity_name": "CUSTOMIZATION_TM_TIME_SHEETS_EXT",
          "source_entity_name": "TM_TIME_SHEETS",
          "extractor": "OracleDBExtractor",
          "extraction": [
              {
                  "extraction_view": "SELECT TIME_SHEET_ID, DESCRIPTION FROM TM_TIME_SHEETS",
                  "source_product_version": "9.30"
              }
          ]
      }
      

      CUSTOMIZATION_TM_TIME_SHEETS_EXT.json defines:

      • The extractor entity name
      • source_entity_name as the source table name
      • extraction_view: SQLs for extracting data

      After you complete this step, the extractor entity JSON file is defined.

    4. Create and define cp.json:

      {
      	"metadata_layout_version": "1.0",
      	"content_pack_name": "CUSTOMIZATION_PPM",
      	"source_product": "PPM",
      	"version": "1.0",
      	"description": "Project Management & Portfolio Management Content Pack, Extend Entities.",
      	"require": {
      	"platform": ">=1.0.0",
      		"cp": [
      			{
      				"name": "CUSTOMIZATION_TARGET",
      				"version": ">=1.0.0"
      			}
      		]
      	},
      	"streams": [
      		{
      			"name": "CUSTOMIZATION_TIME_SHEETS_STREAM"
      		}
      	],
      	"source_entities": [
      		{
      			"name": "CUSTOMIZATION_TM_TIME_SHEETS"
      		}
      	],
      	"extraction_entities": [
      		{
      			"name": "CUSTOMIZATION_TM_TIME_SHEETS_EXT"
      		}
      	]
      } 
      

      cp.json defines:

      • The content pack name: CUSTOMIZATION_PPM
      • Source entities: Defined in CUSTOMIZATION_TM_TIME_SHEETS.json
      • Stream entities: Defined in CUSTOMIZATION_TIME_SHEETS_STREAM.json
      • Extract entities: Defined in CUSTOMIZATION_TM_TIME_SHEETS_EXT.json

 

Deploying Content Packs

To deploy content packs, follow these steps:

  1. Place CUSTOMIZATION_PPM.cp and CUSTOMIZATION_TARGET.cp under <VDW_HOME>/Content.

  2. Run the ContentManager.sh script under the <VDW_HOME>/bin directory to deploy CUSTOMIZATION_TARGET.cp:

    sh ContentManager.sh --instruction install --cpname CUSTOMIZATION_TARGET;

    You can find the following message from ContentManager.log under <VDW_HOME>/logs if the content pack is deployed successfully:

    The content of package CUSTOMIZATION_TARGET was successfully installed

    The DIM_ CUSTOMIZATION_CONTACTS table is generated in the Vertica database.

  3. Run the ContentManager.sh script under the <VDW_HOME>/bin directory to deploy CUSTOMIZATION_PPM.cp:

    sh ContentManager.sh --instruction install --cpname CUSTOMIZATION_PPM;

    You can find the following message from ContentManager.log under <VDW_HOME>/logs if the content pack is deployed successfully:

    The content of package CUSTOMIZATION_PPM was successfully installed
  4. Run the ExtractorEngine.sh script under the <VDW_HOME>/bin directory to extract data from the PPM database to flat files:

    sh ExtractorEngine.sh --streamname CUSTOMIZATION_TIME_SHEETS_STREAM --instancename <PPM_Instance_Name>

    You can find the following message from ExtractorEngine.log under <VDW_HOME>/logs if the content pack is deployed successfully:

    Extractor was successfully executed. The BATCH ID is: <Batch_ID>.

    <PPM_Instance_Name> is the PPM instance name you specified when installing the Vertica for PPM Content Pack.

  5. Run the FlowEngine.sh script under the <VDW_HOME>/bin directory to process ETL:

    sh FlowEngine.sh --batch <Batch_ID> --streamname CUSTOMIZATION_TIME_SHEETS_STREAM --instancename <PPM_Instance_Name>

    You can find the following message from FlowEngine.log under <VDW_HOME>/logs if the content pack is deployed successfully:

    ETL process was executed successfully

    <Batch_ID> used in this command is the batch ID that was generated in Step 4.

  6. Connect to the Vertica database and check whether the data has been loaded successfully:

    select * from <Target_Schema>.DIM_CUSTOMIZATION_TIME_SHEETS

    <Target_Schema> is the name for the schema that contains target data and tables for reporting.