Extracting and Loading Data with vdwetljob.sh

The vdwetljob.sh command combines the functions of the ExtracotrEngine.sh and FlowEngine.sh commands. If you want to extract all data from the PPM instance that you have registered and process ETL for the entire content pack, you can use the vdwetljob.sh command.

It is convenient to use the vdwetljob.sh command together with the crontab command, so that you do not need to acquire the batch ID from the Extractor Engine manually and can process ETL regularly. For details about the crontab command, see Scheduling ETL with crontab.

The following table describes the supported parameters.

Parameter

Mandatory?

Sample Value

Description

instancename

Yes

PPM01

Should be the same with the ppm instance name when you register the PPM instance.

forceinitialload

No

True/false

 

Execute initial load manually. This dumps all records from PPM Oracle instance to flat files again, just as what it does during the initial load. The initial load duration setup is also applied, which means records that are created before the initial load will not be dumped.

pagesize

No

20000

Specifies the page size when the extractor issues SQL queries to the PPM database instance. Use this parameter only when there is critical performance downgrade on the PPM Oracle instance.

The default value is 50000. The system applies the default value if no value is specified.

parallelism

No

200

Specifies the thread pool size that the extractor has to extract data from PPM in parallel.

The default value is 20. The system will use the default value if no value is specified.

cpname

Yes

PPM

Content pack name.

Use this parameter is recommended, which can extract all flat files that the ETL job needs.