Thursday, May 24, 2012

Tuning Sessions for Better Performance

 Tuning Sessions for Better Performance


Challenge

Running sessions is where the pedal hits the metal. A common misconception is that this is the area where most tuning should occur. While it is true that various specific session options can be modified to improve performance, PowerCenter 8 comes with PowerCenter Enterprise Grid Option and Pushdown optimizations that also improve performance tremendously.

Description
Once you optimize the source and target database, and mapping, you can focus on optimizing the session. The greatest area for improvement at the session level usually involves tweaking memory cache settings. The Aggregator (without sorted ports), Joiner, Rank, Sorter and Lookup transformations (with caching enabled) use caches. 

The PowerCenter Server uses index and data caches for each of these transformations. If the allocated data or index cache is not large enough to store the data, the PowerCenter Server stores the data in a temporary disk file as it processes the session data. Each time the PowerCenter Server pages to the temporary file, performance slows. 

You can see when the PowerCenter Server pages to the temporary file by examining the performance details. The transformation_readfromdisk or transformation_writetodisk counters for any Aggregator, Rank, Lookup, Sorter, or Joiner transformation indicate the number of times the PowerCenter Server must page to disk to process the transformation.  Index and data caches should both be sized according to the requirements of the individual lookup.  The sizing can be done using the estimation tools provided in the Transformation Guide, or through observation of actual cache sizes on in the session caching directory.

The PowerCenter Server creates the index and data cache files by default in the PowerCenter Server variable directory, $PMCacheDir. The naming convention used by the PowerCenter Server for these files is PM [type of transformation] [generated session instance id number] _ [transformation instance id number] _ [partition index].dat or .idx. For example, an aggregate data cache file would be named PMAGG31_19.dat. The cache directory may be changed however, if disk space is a constraint. Informatica recommends that the cache directory be local to the PowerCenter Server. A RAID 0 arrangement that gives maximum performance with no redundancy is recommended for volatile cache file directories (i.e., no persistent caches).
If the PowerCenter Server requires more memory than the configured cache size, it stores the overflow values in these cache files. Since paging to disk can slow session performance, the RAM allocated needs to be available on the server.  If the server doesn’t have available RAM and uses paged memory, your session is again accessing the hard disk. In this case, it is more efficient to allow PowerCenter to page the data rather than the operating system.  Adding additional memory to the server is, of course, the best solution.
Refer to Session Caches in the Workflow Administration Guide for detailed information on determining cache sizes.
The PowerCenter Server writes to the index and data cache files during a session in the following cases:
  • The mapping contains one or more Aggregator transformations, and the session is configured for incremental aggregation. 
  •  The mapping contains a Lookup transformation that is configured to use a persistent lookup cache, and the PowerCenter Server runs the session for the first time. 
  •  The mapping contains a Lookup transformation that is configured to initialize the persistent lookup cache. 
  •  The Data Transformation Manager (DTM) process in a session runs out of cache memory and pages to the local cache files. The DTM may create multiple files when processing large amounts of data. The session fails if the local directory runs out of disk space.
When a session is running, the PowerCenter Server writes a message in the session log indicating the cache file name and the transformation name. When a session completes, the DTM generally deletes the overflow index and data cache files. However, index and data files may exist in the cache directory if the session is configured for either incremental aggregation or to use a persistent lookup cache. Cache files may also remain if the session does not complete successfully.

Configuring Automatic Memory Settings

PowerCenter 8 allows you to configure the amount of cache memory. Alternatively, you can configure the Integration Service to automatically calculate cache memory settings at run time. When you run a session, the Integration Service allocates buffer memory to the session to move the data from the source to the target. It also creates session caches in memory. Session caches include index and data caches for the Aggregator, Rank, Joiner, and Lookup transformations, as well as Sorter and XML target caches. The values stored in the data and index caches depend upon the requirements of the transformation. For example, the Aggregator index cache stores group values as configured in the group by ports, and the data cache stores calculations based on the group by ports. When the Integration Service processes a Sorter transformation or writes data to an XML target, it also creates a cache.

Configuring Session Cache Memory

The Integration Service can determine cache memory requirements for the Lookup, Aggregator, Rank, Joiner, Sorter and XML.
You can configure auto for the index and data cache size in the transformation properties or on the mappings tab of the session properties

Max Memory Limits

Configuring maximum memory limits allows you to ensure that you reserve a designated amount or percentage of memory for other processes. You can configure the memory limit as a numeric value and as a percent of total memory. Because available memory varies, the Integration Service bases the percentage value on the total memory on the Integration Service process machine.
For example, you configure automatic caching for three Lookup transformations in a session. Then, you configure a maximum memory limit of 500MB for the session. When you run the session, the Integration Service divides the 500MB of allocated memory among the index and data caches for the Lookup transformations. 

When you configure a maximum memory value, the Integration Service divides memory among transformation caches based on the transformation type.
When you configure a numeric value and a percent both, the Integration Service compares the values and uses the lower value as the maximum memory limit.
When you configure automatic memory settings, the Integration Service specifies a minimum memory allocation for the index and data caches. The Integration Service allocates 1,000,000 bytes to the index cache and 2,000,000 bytes to the data cache for each transformation instance. If you configure a maximum memory limit that is less than the minimum value for an index or data cache, the Integration Service overrides this value. For example, if you configure a maximum memory value of 500 bytes for session containing a Lookup transformation, the Integration Service overrides or disable the automatic memory settings and uses the default values. 

When you run a session on a grid and you configure Maximum Memory Allowed for Auto Memory Attributes, the Integration Service divides the allocated memory among all the nodes in the grid. When you configure Maximum Percentage of Total Memory Allowed for Auto Memory Attributes, the Integration Service allocates the specified percentage of memory on each node in the grid.

Aggregator Caches

Keep the following items in mind when configuring the aggregate memory cache sizes:
  • Allocate at least enough space to hold at least one row in each aggregate group.
  • Remember that you only need to configure cache memory for an Aggregator transformation that does not use sorted ports. The PowerCenter Server uses Session Process memory to process an Aggregator transformation with sorted ports, not cache memory.
  • Incremental aggregation can improve session performance. When it is used, the PowerCenter Server saves index and data cache information to disk at the end of the session. The next time the session runs, the PowerCenter Server uses this historical information to perform the incremental aggregation. The PowerCenter Server names these files PMAGG*.dat and PMAGG*.idx and saves them to the cache directory. Mappings that have sessions which use incremental aggregation should be set up so that only new detail records are read with each subsequent run.
  • When configuring Aggregate data cache size, remember that the data cache holds row data for variable ports and connected output ports only. As a result, the data cache is generally larger than the index cache. To reduce the data cache size, connect only the necessary output ports to subsequent transformations.
Joiner Caches

When a session is run with a Joiner transformation, the PowerCenter Server reads from master and detail sources concurrently and builds index and data caches based on the master rows. The PowerCenter Server then performs the join based on the detail source data and the cache data.
The number of rows the PowerCenter Server stores in the cache depends on the partitioning scheme, the data in the master source, and whether or not you use sorted input.
After the memory caches are built, the PowerCenter Server reads the rows from the detail source and performs the joins. The PowerCenter Server uses the index cache to test the join condition. When it finds source data and cache data that match, it retrieves row values from the data cache.

Lookup Caches

Several options can be explored when dealing with Lookup transformation caches.
  • Persistent caches should be used when lookup data is not expected to change often. Lookup cache files are saved after a session with a persistent cache lookup is run for the first time. These files are reused for subsequent runs, bypassing the querying of the database for the lookup. If the lookup table changes, you must be sure to set the Recache from Database option to ensure that the lookup cache files are rebuilt. You can also delete the cache files before the session run to force the session to rebuild the caches.
  • Lookup caching should be enabled for relatively small tables. Refer to the Best Practice Tuning Mappings for Better Performance to determine when lookups should be cached. When the Lookup transformation is not configured for caching, the PowerCenter Server queries the lookup table for each input row. The result of the lookup query and processing is the same, regardless of whether the lookup table is cached or not. However, when the transformation is configured to not cache, the PowerCenter Server queries the lookup table instead of the lookup cache. Using a lookup cache can usually increase session performance.
  • Just like for a joiner, the PowerCenter Server aligns all data for lookup caches on an eight-byte boundary, which helps increase the performance of the lookup
Allocating Buffer Memory

The Integration Service can determine the memory requirements for the buffer memory:
·  DTM Buffer Size
·  Default Buffer Block Size
You can also configure DTM buffer size and the default buffer block size in the session properties. When the PowerCenter Server initializes a session, it allocates blocks of memory to hold source and target data. Sessions that use a large number of sources and targets may require additional memory blocks. 
To configure these settings, first determine the number of memory blocks the PowerCenter Server requires to initialize the session. Then you can calculate the buffer size and/or the buffer block size based on the default settings, to create the required number of session blocks.
If there are XML sources or targets in the mappings, use the number of groups in the XML source or target in the total calculation for the total number of sources and targets.

Increasing the DTM Buffer Pool Size

The DTM Buffer Pool Size setting specifies the amount of memory the PowerCenter Server uses as DTM buffer memory. The PowerCenter Server uses DTM buffer memory to create the internal data structures and buffer blocks used to bring data into and out of the server. When the DTM buffer memory is increased, the PowerCenter Server creates more buffer blocks, which can improve performance during momentary slowdowns.
If a session's performance details show low numbers for your source and target BufferInput_efficiency and BufferOutput_efficiency counters, increasing the DTM buffer pool size may improve performance.
Using DTM buffer memory allocation generally causes performance to improve initially and then level off. (Conversely, it may have no impact on source or target-bottlenecked sessions at all and may not have an impact on DTM bottlenecked sessions). When the DTM buffer memory allocation is increased, you need to evaluate the total memory available on the PowerCenter Server. If a session is part of a concurrent batch, the combined DTM buffer memory allocated for the sessions or batches must not exceed the total memory for the PowerCenter Server system. You can increase the DTM buffer size in the Performance settings of the Properties tab.

Running Workflows and Sessions Concurrently

The PowerCenter Server can process multiple sessions in parallel and can also process multiple partitions of a pipeline within a session. If you have a symmetric multi-processing (SMP) platform, you can use multiple CPUs to concurrently process session data or partitions of data. This provides improved performance since true parallelism is achieved. On a single processor platform, these tasks share the CPU, so there is no parallelism.
To achieve better performance, you can create a workflow that runs several sessions in parallel on one PowerCenter Server. This technique should only be employed on servers with multiple CPUs available.

Partitioning Sessions

Performance can be improved by processing data in parallel in a single session by creating multiple partitions of the pipeline. If you have PowerCenter partitioning available, you can increase the number of partitions in a pipeline to improve session performance. Increasing the number of partitions allows the PowerCenter Server to create multiple connections to sources and process partitions of source data concurrently.
When you create or edit a session, you can change the partitioning information for each pipeline in a mapping. If the mapping contains multiple pipelines, you can specify multiple partitions in some pipelines and single partitions in others. Keep the following attributes in mind when specifying partitioning information for a pipeline:
·  Location of partition points. The PowerCenter Server sets partition points at several transformations in a pipeline by default. If you have PowerCenter partitioning available, you can define other partition points. Select those transformations where you think redistributing the rows in a different way is likely to increase the performance considerably.
·  Number of partitions. By default, the PowerCenter Server sets the number of partitions to one. You can generally define up to 64 partitions at any partition point. When you increase the number of partitions, you increase the number of processing threads, which can improve session performance. Increasing the number of partitions or partition points also increases the load on the server. If the server contains ample CPU bandwidth, processing rows of data in a session concurrently can increase session performance. However, if you create a large number of partitions or partition points in a session that processes large amounts of data, you can overload the system. You can also overload source and target systems, so that is another consideration.
·  Partition types. The partition type determines how the PowerCenter Server redistributes data across partition points. The Workflow Manager allows you to specify the following partition types:
1.      Round-robin partitioning. PowerCenter distributes rows of data evenly to all partitions. Each partition processes approximately the same number of rows. In a pipeline that reads data from file sources of different sizes, you can use round-robin partitioning to ensure that each partition receives approximately the same number of rows. 
2.      Hash keys. The PowerCenter Server uses a hash function to group rows of data among partitions. The Server groups the data based on a partition key. There are two types of hash partitioning:
o Hash auto-keys. The PowerCenter Server uses all grouped or sorted ports as a compound partition key. You can use hash auto-keys partitioning at or before Rank, Sorter, and unsorted Aggregator transformations to ensure that rows are grouped properly before they enter these transformations.
o Hash user keys. The PowerCenter Server uses a hash function to group rows of data among partitions based on a user-defined partition key. You choose the ports that define the partition key.
3.      Key range. The PowerCenter Server distributes rows of data based on a port or set of ports that you specify as the partition key. For each port, you define a range of values. The PowerCenter Server uses the key and ranges to send rows to the appropriate partition. Choose key range partitioning where the sources or targets in the pipeline are partitioned by key range.
4.      ­Pass-through partitioning. The PowerCenter Server processes data without redistributing rows among partitions. Therefore, all rows in a single partition stay in that partition after crossing a pass-through partition point.
5.      Database partitioning partition. You can optimize session performance by using the database partitioning partition type instead of the pass-through partition type for IBM DB2 targets.
If you find that your system is under-utilized after you have tuned the application, databases, and system for maximum single-partition performance, you can reconfigure your session to have two or more partitions to make your session utilize more of the hardware. Use the following tips when you add partitions to a session:
·  Add one partition at a time. To best monitor performance, add one partition at a time, and note your session settings before you add each partition.
·  Set DTM buffer memory. For a session with n partitions, this value should be at least n times the value for the session with one partition.
·  Set cached values for Sequence Generator. For a session with n partitions, there should be no need to use the number of cached values property of the Sequence Generator transformation. If you must set this value to a value greater than zero, make sure it is at least n times the original value for the session with one partition.
·  Partition the source data evenly. Configure each partition to extract the same number of rows. Or redistribute the data among partitions early using a partition point with round-robin.  This is actually a good way to prevent hammering of the source system. You could have a session with multiple partitions where one partition returns all the data and the override SQL in the other partitions is set to return zero rows (where 1 = 2 in the where clause prevents any rows being returned). Some source systems react better to multiple concurrent SQL queries; others prefer smaller numbers of queries.
·  Monitor the system while running the session. If there are CPU cycles available (twenty percent or more idle time), then performance may improve for this session by adding a partition.
·  Monitor the system after adding a partition. If the CPU utilization does not go up, the wait for I/O time goes up, or the total data transformation rate goes down, then there is probably a hardware or software bottleneck. If the wait for I/O time goes up a significant amount, then check the system for hardware bottlenecks. Otherwise, check the database configuration.
·  Tune databases and system. Make sure that your databases are tuned properly for parallel ETL and that your system has no bottlenecks.
Increasing the Target Commit Interval

One method of resolving target database bottlenecks is to increase the commit interval. Each time the target database commits, performance slows. If you increase the commit interval, the number of times the PowerCenter Server commits decreases and performance may improve.
When increasing the commit interval at the session level, you must remember to increase the size of the database rollback segments to accommodate the larger number of rows. One of the major reasons that Informatica set the default commit interval to 10,000 is to accommodate the default rollback segment / extent size of most databases. If you increase both the commit interval and the database rollback segments, you should see an increase in performance. In some cases though, just increasing the commit interval without making the appropriate database changes may cause the session to fail part way through (i.e., you may get a database error like "unable to extend rollback segments" in Oracle).

Disabling High Precision

If a session runs with high precision enabled, disabling high precision may improve session performance.
The Decimal datatype is a numeric datatype with a maximum precision of 28. To use a high-precision Decimal datatype in a session, you must configure it so that the PowerCenter Server recognizes this datatype by selecting Enable High Precision in the session property sheet. However, since reading and manipulating a high-precision datatype (i.e., those with a precision of greater than 28) can slow the PowerCenter Server down, session performance may be improved by disabling decimal arithmetic. When you disable high precision, the PowerCenter Server reverts to using a dataype of Double.

Reducing Error Tracking

If a session contains a large number of transformation errors, you may be able to improve performance by reducing the amount of data the PowerCenter Server writes to the session log.
To reduce the amount of time spent writing to the session log file, set the tracing level to Terse. At this tracing level, the PowerCenter Server does not write error messages or row-level information for reject data. However, if terse is not an acceptable level of detail, you may want to consider leaving the tracing level at Normal and focus your efforts on reducing the number of transformation errors. Note that the tracing level must be set to Normal in order to use the reject loading utility.
As an additional debug option (beyond the PowerCenter Debugger), you may set the tracing level to verbose initialization or verbose data.
·  Verbose initialization logs initialization details in addition to normal, names of index and data files used, and detailed transformation statistics.
·  Verbose data logs each row that passes into the mapping. It also notes where the PowerCenter Server truncates string data to fit the precision of a column and provides detailed transformation statistics. When you configure the tracing level to verbose data, the PowerCenter Server writes row data for all rows in a block when it processes a transformation.
However, the verbose initialization and verbose data logging options significantly affect the session performance. Do not use Verbose tracing options except when testing sessions. Always remember to switch tracing back to Normal after the testing is complete.
The session tracing level overrides any transformation-specific tracing levels within the mapping. Informatica does not recommend reducing error tracing as a long-term response to high levels of transformation errors. Because there are only a handful of reasons why transformation errors occur, it makes sense to fix and prevent any recurring transformation errors. PowerCenter uses the mapping tracing level when the session tracing level is set to none.

Pushdown Optimization

You can push transformation logic to the source or target database using pushdown optimization. The amount of work you can push to the database depends on the pushdown optimization configuration, the transformation logic, and the mapping and session configuration.
When you run a session configured for pushdown optimization, the Integration Service analyzes the mapping and writes one or more SQL statements based on the mapping transformation logic. The Integration Service analyzes the transformation logic, mapping, and session configuration to determine the transformation logic it can push to the database. At run time, the Integration Service executes any SQL statement generated against the source or target tables, and it processes any transformation logic that it cannot push to the database.
Use the Pushdown Optimization Viewer to preview the SQL statements and mapping logic that the Integration Service can push to the source or target database. You can also use the Pushdown Optimization Viewer to view the messages related to Pushdown Optimization.
Source-Side Pushdown Optimization Sessions
In source-side pushdown optimization, the Integration Service analyzes the mapping from the source to the target until it reaches a downstream transformation that cannot be pushed to the database.
The Integration Service generates a SELECT statement based on the transformation logic up to the transformation it can push to the database. Integration Service pushes all transformation logic that is valid to push to the database by executing the generated SQL statement at run time. Then, it reads the results of this SQL statement and continues to run the session. Similarly it create the view for SQL override and then generate SELECT statement and runs the SELECT statement against this view. When the session completes, the Integration Service drops the view from the database.
Target-Side Pushdown Optimization Sessions
When you run a session configured for target-side pushdown optimization, the Integration Service analyzes the mapping from the target to the source or until it reaches an upstream transformation it cannot push to the database. It generates an INSERT, DELETE, or UPDATE statement based on the transformation logic for each transformation it can push to the database, starting with the first transformation in the pipeline it can push to the database. The Integration Service processes the transformation logic up to the point that it can push the transformation logic to the target database. Then, it executes the generated SQL.
Full Pushdown Optimization Sessions
To use full pushdown optimization, the source and target must be on the same database. When you run a session configured for full pushdown optimization, the Integration Service analyzes the mapping from source to target and analyze each transformation in the pipeline until it analyzes the target. It generates and executes the SQL on sources and targets,
When you run a session for full pushdown optimization, the database must run a long transaction if the session contains a large quantity of data. Consider the following database performance issues when you generate a long transaction:
·  A long transaction uses more database resources.
·  A long transaction locks the database for longer periods of time, and thereby reduces the database concurrency and increases the likelihood of deadlock.
·  A long transaction can increase the likelihood that an unexpected event may occur.
The Rank transformation cannot be pushed to the database. If you configure the session for full pushdown optimization, the Integration Service pushes the Source Qualifier transformation and the Aggregator transformation to the source. It pushes the Expression transformation and target to the target database, and it processes the Rank transformation. The Integration Service does not fail the session if it can  push only part of the transformation logic to the database and the session is configured for full optimization.

Using a Grid

You can use a grid to increase session and workflow performance. A grid is an alias assigned to a group of nodes that allows you to automate the distribution of workflows and sessions across nodes.
When you use a grid, the Integration Service distributes workflow tasks and session threads across multiple nodes. Running workflows and sessions on the nodes of a grid provides the following performance gains:
·  Balances the Integration Service workload.
·  Processes concurrent sessions faster.
·  Processes partitions faster.
When you run a session on a grid, you improve scalability and performance by distributing session threads to multiple DTM processes running on nodes in the grid.
To run a workflow or session on a grid, you assign resources to nodes, create and configure the grid, and configure the Integration Service to run on a grid.
Running a Session on Grid
When you run a session on a grid, the master service process runs the workflow and workflow tasks, including the Scheduler. Because it runs on the master service process node, the Scheduler uses the date and time for the master service process node to start scheduled workflows. The Load Balancer distributes Command tasks as it does when you run a workflow on a grid. In addition, when the Load Balancer dispatches a Session task, it distributes the session threads to separate DTM processes.
The master service process starts a temporary preparer DTM process that fetches the session and prepares it to run. After the preparer DTM process prepares the session, it acts as the master DTM process, which monitors the DTM processes running on other nodes.
The worker service processes start the worker DTM processes on other nodes. The worker DTM runs the session. Multiple worker DTM processes running on a node might be running multiple sessions or multiple partition groups from a single session depending on the session configuration.
For example, you run a workflow on a grid that contains one Session task and one Command task. You also configure the session to run on the grid.
When the Integration Service process runs the session on a grid, it performs the following tasks:
·  On Node 1, the master service process runs workflow tasks. It also starts a temporary preparer DTM process, which becomes the master DTM process. The Load Balancer dispatches the Command task and session threads to nodes in the grid.
·  On Node 2, the worker service process runs the Command task and starts the worker DTM processes that run the session threads.
·  On Node 3, the worker service process starts the worker DTM processes that run the session threads.
For information about configuring and managing a grid, refer to the PowerCenter Administrator Guide and to the best practice PowerCenter Enterprise Grid Option.
For information about how the DTM distributes session threads into partition groups, see "Running Workflows and Sessions on a Grid" in the Workflow Administration Guide.

Reference: Informatica.com