Managing LogMiner Performance

For some applications, the LogMiner can become a bottleneck.  This can happen for a number of reasons.  For instance, the CPUs are saturated and the LogMiner cannot get sufficient CPU resources.  Most frequently the issue is disk performance.

Some applications use particularly large transactions and the LogMiner must spill information to work files on disk.  Given that there are many concurrent transactions interleaved in the AIJ this can lead to a number of such spill files. 

The disk which is used for these work files may have other serious I/O demands on it and I/Os are thereby relatively slower because of queuing.  We have seen some applications where there are 30 or more active databases being replicated by the Loader and all 30 of the Loader families are using the same device for work files.  That adds up to a lot of contention if the databases are busy.

This can be addressed by controlling the location of the work files used by the LogMiner.  This is done by defining the logical name RDM$BIND_AIJ_WORK_FILE to point at a disk and directory that the LogMiner will use to place its work files. This definition would be included in the DCL procedure that is used to run the CLML job.  It is probably more efficient if you make the definition $ define/job as the logical name then is not propagated in the process logical name table to sub-processes. 

If you do not specify otherwise it will use the directory pointed at by SYS$SCRATCH.

On a side note, we have observed at times that the LogMiner can abandon work files.  You might want to take a look in the directory specified by the logical name and see if there are obsolete files there.

Another performance restriction can be sorting.  The LogMiner sorts data to ensure that rows which are updated multiple times in a transaction  appear only once in the output file.  They may appear multiple times because they were flushed to the AIJ multiple times because of page contention or the buffer pool has become full.  Sorting large files of records has all the known issues of sorting  -- CPU and disk I/O.

There are two approaches here.  First is to sort in memory whenever possible.  Second is to sort on disk.

Sorting in memory is done using a quicksort algorithm.  It requires virtual memory to do so and that really translates to physical memory in the working set.  It would be a very bad idea to have the LogMiner process page or swap.

The amount of virtual memory to be used by the LogMiner can be controlled by manipulating the logical name JCC_ADD_CLM_QUICK_SORT.  This logical name sets the maximum number of records that the LogMiner will sort in memory.  Take a look at the documentation section on LogMiner Quick Sort for a more complete explanation of what the possibilities and limits are here. 

Disk-based sorting is done using two files by first forming a run of sorted records and writing them to disk.  The sorting agent then forms another block of records in sorted order and reads from the first disk file and merges them with the in-memory block and writes to a second disk file.  It then forms a third set of records and reads the second disk file to merge with the in-memory set and writes to the first file.  This process continues until there is only one run of sorted records.

If a third sort-work file is added the process will form two groups of sorted records and write them to the first and second files.  It then forms the in-memory group and does a merge of all three sorted sets into the third file. 

Using a third file can significantly reduce the number of records that are read to and from disk and is a good way to reduce the overall amount of work.  In fact, for larger sorts the more files used the better.

The logical name JCC_ADD_CLM_SORTWORK_FILES defined in the DCL that runs the CLML program can be used to alter the number of sortwork files that the LogMiner can use.  The usual SORTWORKn logical names are used to place the sortwork files on different devices if you have discrete disks or even perhaps on the same disk and directory if you have a SAN disk group with a large number of volumes. 

A tiny optimization may be added by using just a device name as the translation value as the SORTWORKn logical names.  The associated files will not then be entered into a directory saving the overhead of managing directory entries.  It there are a lot of sorts being done this can add up to an appreciable savings.  There are downsides to this as sometimes files processed this way can become "lost" and disk space just disappears.  You can use the VMS command $ analyze/disk/repair to cause such lost files to be entered into the [SYSLOST] directory on the volume.  They may then be deleted.