Wednesday, October 28, 2009

Database Links and PeopleSoft

I have seen Oracle database links used in conjuction with PeopleSoft at a number of customer sites over the years. I think some scenarios are examples where it is reasonable to use a database link, some are not.
In Oracle RDBMS, a database link (see also the Oracle Concepts Manual) specifies how a database server can create a session on another database in order to perform a remote operation. The remote database may be another Oracle database, or another type of database. The remote operation may be a query, or it may modify remote data (DML operation), in which case a two-phase commit mechanism ensures transaction integrity across the databases.

PeopleSoft does not use database links between databases because they are a database platform specific technology. Instead, the PeopleSoft methodology is to replicate data between databases with an application message. This works well with small data sets, and with larger data sets that change slowly. However, there are scenarios with very large data volumes where Application Messaging will struggle to keep up, and Oracle RDMBS specific replication technologies are an appropriate alternative.

Data can be replicated between databases with Materialised Views (formerly known as Snapshots). This is a SQL based technology. Materialised Views Logs on the source database track changes in the source table. An incremental or full refresh process is run on the target, usually as a scheduled job.

Oracle introduced an alternative replication technology in Oracle 9i called ‘Streams’. This uses supplementary redo logging on the source database which can then be applied to the target. Additional database server processes transmit, receive and apply this redo stream. This technology is aimed a large volume replication for data warehouses. I know of one site where several hundred database tables are replicated from PeopleSoft HR to a data warehouse.

I think the clearest way is to explain the good and bad use of links is by example.

At two different sites with HR and Finance, I have seen similar examples of both good and bad use of database links (one was on PeopleTools 7.0, before Application Messaging was available).
  • GL data needs to be sent from Payroll in the HR database to GL in the Financials database. The PeopleSoft delivered mechanism was to use generate and reload an interface file. This customer changed replaced the interface table in HR with a view that referenced a table with the same name and structure on the Financial system via a link. The payroll GL extract process now inserted directly into the table on Financials. I think this is a perfectly reasonable use of a database link. It was simple. It performed better than the interface file, and would certainly have been faster than application messaging. 
  • However, the same customer also used the employee expenses module in Financials. This required employee personal data. So they made the PS_PERSONAL_DATA table in Financials a view of the corresponding table in HR. The result was that every time somebody opened the expenses component in Financials, the application server process referenced the database link to HR. There is a limit on the maximum number of database links that a session can concurrently open (OPEN_LINKS) and that the whole database instance can concurrently open (OPEN_LINKS_PER_INSTANCE). They both default is 4, which gives an indication of how Oracle expect you to use this feature. This system ran out of database links. No Oracle errors were generated, instead sessions wait to be allowed to make the connection to the remote database. There are specific database link wait events that report the amount of time lost.  Eventually you reach the point where Tuxedo services timeout and then users receive errors. I think this is an example of the wrong way to use a database link. Instead I think that the data should have been replicated from HR to Financials. In a modern PeopleSoft system this should be done with an application message (on the 7.0 system a Materialised View could have been used). 
  • Not only were database links used to provide data to components, but the same views, that referenced remote objects, were used in queries and reports resulting in complex distributed queries. When multiple remote objects are referenced in a single SQL, I have seen Oracle decide to copy the whole of one or more of these objects to the local temporary tablespace. This does not result in good performance. Remember also, that even query only operations via a database link create a transaction on the remains until a commit or rollback is issued.
  • I worked on another site where Materialised Views were used to incrementally replicate data from CRM and HR to EPM databases. Processes in the EPM database then referenced the Materialised Views. In this system, there was never any question of EPM processes referencing objects via the links. The data volumes were such that Application Messaging would never have coped. In this case only the Materialised View refresh process references the database links, and so the DBA can manage the usage of links. This is a system where (after upgrade to Oracle 10g) Streams could also have been used to replicate the data.

My view is that database links should not be used to directly support functionality in either on-line components or reporting. When this is done in reports and queries it presents the problem of tuning distributed queries, and having to decide whether local or remote database should drive the query. Instead data should be replicated to the local database, preferably by PeopleSoft messaging, but if necessary by Oracle replication. However, I think that it can be reasonable to use a database link in an occasional batch process that moves data between systems.

Sunday, October 11, 2009

Aggregating & Purging Batch Timings

Application Engine can collect timing information for the programs being execution. These 'batch timings' can be written to log file and/or tables in the database. I always recommend that this is enabled in all environments. The runtime overhead is very low, and this data is extremely valuable to determine the performance of a system over a period of time, and to identify the pieces of SQL or PeopleCode code that account for the most time. The timing data collected for individual processes can be viewed directly within the Process Monitor component.

The Process Scheduler purge process does not delete batch timings, so this data remains in the database indefinitely, although it can no longer be accessed via the Process Monitor. Over time, on a busy system, a large volume of data can accumulate. In some ways this is a good thing. There are good reasons to purge the Process Scheduler as aggressively as the business will permit.  The batch timings can still be analysed by direct SQL query. However, the sheer volume of data is likely to result in queries that can take quite a while to execute. After a while, you are less likely to be interested in the performance of individual processes, but are more likely to want to aggregate the data. So, it makes sense to hold the data at least partly aggregated.

I have produced a very simple Application Engine program (GFC_TIM_ARCH) to address this problem. This program is available for download from the Go-Faster website as a PeopleTools Application Designer Project. Please note that this process has been written using Oracle RDBMS specific SQL syntax.

  • Application Engine stores batch timing data in three table PS_BAT_TIMINGS_LOG, PS_BAT_TIMINGS_FN and PS_BAT_TIMINGS_DTL.
  • I have created three new tables PS_GFC_TIMINGS_LOG, PS_GFC_TIMINGS_FN and PS_GFC_TIMINGS_DTL.
  • GFC_TIM_ARCH aggregates the data in each of these tables by the day on which the process begin, by the process name, and (where applicable) by the 'detail_id' column. The aggregated data is put into the GFC_TIMINGS% tables, the original data is removed from the BAT_TIMINGS% tables.
  • It finds the earliest three days for which timing data exists that is older than the longest Process Scheduler retention limit. The idea is that the process should be run daily (the delivered 'Daily Purge' recurrence is suitable), but if it doesn't run for some reason it will catch up the next day.
  • Only one instance of WMS_TIM_ARCH is permitted to run concurrently.

On one system, where I tested this process, BAT_TIMINGS_DTL was growing by over 1 million rows per day. This became around 5000 rows per day in GFC_TIMINGS_DTL.

If you have been running with batch timings for a while, then when you first introduce this process you will probably have a large backlog of data to be aggregated and purged. The easiest option is to run this process repeatedly until the data has been processed (possibly using the recurrence that causes a program to run every minute). After the backlog has been cleared the BAT_TIMINGS% tables should be rebuilt or shrunk in order to release the space left in the tables by the deleted rows. This will help queries that scan the BAT_TIMINGS_DTL record, otherwise these scans still need to include the empty rows because they are below the tables High Water Mark.

Once the backlog has been cleared, the GFC_TIM_ARCH process can run daily and, not withstanding variations in the load on the system, the rolling volume of data retained in the BAT_TIMINGS% tables should be fairly constant, and it should not be necessary to rebuild them frequently. Space freed by the daily delete should be used by new rows that are inserted into the table as AE processes run.