Saturday, October 27, 2007
I blithely responded that it was because if you put five characters into the search criteria of a five character field on a Record Locator Dialogue, then the resulting SQL would result in an equality condition rather than a LIKE.
If I search for a string shorter than the field, it generates a LIKE:
If you search for a string equal in length to the field you get an equality (unless you specify something else in advanced search).
That is how it was in PeopleTools 8.44, or at least that is what I wrote in my book (page 94).
But things change. I checking the behaviour in PeopleTools 8.48, and this is what I found:
Lets start with the Basic Search mode in the Record Locator Dialogue
The default operation in the basic search is 'begins with'. The search string is shorter than the field (11 characters in this case) and so we get a LIKE criteria, but there is a change here. The % has been put into the search string.
However, if I fill the search field with 11 characters I still get a LIKE, and most significantly I no longer get an equality condition.
However, in Advanced Search, I get exactly what I ask for.
'Begins with' always results in a LIKE condition, irrespective of the length of the search string.
'Contains' always produces a LIKE condition, with % symbols at both ends, again irrespective of the length of the search string.
If I specify '=', then I get an equality condition in the SQL, but the search criteria also becomes a bind variable in the SQL. This is different from all other cases, where the search criteria appears in the SQL as a literal value.
Only on a mixed case field, does PeopleSoft append the % in a separate string, because both the column and the search string are forced into upper case.
So does it matter that PeopleTools uses a LIKE instead of an equality? Sometimes it could, but not always. Here is a very simple test where I search for a single employee by the unique key columns, first with an equality
And here is the same search with a LIKE criteria
In this case, although both queries return the same result, it made the difference between a index unique scan and a single fetch operation, and an index range scan and a second fetch, requiring another consistent read. Although, I can also produce tests which produce the same execution plan and number of consistent reads for = and LIKE.
The LIKE operation does cause an additional filter operation, but this is only a small CPU overhead.
However, this does mean that the advice about having 5 character SETIDs and Business Units is no longer valid.
Tuesday, October 02, 2007
Lets start with a quick review of database isolation levels. If you run PeopleSoft on an Oracle database, you hardly ever have to worry about isolation levels, because Oracle defaults to Read Committed, where all transactions are transactionally or read consistent, and that doesn't bring any scalability problems.
(In writing this note, I have spent a fair amount of time reading various Oracle and SQL Server websites. It has been excellent revision!)
What is read consistency? Well, if I start a long-running query at say 09.00, and then I update some data at 09.01 and commit that update, and then at 09.02 the query comes to the data that was changed at 09.01, it will report the data as at 09.00. So the set of data that is retrieved by the long running query is consistent with itself.
On Oracle, this is done by keeping information to reverse the change, essentially the old data values, in the undo (or rollback) segments from which it can construct, in memory, a read consistent version of data blocks as the start of the transaction. So at 9.02, Oracle creates a copy of the block in the buffer cache and uses the undo information to roll it back to the state it was in when the query began at 9.00, and then it reads the data from that read consistent copy.
Thus, Oracle is capable of maintaining read consistency without locking, so that writers never block readers, and readers never block writers.
Oracle also supports two other isolation levels:
- Serializable: queries cannot see changes committed after the transaction began, but before the current query began.
- Read-Only: as Serializable but no INSERT, UPDATE, or DELETE statements are permitted.
However, the other database platforms that are certified for PeopleSoft have additional isolation levels that can be set at database and session level (and hence in the application). In SQL Server the isolation modes are:
- Read Uncommitted (also known as Dirty Read): Queries simply read whatever is in the data blocks, so it is possible to read changes in the database that have not been committed and that may potentially be rolled back!
- Read Committed (SQL Server default): Uncommitted changes cannot be read, queries can be blocked by shared locks created by updates unless using Read Committed Snapshot, or other transaction in higher isolation levels.
- Read Committed Snapshot (available SQL Server 2005): row versioning is used to produce a transactionally consistent snapshot of the data. This is effectively equivalent to Read Committed on Oracle.
- Repeatable Read: Locks are placed on all data that is read.
- Serializable: Locks are placed on all data that is read, and rows cannot be inserted ranges of data queried by other statements.
- Snapshot: Changes made after the start of the transaction cannot be seen.
Dirty Reads in PeopleTools 8.49
Dirty-read mode can be specified for the application server:
It can also be set for scheduled queries run by Application Engine in the Process Scheduler:
This is what Enterprise PeopleTools 8.49 PeopleBook: System and Server Administration says:
This parameter can be used for general reporting and PeopleSoft Query. You can run dirty read queries through the application server, the Process Scheduler, and in a two-tier connection. The Use Dirty-Read setting in the application server configuration controls the behavior of PSAPPSRV, PSQCKSRV, and PSQRYSRV.Note. Dirty reads are not recommended if you are reading data and doing subsequent processing based on the disposition of the data at the time that it is read. Between the time the data is read by a subsequent process and the time the unit of work is completed by the first process, any activity affecting the table data at the time a subsequent process read could be rolled back, invalidating the accuracy of the data that a subsequent process read."
However, the parameter only appears in the query server section (PSQRYSRV) of the application server configuration file (psappsrv.cfg), so I suspect that it only applies to PS/Query queries (and therefore Crystal and BI Publisher reports) and nVision reports. Although, I haven't yet found confirmation of that in PeopleBooks or other documentation.
The delivered comment in the configuration file says that dirty-reads apply to DB2/OS390 and SQL Server, but the documentation doesn't mention which databases it applies to. Dirty-reads do not occur in Oracle, but this isolation level is also available on Sybase and Informix.
You can get around the locking problems in SQL Server 2000 and DB2, but at a price. Your reports may not show you entirely accurate data! But, if you are running PeopleTools 8.49 on SQL Server 2000, I would question why you are not running SQL Server 2005.
Monday, October 01, 2007
The descending key attribute has another purpose. Prior to PeopleTools 8.14, if you specified a key field as descending in Application Designer, that column would also be indexed in descending order where it appeared in indexes. However, PeopleSoft removed this functionality because of Oracle Bug 869177 which in most versions of Oracle 8.1 could cause ORA-3113 when accessing the index.
Now that Oracle has fixed the bug in the database, PeopleTools once again builds indexes with descending columns on Oracle RDBMS from PeopleTools 8.48.
However, as always, there is a snag (or I wouldn't have bothered to write this note). The name of the descending column is not reported in catalogue views. Instead the column name is replaced with a system generated column name.
This could cause problems with SQL scripts that use the catalogue views. For example, I have had to change my DDL Trigger to protect database objects not managed by Application Designer.
So how do you find the name of the underlying column? You have to look in the sys owned catalogue tables. If you look at the query in $ORACLE_HOME/rdbms/admin/catalog.sql at the code behind the view, you find that the column names are stored on sys.col$. There is another column on that table called DEFAULT$, which stores the original column name.
My thanks to Noons for pointing out that you can get the underlying column or expression from DBA_IND_EXPRESSIONS.
I recently spent a day on a site (not a PeopleSoft system) watching a single update statement that eventually ran for 27 hours. I used this view to predict when the SQL statement would complete.
You can see incomplete long running operations by querying long operations where the work done so far is less than the total amount of work.
The operations remain in this view long after they have completed, so you can use it to get a feel for what long operations occur on a sytem. But there are a fixed number of slots for v$session_longops, determined by ksulomux_. The entry for a completed operation remains in the view until the slots is reused. They are not reused in order of allocation, instead the array element to be reused is "probably determined by a MOD function" (My thanks to Tanel Poder for his clarification). So v$session_longops contains both current incomplete long operations and an arbitrary sample of completed long operations.
You can use the SQL_ADDRESS and SQL_HASH_VALUE to get the SQL statement from the library cache.
Aggregate total amount of time spent on SQL statements. But there is a problem, SQL statements can produce multiple long operations that may or may not occur concurrently.
But what if the statement isn’t in the library cache we can try and get it out of statspack! This SQL function looks up the SQL statement by hash value and assembles the pieces up to 4000 characters
This version gets the SQL statement from library cache, and if it isn't there it looks it up from the statspack data.
This is some sample output. This particular statement has been submitted by the publication servers 25 times with a cumulative execution time of 878 seconds.