Showing posts with label Tuxedo. Show all posts
Showing posts with label Tuxedo. Show all posts

Thursday, January 25, 2024

Reducing the Operating System Priority of PeopleSoft Processes

PeopleSoft for the Oracle DBA

I wrote about controlling the operating system priority of processes in PeopleSoft Tuxedo domains in Chapters 13 of 14 of PeopleSoft for the Oracle DBA, but I think it is worth a note here.

On Linux and Unix systems, the nice command can be used to lower the operating system scheduling priority of a process (or a privileged can increase the priority). When a server has no free CPU, processes with a lower priority get less time on the CPU. However, when there is free CPU available, the scheduling priority does not affect the amount of CPU that the process can utilise. 

On Unix, the priority of a Tuxedo server process can be adjusted using the -n server command line option in the configuration. The parameters to this option are simply passed through to the nice(2) function. Hence, this option does not work on Windows.

PSPRCSRV        SRVGRP=BASE
                SRVID=101
                MIN=1
                MAX=1
                RQADDR="SCHEDQ"
                REPLYQ=Y
                CLOPT="-n 4 -sInitiateRequest -- -C psprcs.cfg -CD HR88 -PS PSUNX -A start -S PSPRCSRV"
The operating system priority of a process is inherited from its parent. Therefore, lowering the priority of the Process Scheduler running under Tuxedo will also lower the priority of the batch processes that it spawns. 
  • Therefore Stand-alone Application Engine processes (psae) and Cobol processes inherit the priority of the process scheduler server process (PSPRCSRV).
  • However, if the Application Engine server process (PSAESRV) is used, its priority can be set directly. 
There are some potential uses for this approach.
  • If the process scheduler is co-resident with the application server, then it could be run at a lower priority to ensure the online users get preferential allocation of CPU, and that online performance does not suffer excessively at the hands of the batch.
  • A system might have two websites: one for self-service and the other for the 'back-office' users. You could configure separate application servers for each site, and run the self-service application server is run at a lower priority. 

In PeopleSoft, I prefer to create additional variables in the configuration file (psprcs.cfg).

[Process Scheduler]
;=========================================================================
; General settings for the Process Scheduler
;=========================================================================
PrcsServerName=PSUNX
;-------------------------------------------------------------------------
;Reduce priority of Process Scheduler server process, set to 0 if not needed
Niceness=4
...
From PeopleTools 8.4, the Application Engine server process is configured by default. The priority of the AE server processes can then be controlled independently of the process scheduler by creating a separate variable in the PSAESRV section of the configuration file.  However, it is generally better to use standalone PSAE, unless you have many short-lived application engine processes, as in CRM (see Application Engine in Process Scheduler: PSAESRV Server Process -v- Standalone PSAE executable).   
[PSAESRV]
;=========================================================================
; Settings for Application Engine Tuxedo Server
;=========================================================================
;-------------------------------------------------------------------------
;Reduce priority of application engine server process, set to 0 if not needed
Niceness=5
...
In this example, I have reduced the priorities of both the process scheduler and AE servers, but the process scheduler is left with a higher priority than the AE servers. The new variables can then be referenced Tuxedo template file (psprcsrv.ubx).
{APPENG}
#
# PeopleSoft Application Engine Server
#
PSAESRV         SRVGRP=AESRV
                SRVID=1
                MIN={$PSAESRV\Max Instances}
                MAX={$PSAESRV\Max Instances}
                REPLYQ=Y
                CLOPT="-n {$PSAESRV\Niceness} -- -C {CFGFILE} -CD {$Startup\DBName} -S PSAESRV"
{APPENG}
...
PSPRCSRV        SRVGRP=BASE
                SRVID=101
                MIN=1
                MAX=1
                RQADDR="SCHEDQ"
                REPLYQ=Y
                CLOPT="-n {$Process Scheduler\Niceness} -sInitiateRequest -- -C {CFGFILE} -CD {$Startup\DBName} -PS {$Process Scheduler\PrcsServerName} -A start -S PSPRCSRV"
When the domain is configured in psadmin, the variables are resolved in the Tuxedo configuration file (psprcsrv.ubb).  The -n option can be seen in the server command-line options (CLOPT).
#
# PeopleSoft Application Engine Server
#
PSAESRV         SRVGRP=AESRV
                SRVID=1
                MIN=1
                MAX=1
                REPLYQ=Y
                CLOPT="-n 5 -- -C psprcs.cfg -CD HR88 -S PSAESRV"
...
PSPRCSRV        SRVGRP=BASE
                SRVID=101
                MIN=1
                MAX=1
                RQADDR="SCHEDQ"
                REPLYQ=Y
                CLOPT="-n 4 -sInitiateRequest -- -C psprcs.cfg -CD HR88 -PS PSUNX -A start -S PSPRCSRV"

Thursday, July 07, 2016

Setting Environment Variables in Application Server/Process Scheduler Tuxedo Domains

The topic of how to manage environment variables was mentioned recently on the PeopleSoft Administrator Podcast
Recently I have built a pair of PeopleSoft environments for a proof-of-concept and have faced into exactly this problem.  I have two PS_APP_HOMEs (application homes) share the same PS_HOME (PeopleTools home).  The environment variables need to be correct before I open psadmin to administer the application server and process scheduler domains.
On Unix, I would usually recommend running domains for different PS_APP_HOMEs in different Unix user accounts.  Thus environmental variables can be set up in the .profile or .bashrc scripts.  The processes then run as different Unix users which makes them easier to monitor.  The Unix users should be in the same Unix group and have group permissions to the PS_HOME directory.  This approach is not possible on Windows.
The alternative is to have a scripts or a menu that sets up those variables before you enter psadmin.
I noticed that there is a new menu in the domain configuration menu in psadmin which permits you to set environmental variables that are then built into the Tuxedo domain configuration.  In fact this has always been possible by editing the psappsrv.ubx and psprcs.ubx files directly to cause variables to be created in the psappsrv.env and psprcs.env files, but now you just have to enter the variables through the menu in psadmin.
When we start psadmin we can see the relevant environmental parameters.   PS_APP_HOME points to my HR installation.
PSADMIN -- PeopleTools Release: 8.54.16
Copyright (c) 1996, 2014, Oracle. All rights reserved.

--------------------------------
PeopleSoft Server Administration
--------------------------------

  PS_CFG_HOME            /home/psadm1/psft/pt/8.54
  PS_HOME                /opt/oracle/psft/pt/tools
  PS_APP_HOME            /opt/oracle/psft/pt/hr91
  PS_CUST_HOME           /opt/oracle/psft/pt/hr91/pscust

  1) Application Server
…
But what if I have a Financials installation under the same PS_APP_HOME?  Option 16 of the configuration menu lets me define environmental settings.  Something very similar happens in the Process Scheduler configuration.
----------------------------------------------
Quick-configure menu -- domain: RASC3K
----------------------------------------------
      Features                      Settings
     ==========                    ==========
  1) Pub/Sub Servers   : Yes   17) DBNAME     :[XXXXXX]
…
      Actions
     =========
 14) Load config as shown
 15) Custom configuration
 16) Edit environment settings
  h) Help for this menu
  q) Return to previous menu

HINT: Enter 17 to edit DBNAME, then 14 to load
So, I have added PS_APP_HOME, PS_CUST_HOME and INTERFACE_HOME and they have become a part of the configuration of the Tuxedo domain.
--------------------------------------
PeopleSoft Domain Environment Settings
--------------------------------------
     Domain Name: RASC3K

     TEMP                     :[{LOGDIR}{FS}tmp]
     TMP                      :[{LOGDIR}{FS}tmp]
     TM_BOOTTIMEOUT           :[120]
     TM_RESTARTSRVTIMEOUT     :[120]
     TM_BOOTPRESUMEDFAIL      :[Y]
     FLDTBLDIR32              :[{$TUXDIR}{FS}udataobj]
     FIELDTBLS32              :[jrep.f32,tpadm]
     ALOGPFX                  :[{LOGDIR}{FS}TUXACCESSLOG]
     INFORMIXSERVER           :[{$Startup\ServerName}]
     COBPATH                  :[{$PS_APP_HOME}/cblbin:{$PS_HOME}/cblbin]
     PATH                     :[{$PATH}:{$Domain Settings\Add to PATH}]
     PS_APP_HOME              :[/opt/oracle/psft/pt/fin91]
     PS_CUST_HOME             :[/opt/oracle/psft/pt/fin91/pscust]
     INTERFACE_HOME           :[/opt/oracle/psft/pt/fin91/pscust/interfaces]

  1) Edit environment variable
  2) Add environment variable
  3) Remove environment variable
  4) Comment / uncomment environment variable
  5) Show resolved environment variables
  6) Save
  h) Help for this menu
  q) Return to previous menu

Command to execute (1-6, h or q) :
What is going on here?  These variables have been added to the PS_ENVFILE section of psappsrv.ubx.
# ----------------------------------------------------------------------
*PS_ENVFILE
TEMP={LOGDIR}{FS}tmp
TMP={LOGDIR}{FS}tmp
TM_BOOTTIMEOUT=120
TM_RESTARTSRVTIMEOUT=120
TM_BOOTPRESUMEDFAIL=Y
FLDTBLDIR32={$TUXDIR}{FS}udataobj
FIELDTBLS32=jrep.f32,tpadm
ALOGPFX={LOGDIR}{FS}TUXACCESSLOG
{WINDOWS}
COBPATH={$PS_HOME}\CBLBIN%PS_COBOLTYPE%
INFORMIXSERVER={$Startup\ServerName}
# Set IPC_EXIT_PROCESS=1 to use ExitProcess to terminate server process.
# Set IPC_TERMINATE_PROCESS=1 to use TerminateProcess to terminate server process.
# If both are set, TerminateProcess will be used to terminate server process.
#IPC_EXIT_PROCESS=1
IPC_TERMINATE_PROCESS=1
PATH={$PS_HOME}\verity\{VERITY_OS}\{VERITY_PLATFORM}\bin;{$PATH};{$Domain Settings\Add to PATH}
{WINDOWS}
{UNIX}
INFORMIXSERVER={$Startup\ServerName}
COBPATH={$PS_APP_HOME}/cblbin:{$PS_HOME}/cblbin
PATH={$PATH}:{$Domain Settings\Add to PATH}
{UNIX}
PS_APP_HOME=/opt/oracle/psft/pt/fin91
PS_CUST_HOME=/opt/oracle/psft/pt/fin91/pscust
INTERFACE_HOME=/opt/oracle/psft/pt/fin91/pscust/interfaces
They then appear in the psappsrv.env file that is generated when the domain is configured.  This file contains fully resolved values of environmental variables that are set by every tuxedo application server process when it starts.
TEMP=/home/psadm1/psft/pt/8.54/appserv/XXXXXX/LOGS/tmp
TMP=/home/psadm1/psft/pt/8.54/appserv/XXXXXX/LOGS/tmp
TM_BOOTTIMEOUT=120
TM_RESTARTSRVTIMEOUT=120
TM_BOOTPRESUMEDFAIL=Y
FLDTBLDIR32=/opt/oracle/psft/pt/bea/tuxedo/udataobj
FIELDTBLS32=jrep.f32,tpadm
ALOGPFX=/home/psadm1/psft/pt/8.54/appserv/XXXXXX/LOGS/TUXACCESSLOG
# Set IPC_EXIT_PROCESS=1 to use ExitProcess to terminate server process.
# Set IPC_TERMINATE_PROCESS=1 to use TerminateProcess to terminate server process.
# If both are set, TerminateProcess will be used to terminate server process.
#IPC_EXIT_PROCESS=1
IPC_TERMINATE_PROCESS=1
IPC_TERMINATE_PROCESS=1
INFORMIXSERVER=
COBPATH=/opt/oracle/psft/pt/fin91/cblbin:/opt/oracle/psft/pt/tools/cblbin
PATH=/opt/oracle/psft/pt/fin91/bin:/opt/oracle/psft/pt/fin91/bin/interfacedrivers:/opt/oracle/psft/pt/tools/jre/bin:/opt/oracle/psft/pt/tools/appserv:/opt/oracle/psft/pt/tools/setup:/opt/oracle/psft/pt/bea/tuxedo/bin:.:/opt/oracle/psft/pt/oracle-client/12.1.0.1/bin:/opt/oracle/psft/pt/oracle-client/12.1.0.1/OPatch:/opt/oracle/psft/pt/oracle-client/12.1.0.1/perl/bin:/opt/mf/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/opt/oracle/psft/pt/tools/bin:/opt/oracle/psft/pt/tools/bin/sqr/ORA/bin:/opt/oracle/psft/pt/tools/verity/linux/_ilnx21/bin:/home/psadm1/bin::.
PS_APP_HOME=/opt/oracle/psft/pt/fin91
PS_CUST_HOME=/opt/oracle/psft/pt/fin91/pscust
INTERFACE_HOME=/opt/oracle/psft/pt/fin91/pscust/interfaces

Warning

I found that I had to specify fully resolved paths for the variables I defined.  I do try setting variables in terms of other variables,
     PS_APP_HOME              :[/opt/oracle/psft/pt/fin91]
     PS_CUST_HOME             :[{$PS_APP_HOME}{FS}pscust]
     INTERFACE_HOME           :[{$PS_CUST_HOME}{FS}interfaces]
but I started to get errors.
==============ERROR!================
Value for PS_CUST_HOME: {$PS_APP_HOME}{FS}pscust, is invalid.  Your environment
may not work as expected.
==============ERROR!================
And some variables were not fully resolved in the psappsrv.env file.
PS_APP_HOME=/opt/oracle/psft/pt/fin91
PS_CUST_HOME=/opt/oracle/psft/pt/fin91/pscust
INTERFACE_HOME={$PS_APP_HOME}{FS}pscust/interfaces

Configuration Settings in the .ubx –v- .cfg

My only reservation is that there is now environment specific configuration in the psappsrv.ubx file, rather than the psappsrv.cfg file.   When I have done this in the past I would create additional variables in psappsrv.cfg that were referenced from the psappsrv.ubx file.  Thus the psappsrv.ubx was consistent across environments, and all the configuration is in the main configuration file psappsrv.cfg.
Although, you can add additional variables in psappsrv.cfg, thus
[Domain Settings]
;=========================================================================
; General settings for this Application Server.
;=========================================================================
…
Application Home=/opt/oracle/psft/pt/fin91
Custom Directory=pscust
Interface Directory=pscust/interfaces
and then reference them in the variables, and they will resolve correctly in the psappsrv.env.
PS_APP_HOME={$Domain Settings\Application Home}
PS_CUST_HOME={$Domain Settings\PS_APP_HOME}{FS}{$Domain Settings\Custom Directory}
INTERFACE_HOME={$Domain Settings\PS_APP_HOME}{FS}{$Domain Settings\Interface Directory}
You may experience errors in psadmin
==============ERROR!================
Value for PS_APP_HOME: {$Domain Settings\Application Home}, is invalid.  Your
environment may not work as expected.
==============ERROR!================

==============ERROR!================
Value for PS_CUST_HOME: {$Domain Settings\Application Home}{FS}{$Domain
Settings\Custom Directory}, is invalid.  Your environment may not work as
expected.
==============ERROR!================

Conclusion

Using this technique, it does not matter how environment variables are set when you go into psadmin to start the application server, the correct setting is defined in the Tuxedo domain and overrides that.
You have always been able to do this in Tuxedo, but you would have had to edit the psappsrv.ubx file yourself, now the menu allows you to administer this.
There is no way to view the psappsrv.ubx and psappsrv.env files from within psadmin, only psappsrv.cfg can be opened.  If you want to check your settings have reached the psappsrv.env file, you will need to leave psadmin and look in the files for yourself.

Friday, September 04, 2015

Measuring Tuxedo Queuing in the PeopleSoft Application Server

Why Should I Care About Queuing?

Queuing in the application server is usually an indicator of a performance problem, rather than a problem in its own right.  Requests will back up on the inbound queue because the application server cannot process them as fast as they arrive.  This is usually seen on the APPQ which is serviced by the PSAPPSRV process, but applies to other server processes too.  Common causes include (but are not limited to):
  • Poor performance of either SQL on the database or PeopleCode executed within the application server is extending service duration
  • The application server domain is undersized for the load.  Increasing the number of application server domains or application server process may be appropriate.  However, before increasing the number of server process it is necessary to ensure that the physical server has sufficient memory and CPU to support the domain (if the application server CPU is overloaded then requests move from the Tuxedo queues to the operating system run queue).
  • The application server has too many server processes per queue causing contention in the systems calls that enqueue and dequeue requests to and from IPC queue structure.  A queue with more than 8-10 application server processes can exhibit this contention.  There will be a queue of inbound requests, but not all the server processes will be non-idle.
When user service requests spend time queuing in the application server, that time is part of the users' response time.  Application server queuing is generally to be avoided (although it may be the least worst alternative). 
What you do about queuing depends on the circumstances, but it is something that you do want to know about.

3 Ways to Measure Application Server Queuing

There are several ways to detect queuing in Tuxedo
  • Direct measurement of the Tuxedo domain using the tmadmin command-line interface.  A long time ago I wrote a shell script tuxmon.sh.  It periodically runs the printqueue and printserver commands on an application server and extracts comma separated data to a flat that can then be loaded into a database.  It would have to be configured for each domain in a system.
  • Direct Measurement with PeopleSoft Performance Monitor (PPM).  Events 301 and 302 simulate the printqueue and printserver commands.  However, event 301 only works from PT8.54 (and at the time of writing I am working on a PT8.53 system).  Even then, the measurements would only be taken once per event cycle, which defaults to every 5 minutes.  I wouldn't recommend increasing the sample frequency, so this will only ever be quite a coarse measurement.
  • Indirect Measurement from sampled PPM transactions.  Although includes time spent on the return queue and to unpack the Tuxedo message.  This technique is what the rest of this article is about.

Indirectly Measuring Application Server Queuing from Transactional Data

Every PIA and Portal request includes a Jolt call made by the PeopleSoft servlet to the domain.  The Jolt call is instrumented in PPM as transaction 115.  Various layers in the application server are instrumented in PPM, and the highest point is transaction 400 which where the service enters the PeopleSoft application server code.  Transaction 400 is always the immediate child of transaction 115.  The difference in the duration of these transactions is the duration of the following operations:
  • Transmit the message across the network from the web server to the JSH.  There is a persistent TCP socket connection.
  • To enqueue the message on the APPQ queue (including writing the message to disk if it cannot fit on the queue).
  •  Time spent in the queue
  • To dequeue the message from the queue (including reading the message back from disk it was written there).
  • To unpack the Tuxedo message and pass the information to the service function
  • And then repeat the process for the return message back to the web server via the JSH queue (which is not shown  in tmadmin)
I am going make an assumption that the majority of the time is spent by message waiting in the inbound queue and that time spent on the other activities is negligible.  This is not strictly true, but is good enough for practical purposes.  Any error means that I will tend to overestimate queuing.
Some simple arithmetic can convert this duration into an average queue length. A queue length of n means that n requests are waiting in the queue.  Each second there are n seconds of queue time.  So the number of seconds per second of queue time is the same as the queue length. 
I can take all the sampled transactions in a given time period and aggregate the time spent between transactions 115 and 400.  I must multiply it by the sampling ratio, and then divide it by the duration of the time period for which I am aggregating it.  That gives me the average queue length for that period.
This query aggregates queue time across all application server domains in each system.  It would be easy to examine a specific application server, web server or time period.
REM https://blog.psftdba.com/2015/09/measuring-tuxedo-queuing-in-peoplesoft.html
WITH c AS (
SELECT B.DBNAME, b.pm_sampling_rate
,      TRUNC(c115.pm_agent_Strt_dttm,'mi') pm_agent_dttm
,      A115.PM_DOMAIN_NAME web_domain_name
,      SUBSTR(A400.PM_HOST_PORT,1,INSTR(A400.PM_HOST_PORT,':')-1) PM_tux_HOST
,      SUBSTR(A400.PM_HOST_PORT,INSTR(A400.PM_HOST_PORT,':')+1) PM_tux_PORT
,      A400.PM_DOMAIN_NAME tux_domain_name
,      (C115.pm_trans_duration-C400.pm_trans_duration)/1000 qtime
FROM   PSPMAGENT A115 /*Web server details*/
,      PSPMAGENT A400 /*Application server details*/
,      PSPMSYSDEFN B
,      PSPMTRANSHIST C115 /*Jolt transaction*/
,      PSPMTRANSHIST C400 /*Tuxedo transaction*/
WHERE  A115.PM_SYSTEMID = B.PM_SYSTEMID 
AND    A115.PM_AGENT_INACTIVE = 'N'
AND    C115.PM_AGENTID = A115.PM_AGENTID
AND    C115.PM_TRANS_DEFN_SET=1
AND    C115.PM_TRANS_DEFN_ID=115
AND    C115.pm_trans_status = '1' /*valid transaction only*/
--
AND    A400.PM_SYSTEMID = B.PM_SYSTEMID 
AND    A400.PM_AGENT_INACTIVE = 'N'
AND    C400.PM_AGENTID = A400.PM_AGENTID
AND    C400.PM_TRANS_DEFN_SET=1
AND    C400.PM_TRANS_DEFN_ID=400
AND    C400.pm_trans_status = '1' /*valid transaction only*/
--
AND    C115.PM_INSTANCE_ID = C400.PM_PARENT_INST_ID /*parent-child relationship*/
AND    C115.pm_trans_duration >= C400.pm_trans_duration
), x as (
SELECT dbname, pm_agent_dttm
,      AVG(qtime) avg_qtime
,      MAX(qtime) max_qtime
,      c.pm_sampling_rate*sum(qtime)/60 avg_qlen
,      c.pm_sampling_rate*count(*) num_services
FROM   c
GROUP BY dbname, pm_agent_dttm, pm_sampling_rate
)
SELECT * FROM x
ORDER BY dbname, pm_agent_dttm
  • Transactions are aggregated per minute, so the queue time is divided by 60 at the end of the calculation because we are measuring time in seconds.
Then the results from the query can be charted in excel (see http://www2.go-faster.co.uk/scripts.htm#awr_wait.xls). This chart was taken from a real system undergoing a performance load test, and we could see


Is this calculation and assumption reasonable?

The best way to validate this approach would be to measure queuing directly using tmadmin.  I could also try this on a PT8.54 system where event 301 will report the queuing.  This will have to wait for a future opportunity.
However, I can compare queuing with the number of busy application servers at reported by PPM event 302 for the CRM database.  Around 16:28 queuing all but disappears.  We can see that there were a few idle application servers which is consistent with the queue being cleared.  Later the queuing comes back, and most of the application servers are busy again.  So it looks reasonable.
Application Server Activity

Thursday, January 02, 2014

Minimum Number of Recycling Server Processes

When I rebuilt my demo system (some while ago) with PeopleTools 8.52, I noticed a new message generated by ubbgen in PeopleTools 8.52 when the minimum number of recycling servers is set to 1.

WARNING: PSAPPSRV, PSSAMSRV, PSQRYSRV, PSQCKSRV, PSPPMSRV and PSANALYTICSRV are configured with Min instance set to 1. 
To avoid loss of service, configure Min instance to at least 2.

What Produces the Message?
ubbgen is the PeopleSoft utility that merges the template file (psappsrv.ubx) with the configuration file (psappsrv.cfg) file to produce the Tuxedo configuration file (psappsrv.ubb) and the environment file (psappsrv.env). It is invoked by psadmin during Tuxedo domain configuration.
ubbgen -t psappsrv.ubx -c psappsrv.cfg -o psappsrv.ubb -v psappsrv.val -q y -u PUBSUB=n/QUICKSRV=n/QUERYSRV=n/JOLT=y/JRAD=n/DBGSRV=n/RENSRV=n/MCF=n/PPM=n/ANALYTICSRV=n
The message is produced at this time.

Recycling Servers
Several servers in a PeopleSoft application Server domain recycle after they have handled a number of services. Recycling is a PeopleSoft behaviour and not a Tuxedo behaviour.  It is controlled by the Recycle Count parameter in the PeopleSoft configuration file (psappsrv.cfg).  This parameter is not referenced in the template file (psappsrv.ubx).

[PSAPPSRV]
;=========================================================================
; Settings for PSAPPSRV
;=========================================================================

;-------------------------------------------------------------------------
; UBBGEN settings
Min Instances=2
Max Instances=3
Service Timeout=0

;-------------------------------------------------------------------------
; Number of services after which PSAPPSRV will automatically restart.
; If the recycle count is set to zero, PSAPPSRV will never be recycled.
; The default value is zero.
; Dynamic change allowed for Recycle Count
Recycle Count=1000

PeopleSoft first started using BEA Tuxedo (as it was then) in PeopleTools 6 to remote call Cobol processes in the Financials product.  The Application Server was introduced in PeopleTools 7.  PSAPPSRV had recycling from the first release.  Legend has it that the engineers at Tuxedo where horrified when they heard that PeopleSoft had introduced recycling to resolve problems created by dynamic memory allocation and deallocation by the Panel Processor (now known as the component Processor).

Tuxedo servers are supposed to be robust, long lived and not require to be regularly restarted.  Hence the server restart functionality in Tuxedo is only designed to be invoked in the rare occasions when a server process crashes.  Server processes are started by the Restart Server (restartsrv), and only one process can be started concurrently.

The minimum of 2 applies to the following servers because they can both recycle and be configured to spawn additional instances on demand PSAPPSRV, PSANALYTICSRV, PSSAMSRV, PSQCKSRV, PSQRYSRV, PSPUBHND, PSSUBHND, PSBRKHND.  In the delivered configuration file for the developer domain, recycling is disabled for several servers by setting the recycle count to zero.  However, the message is produced by ubbgen irrespective of the value of recycle count.

The message handler servers can be set to have just a single instance, without producing any warning.  They only consume messages from the dispatcher processes so their temporary disappearance will not cause user errors.

What Happens When The Only Server Recycles?
Tuxedo server processes consume service requests placed on queues.  When a server starts up, it advertises its services on the Bulletin Board.  Tuxedo processes that submit requests (mostly the JSH processes, but also the message dispatcher processes, the Process Scheduler) look up on the Bulletin Board where a service is advertised and place it on the appropriate queue.

When a server process performs an orderly shutdown (as it does during a recycle) it removes the adverts for its services (the command is unadvertise).  If a process crashes the Bulletin Board Liaison process (BBL) detects the crash and cleans the Bullentin Board. When all services advertised on a queue have been unadvertised, the queue is also removed from the Bulletin Board.  If the process submitting the service request cannot find any server advertising a service it generates an error.  
074001.GO-FASTER-6!JSH.3124.5736.-2: JOLT_CAT:1043: "ERROR: tpacall() call failed, tperrno = 6"
This is why the minimum is 2.  If you recycled the only server process, it is possible for someone to produce this error (see also Minimum Number of Application Server Processes).

Should I set the minimum higher than 2?
The number of services handled by different server processes on the same queue is usually uneven because the service is handled by the first free server.  Therefore it is rare for the processes will reach the recycle count simultaneously, but it can still happen.  Even in a quiet system that doesn't have sufficient activity to justify 3 PSAPPSRVs, I prefer to set the minimum number of servers to at least 3. 

Debugger
NB: If the debugger process is enabled then PSADMIN will force the minimum and maximum number of PSAPPSRV processes to at least 2.  ubbgen will actually update psappsrv.cfg.
Warning:  PSAPPSRV Min Instances too small, setting to 2 for debugger.

Thursday, June 17, 2010

Configuring Large PeopleSoft Application Servers

Occasionally, I see very large PeopleSoft systems running on large proprietary Unix servers with many CPUs.  In an extreme case, I needed to configure application server domains with up to 14 PSAPPSRV processes per domain (each domain was on a virtual server with 8 CPU cores, co-resident with the Process Scheduler).

The first and most important point to make is don't have too many server processes.  If you run out of CPU or if you fully utilise all the physical memory and start to page memory from disk, then you have too many server processes.  It is better to queue on a Tuxedo queue rather than the CPU run queue, or disk queue during paging.

Multiple APPQ/PSAPPSRV Queues

A piece of advice that I originally got from BEA (prior to their acquisition by Oracle) was that you should not have more than 10 server processes on a single queue in Tuxedo.  Otherwise, you are likely to suffer from contention on the IPC queue structure because processes must acquire exclusive access to the queue in order to enqueue a service request to the queue or dequeue a request from it.  Instead multiple queues should be configured that are both serviced by the same server processes and so advertise the same services. 

If you look at the 'large' template delivered by PeopleSoft, you will see that it produces a domain that runs between 9 and 15 PSAPPSRV processes.  This does not conform to the advice I received from BEA.  I repeated this advice in PeopleSoft for the Oracle DBA.  Though I cannot now find the source for it, I stand by it.  I have recently been able to conduct some analysis to confirm it on a real production system.  Domains with two queues of 8 PSAPPPSRV server process each out performed domains with only a single queue.

Load Balancing Across Queues

If the same service is advertised on multiple queues, then Tuxedo recommended that you should specify realistic service loads and use Tuxedo load balancing to determine where to enqueue requests.  I want to emphasise that I am talking about load balancing across queues within a Tuxedo domain, and not about load balancing across Tuxedo domains in the web server.

This is what the Tuxedo documentation says about load balancing:

"Load balancing is a technique used by the BEA Tuxedo system for distributing service requests evenly among servers that offer the same service. Load balancing avoids overburdening some servers while leaving others idle or infrequently used. Before sending a request to a service routine, the BEA Tuxedo system identifies all servers capable of handling the request and selects the one most appropriate for maintaining a balanced load across all the servers in the configuration.


You can control whether a load-balancing algorithm is used on the system as a whole. Such as algorithm should be used only when necessary, that is, only when a service is offered by servers that use more than one queue. Services offered by only one server, or by multiple servers in a Multiple Server, Single Queue (MSSQ) do not need load balancing. The LDBAL parameter for these services should be set to N. In other cases, you may want to set LDBAL to Y."

It doesn't state that load balancing is mandatory for multi-queue domains, and only hints that it might improve performance.  If load balancing is not used, the listener process puts the messages on the first empty queue (one where no requests are queued).  If all queues have requests the listener round-robins between the queues.

You could consider giving ICScript, GetCertificate and other services with small service times a higher Tuxedo Service priority.  This means they jump the queue 9 times out of 10.  ICScript is generally used during navigation, GetCertificate is used at log on.  Giving these services higher priority will mean they perform well even when the system is busy.  Users often need to do several mouse clicks to navigate around the system, but these services are usually quick.  This will improve the user experience without changing the overall performance of the system.

Data

I have recently been able to test the performance of a domains with up to 14 PSAPPSRVs on a single IPC queue, versus domains with two queues with up to 7 PSAPPSRVs each, both with and without Tuxedo queue balancing.   These results come from a real production system where the multiple queue configuration was implemented on 2 of the 4 application servers.  The system has a short-lived weekly peak period of on-line processing.  During that time Tuxedo spawns additional PSAPPSRV processes, and so I get different sets of times for different numbers of process. 

The timings are produced from transactions sampled by PeopleSoft Performance Monitor.  I capture the number of spawned processes using the Tuxmon scripts on my website that use tmadmin to collect Tuxedo metrics.

1 Queue2 Queue

Server Processes per Queue

Number of Services

Mean ICPanel Service Time

Server Processes per Queue

Number of Services

Mean ICPanel Service Time

6

2,616

1.33

3

6945

1.05

7

1,949

0.97




8

1,774

1.06

4

7595

1.16

9

1,713

1.02




10

1,553

1.25

5

4629

1.17

11

1,250

1.30




12

969

1.32

6

3397

1.16

13

427

1.21




14

1,057

1.10

7

3445

1.13

 Total

13,308

1.17


26011

1.13

The first thing to acknowledge is that this data is quite noisy because it comes from a real production system, and the effects we are looking for are quite small.

I am satisfied that the domains with two PSAPPSRV queues generally perform better under high load, than those under 1.  Not only does the queue time increase on the single queue domain, the service time also increases.

However, I cannot demonstrate that Tuxedo Load Balancing makes a significant difference in either direction.

My results suggest that domains with multiple queues for requests handled by PSAPPSRV process perform slightly better without load balancing if there is no queue of requests, but perform slightly better if there is a queue of pending requests.  However, the difference is small.  It is not large enough to be statistically significant in my test data.

Conclusion

If you have a busy system with lots of on-line users, and sufficient hardware to resource it, then you might reach a point when you need more than 10 PSAPPSRVs.  In which case, I recommend that you configure multiple Tuxedo queues.

On the whole, I would recommend that Tuxedo Load Balancing should be configured.  I would not expect it to improve performance, but it will not degrade it either.

Tuesday, May 19, 2009

Manually Booting Tuxedo Application Server Processes in Parallel

Normally when an Application Server is booted, initialisation of each process completes before the next one is started. The ability to boot Application Server processes in parallel was added to the psadmin utility in PeopleTools 8.48. However, psadmin is merely a wrapper for the BEA Tuxedo tmadmin command line utility, and it has always been possible to do this manually in previous versions of PeopleTools via the tmadmin utility as follows.

1. Boot the Tuxedo Bulletin Board Liaison process.

#boot the Tuxedo administrative processes
boot -A

2. Boot the PeopleSoft Application Server processes and but specify the -w parameter so that they don't wait as they start

boot -g APPSRV -w

If you are running PUBSUB or other servers in other groups then you would also boot them here.

3. Boot the JREPSRV process (which maps Java Classes to Tuxedo Services).

boot -g JREPGRP

4. List the servers with print server so you know that the PeopleSoft servers are booted.

psr

5. When all the other processes have booted, boot the WSL and JSL processes.

boot -g BASE
boot -g JSLGRP