Thursday, June 17, 2010

Configuring Large PeopleSoft Application Servers

Occasionally, I see very large PeopleSoft systems running on large proprietary Unix servers with many CPUs.  In an extreme case, I needed to configure application server domains with up to 14 PSAPPSRV processes per domain (each domain was on a virtual server with 8 CPU cores, co-resident with the Process Scheduler).

The first and most important point to make is don't have too many server processes.  If you run out of CPU or if you fully utilise all the physical memory and start to page memory from disk, then you have too many server processes.  It is better to queue on a Tuxedo queue rather than the CPU run queue, or disk queue during paging.

Multiple APPQ/PSAPPSRV Queues

A piece of advice that I originally got from BEA (prior to their acquisition by Oracle) was that you should not have more than 10 server processes on a single queue in Tuxedo.  Otherwise, you are likely to suffer from contention on the IPC queue structure because processes must acquire exclusive access to the queue in order to enqueue a service request to the queue or dequeue a request from it.  Instead multiple queues should be configured that are both serviced by the same server processes and so advertise the same services. 

If you look at the 'large' template delivered by PeopleSoft, you will see that it produces a domain that runs between 9 and 15 PSAPPSRV processes.  This does not conform to the advice I received from BEA.  I repeated this advice in PeopleSoft for the Oracle DBA.  Though I cannot now find the source for it, I stand by it.  I have recently been able to conduct some analysis to confirm it on a real production system.  Domains with two queues of 8 PSAPPPSRV server process each out performed domains with only a single queue.

Load Balancing Across Queues

If the same service is advertised on multiple queues, then Tuxedo recommended that you should specify realistic service loads and use Tuxedo load balancing to determine where to enqueue requests.  I want to emphasise that I am talking about load balancing across queues within a Tuxedo domain, and not about load balancing across Tuxedo domains in the web server.

This is what the Tuxedo documentation says about load balancing:

"Load balancing is a technique used by the BEA Tuxedo system for distributing service requests evenly among servers that offer the same service. Load balancing avoids overburdening some servers while leaving others idle or infrequently used. Before sending a request to a service routine, the BEA Tuxedo system identifies all servers capable of handling the request and selects the one most appropriate for maintaining a balanced load across all the servers in the configuration.


You can control whether a load-balancing algorithm is used on the system as a whole. Such as algorithm should be used only when necessary, that is, only when a service is offered by servers that use more than one queue. Services offered by only one server, or by multiple servers in a Multiple Server, Single Queue (MSSQ) do not need load balancing. The LDBAL parameter for these services should be set to N. In other cases, you may want to set LDBAL to Y."

It doesn't state that load balancing is mandatory for multi-queue domains, and only hints that it might improve performance.  If load balancing is not used, the listener process puts the messages on the first empty queue (one where no requests are queued).  If all queues have requests the listener round-robins between the queues.

You could consider giving ICScript, GetCertificate and other services with small service times a higher Tuxedo Service priority.  This means they jump the queue 9 times out of 10.  ICScript is generally used during navigation, GetCertificate is used at log on.  Giving these services higher priority will mean they perform well even when the system is busy.  Users often need to do several mouse clicks to navigate around the system, but these services are usually quick.  This will improve the user experience without changing the overall performance of the system.

Data

I have recently been able to test the performance of a domains with up to 14 PSAPPSRVs on a single IPC queue, versus domains with two queues with up to 7 PSAPPSRVs each, both with and without Tuxedo queue balancing.   These results come from a real production system where the multiple queue configuration was implemented on 2 of the 4 application servers.  The system has a short-lived weekly peak period of on-line processing.  During that time Tuxedo spawns additional PSAPPSRV processes, and so I get different sets of times for different numbers of process. 

The timings are produced from transactions sampled by PeopleSoft Performance Monitor.  I capture the number of spawned processes using the Tuxmon scripts on my website that use tmadmin to collect Tuxedo metrics.

1 Queue2 Queue

Server Processes per Queue

Number of Services

Mean ICPanel Service Time

Server Processes per Queue

Number of Services

Mean ICPanel Service Time

6

2,616

1.33

3

6945

1.05

7

1,949

0.97




8

1,774

1.06

4

7595

1.16

9

1,713

1.02




10

1,553

1.25

5

4629

1.17

11

1,250

1.30




12

969

1.32

6

3397

1.16

13

427

1.21




14

1,057

1.10

7

3445

1.13

 Total

13,308

1.17


26011

1.13

The first thing to acknowledge is that this data is quite noisy because it comes from a real production system, and the effects we are looking for are quite small.

I am satisfied that the domains with two PSAPPSRV queues generally perform better under high load, than those under 1.  Not only does the queue time increase on the single queue domain, the service time also increases.

However, I cannot demonstrate that Tuxedo Load Balancing makes a significant difference in either direction.

My results suggest that domains with multiple queues for requests handled by PSAPPSRV process perform slightly better without load balancing if there is no queue of requests, but perform slightly better if there is a queue of pending requests.  However, the difference is small.  It is not large enough to be statistically significant in my test data.

Conclusion

If you have a busy system with lots of on-line users, and sufficient hardware to resource it, then you might reach a point when you need more than 10 PSAPPSRVs.  In which case, I recommend that you configure multiple Tuxedo queues.

On the whole, I would recommend that Tuxedo Load Balancing should be configured.  I would not expect it to improve performance, but it will not degrade it either.

10 comments :

Tom Williams Jr. said...

Application Server Load Balancing is a good thing regardless of if you are in a large environment or in a small environment in that it provides another layer of stability since the PeopleSoft Application Server has a nasty habit of acting up (especially on Windows). So, if you have two Application Servers setup in a Load Balancing (round robin) configuration, you will ensure that if one of your services goes crazy, you will have a backup running.

Are Queue's synonymous with Handlers?

David Kurtz said...

This blog entry is not about load balancing requests from the web server across different application server domains. That is a different topic.
I am discussing what happens as a single application server domain it is scaled up.
I am not a fan of having multiple Application Server domains for the same database on the same physical server (there are a few specific exceptions). It doesn't provide any additional tolerance to failure. You can have one domain overload and while another has spare capacity because the load balancing of application servers by the PeopleSoft servlet does not usually balance well.
The Jolt Listener (JSL) spawns a number of handler (JSH) processes. Each JSH has a return queue that is not exposed by the printqueue command in tmadmin, but can be seen with the ipcs command (a windows version is delivered as a part of Tuxedo). This article is about the inbound queues to the Application Server processes that handle service requests.

Tom Williams Jr. said...

Thanks. It would be cool if you could go into further details about this and how to go about tweaking it a bit more.

Cheers...

AnnaR said...

Hi!

Thank you for your excellent articles!

So, for enabling multiple queues (APPQ) we need:
1) enable load balancing
2) calculate the load

Can you explain, please, how can we implement these steps?

Thanks a lot.

David Kurtz said...

Chapter 13 of PeopleSoft for the Oracle DBA deals with configuring multiple queues, load balancing, calculating service loads, and setting service priorities. If you don't have the book take a look at the Advanced Tuxedo document on the www.go-faster.co.uk website. That is based on PeopleTools 7.5x, but the Tuxedo aspects have not changed.

Anonymous said...

Hi
Thanks for the article and the excellent information in your books.
I was wondering what equipment you run the test on especially how many CPUs on the application server?

Thanks
Greg

David Kurtz said...

These domains with up to 14 PSAPPSRVs were running on virtual servers with 8 CPU cores.

Graham said...

We need to stop crediting the Application Server layer with the ability to "load balance".... it uses a round robbin connection pattern which a) takes no notice whatsoever of any useful definition of the word "load" and b) ignores any attempt to "balance" said load.

Here's one good definition http://en.wikipedia.org/wiki/Load_balancing_%28computing%29 The keywords here are ...spread evenly...

Come on JOLT engineers in Pleasanton - "load balancing" please in 8.52? Let's talk... there are several highly qualified Tuxedo and App Domain specialists out here with some very real requirements and useful ideas.

Having said that I love the App Server architecture ... it's bomb proof, stable, reliable and still highly configureable even if it doesn't have "load balancing".

Thanks for a great write up here Dave and all the support you give to the PeopleSoft community.

David Kurtz said...

Thanks Graham, and you are quite right on all points. For the avoidance of any remaining doubt I have added a comment that this posting was discussing load balancing across queues within a Tuxedo domain, and not about load balancing across Tuxedo domains in the web server.

Abdul Latif said...

Thanks David for all your comments and helpful information.

Can you please provide me the syntax to run tuxmon script in AIX box . I am new to shell script and would like to use your script in our environment to find out the tuxedo processes.

Thanks

Asif