The ASH data shows that most of the database time, 184 out of 192 seconds is spent in the two statements that aggregate that data. These statements are in the procedures that are called within the driving query.
SQL_ID SQL_PLAN_HASH_VALUE EXEC_SECS ASH_SECS ------------- ------------------- ---------- ---------- 515d3buvaf8us 1162018321 615 133 55a20fnkwv0ht 3972836246 615 51 ... ---------- sum 192
However, more significantly, only 192 seconds out of a total elapsed run time of 615 seconds is spent in the database. That is just 31%. So most of the time is spent executing code within the SQR program.
We need to look at the code to see exactly what is happening.
This is the driving query. It returns each employee who is paid in a specified payroll, and then for each row the procedure get_gp_acum is used to fetch the sum of certain accumulators for that payroll and employee
Begin-Select a.emplid a.cal_run_id a.cal_id a.gp_paygroup a.orig_cal_run_id a.rslt_seg_num a.tax_code_uk a.gpgb_ni_categorye. n.name Let $pin_name = 'GBR AC GRTX SEG' !Taxable Gross do get_gp_acum Let #Taxable_gross = &b.calc_rslt_val Let $pin_name = 'GBR AC NIBL SEG' !Nlable Gross do get_gp_acum Let #Niable_gross1 = &b.calc_rslt_val … from ps_gpgb_ee_rslt a, ps_person_name n where a.emplid = n.emplid and a.cal_run_id = $PNL_CAL_RUN_ID1 and a.empl_rcd = 0 and a.gp_paygroup = $PNL_PYGRP1 and a.cal_id = $PNL_CAL_ID1 and a.orig_cal_run_id = a.cal_run_id order by a.emplid,a.gp_paygroup End-Select End-Procedure
This is one of the two procedures that is called to obtain each value. It simply sums the data for that employee.
Begin-Procedure get_gp_acum begin-select sum(b.calc_rslt_val) &b.calc_rslt_val from ps_xx_gpacum_rpt1 b, ps_gp_pin c where b.emplid = &a.emplid and b.empl_rcd = 0 and b.cal_run_id = &a.cal_run_id and b.gp_paygroup = &a.gp_paygroup and b.cal_id = &a.cal_id and b.orig_cal_run_id = &a.orig_cal_run_id and b.rslt_seg_num = &a.rslt_seg_num and b.orig_cal_run_id = b.cal_run_id and b.pin_num = c.pin_num and c.pin_nm = $pin_name end-select End-Procedure
This code is very clear, well structured, and easy to maintain. The only trouble is that it is slow. Each SQL calls makes SQR do a lot of work, and that takes time.
In this case there was not much procedural code in the SQR and so I was able to coalesce the SQL from the called procedures into the driving query.
If the called procedures had been simple single row look-ups I could have used an outer-join. However, as they are using a group function (sum), I put the query into a scalar query (a query within in the select clause that returns only one row and one column). Each call to a procedure was replaced with a separate scalar query. I ended up with 21 scalar queries.
During this rewrite I encountered an SQR quirk; if the scalar query was placed in the main select clause, SQR produces errors because it is expecting an expression, and it complains that the SELECT keyword is not a variable. I then had to wrap the query in an in-line view. Each scalar query must be given a column alias, and the column alias can be referenced in the SQR select clause.
Begin-Procedure MAIN-REPORT Begin-Select a.emplid a.cal_run_id a.cal_id a.gp_paygroup a.orig_cal_run_id a.rslt_seg_num a.tax_code_uk a.gpgb_ni_category n.name a.gbr_ac_grtx_seg Let #Taxable_gross = &a.gbr_ac_grtx_seg a.gbr_ac_nibl_seg Let #Niable_gross1 = &a.gbr_ac_nibl_seg … from ( select a.emplid, a.cal_run_id, a.cal_id, a.gp_paygroup, a.orig_cal_run_id, a.rslt_seg_num, a.tax_code_uk, a.gpgb_ni_category ,NVL((SELECT sum(b.calc_rslt_val) from ps_xx_gpacum_rpt1 b, ps_gp_pin c where b.emplid = a.emplid and b.empl_rcd = 0 and b.cal_run_id = a.cal_run_id and b.gp_paygroup = a.gp_paygroup and b.cal_id = a.cal_id and b.orig_cal_run_id = a.orig_cal_run_id and b.rslt_seg_num = a.rslt_seg_num and b.orig_cal_run_id = a.cal_run_id and b.pin_num = c.pin_num and c.pin_nm = 'GBR AC GRTX SEG'),0) gbr_ac_grtx_seg ,NVL((SELECT sum(b.calc_rslt_val) from ps_xx_gpacum_rpt1 b, ps_gp_pin c where b.emplid = a.emplid and b.empl_rcd = 0 and b.cal_run_id = a.cal_run_id and b.gp_paygroup = a.gp_paygroup and b.cal_id = a.cal_id and b.orig_cal_run_id = a.orig_cal_run_id and b.rslt_seg_num = a.rslt_seg_num and b.orig_cal_run_id = a.cal_run_id and b.pin_num = c.pin_num and c.pin_nm = 'GBR AC NIBL SEG'),0) gbr_ac_nibl_seg … from ps_gpgb_ee_rslt a where a.empl_rcd = 0 and a.orig_cal_run_id = a.cal_run_id ) a ,ps_person_name n where a.cal_run_id = $PNL_CAL_RUN_ID1 and a.gp_paygroup = $PNL_PYGRP1 and a.cal_id = $PNL_CAL_ID1 and a.emplid = n.emplid order by a.emplid,a.gp_paygroup End-Select
The SQL looks much more complicated, as does the execution plan. However, the effect on performance was dramatic.
SQL_ID SQL_PLAN_HASH_VALUE EXEC_SECS ASH_SECS ------------- ------------------- ---------- ---------- f9d03ffbftv81 1694704409 154 114 5d2x9mqvvyrjk 989254841 154 2 3v550ghn6z8jv 1521271881 154 1 ---------- sum 117
The response of just the combined SQL at 117 seconds is better than the separate SQLs at 154 seconds. Much more significantly, the amount of time spent in SQR (rather than the database) has fallen from 432 seconds to just 37. Therefore, 90% of the SQR response time was spent on submitting the SQL calls in the called procedures.
Conclusions
SQL calls in SQR are expensive. The cost of making lots of calls inside a loop or another driving query can add up to a significant amount of time. SQRs that consume time in this way will also be consuming CPU and memory on the server where the Process Scheduler is located.
In this case, combining SQL statements also improved SQL performance, but that will not always be the case.
There are times when better performance can be achieved at the cost of more convoluted code. In each case there is a judgement to be made as to whether improvement in performance is worth the increase in complexity.
No comments :
Post a Comment