This is the fourth of five articles that examine the challenges posed by typical queries on the journal line table (PS_JRNL_LN).
If it had been possible to referentially partition PS_JRNL_LN by FISCAL_YEAR, then the sample query shown in earlier posts would have been able to prune partitions by fiscal year. This would have significantly reduced the cost of the full scan.
Instead, if the sample query has to full scan PS_JRNL_LN, it has to scan all the partitions in PS_JRNL_LN and then filter them against the keys retrieved from PS_JRNL_HEADER. The only way to reduce the cost of the full scan is to reduce the number of blocks being scanned by purge and compression. On Exadata, Hybrid Columnar Compression (HCC) will achieve a much higher compression ratio, and unwanted compression units will be eliminated efficiently.
To meet reporting requirements, Financials systems inevitably need to keep journal data going back several years. Nonetheless, purging unneeded historical journals as aggressively as permissible is the first step in improving or at least preserving query performance.
Compression
Compressing the historical journal line partitions will further reduce the number of blocks in the segments and reduce the cost of the full scan in the query, thus making the optimiser more likely to switch away from the nested loop join to the full scan/Bloom filter/hash join.
In Oracle, basic compression and Hybrid Columnar Compression (HCC) are well suited to data warehouse applications, but they also have application in OLTP systems. Compression occurs during direct-path insert and segment reorganisation operations and does not apply to anything modified by PeopleSoft processes that use conventional DML.
Basic compression is available on any Oracle database; essentially, it avoids storing the same data values many times in the same data block. Hybrid Columnar Compression (HCC) is available on Exadata. Different HCC compression levels use different compression algorithms. DML on compressed data will decompress it.
I do not recommend using Advanced Compression to compress current periods due to the impact on day-to-day processing.
Which Partitions are Static?
DBA_TAB_MODIFICATIONS shows the number of DML operations in each partition since statistics were last collected.
SELECT p.partition_position, m.*, p.high_value
FROM dba_tab_modifications m
INNER JOIN dba_tab_partitions p
ON p.table_owner = m.table_owner AND p.table_name = m.table_name AND p.partition_name = m.partition_name
WHERE m.table_owner = 'SYSADM' AND m.table_name = 'PS_JRNL_LN'
ORDER BY 1
/
This report was generated in December 2024. Most of the updates are in the current and previous monthly periods. There are almost no updates that are older than a year.
Part Table Drop
Pos# Owner TABLE_NAME PARTITION_NAME S INSERTS UPDATES DELETES TIMESTAMP TRU Segs HIGH_VALUE
---- -------- ------------ -------------------- - --------- --------- --------- ------------------- --- ---- --------------------------------------------------------------------------------
…
40 SYSADM PS_JRNL_LN JRNL_LN_202212 0 0 0 13/10/2024 10:08:56 NO 0 TO_DATE(' 2023-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
41 SYSADM PS_JRNL_LN JRNL_LN_202301 0 0 0 29/09/2024 10:01:16 NO 0 TO_DATE(' 2023-02-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
42 SYSADM PS_JRNL_LN JRNL_LN_202302 0 0 0 29/09/2024 10:01:16 NO 0 TO_DATE(' 2023-03-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
43 SYSADM PS_JRNL_LN JRNL_LN_202303 0 0 0 29/09/2024 10:01:16 NO 0 TO_DATE(' 2023-04-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
44 SYSADM PS_JRNL_LN JRNL_LN_202304 0 0 0 29/09/2024 10:01:16 NO 0 TO_DATE(' 2023-05-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
45 SYSADM PS_JRNL_LN JRNL_LN_202305 0 0 0 29/09/2024 10:01:16 NO 0 TO_DATE(' 2023-06-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
46 SYSADM PS_JRNL_LN JRNL_LN_202306 0 0 0 29/09/2024 10:01:16 NO 0 TO_DATE(' 2023-07-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
47 SYSADM PS_JRNL_LN JRNL_LN_202307 0 0 0 29/09/2024 10:01:16 NO 0 TO_DATE(' 2023-08-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
48 SYSADM PS_JRNL_LN JRNL_LN_202308 0 0 0 29/09/2024 10:01:16 NO 0 TO_DATE(' 2023-09-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
49 SYSADM PS_JRNL_LN JRNL_LN_202309 0 0 0 13/10/2024 10:08:56 NO 0 TO_DATE(' 2023-10-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
50 SYSADM PS_JRNL_LN JRNL_LN_202310 0 0 0 27/10/2024 10:59:45 NO 0 TO_DATE(' 2023-11-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
51 SYSADM PS_JRNL_LN JRNL_LN_202311 0 0 0 29/09/2024 10:01:16 NO 0 TO_DATE(' 2023-12-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
52 SYSADM PS_JRNL_LN JRNL_LN_202312 34 193 34 10/12/2024 14:21:38 NO 0 TO_DATE(' 2024-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
53 SYSADM PS_JRNL_LN JRNL_LN_202401 42374 127736 210 12/12/2024 05:27:31 NO 0 TO_DATE(' 2024-02-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
54 SYSADM PS_JRNL_LN JRNL_LN_202402 34803 92215 0 12/12/2024 05:26:30 NO 0 TO_DATE(' 2024-03-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
55 SYSADM PS_JRNL_LN JRNL_LN_202403 54940 166263 0 12/12/2024 05:12:29 NO 0 TO_DATE(' 2024-04-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
56 SYSADM PS_JRNL_LN JRNL_LN_202404 5900 13730 0 13/12/2024 05:29:32 NO 0 TO_DATE(' 2024-05-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
57 SYSADM PS_JRNL_LN JRNL_LN_202405 6151 13869 0 13/12/2024 05:31:06 NO 0 TO_DATE(' 2024-06-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
58 SYSADM PS_JRNL_LN JRNL_LN_202406 18317 58263 0 13/12/2024 16:15:49 NO 0 TO_DATE(' 2024-07-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
59 SYSADM PS_JRNL_LN JRNL_LN_202407 5067792 14937405 0 13/12/2024 16:02:36 NO 0 TO_DATE(' 2024-08-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
60 SYSADM PS_JRNL_LN JRNL_LN_202408 5217744 15378822 0 13/12/2024 18:02:57 NO 0 TO_DATE(' 2024-09-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
61 SYSADM PS_JRNL_LN JRNL_LN_202409 65389 243360 160 13/12/2024 12:45:25 NO 0 TO_DATE(' 2024-10-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
62 SYSADM PS_JRNL_LN JRNL_LN_202410 44839 152210 0 13/12/2024 00:28:54 NO 0 TO_DATE(' 2024-11-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
63 SYSADM PS_JRNL_LN JRNL_LN_202411 28279594 53637873 27478940 13/12/2024 18:18:00 NO 0 TO_DATE(' 2024-12-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
64 SYSADM PS_JRNL_LN JRNL_LN_202412 34761590 53485631 27484239 13/12/2024 19:16:11 NO 0 TO_DATE(' 2025-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
65 SYSADM PS_JRNL_LN JRNL_LN_202501 137138 473452 0 13/12/2024 19:18:09 NO 0 TO_DATE(' 2025-02-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORI1
66 SYSADM PS_JRNL_LN JRNL_LN_202502 0 0 0 10/11/2024 10:08:21 NO 0 TO_DATE(' 2025-03-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
67 SYSADM PS_JRNL_LN JRNL_LN_202503 466 0 0 13/12/2024 03:59:20 NO 0 TO_DATE(' 2025-04-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
68 SYSADM PS_JRNL_LN JRNL_LN_202504 0 0 0 17/11/2024 10:03:01 NO 0 TO_DATE(' 2025-05-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
69 SYSADM PS_JRNL_LN JRNL_LN_202505 0 0 0 17/11/2024 10:03:01 NO 0 TO_DATE(' 2025-06-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
…
Therefore, compressing journal lines in the current or previous fiscal years is not viable, as subsequent updates would destroy the compression. However, I can look at explicitly compressing older partitions.
Attribute Clustering
Attribute Clustering sorts or clusters the rows by certain columns. Like compression, this is declarative and will only take effect during direct-path load or segment reorganisation (such as a compression operation). It will not take effect during normal DML. It can be defined at table level and will be implemented during compression.
I have defined attribute clustering on PS_JRNL_LN to sort the rows by the unique key columns of its parent record, PS_JRNL_HEADER.
ALTER TABLE ps_jrnl_ln ADD CLUSTERING BY LINEAR ORDER (business_unit, journal_id, journal_date, unpost_Seq);
ALTER TABLE ps_jrnl_ln MODIFY CLUSTERING YES ON LOAD YES ON DATA MOVEMENT;
I have found it produces a small (approximately 2-5% of the original size) improvement in the resulting compression, further reducing the cost of the full table scan. So, it is a marginal gain for no additional cost.
Hybrid Columnar Compression (HCC)
I have chosen to compress partitions older than one complete fiscal year with QUERY LOW, and those than 3 complete fiscal years with QUERY HIGH.
Each historical partition can be compressed using an online compress operation. This can be done in parallel. It is important to specify that it maintains the indexes so that they remain valid and usable after the operation. This approach will not lock PS_JRNL_LN, and so the application can continue to run during this process.
…
ALTER TABLE ps_jrnl_ln MOVE PARTITION jrnl_ln_202012 COMPRESS FOR QUERY HIGH UPDATE INDEXES ONLINE PARALLEL;
ALTER TABLE ps_jrnl_ln MOVE PARTITION jrnl_ln_202101 COMPRESS FOR QUERY LOW UPDATE INDEXES ONLINE PARALLEL;
…
The statistics on the newly compressed partitions should be updated, as well as the global statistics.
Statistics
Optimizer statistics have to be regathered on the freshly compressed partitions; otherwise, the reduction in the cost of the full scan will not occur.
On some systems, statistics collection on the very largest tables, including PS_JRNL_LN, may not be completed within the regular maintenance window. Then, Oracle may repeatedly attempt to collect statistics on these segments, and other statistics can become stale too. Instead, I usually lock the statistics on such tables (to remove them from the maintenance window job) and create a specific recurring job to collect statistics at a convenient time (e.g. at the weekend).
An interim option is to simply manually update the number of blocks in the partition statistics to the number of blocks in the compressed segment (if it is lower), and recalculate the total number of blocks in the whole table for the global statistics.
set serveroutput on timi on
DECLARE
l_table_name VARCHAR2(18) := 'PS_JRNL_LN';
l_part_update BOOLEAN := FALSE;
BEGIN
FOR i IN(
select t.table_name, t.partition_name, t.num_rows, t.blocks stat_blocks, s.blocks seg_blocks
, s.tablespace_name, p.compress_for, t.num_rows/NULLIF(LEAST(t.blocks,s.blocks),0) rpb
from user_segments s
inner join user_tab_partitions p ON p.table_name = s.segment_name AND p.partition_name = s.partition_name
inner join user_tab_statistics t ON s.segment_name = t.table_name AND s.partition_name = t.partition_name and t.blocks>s.blocks
where s.segment_type = 'TABLE PARTITION' and p.compress_for IS NOT NULL and s.segment_name = l_table_name
) LOOP
l_part_update := TRUE;
dbms_output.put_line(i.table_name||' ('||i.partition_name||') '||i.stat_blocks||' => '||i.seg_blocks||' blocks');
dbms_stats.set_table_stats(ownname=>'SYSADM',tabname=>i.table_name,partname=>i.partition_name,numblks=>i.seg_blocks,force=>TRUE);
END LOOP;
IF l_part_update THEN
FOR i IN (
SELECT table_name, sum(blocks) blocks
FROM user_tab_statistics
WHERE table_name = l_table_name AND partition_name IS NOT NULL
GROUP BY table_name
) LOOP
dbms_output.put_line(i.table_name||' = '||i.blocks||' blocks');
dbms_stats.set_table_stats(ownname=>'SYSADM',tabname=>i.table_name,numblks=>i.blocks,force=>TRUE);
END LOOP;
ELSE
dbms_output.put_line(l_table_name||' - no action required');
END IF;
END;
/
Orphaned Index Entries & Space Recovery
One side effect of the table segment compression operation (or any other segment reorganisation operation) is that we get orphaned entries in any global indexes. I could rebuild these indexes.
ALTER INDEX psdjrnl_ln REBUILD ONLINE TABLESPACE psindex PARALLEL;
Or, I could wait for the PMO_DEFERRED_GIDX_MAINT_JOB job to run (scheduled by default during the maintenance window) to clean out orphaned index entries from all currently impacted global indexes. Another alternative is to manually run the dbms_part.cleanup_gidx procedure that is in turn called by this job (see also Richard Foote: 12c Asynchronous Global Index Maintenance Part II)
However, another side effect is that global indexes can grow as they are maintained by the table partition compression operations. These indexes can be coalesced, cleaned up, and then shrunk. The shrink implies a coalesce. ALTER INDEX … SHRINK SPACE COMPACT is equivalent to ALTER INDEX … COALESCE - see Jonathan Lewis's Oracle Scratchpad: Shrinking indexes). However, the coalesce can be done in parallel, but the shrink cannot. Therefore, I have chosen to do a parallel coalesce that includes a clean-up of orphaned entries, and then I shrink the segment.
ALTER INDEX psdjrnl_ln COALESCE CLEANUP PARALLEL;
ALTER INDEX psdjrnl_ln SHRINK SPACE;
I also rebuild corresponding partitions in locally partitioned indexes.
…
ALTER INDEX ps_jrnl_ln REBUILD PARTITION jrnl_ln_202012 ONLINE;
ALTER INDEX ps_jrnl_ln REBUILD PARTITION jrnl_ln_202101 ONLINE;
…
At the end of this process, space has been released back to the tablespace, but the free space will be distributed throughout the tablespace, so it probably won't be possible to release space back to the file system.
If you are compressing many partitions, perhaps because you are doing this for the first time, and they are in a dedicated tablespace, then I would suggest completely rebuilding the objects into new tablespaces. Tablespaces can be renamed as required.
No comments :
Post a Comment