Thursday, July 16, 2015

BMC REORG using REBALANCE hidden glitch


Problem: 

Attempted BMC ROERG using option REBALANCE with SHRLEVEL CHANGE between partitions to resolve space issue on a table partition, as application needed 24/7 availability, which put the table partition in Recovery pending as the job failed during ALTER Table limit key stage after REORG Switch phase. 

 

Reason:

During utility terminate phase of BMC REORG with REBALACNE option it completes the switch to shadow dataset first and then attempt to alter statement, where the control is passed to DB2.   DB2 must invalidate dependent plans and packages on that object to complete the ALTER successfully.

The BMC Reorg was holding a DRAIN on the object and successfully completed the switch phase, But that DRAIN did not prevent dependent packages from running, as long as they are not currently accessing this object. And if the packages are running, then DB2 cannot invalidate them.  It is also likely in this case that this thread was waiting on the Reorg plus DRAIN in order to access this object, resulting in a deadlock. To put it another way, Reorg Plus is waiting on the thread in order to invalidate the package in order to complete the ALTER, and the thread is waiting on Reorg Plus to access the object, so its package cannot be invalidated, and so the ALTER cannot be completed.


The BMC recommended default option ALTRFAIL=RCVRPEND caused the table to be put in Recovery pending status to allow it to be restarted to complete the REORG.  And this option ALTRFAIL is an installation option so it cannot be overridden with REORG Control card. Either you need to change the Default ALTRFAIL=TERM or create new DOPTS module which takes time to make and needs system DBA support.

With ALTRFAIL=TERM, REORG PLUS automatically backs out the reorganization when a failure occurs during processing of limit-key ALTER statements


Fix:

We resolve the space issue with IBM REORG as it build good compression dictionary and does better space compression than BMC REORG.

 

Lessons Learnt:  

BMC REORG REBALANCE with SHRLEVEL CHANGE even though say no outage in the manual, it is not really a solution for an application that is active and busy 24/7.

Always have a user module with ALTRFAIL=TERM and use it to avoid outage to the application if at all you are going to try to attempt BMC REORG REBALANACE with SHRLEVL CHANGE for any immediate fix without an outage.

No comments:

Post a Comment