Thursday, December 19, 2013

DB2 Connection Failure /U4040 Abend on program DB2CAF / BMC Image copy jobs looping – DB2 Reason 00F30055

Problem:


Some of the BMC DB2 Image copy jobs started looping consuming huge CPU after BMC COPY PLUS Upgrade (For DB2 V9.1.00 to For DB2 V10.1.00). Once the jobs are cancelled and rerun without any change it runs and completes within few CPU seconds. It occurs for few jobs alone, once or twice a week.

Application Batch jobs started failed due to U4040 Abend on program DB2CAF.

Several questions

1. What is the reason for DB2 connection failures?

2. As there are no change production jobs or volume, what could be the reason that this issue starting occurring after BMC upgrade.

3. Why DB2 BMC Image copy alone is looping where other application job fail with U4040.

Reason:

Reason for failure based on DB2 Reason code 00F30055 was due to the maximum number of concurrent identify level agents has been exceeded. We identified that IDBACK maximum limit is reached causing these connection failures.

Why it occurred

BMC Image copy jobs looping after conversion to COPY PLUS FOR DB2 V10.1.00 from COPY PLUS FOR DB2 V9.1.00

Before upgrade, the BMC Option module had parameter MAXTASKS= 1, which was running as a single threaded job. Subtask was always 1.

After upgrade, the BMC Option module had parameter set to MAXTASKS= (1, AUTO), which enables multitasking that allowed more DB2 parallel threads created by the job. We could see 6 subtasks are generated, which were running as single thread earlier.


If we do not want multi-tasking, we need to set MAXTASKS can be set to (1, 1).

From BMC Manual

With multitasking, make sure that you have specified unique output data set names across tasks. You can add symbolic variables, such as &TASK or &SEQ, to your output descriptor to make sure that the data sets have unique names.

Multitasking might require changes to the following DB2 DSNZPARMS:

 CTHREAD (maximum users)

 IDFORE (maximum users from TSO)

 IDBACK (maximum number of concurrent attachments from batch)

Pending : Why BMC Image copy was looping taking huge CPU instead failing with connection failure is open with BMC now.

Solution:

With Multitasking we did observe the jobs running faster and no difference on CPU usage on a successful run.

We decide to increase the maximum limit value on DB2 DSNZPARMS IDBACK, CTHREAD and IDFORE to prevent this failure and DB2 BMC Image copy lopping issue.



Lessons Learnt:

When changing MAXTASK parameter on BMC option user module, the DB2 subsystem parameters CTHREAD, IDTHREAD and IDBACK might require increase in the limit value.

Check the parameter changes on the BMC Option user module.

No comments:

Post a Comment