Troubleshooting Guide
If you are supporting an organization using DB2, you will receive calls
from users to resolve a variety of problems. Your response depends on:
- The severity of the problem
- The specific nature of the problem
- Any related information that you can gather
- Your experience in resolving similar problems
To fix a problem, start by getting a good description of the problem. With
this description, you can work on determining the origin of the problem. For
example, a problem may exist in any of the following:
- Hardware
- Operating system
- Networking system or other subsystem
- DB2 server
- DB2 client
- DB2 Connect gateway to host systems
Refer to the following sections for information:
Also, use the detailed information provided in the other chapters of Part 1
to help you determine and solve your particular problem.
Most applications run in a client/server environment. You must determine if
a problem is on the client, the server, or somewhere in between (that is, in
the LAN or communication protocol stack).
Investigating where the problem is detected or reported is the best way to
start. For example, if you receive an unexpected SQL code on a client, then
investigate the SQL code on that client. (See "Responding to Unexpected Messages or SQL Codes" for information.)
Often the SQL code alone is enough information to determine the source and
cause of the problem. If the SQL code does not give enough information to
determine the source of a problem, examine the db2diag.log file at the machine
where the problem was reported. For example, if the problem was reported on a
client, first look at the db2diag.log file on that particular client.
The db2diag.log file is an ASCII file written by DB2 that contains
diagnostic information for DB2. If you know the date and time when the problem
occurred, you can go directly to the corresponding db2diag.log entries. For
information on this important file, see "Understanding First Failure Data Capture". When viewing the file, keep in mind that the most recent conditions are
always at the end.
When you receive an unexpected message or SQL code, follow these steps
until you can determine the problem:
- When you receive a message, take note of all available information,
including the following:
- The SQL code, an 8-digit alphanumeric message identification
number. Also note all reason codes, return codes, and other information
associated with the message returned.
-
Any SQL state received. SQL states are useful for diagnosing problems,
because they are consistent across all platforms. For a list of SQL states,
see the Message Reference.
- The text of the message (especially if the message does not include an
identification number or a code).
- The SQLCA if available.
- Any action suggested in the message.
- Diagnostic files, such as the db2diag.log file. In addition, note any
operating system diagnostic files such as core files (for UNIX-based systems),
event logs (for Windows NT), or SYSLOG files (for OS/2). For information, see Part 2. "Advanced DB2 Troubleshooting".
- The environment in which the message occurred. For example, what the user
was doing at the time, the steps that led up to the problem, the type of
operating system, applications that were running, and the communication
protocol.
- The SQL statement that encountered the error, and any preceding statements
in the unit of work
-
Check the online message help by typing
db2 "? message",
where message is the complete SQL code, SQL state, or message number.
Read and follow the suggested actions.
- Use the SQL code or message number to search available DB2 documentation
for additional information.
- If the problem persists, ensure that you have as much of the following
information as possible before contacting DB2 Customer Service:
- If you determine that the problem is not with DB2 but with a
vendor-supplied application, contact the vendor.
In this book the term abend includes:
- Segmentation violations and general protection faults (GPFs) on Windows
systems
- Traps on OS/2
- Exceptions on UNIX-based systems
When an abend occurs, work through the following steps until you can
determine the problem:
- Confirm that all DB2 components are at the same service level, especially
if a fix pack has recently been installed. See "Updating DB2 Products".
- Note the executable module that reported the abend.
- If the problem persists, try to collect the following additional
information before contacting DB2 Customer Service:
- Any logged information, in particular:
- If the problem can be reproduced, a trace on the client and server may be
helpful. Follow the steps in "Example of Tracing to a File".
Part 2. "Advanced DB2 Troubleshooting" for information.
- If you determine that the problem is not with DB2 but with a
vendor-supplied application, contact the vendor.
When the system appears to be suspended or looping, try to identify the
problem by working through the following steps:
- Recover the system:
- If the operating system is suspended (with no sign of disk activity),
reboot the machine and check the db2diag.log file for problems.
- If you can access the operating system but not the application:
- Check the status of applications with the Control Center or the LIST
APPLICATIONS FOR DATABASE database-alias command.
The status information shows if applications are waiting for a lock or for
user input ("UOW Waiting"), rather than being suspended inside of the database
manager.
- Use a CPU monitor to check for applications that are using large amounts
of CPU time, and then use your judgement to determine whether or not the
applications are suspended or behaving as expected.
- Check the db2diag.log file for DB2 problems.
- On UNIX-based environments, work through the following steps until
you can stop your DB2 instance:
- Stop the DB2 instance normally with db2stop
- Stop the DB2 instance and force any remaining applications with
db2stop force
Work through the following steps as a last resort, only if the
above steps did not stop the DB2 instance:
- Abruptly kill the DB2 instance with db2stop -kill
- Use the kill command to terminate any DB2 agents that cannot be
stopped
- Use the kill command to terminate DB2 itself (db2sysc)
- As a very last resort, reboot your entire system
If you must use the kill command, ensure that all DB2
interprocess communications (IPC) resources are removed. Either:
- Using messages, the db2diag.log file, and other information, attempt to
determine why the suspension or loop occurred.
Some common problems that cause suspensions or loops include the
following:
- The operating system has run out of swap space or paging space.
- Applications are waiting on a lock, waiting for a database restart, or
waiting for a response from a remote peer. To see if applications are still
running, check for disk activity:
- If the problem persists, try to collect the following additional
information before contacting DB2 Customer Service:
- If you determine that the problem is not with DB2 but with a
vendor-supplied application, contact the vendor.
[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]
[ DB2 List of Books |
Search the DB2 Books ]