Troubleshooting Guide
This section describes how to deal with some frequently
encountered problems faced by users working with a DB2 server. It addresses
the following topics:
For additional troubleshooting information on this topic, see:
Important: This section represents a small sampling of the
information available from DB2 Customer Service. For a complete and
up-to-date source of DB2 information, use the DB2 Product and
Service Technical Library at
http://www.software.ibm.com/data/db2/library/.
If you experience problems migrating to Version 5, ensure that:
- [ ]
- You followed the instructions in the Quick
Beginnings guide.
- [ ]
- You checked the list of incompatibilities from previous versions. (See the
Administration Guide.)
- [ ]
- The database is consistent.
To verify that databases are consistent so that they can be migrated, use
the db2ckmig command. (See the Quick
Beginnings book for details.)
You must correct all errors that are reported for databases that are in any
of the following states:
- Backup pending: Perform a backup of the database.
- Rollforward pending: Recover the database as required by performing
or resuming the db2 ROLLFORWARD DATABASE command.
- Database inconsistent: Restart the database to make it consistent.
Additionally, this tool identifies databases that contain objects that use
SYSCAT, SYSSTAT, or SYSFUN as schema names. These objects must be dropped and
re-created using a different schema name.
- [ ]
- The disk space on the machine is adequate. (The amount of disk space
required varies with each database.)
- [ ]
- You use appropriate values for the number of primary and secondary log
files (logprimary and logsecond in database
configuration), and for the size of the log files (logfilsiz in
database configuration). See "Server Fails" for details.
- [ ]
- If migrating from a Version 1 DB2 for OS/2 database, you use the
db2resdb command.
- Symptom
- The DB2 server does not function correctly.
- Action
- Ensure that:
- [ ]
- You check the db2diag.log file on the server to diagnose a cause. For
information, see "Interpreting the db2diag.log File".
- [ ]
- DB2COMM is set to the protocols required by connecting clients.
(If not, the message SQL5043N is usually received.)
Consult the db2diag.log file to see which communication protocols were not
started successfully, and why. For information, see "Using the db2diag.log file to diagnose server communication problems".
- [ ]
- There is enough log space. (If not, the SQL0984C message is typically
received.)
To increase log space:
- Increase the number of primary and secondary log files
(logprimary and logsecond in database configuration)
-
Increase the size of the log files (
logfilsiz in database
configuration)
To initiate your changes, you must disconnect all applications from the
database and/or deactivate the database.
You must ensure that the file system or disk space that contains the logs
is big enough to hold all of the logs. To calculate the minimum required space
for initializing the logs, use the following formula:
(logprimary + logsecondary) * logfilsiz * 4096
- [ ]
- You are using the latest version of DB2, with appropriate fix packs
installed. For information, see "Updating DB2 Products".
- [ ]
- The index files are valid.
The db2diag.log file indicates if indexes need to be re-created. Use
db2dart /MI to mark the indexes as inconsistent. See "Miscellaneous Troubleshooting Tools" for details on db2dart /MI.
- [ ]
- The data is consistent.
Look for the following symptoms:
- There is a DIA3700C error in the db2diag.log file.
- The server goes down when particular data is accessed. This data resource
may be damaged.
- The server fails repeatedly.
Run db2dart to verify that the data is consistent. If data is
not consistent, contact DB2 Customer Service.
For Extended Enterprise Edition, see also "Database Manager Won't Start or Hangs".
- Symptom
- You have problems installing or administering a database server.
- Action
- Ensure that:
- [ ]
- You are using a valid user name and password that have administrator
authority.
- For OS/2 Warp Connect and later versions of OS/2: The default DB2
administrator user name and password are the same as those specified when the
operating system was installed.
- For versions of OS/2 before OS/2 Warp Connect: The default
administrator user name and password depend on the national language installed
on a machine, and on whether UPM was installed with a particular user name. In
most cases, the defaults are USERID and PASSWORD in English countries. For
more information, see the Quick Beginnings for your platform.
- For Windows NT: Your membership in a group controls which actions
you are permitted to perform. For DB2 for Windows NT, you must belong to the
"Administrators" or "Domain Admins." groups to perform installation and some
administration tasks.
You can find out what your group is by using the User Manager tool to
display user names and group memberships. To invoke this tool, select
Start, then Programs, then Administrative
Tools, and then User Manager.
To install DB2, you must have administrator authority on the local machine
with a user name that is eight characters or less (note that you cannot use
the user name "Administrator"). To be added to the administrator group, you
must be added by an existing member of that group.
- For UNIX-based environments: You must belong to a SYSADM
group. The name of the group must be eight characters or less in length.
-
For DB2 for Solaris: If a DB2 instance is started by a user who does
not belong to the same group as the DB2 instance owner, the SQL1042C message
is received for many commands. In this situation, the executing db2sysc
process inherits the user's group name and does not have the correct read
permission to open a file in the /proc directory.
You must start the instance as the instance owner, and the id/group of the
db2sysc process attempting to read a file must be the same as the id/group of
the DB2 instance.
- [ ]
- For UNIX-based environments other than AIX, you update and rebuild
the kernel configuration parameters. (If not, the SQL1016N or SQL1018N
messages will typically occur when you try to create an instance or use the
command line processor.) See the Quick Beginnings guide for your platform for details.
It is important to have a good backup and recovery strategy, and a plan to
implement this strategy. This plan should address questions such as the
following:
- How crucial is the data?
- How long can the data be unavailable without impacting users?
- How much time and resource can be spent on restoring the database?
- Is the data read-only, or is it updated?
- Can the data be rebuilt easily from another source?
- How much resource is allocated to a backup and recovery system?
Backup plans are heavily influenced by the importance of having the data in
the database available for use. If your business depends heavily on the data,
then you must develop a backup and recovery plan that will minimize downtime
and ensure that the data is available when needed at a cost that you are
willing to accept.
For help on backup and recovery see the Administration
Guide and use the Backup Database SmartGuide from the Control Center.
- Symptom
- The SQL0902C message is received when a database is backed up.
- Possible Cause
- The database is not architecturally consistent.
- Action
- Use the db2dart command to verify the architectural consistency
of the database. See "Miscellaneous Troubleshooting Tools" for information on this command.
Usually you can perform a restore with a roll forward to the end of the
logs to correct the problem. If the problem persists, contact DB2 Customer
Service.
- Symptom
- Data cannot be restored.
- Action
- Ensure that:
- [ ]
- You have enough disk space to restore the data.
Use redirected restore to specify a modified list of containers for the
table spaces being restored. See the Administration
Guide.
- [ ]
- You specify the correct path to the backup image and logs. (It is possible
that they were moved.)
- [ ]
- If an online backup was taken, all logs from the beginning to the end of
the backup are required to restore and roll forward the database.
(The rollforward must proceed until at least the point where the online
backup ended. Otherwise, the database is not accessible.)
- Symptom
- Data will not load on a database server.
- Action
- Ensure that:
- [ ]
- You used the RESTART or REPLACE parameters with the LOAD command. (If not,
the SQL3805N message is typically received.)
For more information on the LOAD command, see the Command
Reference.
- [ ]
-
The size of a database object in an SMS table space has not reached an
operating system limit.
For example, OS/2, Windows NT, and Version 3.2 of AIX have a 2 GB file
size limit.
(Refer to "Problems with Splitting and Loading Data" for more information on loading data in Extended Enterprise Edition.)
- Symptom
- You are having problems importing data.
- Action
-
- [ ]
-
Ensure that there is enough log space available. (See
"Server Fails".)
- [ ]
- Use the COMMITCOUNT n option to commit data after every
n records are imported. This option protects the committed data from
loss in case of a failure. It also reduces the log requirements for an import
operation, allowing a number of smaller transactions rather than one long
transaction.
- Symptom
- You cannot use commands, utilities, or the Command Line Processor.
- Action
- Ensure that:
- [ ]
- You bound the utilities and your applications to the database after you
installed a fix or newer executable.
(The SQL0818N or SQL0805N messages indicate that you need to rebind.)
When you precompile an SQL application, a compilable file is produced, and
optionally a bind file. Both of these files have a new timestamp. The default
behavior of the precompile operation is to automatically create a package so
that you do not need to bind anything. However, if you create a bind file but
not a package, then you must bind the new bind file to the database.
For information on binding, see your Quick Beginnings guide.
- [ ]
-
You used the correct syntax (especially for UNIX-based systems).
It is recommended that you either:
- Use double quotation marks (" ") around a command line processor request.
- Precede any special characters such as asterisks (*), brackets, or
question marks (?) with a back slash (\) character to ensure that the command
line processor interprets them correctly.
Example
For:
db2 SELECT * FROM SYSCAT.TABLES
Use either:
db2 "SELECT * FROM SYSCAT.TABLES"
db2 SELECT \* FROM SYSCAT.TABLES
The database manager provides concurrency control and prevents uncontrolled
access by means of locks. Because of the basic principles of DB2 locking, you
do not need to take action to control locks in most cases. For information on
how locking works, see the Administration Guide.
Use the following methods to investigate whether or not
applications are creating deadlocks or holding locks:
- Deadlocks occur when two or more applications connected to the
same database wait indefinitely for a resource. The waiting is never resolved
because each application is holding a resource that the other needs to
continue.
Use an event monitor for deadlocks on a database to keep a trace of each
deadlock as it occurs.
- To determine if an application is waiting for a lock, use the
snapshot monitor and see if lock_wait_time is high. A high value indicates
that an application is waiting for a lock that another application is holding.
It indicates that applications may not be committing their transactions
frequently enough.
To see which locks are causing waits, and which applications are holding
them, set the LOCK monitor switch ON and then get an application snapshot.
For information, see "The Database System Monitor" and the System Monitor Guide and Reference.
- Symptom
- Applications accessing a DB2 database are slow or appear to hang.
- Possible Cause
- There is lock contention or lock escalation.
- Action
- Use the Control Center or the LIST APPLICATIONS FOR DATABASE
database-alias SHOW DETAIL command to find out if applications
are waiting on a lock and who is holding the lock that the applications are
waiting for. Then use the database monitor to determine if lock escalation is
occurring.
If escalation is occurring, ensure that:
- [ ]
- Applications use an appropriate frequency of commits.
Take a monitor snapshot for locks to determine which applications are
causing other applications to wait for locks. Also, monitor the
db2.lock_waits performance variable.
- [ ]
- The values of the database configuration parameters maxlocks
and locklist, which determine the number of locks allowed, are
appropriate.
- [ ]
- The locktimeout database configuration parameter is set
appropriately. When it is turned on, this parameter can help avoid hangs due
to lock contention, and can tell you if lock contention is your problem.
- [ ]
- The dlchktime database configuration parameter is set
appropriately. This parameter defines the frequency at which the database
manager checks for deadlocks among all the applications connected to a
database. If it is set high, CPU time is saved but deadlocks may not be
detected soon enough.
- [ ]
- Other users are not preparing and binding applications.
The procedure of preparing and binding applications requires that locks be
obtained on system catalog tables. This procedure should be performed in
off-peak hours.
- [ ]
- Lock waits or deadlocks are not caused by next key locking.
Next key locking guarantees a cursor stability (CS) isolation
level by automatically locking the next key for all INSERT and DELETE
statements and the next higher key value above the result set for SELECT
statements. It is required to guarantee ANSII and SQL 92 standard CS, and is
the DB2 default.
Examine explain information for the application. If the problem appears to
be with next key locking, you can change the isolation level to Read Stability
(RS) by setting the DB2_RR_TO_RS flag on.
The DB2_RR_TO_RS flag stops all next key locking on user tables
(catalog tables are not affected). Any packages bound with CS are
automatically downgraded to RS, because DB2 can no longer guarantee CS. Do not
use this flag if you require ANSII and SQL standard CS,
For information on locking, configuration parameters, and explain
information, see the Administration Guide.
- Symptom
- The SQL0911N message is received.
- Possible Cause
- There are deadlocks or timeouts.
- Action
- Check the reason code in the SQLCA to determine if deadlocks or timeouts
are causing the problem (see "Interpreting an SQLCA Structure").
In the case of deadlocks, create an event monitor for deadlocks to gather
more information. Possibly increase the size of the lock list to avoid lock
escalation which can often result in a deadlock scenario.
Note that timeouts are not captured using the deadlock event monitor.
User authentication can cause problems for Windows NT users because of the
way the operating system authenticates. This section describes some
considerations for user authentication under DB2 for Windows NT:
- User names and group names are limited to 8 characters within DB2.
Therefore the default administrator user name on Windows NT ("ADMINISTRATOR")
does not work when installing or accessing DB2 for Windows NT.
- User names under Windows NT are not case sensitive; however, passwords are
case sensitive.
- User names and group names can be a combination of upper- and lowercase
characters. However, they are usually converted to uppercase when used within
DB2. For example, if you connect to the database and create the table
schema1.table1, this table is stored as SCHEMA1.TABLE1 within the database.
(If you wish to use lowercase object names, issue commands from the command
line processor, enclosing the object names in quotation marks, or use
third-party ODBC front-end tools.)
In DB2 Universal Database we have integrated the authentication of user
names and passwords into the DB2 System Controller. The Security Service is
only required when connecting a client to a server that is configured for
authentication CLIENT.
In a Windows NT LAN environment a user can be authenticated at either a
primary or backup domain controller. This feature is very important in large
distributed LANs with one central primary domain controller and one or more
backup domain controllers at each site. Users can then be authenticated on the
backup domain controller at their site instead of requiring a call to the
primary domain controller for authentication.
The advantage of having a backup domain controller (BDC), in this case, is
that users are authenticated faster and the LAN is not as congested as it
would have been had there been no BDC.
For Version 5, authentication can occur at a backup domain controller under
the following conditions:
- The DB2 for Windows NT server is installed on the backup domain controller
- The DB2DMNBCKCTLR profile variable is set appropriately
If the DB2DMNBCKCTLR profile variable is not set or is set to blank, DB2
for Windows NT performs authentication at the primary domain controller.
If the DB2DMNBCKCTLR profile variable is set to a question mark
(DB2DMNBCKCTLR=?) then DB2 for Windows NT will perform its authentication on
the backup domain controller under the following conditions:
- The Cached Primary Domain under the registry editor is the domain at which
the machine is located. (You can find this setting under
HKEY_LOCAL_MACHINE -> Software -> Microsoft
-> Windows NT -> Current Version ->
WinLogon.)
- The Server Manager shows the backup domain controller as active and
available. (That is, the icon for this machine is not greyed out.)
- The registry for the DB2 for Windows NT server indicates that the system
is a backup domain controller on the specified domain.
Under normal circumstances the setting DB2DMNBCKCTLR=? will work; however,
it will not work in all environments. The information supplied about the
servers on the domain is dynamic, and Computer Browser must be running to keep
this information accurate and current. Large LANs may not be running Computer
Browser and therefore Server Manager's information may not be current. In
this case, there is a second method to tell DB2 for Windows NT to authenticate
at the backup domain controller: set DB2DMNBCKCTLR=xxx where
xxx is the machine name of the backup domain controller and DB2 for
Windows NT server. With this setting, authentication will occur on the backup
domain controller under the following conditions:
- The Cached Primary Domain under the registry editor is the domain at which
the machine is located. (You can find this setting under
HKEY_LOCAL_MACHINE -> Software -> Microsoft
-> Windows NT -> Current Version ->
WinLogon.)
- The machine is configured as a backup domain controller for the specified
domain. (If the machine is set up as a backup domain controller but for
another domain, this setting will result in an error.)
DB2 for Windows NT supports the following types of groups:
- Local groups
- Global groups
- Global groups as members of local groups.
DB2 for Windows NT enumerates the local and global groups that the user is
a member of, using the security database where the user was found. DB2
Universal Database provides an override that forces group enumeration to occur
on the local Windows NT server where DB2 is installed, regardless of where the
user account was found. This override can be achieved using the following
commands:
- For global settings: db2set -g db2_grp_lookup=local
- For instance settings: db2set -i db2_grp_lookup=local
To view all DB2 profile variables that are set, type db2set
-all.
For DB2 for Windows NT to work with domain security, you must grant
authority and privileges to a local group. User names within the local and
global groups MUST be defined on the same domain as the local or global group
in order to be authenticated correctly.
If the db2_grp_lookup profile variable is set to local, then DB2
tries to find a user on the local machine only. If the user is not found on
the local machine, or is not defined as a member of a local or global group,
then authentication fails. DB2 does not try to find the user on
another machine in the domain or on the domain controllers.
If the db2_grp_lookup profile variable is not set then:
- DB2 first tries to find the user on the same machine.
- If the user name is defined locally, the user is authenticated locally.
- If the user is not found locally, DB2 attempts to find the user name on
its domain, and then on trusted domains.
The following examples illustrate how DB2 for Windows NT can support domain
security. In the first example, the connection works because the user name and
local group are on the same domain. In the second example, the connection does
not work because the user name and local or global group are on different
domains.
The connection works in the following scenario because the user name
and local or global group are on the same domain.
Note that the user name and local or global group do not need to be defined
on the domain where the database server is running, but they must be on the
same domain as each other.
Domain1
| Domain2
|
A trust relationship exists with Domain2.
|
- A trust relationship exists with Domain1.
- The local or global group grp2 is defined.
- The user name id2 is defined.
- The user name id2 is part of grp2.
|
The DB2 server runs in this domain. The following DB2 commands are issued
from it:
REVOKE CONNECT ON db FROM public
GRANT CONNECT ON db TO GROUP grp2
CONNECT TO db USER id2
|
|
The local or global domain is scanned but id2 is not found. Domain
security is scanned.
|
|
| The user name id2 is found on this domain. DB2 gets additional
information about this user name (that is, it is part of the group grp2).
|
The connection works because the user name and local or global group are
on the same domain.
|
|
The connection does not work in the following scenario because the user
name is defined on a different domain than where the local or global group is
defined.
Domain1
| Domain2
|
A trust relationship exists with Domain2.
|
- A trust relationship exists with Domain1.
- The local or global group grp2 is defined.
|
- The global group grp1 is defined.
- The user name id1 is defined.
- The user name id1 is part of grp1.
|
|
| Domain1\grp1 is part of grp2.
|
The DB2 server runs on this domain. The following DB2 commands are issued
from it:
REVOKE CONNECT ON db FROM public
GRANT CONNECT ON db TO GROUP grp2
CONNECT TO db USER id1
|
|
The local or global domain is scanned and id1 is found. DB2 gets
information for this user name (that is, the user name id1 is part of grp1 and
the group grp1 is part of Domain2\grp2).
|
|
| The group grp2 exists on this domain.
|
The connection does not work because the local or global group is on
Domain2 and the actual user name is defined on Domain1.
The connection would work if the following command was issued
instead: GRANT CONNECT ON db TO GROUP grp1
|
|
Database performance is a complicated subject beyond the scope of this
guide. As a starting point, be aware of the following:
- The Performance Configuration SmartGuide provides a first attempt at
optimizing a database. It asks you qualitative questions about the database,
its data, and its purpose, and then suggests optimal configuration parameters.
To invoke this SmartGuide, click with the right mouse button on the
database's icon in the Control Center, and then select Configure
performance from the pop-up menu.
- The reorgchk command returns information about the physical
characteristics of a table, and whether or not it would be beneficial to
reorganize that table.
-
The
runstats command updates catalog statistics, thereby
improving the potential for query performance. It is accessible from the
command line or from the Control Center. Rebind all static applications after
running it.
- To understand the access plan chosen for a particular SQL statement, use
Visual Explain from the Control Center. This tool accesses the integrated
Explain facility, a component of the SQL compiler that captures information
about the environment in which static and dynamic SQL statements are compiled.
This information can help you:
- Understand why a query is executing in a particular way
- Design application programs and databases
- Determine when an application should be rebound
Visual Explain provides an accurate, complete, and easy-to-understand
picture of access plans. Alternatively, the following commands are available
from the misc subdirectory of the sqllib directory:
- db2expln describes the access plan selected for static SQL
statements in the packages stored in the system catalog tables. It can be used
to obtain a quick explanation of the chosen access plan for packages for which
explain data was not captured at bind time.
- db2fmt provides a textual description of the explain tables and
the operators chosen for access plans of static SQL.
- dynexpln converts a dynamic statement into a static statement
and runs db2expln against it. This is a quick way to see the basic
access plan for a dynamic statement, but is not necessarily accurate.
For information on the Explain facility and on db2expln, see the
Administration Guide.
- Configuration parameters affect performance. For example,
sheapthres, the database manager configuration parameter for sort
heap threshold, can affect performance if it is set too low. It should also be
at least twice as big as sortheap, the database configuration
parameter for sort heap. For information, see the Administration Guide.
- Locking and how it is handled may affect performance of applications. See "Applications slow or appear to hang".
- The various monitoring tools available for your operating system will help
you identify performance problems with your DB2 server and with your network
connections:
- For OS/2, use SPM/2
- For Windows NT, use the Windows NT performance monitor
- For UNIX-based environments, use the various monitoring commands
outlined in "Performance Monitoring Tools".
For more information on performance:
- Use the keyword "performance" from the DB2 Product and Service Technical
Library at http://www.software.ibm.com/data/db2/library/.
- See the Administration Getting Started guide, the Administration Guide, and the System Monitor Guide and Reference.
[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]
[ DB2 List of Books |
Search the DB2 Books ]