IBM Books

Administration Guide


Index Management

It is important to remember that you do not decide when an index should be used; the database manager makes the decision based on the available table and index information. However, you play an important role in the process by creating the necessary indexes that can improve performance. It is also important for you to collect statistics about the indexes (using the RUNSTATS utility) after you create an index, or change the prefetch quantity (as mentioned above), and on an ongoing basis to keep the statistics up to date. This means you must understand the kinds of indexes that you can create and the ways to create them.

Indexing versus No Indexing

For each table referenced in a database query, if no index exists on the table, then a table scan must be performed on that table. The larger the table, the longer a table scan takes. A table scan occurs when the database manager sequentially accesses every row of a table. This can be compared to an index scan that occurs when the database manager accesses data using an index. (See "Index Scan Concepts".)

An index will be selected for use, if the optimizer estimates that an index scan will be faster than a table scan. Index files generally are smaller and require less time to read than an entire table, particularly as tables grow larger. In addition, the entire index may not need to be scanned. The predicates applied to the index reduce the number of rows to be read from the data pages.

Each index entry consists of a search-key value and a pointer to the row containing that value. The values are arranged in ascending or descending order of the search-key value, which makes it possible to bracket the search, given the right predicates. An index can also be used to obtain rows in an ordered sequence, eliminating the need for the database manager to sort the rows after they are read from the table.
Note:You cannot control whether an index is used by the database manager. For example, the result of a query cannot be guaranteed to be produced in an ordered sequence simply by the existence of an index on the table being queried. The database manager may use this index during the processing of the query but is not required to. Only the existence of an ORDER BY clause can "guarantee" the order of a result set.

Indexes can reduce access time significantly; however, indexes can also have adverse effects on performance. Before creating indexes, consider the effects of multiple indexes on disk space and processing time:

Indexes should be carefully chosen to address the needs of the application program.

To determine whether an index is used in a specific package you may use the SQL Explain facility, described in Chapter 13. "SQL Explain Facility".

Guidelines for Indexing

Which indexes should be created depends on the data and its intended uses. The following guidelines can help you determine which indexes would be most useful:

The following are typical circumstances in which creating an index can improve performance:

Performance Tips for Administering Indexes

The following can help you understand how performance can be impacted by properly using and managing indexes:

  1. Index Creation

    When creating indexes on large tables, and having an SMP machine, consider setting intra_parallel to YES (1) or ANY (-1) to take advantage of parallel performance improvements.

    Multiple processors can be used to scan and sort data. The only time when it is not advantageous to have multiple processors during index creation occurs when the indexsort database configuration parameter is NO. (The default for the parameter is YES). The parameter controls whether sorting of index keys is done during index creation.

  2. Index Table Space

    Indexes may be stored in a different table space from that used to store other table data. This can allow for more efficient use of DASD devices by reducing the movement of read/write heads. You can also create your index table spaces so they will be stored on faster physical devices.

    A table space may also be assigned a separate buffer pool which may protect the index pages from being pushed out of the buffer by the presence of lots of data pages.

    When indexes are not placed in separate table spaces, both data and index pages use the same extent size and prefetch quantity. If you use a different table space for indexes, you have the option of selecting different values for all the characteristics of a table space. Since indexes are typically smaller than tables and are spread over fewer containers, it is common to find smaller extent sizes such as 8 and 16. For more information see, "Index Page Prefetch". Use of faster devices for a table space will be considered by the SQL optimizer, as described in "Table Space Impact on Query Optimization". For more information about table spaces, see "Designing and Choosing Table Spaces".

  3. Degree of Clustering

    If your SQL statement requires ordering (for example, ORDER BY, GROUP BY, DISTINCT) and there is an appropriate index to satisfy the ordering, there may be times that the database manager does not choose the index. This could happen when:

    To improve the effectiveness of the index, use the REORG utility to cluster an index on particular columns. Note, however, that in general a table can only be clustered on one index. Your tables and indexes should be built in the sequence of the clustering index for that table.

  4. RUNSTATS Utility

    After creating a new index, you should use the RUNSTATS utility to collect index statistics. These statistics allow the optimizer to determine whether using the index can improve access performance. See "Collecting Statistics using the RUNSTATS Utility" for more information on this topic.

  5. Reorganizing an Index

    To get the best performance you can from your indexes, you should consider reorganizing your indexes periodically. Updates to your tables may cause index page prefetch to become less effective. To keep the effectiveness of index page prefetch you must reorganize the index.

    You can reorganize the index by either dropping and re-creating the index, or by using the REORG utility. For more information, see "Reorganizing Table Data".

    Dropping and re-creating the index gets a new set of pages that are roughly contiguous and sequential. This improves index page prefetch when it occurs.

    Although more costly to accomplish, the REORG utility also ensures clustering of the data pages. This clustering has greater benefit for index scans accessing a significant number of data pages.

    If you work in a symmetric multi-processor (SMP) system environment, the REORG utility will use multiple processors when intra_parallel is YES or ANY.

  6. Use EXPLAIN

    Periodically, run EXPLAIN on your most frequently used queries and check that each of your indexes is used at least once. If an index is not used in any query, consider dropping that index.

    Also, use EXPLAIN to see if table scans on large tables are processed as the inner of nested loop joins. This would indicate that an index on the join predicate column is either missing or thought to be ineffective at applying the join predicate. Or, perhaps the join predicate is not present.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]

[ DB2 List of Books | Search the DB2 Books ]