IBM Books

Administration Guide


Creating a Database

Creating a database sets up all the system catalog tables that are needed by the database and allocates the database recovery log. The database configuration file is created, and the default values are set. The database manager will also bind the database utilities to the database.

The following database privileges are automatically granted to PUBLIC: CREATETAB, BINDADD, CONNECT, and IMPLICIT_SCHEMA. SELECT privilege on the system catalog views is also granted to PUBLIC.

The following command line processor command creates a database called personl, in the default location, with the associated comment "Personnel DB for BSchiefer Co".

   create database personl
      with "Personnel DB for BSchiefer Co"

The tasks carried out by the database manager when you create a database are discussed in the following sections:

For additional information related to the physical implementation of your database, see Chapter 2. "Designing Your Physical Database".

If you wish to create a database in a different, possibly remote, database manager instance, see "Using Multiple Instances of the Database Manager". This topic also provides an introduction to the command you need to use if you want to perform any instance-level administration against an instance other than your default instance, including remote instances.
Note:See the Command Reference for information about the default database location and about specifying a different location with the CREATE DATABASE command.

Definition of Initial Nodegroups

When a database is initially created, database partitions are created for all partitions specified in the db2nodes.cfg file. Other partitions can be added or removed with the ADD NODE and DROP NODE commands.

Three nodegroups are defined:

Definition of Initial Table Spaces

When a database is initially created, three table spaces are defined:

If you do not specify any table space parameters with the CREATE DATABASE command, the database manager will create these table spaces using system managed storage (SMS) directory containers. These directory containers will be created in the subdirectory created for the database (see "Database Physical Directories"). The extent size for these table spaces will be set to the default.

If you do not want to use the default definition for these table spaces, you may specify their characteristics on the CREATE DATABASE command. For example, the following command could be used to create your database on OS/2:

   CREATE DATABASE PERSONL
     CATALOG TABLESPACE
       MANAGED BY SYSTEM USING ('d:\pcatalog','e:\pcatalog')
       EXTENTSIZE 16 PREFETCHSIZE 32
     USER TABLESPACE
       MANAGED BY DATABASE USING (FILE'd:\db2data\personl' 5000,
                                  FILE'd:\db2data\personl' 5000)
       EXTENTSIZE 32 PREFETCHSIZE 64
     TEMPORARY TABLESPACE
       MANAGED BY SYSTEM USING ('f:\db2temp\personl')
     WITH "Personnel DB for BSchiefer Co"

In this example, the definition for each of the initial table spaces is explicitly provided. You only need to specify the table space definitions for those table spaces for which you do not want to use the default definition.

The coding of the MANAGED BY phrase on the CREATE DATABASE command follows the same format as the MANAGED BY phrase on the CREATE TABLESPACE command. For additional examples, see "Creating a Table Space".

Before creating your database, see "Designing and Choosing Table Spaces".

Definition of System Catalog Tables

A set of system catalog tables is created and maintained for each database. These tables contain information about the definitions of the database objects (for example, tables, views, indexes, and packages), and security information about the type of access users have to these objects. These tables are stored in the SYSCATSPACE table space,

These tables are updated during the operation of a database; for example, when a table is created. You cannot explicitly create or drop these tables, but you can query and view their content. When the database is created, in addition to the system catalog table objects, the following database objects are defined in the system catalog:

After your database has been created, you may wish to limit the access to the system catalog views, as described in "Securing the System Catalog Views".

Definition of Database Directories

Three directories are used when establishing or setting up a new database.

Local Database Directory

A local database directory file exists in each path (or drive on other platforms) in which a database has been defined. This directory contains one entry for each database accessible from that location. Each entry contains:

To see the contents of this file for a particular database, issue the following command, where location specifies the location of the database:

   LIST DATABASE DIRECTORY ON location

System Database Directory

A system database directory file exists for each instance of the database manager, and contains one entry for each database that has been cataloged for this instance. Databases are implicitly cataloged when the CREATE DATABASE command is issued and can also be explicitly cataloged with the CATALOG DATABASE command. For information about cataloging databases, see "Cataloging a Database".

For each database created, an entry is added to the directory containing the following information:

To see the contents of this file, issue the LIST DATABASE DIRECTORY command without specifying the location of the database directory file.

In a partitioned database environment, you must ensure that all database partitions always access the same system database directory file, sqldbdir, in the sqldbdir subdirectory of the home directory for the instance. Unpredictable errors can occur if either the system database directory or the system intention file sqldbins in the same sqldbdir subdirectory are symbolic links to another file that is on a shared file system. These files are described in "Enabling Data Partitioning".

Node Directory

The database manager creates the node directory when the first database partition is cataloged. To catalog a database partition, use the CATALOG NODE command. To list the contents of the local node directory, use the LIST NODE DIRECTORY command. The node directory is created and maintained on each database client. The directory contains an entry for each remote workstation having one or more databases that the client can access. The DB2 client uses the communication end point information in the node directory whenever a database connection or instance attachment is requested.

The entries in the directory also contain information on the type of communication protocol to be used to communicate from the client to the remote database partition. Cataloging a local database partition creates an alias for an instance that resides on the same machine. A local node should be cataloged when there is more than one instance on the same workstation to be accessed from the user's client.

Definition of Database Recovery Log

A database recovery log keeps a record of all changes made to a database, including the addition of new tables or updates to existing ones. This log is made up of a number of log extents, each contained in a separate file called a log file.

The database recovery log can be used to ensure that a failure (for example, a system power outage or application error) does not leave the database in an inconsistent state. In case of a failure, the changes already made but not committed are rolled back, and all committed transactions, which may not have been physically written to disk, are redone. These actions ensure the integrity of the database.

For more information, see Chapter 6. "Recovering a Database".

Binding Utilities to the Database

When a database is created, the database manager attempts to bind the utilities in db2ubind.lst to the database. This file is stored in the bnd subdirectory of your sqllib directory.

Binding a utility creates a package, which is an object that includes all the information needed to process specific SQL statements from a single source file.
Note:If you wish to use these utilities from a client, you must bind them explicitly. See the Quick Beginnings manual appropriate to your platform for information.

If for some reason you need to bind or rebind the utilities to a database, issue the following commands using the command line processor:

   connect to sample
   bind @db2ubind.lst
Note:You must be in the directory where these files reside to create the packages in the sample database. The bind files are found in the BND subdirectory of the SQLLIB directory. In this example, sample is the name of the database.

Cataloging a Database

When you create a new database, it is automatically cataloged in the system database directory file. You may also use the CATALOG DATABASE command to explicitly catalog a database in the system database directory file. The CATALOG DATABASE command allows you to catalog a database with a different alias name, or to catalog a database entry that was previously deleted using the UNCATALOG DATABASE command.

The following command line processor command catalogs the personl database as humanres:

   catalog database personl as humanres
      with "Human Resources Database"

Here, the system database directory entry will have humanres as the database alias, which is different from the database name (personl).

You can also catalog a database on an instance other than the default. In the following example, connections to database B are to INSTANCE_C.

   catalog database b as b at node instance_c
Note:The CATALOG DATABASE command is also used on client nodes to catalog databases that reside on database server machines. For more information, see the Quick Beginnings manual appropriate to your platform.

For information on the Distributed Computing Environment (DCE) cell directory, see "DCE Directory Services" and Appendix F. "Using Distributed Computing Environment (DCE) Directory Services".
Note:To improve performance, you may cache directory files, including the database directory, in memory. (See "Directory Cache Support (dir_cache)" for information about enabling directory caching.) When directory caching is enabled, a change made to a directory (for example, using a CATALOG DATABASE or UNCATALOG DATABASE command) by another application may not become effective until your application is restarted. To refresh the directory cache used by a command line processor session, issue a db2 terminate command.

In addition to the application level cache, a database manager level cache is also used for internal, database manager look-up. To refresh this "shared" cache, issue the db2stop and db2start commands.

For more information about directory caching, see "Directory Cache Support (dir_cache)".

DCE Directory Services

DCE is an Open Systems Foundation** (OSF**) architecture that provides tools and services to support the creation, use, and maintenance of applications in a distributed heterogeneous computing environment. It is a layer between the operating system, the network, and a distributed application that allows client applications to access remote servers.

With local directories, the physical location of the target database is individually stored on each client workstation in the database directory and node directory. The database administrator can therefore spend a large amount of time updating and changing these directories. The DCE directory services provide a central directory alternative to the local directories. It allows information about a database or a database manager instance to be recorded once in a central location, and any changes or updates to be made at that one location.

DCE is not a prerequisite for running DB2, but if you are operating in a DCE environment, see Appendix F. "Using Distributed Computing Environment (DCE) Directory Services" for more information.

Creating Nodegroups

You create a nodegroup with the CREATE NODEGROUP statement. This statement specifies the set of nodes on which the table space containers and table data are to reside. This statement also:

Assume that you want to load some tables on a subset of the database partitions in your database. You would use the following command to create a nodegroup of two nodes (1 and 2) in a database consisting of at least 3 (0 to 2) nodes:

  CREATE NODEGROUP mixng12 ON NODES (1,2)

For more information about creating nodegroups, see the SQL Reference manual.

The CREATE DATABASE command or sqlecrea() API also create the default system nodegroups, IBMDEFAULTGROUP, IBMCATGROUP, and IBMTEMPGROUP. (See "Designing and Choosing Table Spaces" for information.)

Creating a Table Space

Creating a table space within a database assigns containers to the table space and records its definitions and attributes in the database system catalog. You can then create tables within this table space.

The syntax of the CREATE TABLESPACE statement is discussed in detail in the SQL Reference manual. For information on SMS and DMS table spaces, see "Designing and Choosing Table Spaces".

The following SQL statement creates an SMS table space on OS/2 or Windows NT using three directories on three separate drives:

   CREATE TABLESPACE RESOURCE
      MANAGED BY SYSTEM
      USING ('d:\acc_tbsp', 'e:\acc_tbsp', 'f:\acc_tbsp')

The following SQL statement creates a DMS table space on OS/2 using two file containers each with 5,000 pages:

   CREATE TABLESPACE RESOURCE
      MANAGED BY DATABASE
      USING (FILE'd:\db2data\acc_tbsp' 5000,
             FILE'e:\db2data\acc_tbsp' 5000)

In the above two examples, explicit names have been provided for the containers. You may also specify relative container names, in which case, the container will be created in the subdirectory created for the database (see "Database Physical Directories").

In addition, if part of the path name specified does not exist, the database manager will create it. If a subdirectory is created by the database manager, it may also be deleted by the database manager when the table space is dropped.

The assumption in the above examples is that the table spaces are not associated with a specific nodegroup. The default nodegroup IBMDEFAULTGROUP is used when the following parameter is not specified in the statement:

   IN nodegroup

The following SQL statement creates a DMS table space on a UNIX-based system using three logical volumes of 10 000 pages each, and specifies their I/O characteristics:

   CREATE TABLESPACE RESOURCE
      MANAGED BY DATABASE
      USING (DEVICE '/dev/rdblv6' 10000,
             DEVICE '/dev/rdblv7' 10000,
             DEVICE '/dev/rdblv8' 10000)
      OVERHEAD 24.1
      TRANSFERRATE 0.9

The UNIX devices mentioned in this SQL statement must already exist and be able to be written to by the instance owner and the SYSADM group.

The following example creates a DMS table space on a nodegroup called ODDNODEGROUP in a UNIX partitioned database. ODDNODEGROUP must be previously created with a CREATE NODEGROUP statement. In this case, the ODDNODEGROUP nodegroup is assumed to be made up of database partitions numbered 1, 3, and 5. On all database partitions, use the device /dev/hdisk0 for 10 000 4K pages. In addition, declare a device for each database partition of 40 000 4K pages.

   CREATE TABLESPACE PLANS
      MANAGED BY DATABASE
      USING (DEVICE '/dev/HDISK0' 10000, DEVICE '/dev/n1hd01' 40000) ON NODE 1
            (DEVICE '/dev/HDISK0' 10000, DEVICE '/dev/n3hd03' 40000) ON NODE 3
            (DEVICE '/dev/HDISK0' 10000, DEVICE '/dev/n5hd05' 40000) ON NODE 5

UNIX devices are classified into two categories: character serial devices and block-structured devices. For all file-system devices, it is normal to have a corresponding character serial device (or raw device) for each block device (or cooked device). The block-structured devices are typically designated by names similar to "hd0" or "fd0". The character serial devices are typically designated by names similar to "rhd0", "rfd0", or "rmt0". These character serial devices have faster access than block devices. The character serial device names should be used on the CREATE TABLESPACE command and not block device names.

The overhead and transfer rate help to determine the best access path to use when the SQL statement is compiled. For information on the OVERHEAD and TRANSFERRATE parameters, see Part 3. "Tuning Application Performance".

DB2 can greatly improve the performance of sequential I/O using the sequential prefetch facility, which uses parallel I/O. See "Understanding Sequential Prefetching" for details on this facility.

The ALTER TABLESPACE SQL statement can be used to add a container to a DMS table space and modify the PREFETCHSIZE, OVERHEAD, and TRANSFERRATE settings for a table space. The transaction issuing the table space statement should be committed as soon as possible, to prevent system catalog contention.
Note:The PREFETCHSIZE should be a multiple of the EXTENTSIZE. For example if the EXTENTSIZE is 10, the PREFETCHSIZE should be 20 or 30. For more information, see "Understanding Sequential Prefetching".

Creating Table Spaces in Nodegroups

By placing a table space in a multiple database partition nodegroup, all of the tables within the table space are divided or partitioned across each database partition in the nodegroup. The table space is created into a nodegroup. Once in a nodegroup, the table space must remain there; It cannot be changed to another nodegroup. The CREATE TABLESPACE statement is used to associate a table space with a nodegroup.

Creating a Schema

While organizing your data into tables, it may also be beneficial to group tables (and other related objects) together. This is done by defining a schema through the use of the CREATE SCHEMA statement. Information about the schema is kept in the system catalog tables of the database to which you are connected. As other objects are created, they can be placed within this schema.

The syntax of the CREATE SCHEMA statement is described in detail in the SQL Reference manual. The new schema name cannot already exist in the system catalogs and it cannot begin with "SYS".

If a user has SYSADM or DBADM authority, then the user can create a schema with any valid name. When a database is created, IMPLICIT_SCHEMA authority is granted to PUBLIC (that is, to all users).

The definer of any objects created as part of the CREATE SCHEMA statement is the schema owner. This owner can GRANT and REVOKE schema privileges to other users.

The following is an example of a CREATE SCHEMA statement that creates a schema for an individual user with the authorization ID "joe":

   CREATE SCHEMA joeschma AUTHORIZATION joe
This statement must be issued by a user with DBADM authority.

Schemas may also be implicitly created when a user has IMPLICIT_SCHEMA authority. With this authority, users implicitly create a schema whenever they create an object with a schema name that does not already exist.

If users do not have IMPLICIT_SCHEMA authority, they can create a schema using their own authorization ID.

Creating a Table

After you determine how to organize your data into tables, the next step is to create those tables, by using the CREATE TABLE statement. The table descriptions are stored in the system catalog of the database to which you are connected.

The syntax of the CREATE TABLE statement is described in detail in the SQL Reference. For information about naming tables, columns, and other database objects, see Appendix D. "Naming Rules".

The CREATE TABLE statement gives the table a name, which is a qualified or unqualified identifier, and a definition for each of its columns. You can store each table in a separate table space, so that a table space will contain only one table. If a table will be dropped and created often, it is more efficient to store it in a separate table space and then drop the table space instead of the table. You can also store many tables within a single table space. In a partitioned database environment, the table space chosen also defines the nodegroup and the database partitions on which table data is stored.

The table does not contain any data at first. To add rows of data to it, use one of the following:

It is possible to add data into the table without logging the change. This is done using the NOT LOGGED INITIALLY parameter on the CREATE TABLE statement. Any changes made to the table by an INSERT, DELETE, UPDATE, CREATE INDEX, DROP INDEX, or ALTER TABLE operation in the same unit of work in which the table is created are not logged. Logging begins in subsequent units of work.

A table consists of one or more column definitions. A maximum of 500 columns can be defined for a table. Columns represent the attributes of an entity. The values in any column are all the same type of information. See the SQL Reference for more information.

A column definition includes a column name, data type, and any necessary null attribute, or default value (optionally chosen by the user).

The column name describes the information contained in the column and should be something that will be easily recognizable. It must be unique within the table; however, the same name can be used in other tables. See "Object Names" for information about naming rules.

The data type of a column indicates the length of the values in it and the kind of data that is valid for it. The database manager uses character string, numeric, date, time and large object data types. Graphic string data types are only available for database environments using multi-byte character sets. In addition, columns can be defined with user-defined distinct types, which are discussed in "Creating a User-Defined Type (UDT)".

The default attribute specification indicates what value is to be used if no value is provided. The default value can be specified, or a system-defined default value used. Default values may be specified for columns with, and without, the null attribute specification.

The null attribute specification indicates whether or not a column can contain null values.

The following is an example of a CREATE TABLE statement that creates the EMPLOYEE table in the RESOURCE table space. This table is defined in the sample database:

   CREATE TABLE EMPLOYEE
      (EMPNO     CHAR(6)     NOT NULL PRIMARY KEY,
       FIRSTNME  VARCHAR(12) NOT NULL,
       MIDINIT   CHAR(1)     NOT NULL WITH DEFAULT,
       LASTNAME  VARCHAR(15) NOT NULL,
       WORKDEPT  CHAR(3),
       PHONENO   CHAR(4),
       PHOTO     BLOB(10M)   NOT NULL)
   IN RESOURCE

The following sections build on the previous example to cover other options you should consider:

Large Object (LOB) Column Considerations

Before creating a table that contains large object columns, you need to make the following decisions:

  1. Do you want to log changes to LOB columns?

    If you do not want to log these changes, you must turn logging off by specifying the NOT LOGGED clause when you create the table. For example:

       CREATE TABLE EMPLOYEE
          (EMPNO     CHAR(6)     NOT NULL PRIMARY KEY,
           FIRSTNME  VARCHAR(12) NOT NULL,
           MIDINIT   CHAR(1)     NOT NULL WITH DEFAULT,
           LASTNAME  VARCHAR(15) NOT NULL,
           WORKDEPT  CHAR(3),
           PHONENO   CHAR(4),
           PHOTO     BLOB(10M)   NOT NULL  NOT LOGGED)
       IN RESOURCE
    

    If the LOB column is larger than 1 GB, logging must be turned off. (As a rule of thumb, you may not want to log LOB columns larger than 10 MB.) As with other options specified on a column definition, the only way to change the logging option is to re-create the table.

    Even if you choose not to log changes, LOB columns are shadowed to allow changes to be rolled back, whether the roll back is the result of a system generated error, or an application request. Shadowing is a recovery technique where current storage page contents are never overwritten. That is, old, unmodified pages are kept as "shadow" copies. These copies are discarded when they are no longer needed to support a transaction rollback.
    Note:When recovering a database using the RESTORE and ROLLFORWARD commands, LOB data that was "NOT LOGGED" and was written since the last backup will be replaced by binary zeros.

  2. Do you want to minimize the space required for the LOB column?

    You can make the LOB column as small as possible using the COMPACT clause on the CREATE TABLE statement. For example:

       CREATE TABLE EMPLOYEE
          (EMPNO     CHAR(6)     NOT NULL PRIMARY KEY,
           FIRSTNME  VARCHAR(12) NOT NULL,
           MIDINIT   CHAR(1)     NOT NULL WITH DEFAULT,
           LASTNAME  VARCHAR(15) NOT NULL,
           WORKDEPT  CHAR(3),
           PHONENO   CHAR(4),
           PHOTO     BLOB(10M)   NOT NULL  NOT LOGGED  COMPACT)
       IN RESOURCE
    

    There is a performance cost when appending to a table with a compact LOB column, particularly if the size of LOB values are increased (because of storage adjustments that must be made).

    On platforms such as OS/2 where sparse file allocation is not supported and where LOBs are placed in SMS table spaces, consider using the COMPACT clause. Sparse file allocation has to do with how physical disk space is used by an operating system. An operating system that supports sparse file allocation does not use as much physical disk space to store LOBs as compared to an operating system not supporting sparse file allocation. The COMPACT option allows for even greater physical disk space "savings" regardless of the support of sparse file allocation. Because you can get some physical disk space savings when using COMPACT, you should consider using COMPACT if your operating system does not support sparse file allocation.
    Note:DB2 system catalogs for Version 5 use LOB columns and may take up more space than in previous versions.

Defining Constraints

This section discusses how to define constraints:

For more information on constraints, see "Planning for Constraint Enforcement" and the SQL Reference.

Defining a Unique Constraint

Unique constraints ensure that every value in the specified key is unique. A table can have any number of unique constraints, with at most one unique constraint defined as a primary key.

You define a unique constraint with the UNIQUE clause in the CREATE TABLE or ALTER TABLE statements. The unique key can consist of more than one column. More than one unique constraint is allowed on a table.

Once established, the unique constraint is enforced automatically by the database manager when an INSERT or UPDATE statement modifies the data in the table. The unique constraint is enforced through a unique index.

When a unique constraint is defined in an ALTER TABLE statement and an index exists on the same set of columns of that unique key, that index becomes the unique index and is used by the constraint.

You can take any one unique constraint and use it as the primary key. The primary key can be used as the parent key in a referential constraint (along with other unique constraints). There can be only one primary key per table. You define a primary key with the PRIMARY KEY clause in the CREATE TABLE or ALTER TABLE statement. The primary key can consist of more than one column.

A primary index forces the value of the primary key to be unique. When a table is created with a primary key, the database manager creates a primary index on that key.

Some performance tips for indexes used as unique constraints include:

Defining Referential Constraints

Referential integrity is imposed by adding referential constraints to table and column definitions. Referential constraints are established with the the "FOREIGN KEY Clause", and the "REFERENCES Clause" in the CREATE TABLE or ALTER TABLE statements.

The identification of foreign keys enforces constraints on the values within the rows of a table or between the rows of two tables. The database manager checks the constraints specified in a table definition and maintains the relationships accordingly. The goal is to maintain integrity whenever one database object references another.

For example, primary and foreign keys each have a department number column. For the EMPLOYEE table, the column name is WORKDEPT, and for the DEPARTMENT table, the name is DEPTNO. The relationship between these two tables is defined by the following constraints:

The SQL statement defining the parent table, DEPARTMENT, is:

   CREATE TABLE DEPARTMENT
      (DEPTNO    CHAR(3)     NOT NULL,
       DEPTNAME  VARCHAR(29) NOT NULL,
       MGRNO     CHAR(6),
       ADMRDEPT  CHAR(3)     NOT NULL,
       LOCATION  CHAR(16),
          PRIMARY KEY (DEPTNO))
   IN RESOURCE

The SQL statement defining the dependent table, EMPLOYEE, is:

   CREATE TABLE EMPLOYEE
      (EMPNO     CHAR(6)     NOT NULL PRIMARY KEY,
       FIRSTNME  VARCHAR(12) NOT NULL,
       LASTNAME  VARCHAR(15) NOT NULL,
       WORKDEPT  CHAR(3),
       PHONENO   CHAR(4),
       PHOTO     BLOB(10m)   NOT NULL,
          FOREIGN KEY DEPT (WORKDEPT)
          REFERENCES DEPARTMENT ON DELETE NO ACTION)
   IN RESOURCE

By specifying the DEPTNO column as the primary key of the DEPARTMENT table and WORKDEPT as the foreign key of the EMPLOYEE table, you are defining a referential constraint on the WORKDEPT values. This constraint enforces referential integrity between the values of the two tables. In this case, any employees that are added to the EMPLOYEE table must have a department number that can be found in the DEPARTMENT table.

The delete rule for the referential constraint in the employee table is NO ACTION, which means that a department cannot be deleted from the DEPARTMENT table if there are any employees in that department.

Although the previous examples use the CREATE TABLE statement to add a referential constraint, the ALTER TABLE statement can also be used. See "Altering a Table".

Another example: The same table definitions are used as those in the previous example. Also, the DEPARTMENT table is created before the EMPLOYEE table. Each department has a manager, and that manager is listed in the EMPLOYEE table. MGRNO of the DEPARTMENT table is actually a foreign key of the EMPLOYEE table. Because of this referential cycle, this constraint poses a slight problem. You could add a foreign key later (see "Adding Primary and Foreign Keys"). You could also use the CREATE SCHEMA statement to create both the EMPLOYEE and DEPARTMENT tables at the same time (see the example in the SQL Reference).

FOREIGN KEY Clause

A foreign key references a primary key or a unique key in the same or another table. A foreign key assignment indicates that referential integrity is to be maintained according to the specified referential constraints. You define a foreign key with the FOREIGN KEY clause in the CREATE TABLE or ALTER TABLE statement.

The number of columns in the foreign key must be equal to the number of columns in the corresponding primary or unique constraint (called a parent key) of the parent table. In addition, corresponding parts of the key column definitions must have the same data types and lengths. The foreign key can be assigned a constraint name. If you do not assign a name, one is automatically assigned. For ease of use, it is recommended that you assign a constraint name and do not use the system-generated name.

The value of a composite foreign key matches the value of a parent key if the value of each column of the foreign key is equal to the value of the corresponding column of the parent key. A foreign key containing null values cannot match the values of a parent key, since a parent key by definition can have no null values. However, a null foreign key value is always valid, regardless of the value of any of its non-null parts.

The following rules apply to foreign key definitions:

REFERENCES Clause

The REFERENCES clause identifies the parent table in a relationship, and defines the necessary constraints. You can include it in a column definition or as a separate clause accompanying the FOREIGN KEY clause, in either the CREATE TABLE or ALTER TABLE statements.

If you specify the REFERENCES clause as a column constraint, an implicit column list is composed of the column name or names that are listed. Remember that multiple columns can have separate REFERENCES clauses, and that a single column can have more than one.

Included in the REFERENCES clause is the delete rule. In our example, the ON DELETE NO ACTION rule is used, which states that no department can be deleted if there are employees assigned to it. Other delete rules include ON DELETE CASCADE, ON DELETE SET NULL, and ON DELETE RESTRICT. See "DELETE Rules".

Implications for Utility Operations

The LOAD utility will turn off constraint checking for self-referencing and dependent tables, placing these tables into check pending state. After the LOAD utility has completed, you will need to turn on the constraint checking for all tables for which it was turned off. For example, if the DEPARTMENT and EMPLOYEE tables are the only tables that have been placed in check pending state, you can execute the following command:

   SET CONSTRAINTS FOR DEPARTMENT, EMPLOYEE IMMEDIATE CHECKED

The IMPORT utility is affected by referential constraints in the following ways:

Defining a Table Check Constraint

A table check constraint specifies a search condition that is enforced for each row of the table on which the table check constraint is defined. You create a table check constraint on a table by associating a check-constraint definition with the table when the table is created or altered. This constraint is automatically activated when an INSERT or UPDATE statement modifies the data in the table. A table check constraint has no effect on a DELETE or SELECT statement.

A constraint name cannot be the same as any other constraint specified within the same CREATE TABLE statement. If you do not specify a constraint name, the system generates an 18-character unique identifier for the constraint.

A table check constraint is used to enforce data integrity rules not covered by key uniqueness or a referential integrity constraint. In some cases, a table check constraint can be used to implement domain checking. The following constraint issued on the CREATE TABLE statement ensures that the start date for every activity is not after the end date for the same activity:

   CREATE TABLE EMP_ACT
      (EMPNO      CHAR(6)      NOT NULL,
       PROJNO     CHAR(6)      NOT NULL,
       ACTNO      SMALLINT     NOT NULL,
       EMPTIME    DECIMAL(5,2),
       EMSTDATE   DATE,
       EMENDATE   DATE,
       CONSTRAINT ACTDATES CHECK(EMSTDATE <= EMENDATE) )
   IN RESOURCE

Although the previous example uses the CREATE TABLE statement to add a table check constraint, the ALTER TABLE statement can also be used. See "Altering a Table".

Creating a Table in Multiple Table Spaces

Data, index, and long column data can be stored in the same table space as the table or in a different table space only for DMS. The following example shows how the EMP_PHOTO table could be created to store the different parts of the table in different table spaces:

   CREATE TABLE EMP_PHOTO
      (EMPNO        CHAR(6)      NOT NULL,
       PHOTO_FORMAT VARCHAR(10)  NOT NULL,
       PICTURE      BLOB(100K) )
   IN RESOURCE
   INDEX IN RESOURCE_INDEXES
   LONG  IN RESOURCE_PHOTO
This example will cause the EMP_PHOTO data to be stored as follows:

See "Table Space Design Considerations" for additional considerations on the use of multiple DMS table spaces for a single table.

See the SQL Reference for more information.

Creating a Table in a Partitioned Database

Before creating a table that will be physically divided or partitioned, you need to consider the following:

One additional option exists when creating a table in a partitioned database environment: the partitioning key. A partitioning key is a key that is part of the definition of a table. It determines the partition on which each row of data is stored.

It is important to select an appropriate partitioning key because it cannot be changed later. Furthermore, any unique indexes (and therefore unique or primary keys) must be defined as a superset of the partitioning key. That is, if a partitioning key is defined, unique keys and primary keys must include all of the same columns as the partitioning key (they may have more columns).

If you do not specify the partitioning key explicitly, the following defaults are used. Ensure that the default partitioning key is appropriate.

Following is an example:

   CREATE TABLE MIXREC (MIX_CNTL INTEGER NOT NULL,
                        MIX_DESC CHAR(20) NOT NULL,
                        MIX_CHR  CHAR(9) NOT NULL,
                        MIX_INT INTEGER NOT NULL,
                        MIX_INTS SMALLINT NOT NULL,
                        MIX_DEC DECIMAL NOT NULL,
                        MIX_FLT FLOAT NOT NULL,
                        MIX_DATE DATE NOT NULL,
                        MIX_TIME TIME NOT NULL,
                        MIX_TMSTMP TIMESTAMP NOT NULL)
                        IN MIXTS12
                        PARTITIONING KEY (MIX_INT) USING HASHING

In the preceding example, the table space is MIXTS12 and the partitioning key is MIX_INT. If the partitioning key is not specified explicitly, it is MIX_CNTL. (If no primary key is specified and no partitioning key is defined, the partitioning key is the first non-long column in the list.)

A row of a table, and all information about that row, always resides on the same database partition.

The size limit for one partition of a table is 64 GB, or the available disk space, whichever is smaller. The size of the table can be as large as 64 GB (or the available disk space) times the number of database partitions.

Creating a Trigger

A trigger defines a set of actions that are executed in conjunction with, or triggered by, an INSERT, UPDATE, or DELETE clause on a specified base table. Some uses of triggers are to:

You can use triggers to support general forms of integrity or business rules. For example, a trigger can check a customer's credit limit before an order is accepted or update a summary data table.

The benefits of using a trigger are:

The following SQL statement creates a trigger that increases the number of employees each time a new person is hired, by adding 1 to the number of employees (NBEMP) column in the COMPANY_STATS table each time a row is added to the EMPLOYEE table.

   CREATE TRIGGER NEW_HIRED
      AFTER INSERT ON EMPLOYEE
      FOR EACH ROW MODE DB2SQL
      UPDATE COMPANY_STATS SET NBEMP = NBEMP+1;

A trigger body can include one or more of the following SQL statements: INSERT, searched UPDATE, searched DELETE, full-selects, SET transition-variable, and SIGNAL SQLSTATE. See the SQL Reference and the Embedded SQL Programming Guide for information about creating and using triggers. The trigger can be activated before or after the INSERT, UPDATE, or DELETE statement to which it refers.

Trigger Dependencies

All dependencies of a trigger on some other object are recorded in the SYSCAT.TRIGDEP catalog. A trigger can depend on the following objects:

If one of these objects is dropped, the trigger becomes inoperative but its definition is retained in the catalog. To revalidate this trigger, you must retrieve its definition from the catalog and submit a new CREATE TRIGGER statement.

If a trigger is dropped, its description is deleted from the SYSCAT.TRIGGERS catalog view and all of its dependencies are deleted from the SYSCAT.TRIGDEP catalog view. All packages having UPDATE, INSERT, or DELETE dependencies on the trigger are invalidated.

If the dependent object is a view and it is made inoperative, the trigger is also marked inoperative. Any packages dependent on triggers that have been marked inoperative are invalidated. (For more information, see "Statement Dependencies When Changing Objects".)

Creating a User-Defined Function (UDF)

User-defined functions (UDFs) extend and add to the support provided by built-in functions of SQL, and can be used wherever a built-in function can be used. You can create UDFs as either:

There are three types of UDFs:

Scalar
Returns a single-valued answer each time it is called. For example, the built-in function SUBSTR() is a scalar function. Scalar UDFs can be either external or sourced.

Column
Returns a single-valued answer from a set of like values (a column). It is also sometimes called an aggregating function in DB2. An example of a column function is the built-in function AVG(). An external column UDF cannot be defined to DB2, but a column UDF which is sourced upon one of the built-in column functions can be defined. This is useful for distinct types.

For example, if there is a distinct type SHOESIZE defined with base type INTEGER, a UDF AVG(SHOESIZE) which is sourced on the built-in function AVG(INTEGER) could be defined, and it would be a column function.

Table
Returns a table to the SQL statement which references it. Table functions may only be referenced in the FROM clause of a SELECT statement. Such a function can be used to apply SQL language processing power to data which is not DB2 data, or to convert such data into a DB2 table.

For example, table functions can take a file and convert it to a table, tabularize sample data from the World Wide Web, or access a Lotus Notes database and return information such as the date, sender, and text of mail messages. This information can be joined with other tables in the database.

A table function can only be an external function. It cannot be a sourced function.

Information about existing UDFs is recorded in the SYSCAT.FUNCTIONS and SYSCAT.FUNCPARMS catalog views. The system catalog does not contain the executable code for the UDF. (Therefore, when creating your backup and recovery plans you should consider how you will manage your UDF executables.)

Statistics about the performance of UDFs are important when compiling SQL statements. For information about how to update UDF statistics in the system catalog, see "Updating Statistics for User-Defined Functions".

A UDF cannot be dropped if a view, trigger, table check constraint, or another UDF is dependent on it. If a UDF is dropped, packages that are dependent on it are marked inoperative. (For more information, see "Statement Dependencies When Changing Objects".)

For details on using the CREATE FUNCTION statement to write a UDF to suit your specific application, see the Embedded SQL Programming Guide. See the SQL Reference for details on UDF syntax.

Creating a User-Defined Type (UDT)

A user-defined type (UDT) is a distinct type derived from an existing type, such as an integer, decimal, or character type. UDTs support strong typing, which means that they share the same representations as the type they are derived from, but by definition the derived type and the source type are incompatible. They cannot be compared directly to each other.

Instances of the same distinct type can be compared to each other, if the WITH COMPARISONS clause is specified on the CREATE DISTINCT TYPE statement. Instances of UDTs cannot be used as arguments of functions or operands of operations that were defined on the source type. Similarly, the source type cannot be used in arguments or operands that were defined to use a UDT.

The SYSCAT.DATATYPES catalog view allows you to see the UDTs that have been defined for your database. This catalog view also shows you the data types defined by the database manager when the database was created. For a complete list of all data types, see the SQL Reference.

The following SQL statement creates the UDT t_educ as a smallint:

   CREATE DISTINCT TYPE T_EDUC AS SMALLINT WITH COMPARISONS

Once you have created a UDT, you can use it to define columns in a CREATE TABLE statement:

   CREATE TABLE EMPLOYEE
      (EMPNO      CHAR(6)      NOT NULL,
       FIRSTNME   VARCHAR(12)  NOT NULL,
       LASTNAME   VARCHAR(15)  NOT NULL,
       WORKDEPT   CHAR(3),
       PHONENO    CHAR(4),
       PHOTO      BLOB(10M)    NOT NULL,
       EDLEVEL    T_EDUC)
   IN RESOURCE

A UDT cannot be used as an argument for most of the system-provided, or built-in, functions. User-defined functions must be provided to enable these and other operations. For more information about creating and using user-defined functions, see the SQL Reference and the Embedded SQL Programming Guide.

You can drop a UDT only if:

When a UDT is dropped, any functions that are dependent on it are also dropped.

Creating a View

Views are derived from one or more base tables or views, and can be used interchangeably with base tables when retrieving data. When changes are made to the data shown in a view, the data is changed in the table itself.

A view can be created to limit access to sensitive data, while allowing more general access to other data. For example, the EMPLOYEE table may have salary information in it, which should not be made available to everyone. The employee's phone number, however, should be generally accessible. In this case, a view could be created from the LASTNAME and PHONENO columns only. Access to the view could be granted to PUBLIC, while access to the entire EMPLOYEE table could be restricted to those who have the authorization to see salary information. For information about read-only views, see the SQL Reference manual.

With a view, you can make a subset of table data available to an application program and validate data that is to be inserted or updated. A view can have column names that are different from the names of corresponding columns in the original tables.

The use of views provides flexibility in the way your programs and end-user queries can look at the table data.

The following SQL statement creates a view on the EMPLOYEE table that lists all employees in Department A00 with their employee and telephone numbers:

   CREATE VIEW EMP_VIEW (DA00NAME, DA00NUM, PHONENO)
      AS SELECT LASTNAME, EMPNO, PHONENO FROM EMPLOYEE
      WHERE WORKDEPT = 'A00'
      WITH CHECK OPTION

The first line of this statement names the view and defines its columns. The name EMP_VIEW must be unique within its schema in SYSCAT.TABLES. The view name appears as a table name although it contains no data. The view will have three columns called DA00NAME, DA00NUM, and PHONENO, which correspond to the columns LASTNAME, EMPNO, and PHONENO from the EMPLOYEE table. The column names listed apply one-to-one to the select list of the SELECT statement. If column names are not specified, the view uses the same names as the columns of the result table of the SELECT statement.

The second line is a SELECT statement that describes which values are to be selected from the database. It may include the clauses ALL, DISTINCT, FROM, WHERE, GROUP BY, and HAVING. The name or names of the data objects from which to select columns for the view must follow the FROM clause.

The WITH CHECK OPTION clause indicates that any updated or inserted row to the view must be checked against the view definition, and rejected if it does not conform. This enhances data integrity but requires additional processing. If this clause is omitted, inserts and updates are not checked against the view definition.

The following SQL statement creates the same view on the EMPLOYEE table using the SELECT AS clause:

   CREATE VIEW EMP_VIEW
      SELECT LASTNAME AS DA00NAME,
             EMPNO AS DA00NUM,
             PHONENO
      FROM EMPLOYEE
      WHERE WORKDEPT = 'A00'
      WITH CHECK OPTION

Once a view is defined, it cannot be changed. To modify a view definition, you must drop and re-create it.

You can create a view that uses a UDF in its definition. However, to update this view so that it contains the latest functions, you must drop it and then re-create it. If a view is dependent on a UDF, that function cannot be dropped.

The following SQL statement creates a view with a function in its definition:

   CREATE VIEW EMPLOYEE_PENSION (NAME, PENSION)
     AS SELECT NAME, PENSION(HIREDATE,BIRTHDATE,SALARY,BONUS)
     FROM EMPLOYEE

The UDF function PENSION calculates the current pension an employee is eligible to receive, based on a formula involving their HIREDATE, BIRTHDATE, SALARY, and BONUS.

In addition to using views as described above, a view can also be used to:

An alternative to creating a view is to use a nested or common table expression to reduce catalog lookup and improve performance. See the SQL Reference for more information about common table expressions.

Creating an Alias

An alias is an indirect method of referencing a table or view, so that an SQL statement can be independent of the qualified name of that table or view. Only the alias definition must be changed if the table or view name changes. An alias can be created on another alias. An alias can be used in a view or trigger definition and in any SQL statement, except for table check-constraint definitions, in which an existing table or view name can be referenced.

The alias is replaced at statement compilation time by the table or view name. If the alias or alias chain cannot be resolved to a table or view name, an error results. For example, if WORKERS is an alias for EMPLOYEE, then at compilation time:

   SELECT * FROM WORKERS

becomes in effect

   SELECT * FROM EMPLOYEE

An alias name can be used wherever an existing table name can be used, and can refer to another alias if no circular or repetitive references are made along the chain of aliases.

The following SQL statement creates an alias WORKERS for the EMPLOYEE table:

   CREATE ALIAS WORKERS FOR EMPLOYEE

The alias name cannot be the same as an existing table, view, or alias, and can only refer to a table within the same database. The name of a table or view used in a CREATE TABLE or CREATE VIEW statement cannot be the same as an alias name in the same schema.

You do not require special authority to create an alias, unless the alias is in a schema other than the one owned by your current authorization ID, in which case DBADM authority is required.

An alias can be defined for a table, view, or alias that does not exist at the time of definition. However, it must exist when an SQL statement containing the alias is compiled.

When an alias, or the object to which an alias refers, is dropped, all packages dependent on the alias are marked invalid and all views and triggers dependent on the alias are marked inoperative.
Note:DB2 for MVS/ESA employs two distinct concepts of aliases: ALIAS and SYNONYM. These two concepts differ from DB2 Universal Database as follows:

  • ALIASes in DB2 for MVS/ESA:
    • Require their creator to have special authority or privilege
    • Cannot reference other aliases.

  • SYNONYMs in DB2 for MVS/ESA:
    • Can only be used by their creator
    • Are always unqualified
    • Are dropped when a referenced table is dropped
    • Do not share namespace with tables or views.

Creating an Index

An index is a list of the locations of rows, sorted by the contents of one or more specified columns. Indexes are typically used to speed up access to a table. However, they can also serve a logical data design purpose. For example, a unique index does not allow entry of duplicate values in the columns, thereby guaranteeing that no two rows of a table are the same. Indexes can also be created to specify ascending or descending order of the values in a column.

An index is defined by columns in the base table. It can be defined by the creator of a table, or by a user who knows that certain columns require direct access. Up to 16 columns can be specified for an index. A primary index key is automatically created on the primary key, unless a user-defined index already exists.

Any number of indexes can be defined on a particular base table and they can have a beneficial effect on the performance of queries. However, the more indexes there are, the more the database manager must modify during update, delete, and insert operations. Creating a large number of indexes for a table that receives many updates can slow down processing of requests. Therefore, use indexes only where a clear advantage for frequent access exists.

An index key is a column or collection of columns on which an index is defined, and determines the usefulness of an index. Although the order of the columns making up an index key does not make a difference to index key creation, it may make a difference to the optimizer when it is deciding whether or not to use an index.

If the table being indexed is empty, an index is still created, but no index entries are made until the table is loaded or rows are inserted. If the table is not empty, the database manager makes the index entries while processing the CREATE INDEX statement.

Indexes for tables in a partitioned database are built using the same CREATE INDEX statement. They are partitioned based on the partitioning key of the table. An index on a table is made of the local indexes in that table on each node in the nodegroup. Note that unique indexes defined in a multiple partition environment must be a superset of the partitioning key.

Performance Tip: Create your indexes before using the LOAD utility if you are going to carry out the following series of tasks:

You should consider ordering the execution of tasks in the following way:

  1. Create the table
  2. Create the index
  3. Load the table with the index create and statistics update options requested.

For more information on LOAD performance improvements, see "System Catalog Tables".

Indexes are maintained after they are created. Subsequently, when application programs use a key value to randomly access and process rows in a table, the index based on that key value can be used to access rows directly. This is important, because the physical storage of rows in a base table is not ordered. When a row is inserted, it is placed in the most convenient storage location that can accommodate it. When searching for rows of a table that meet a particular selection condition and the table has no indexes, the entire table is scanned. An index optimizes data retrieval without performing a lengthy sequential search.

The data for your indexes can be stored in the same table space as your table data, or in a separate table space containing index data. The table space used to store the index data is determined when the table is created (see "Creating a Table in Multiple Table Spaces").

The following two sections "Using an Index" and "Using the CREATE INDEX Statement" provide more information on creating an index.

Using an Index

An index is never directly used by an application program. The decision on whether to use an index and which of the potentially available indexes to use is the responsibility of the optimizer.

The best index on a table is one that:

For a detailed discussion of how an index can be beneficial, see "Index Scan Concepts".

Using the CREATE INDEX Statement

You can create an index that will allow duplicates (a non-unique index) to enable efficient retrieval by columns other than the primary key, and allow duplicate values to exist in the indexed column or columns.

The following SQL statement creates a non-unique index called LNAME from the LASTNAME column on the EMPLOYEE table, sorted in ascending order:

   CREATE INDEX LNAME ON EMPLOYEE (LASTNAME ASC)

The following SQL statement creates a unique index on the phone number column:

   CREATE UNIQUE INDEX PH ON EMPLOYEE (PHONENO DESC)

A unique index ensures that no duplicate values exist in the indexed column or columns. The constraint is enforced at the end of the SQL statement that updates rows or inserts new rows. This type of index cannot be created if the set of one or more columns already has duplicate values.

The keyword ASC puts the index entries in ascending order by column, while DESC puts them in descending order by column. The default is ascending order.

In multiple partition databases, unique indexes must be defined as supersets of the partitioning key.

For intra-partition parallelism, index create performance is improved by using multiple processors for the scanning and sorting of data that is performed during index creation. The use of multiple processors is enabled by setting intra_parallel to YES(1) or ANY(-1). The number of processors used during index create is determined by the system and is not affected by the configuration parameters dft_degree or max_querydegree, by the application runtime degree, or by the SQL statement compilation degree. If the database configuration parameter index sort is NO, then index create will not use multiple processors.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]

[ DB2 List of Books | Search the DB2 Books ]