The database manager does not, in general, restrict the character set available to an application except as noted below.
Each combined SBCS/DBCS code page allows for both single- and
double-byte character code points. This is accomplished by reserving a subset
of the 256 available code points of each implied SBCS code page identifier for
single-byte characters, with the remainder of the code points either undefined
or allocated to the first byte of double-byte code points. These code points
are shown in the following table.
Table 102. Mixed Character Set Code Points
Supported Mixed Code Page | Code Points for Single-byte Characters | Code Points for First Byte of Double-Byte Characters |
---|---|---|
932 | x00-7F, xA1-DF | x81-9F, xE0-FC |
942, 943 | x00-80, xA0-DF, xFD-FF | x81-9F, xE0-FC |
938 | x00-7E | x81-FC |
948 | x00-80 | x81-FC |
949 | x00-7F | x8F-FE |
950 | x00-7E | x81-FE |
1381 | x00-7F | x8C-FE |
Code points not assigned to either category above are not defined, and are processed as single-byte undefined code points.
Within each implied DBCS code page, there are 256 code points available as the second byte for each valid first byte. These code points are also partitioned into valid and invalid second byte ranges for the purpose of determining whether a DBCS character is properly formed. Note that in DBCS environments, DB2 does not perform validity checking on individual double-byte characters.
The basic character set that may be used in database names consists of the single-byte uppercase and lowercase Latin letters (A...Z, a...z), the Arabic numerals (0...9) and the underscore character (_). This list of letters is augmented with the three special characters #, @ and $ to provide compatibility with host database products. However, these special characters should be used with care in an NLS environment because they are not included in the NLS host (EBCDIC) invariant character set.
When naming database objects (such as tables and views), program labels, host variables, cursors and statements alphabetics from the extended character set may also be used. For example, those letters with diacritical marks. The available characters depend on the code page in use and if you are using the database in a multiple code page environment, you must ensure that all code pages support any alphabetics you plan on using from the extended character set. See the SQL Reference for a discussion of delimited identifiers which can be used in SQL statements and can contain characters outside the extended character set.
In DBCS environments, the extended character set consists of all the characters in the basic character set, plus those identified as a letter or digit as follows:
The coding of SQL statements is not language dependent. SQL is a programming language and, like other programming languages such as C, it is language invariant. The SQL keywords must be typed as shown, although they may be typed in uppercase, lowercase, or mixed case. The names of database objects, host variables and program labels that occur in an SQL statement cannot contain characters outside the database manager extended character set as described above.