IBM Books

Embedded SQL Programming Guide


Interface between DB2 and a UDF

This section discusses some of the details of the interface between DB2 and a UDF, and discusses the sqludf.h include file which makes the interface manageable. This include file only applies to C and C++ UDFs. For information on coding UDFs in Java, see "Coding a Java UDF".

The Arguments Passed from DB2 to a UDF

In addition to the SQL arguments and those values passed in the DML reference to the function, DB2 passes additional arguments to the external UDF. For C and C++, all of these arguments are passed in the order shown in Figure 24. Java UDFs take only the SQL-argument and SQL-result arguments, leaving the other information available to extra function calls. Java UDFs have the same restrictions on the resulting SQL-state and diagnostic-message arguments documented below. For information on coding UDFs in Java, see "Coding a Java UDF".

Figure 24. Passing Arguments to a UDF

                          <------------<                                  
>>--.------------------.----SQL-result----.----------------------.------> 
    | <--------------< |                  | <------------------< |        
    '---SQL-argument---'                  '---SQL-argument-ind---'        
                                                                          
   <----------------<                                                     
>----SQL-result-ind----SQL-state--function-name--specific-name----------> 
                                                                          
>--diagnostic-message--.------------.--.-----------.--.--------.------->< 
                       '-scratchpad-'  '-call-type-'  '-dbinfo-'          

Note:Each of the above arguments passed to the external function is a pointer to the value, and not the actual value.

The arguments are described as follows:

SQL-argument
This argument is set by DB2 before calling the UDF. This value repeats n times, where n is the number of arguments specified in the function reference. The value of each of these arguments is taken from the expression specified in the function invocation. It is expressed in the data type of the defined parameter. How these data types map to C language constructs is described in "How the SQL Data Types are Passed to a UDF".

SQL-result
This argument is set by the UDF before returning to DB2. For scalar functions there is exactly one SQL-result. For table functions there is one SQL-result for each result column of the function defined in the RETURNS TABLE clause of the CREATE FUNCTION statement. They correspond to the position of the columns defined in the RETURNS TABLE clause. That is, the first SQL-result argument corresponds to the first column defined in the RETURNS TABLE clause, and so on.

For both scalar functions and table functions, DB2 allocates the buffer and passes its address to the UDF. The UDF puts each result value into the buffer. Enough buffer space is allocated by DB2 to contain the value expressed in the data type. For scalar functions, this data type is defined in the CAST FROM clause,if it is present, or in the RETURNS clause, if no CAST FROM clause is present. For table functions, the data types are defined in the RETURNS TABLE(...) clause. For information on how these types map to C language constructs, see "How the SQL Data Types are Passed to a UDF".

Note that for table functions, DB2 defines a performance optimization where every defined column does not have to be returned to DB2. Your UDF returns only the columns required by the statement referencing the table function. For example, consider a CREATE FUNCTION statement for a table function defined with 100 result columns. If a given statement referencing the function is only interested in two of them, this optimization enables the UDF to only return those two columns for each row. See the dbinfo argument below for more information on this optimization.

For each value returned, (that is, a single value for a scalar function, and in general, multiple values for a table function), the UDF code should not change more bytes than is required for the data type and length of the result. DB2 will attempt to determine if the UDF body has written beyond the end of the result buffer by a few bytes, returning SQLCODE -450 (SQLSTATE 39501). However, a major overwrite by the UDF may not be detected by DB2 can cause unpredictable results or an abnormal termination,

SQL-argument-ind
This argument is set by DB2 before calling the UDF. It can be used by the UDF to determine if the corresponding SQL-argument is null or not. The nth SQL-argument-ind corresponds to the nth SQL-argument (described above). It contains one of the following values:
0
The argument is present and not null.
-1
The argument is present and its value is null.

If the function is defined with NOT NULL CALL, the UDF body does not need to check for a null value. However, if it is defined with NULL CALL, any argument can be NULL and the UDF should check it.

The indicator takes the form of a SMALLINT value, and this can be defined in your UDF as described in "How the SQL Data Types are Passed to a UDF".

SQL-result-ind
This argument is set by the UDF before returning to DB2. There is one of these for each SQL-result argument, and they correspond positionally.

This argument is used by the UDF to signal if the particular result value is null:

0 or positive
The result is not null
negative
The result is the null value(5)

Even if the function is defined with NOT NULL CALL, the UDF body must set the indicator of the result.

The indicator takes the form of a SMALLINT value, and this can be defined in your UDF as described in "How the SQL Data Types are Passed to a UDF".

If the optimization using the dbinfo argument column list is being utilized, then only the indicators corresponding to the required columns need be set.

SQL-state
This argument is set by the UDF before returning to DB2. It takes the form of a CHAR(5) value. Ensure that the argument definition in the UDF is appropriate for a CHAR(5) as described in "How the SQL Data Types are Passed to a UDF", and can be used by the UDF to signal warning or error conditions. It contains the value '00000', when the function is called. The UDF can set the value to the following:

00000
The function code did not detect any warning or error situations.

01Hxx
The function code detected a warning situation. This results in a SQL warning, SQLCODE +462 (SQLSTATE 01Hxx ). Here 'xx' is any string.

02000
Only valid for table functions, it means that there are no more rows in the table.

38502
A special value for the case where the UDF body attempted to issue an SQL call and received an error, SQLCODE -487 (SQLSTATE 38502). because SQL is not allowed in UDFs), and chose to pass this same error back through to DB2.

Any other 38xxx
The function code detected an error situation. It results in a SQL error, SQLCODE -443 (SQLSTATE 38xxx). Here 'xxx' is any string. Do not use 380xx through 384xx because those values are reserved by the draft extensions to the SQL92 international standard, or 385xx because those values are reserved by IBM.

Any other value is treated as an error situation resulting in SQLCODE -463 (SQLSTATE 39001).

function-name
This argument is set by DB2 before calling the UDF. It is the qualified function name, passed from DB2 to the UDF code. This variable takes the form of a VARCHAR(27) value. Ensure that the argument definition in the UDF is appropriate for a VARCHAR(27). See "How the SQL Data Types are Passed to a UDF" for more information.

The form of the function name that is passed is:

     <schema-name>.<function-name>

The parts are separated by a period. Two examples are:

     PABLO.BLOOP       WILLIE.FINDSTRING

This form enables you to use the same UDF body for multiple external functions, and still differentiate between the functions when it is invoked.
Note:Although it is possible to include the period in object names, it is not recommended. For example, if a function, rotate is in a schema, obj.op, the function name that is returned is obj.op.rotate, and it is not obvious if the schema name is obj or obj.op.

specific-name
This argument is set by DB2 before calling the UDF. It is the specific name of the function passed from DB2 to the UDF code. This variable takes the form of a VARCHAR(18) value. Ensure that the argument definition in the UDF is appropriate for a VARCHAR(18). See "How the SQL Data Types are Passed to a UDF" for more information. Two examples are:
     willie_find_feb95       SQL9507281052440430

This first value is provided by the user in his CREATE FUNCTION statement. The second is a value generated by DB2 when the user does not specify a value.

As with the function-name argument, the reason for passing this value is to give the UDF the means of distinguishing exactly which specific function is invoking it.

diagnostic-message
This argument is set by the UDF before returning to DB2. The UDF can use this argument to insert a message text in a DB2 message. It takes the form of a VARCHAR(70) value. Ensure that the argument definition in the UDF is appropriate for a VARCHAR(70). See "How the SQL Data Types are Passed to a UDF" for more information.

When the UDF returns either an error or a warning, using the SQL-state argument described above, it can include descriptive information here. DB2 includes this information as a token in its message.

DB2 sets the first character to null before calling the UDF. Upon return, it treats the string as a C null-terminated string. This string will be included in the SQLCA as a token for the error condition. At least the first 17 characters in this string will appear in the SQLCA or DB2 CLP message. However, the actual number of characters which will appear depends on the lengths of the other tokens. Avoid using X'FF' in the text since this character is used to delimit tokens in the SQLCA.

The UDF code should not change more than the VARCHAR(70) buffer which is passed to it. DB2 will attempt to determine if the UDF body has written beyond the end of this buffer by a few characters, SQLCODE -450 (SQLSTATE 39501). However, an overwrite by the UDF can cause unpredictable results or an abend, as it may not be detected by DB2.

DB2 assumes that any message tokens returned from the UDF to DB2 are in the same code page as the database. Your UDF should ensure that If this is the case. If you use the 7-bit invariant ASCII subset, your UDF can return the message tokens in any code page.

scratchpad
This argument is set by DB2 before calling the UDF. It is only present if the CREATE FUNCTION statement for the UDF specified the SCRATCHPAD keyword. This argument is a structure, exactly like the structure used to pass a value of any of the LOB data types, with the following elements:

The scratchpad can be mapped in your UDF using the same type as either a CLOB or a BLOB, since the argument passed has the same structure. See "How the SQL Data Types are Passed to a UDF" for more information.

Ensure your UDF code does not make changes outside of the scratchpad buffer. DB2 attempts to determine if the UDF body has written beyond the end of this buffer by a few characters, SQLCODE -450 (SQLSTATE 39501), but a major overwrite by the UDF can cause unpredictable results, or an abend, and may not result in a graceful failure by DB2.

call-type
This argument is set by DB2 before calling the UDF. The argument is only present if the CREATE FUNCTION statement for the UDF specified the FINAL CALL keyword. It follows the scratchpad argument; or the diagnostic-message argument if the scratchpad argument is not present. This argument takes the form of an INTEGER value. Ensure that this argument definition in the UDF is appropriate for INTEGER. See "How the SQL Data Types are Passed to a UDF" for more information.

For scalar functions it contains: REFID='wq4704'.

-1
This is the first call to the UDF for this statement. A first call is a normal call in that all SQL argument values are passed. The only difference is that on a first call, in cases where a scratchpad is also passed, the scratchpad is set to binary zeros.

Note that if SCRATCHPAD is specified in the CREATE FUNCTION statement, but FINAL CALL is not, the call-type argument is not passed to the UDF. If it is important for the UDF to identify this first call situation, it must do so by some other means. For example, it could key off the all binary zeros state of the scratchpad.

0
This is a normal call, that is, all the normal input argument values are passed. If a scratchpad is also passed, it is untouched from the previous call.

1
This is a final call, that is, no SQL-argument or SQL-argument-ind values are passed, and attempts to examine these values may cause unpredictable results. If a scratchpad is also passed, it is untouched from the previous call.

A UDF is expected to release system resources (for example, memory) acquired during the first call and succeeding normal calls. On a final call, a UDF should not return any answer using the SQL-result or SQL-result-ind arguments. Both of these are ignored by DB2 upon return from a final call to a UDF.

However, on a final call, the UDF may set the SQL-state and diagnostic-message arguments. These arguments are handled the same way as on other calls to the UDF.

If a UDF that is used in a subquery both uses a scratchpad and requires a final call, DB2 may decide to make a final call (and then refresh the scratchpad) between invocations of the subquery.

For table functions it contains:

-1
This is the OPEN call to the UDF for this statement. The scratchpad (if any) is set to binary zeros when the UDF is called. All argument values are passed, and the UDF should do whatever one-time initialization actions are required, but should not return a row of data to DB2.

Remember that the FINAL CALL option is mandatory for table functions, but that SCRATCHPAD is not.

0
This is a FETCH call, that is, all the normal input argument values are passed, same as on the OPEN CALL, and the UDF is expected to return a row or the end-of-table condition ('02000' returned as SQLSTATE to DB2). If a scratchpad is also passed, it is untouched from the previous call.

1
This is a CLOSE call, that is, no SQL-argument or SQL-argument-ind values are passed, and attempts to examine these values may cause unpredictable results. If a scratchpad is also passed, it is untouched from the previous call.

A UDF is expected to release system resources (for example, memory) acquired during the OPEN call and succeeding FETCH calls. On a final call, a table UDF should not return any result row to DB2. In fact, the SQL-result and SQL-result-ind arguments are ignored by DB2 upon return from a CLOSE call to a UDF.

However, on a CLOSE call, the UDF may set the SQL-state and diagnostic-message arguments. These arguments are handled the same way as on other calls to the UDF.

If a table function UDF is used in a subquery, the table function should expect a CLOSE call for each invocation of the subquery within the higher level query, and a subsequent OPEN call for the next invocation of the subquery within the higher level query.

dbinfo
This argument is set by DB2 before calling the UDF. It is only present if the CREATE FUNCTION statement for the UDF specifies the DBINFO keyword. The argument is a structure whose C language definition is contained in the header file sqludf.h discussed in "The UDF Include File: sqludf.h". It contains the following elements:

  1. Data base name length (dbnamelen)

    The length of data base name below. This field is an unsigned short integer.

  2. Data base name (dbname)

    The name of the currently connected database. This field is a long identifier of 128 characters. The data base name length field described above identifies the actual length of this field. It does not contain a null terminator or any padding.

  3. Application Authorization ID Length (authidlen)

    The length of application authorization ID below. This field is an unsigned short integer.

  4. Application authorization ID (authid)

    The application run time authorization ID. This field is a long identifier of 128 characters. It does not contain a null terminator or any padding. The application authorization ID length field described above identifies the actual length of this field.

  5. Database code page (codepg)

    This is an union of two 48-byte long structures; one is used by DB2 Universal Database, the other is reserved for future use. The DB2 Universal Database structure contains the following fields:

    1. SBCS. Single byte code page, an unsigned long integer.
    2. DBCS. Double byte code page, an unsigned long integer.
    3. COMP. Composite code page, an unsigned long integer.
    4. DUMMY. Reserved for later use, a char of length 36.

  6. Schema name length (tbschemalen)

    The length of schema name below. Contains 0 (zero) if a table name is not passed. This field is an unsigned short integer.

  7. Schema name (tbschema)

    Schema for the table name below. This field is a long identifier of 128 characters. It does not contain a null terminator or any padding. The schema name length field described above identifies the actual length of this field.

  8. Table name length (tbnamelen)

    The length of the table name below. Contains 0 (zero) if a table name is not passed. This field is an unsigned short integer.

  9. Table name (tbname)

    This is the name of the table being updated or inserted. This field is set only if the UDF reference is the right-hand side of a SET clause in an UPDATE statement, or an item in the VALUES list of an INSERT statement. This field is a long identifier of 128 characters. It does not contain a null terminator or any padding. The table name length field described above, identifies the actual length of this field. The schema name field above, together with this field form the fully qualified table name.

  10. Column name length (colnamelen)

    Length of column name below. It contains a 0 (zero) if a column name is not passed. This field is an unsigned short integer.

  11. Column name (colname)

    Under the exact same conditions as for table name, this field contains the name of the column being updated or inserted; otherwise not predictable. This field is a long identifier of 128 characters. It does not contain a null terminator or any padding. The column name length field described above, identifies the actual length of this field.

  12. Version/Release number (ver_rel)

    An 8 character field that identifies the product and its version, release, and modification level with the format pppvvrrm where:

    • ppp identifies the product as follows:
      DSN
      DB2 for MVS/ESA or OS/390
      ARI
      SQL/DS
      QSQ
      DB2 for AS/400
      SQL
      DB2 Universal Database
    • vv is a two digit version identifier.
    • rr is a two digit release identifier.
    • m is a one digit modification level identifier.

  13. Platform (platform)

    The following identifies the operating platform for the application server:

    SQLUDF_PLATFORM_OS2
    OS/2
    SQLUDF_PLATFORM_AIX
    AIX
    SQLUDF_PLATFORM_NT
    NT
    SQLUDF_PLATFORM_HP
    HP-UX
    SQLUDF_PLATFORM_SUN
    Solaris
    SQLUDF_PLATFORM_SNI
    Siemens Nixdorf
    SQLUDF_PLATFORM_UNKNOWN
    Unknown platform

    For additional platforms that are not contained in the above list, see the contents of the sqludf.h file.

  14. Number of table function column list entries (numtfcol)

    The number of non-zero entries in the table function column list specified in the table function column list field below.

  15. Reserved field (resd1)

    This field is for future use. It is defined as 24 characters long.

  16. Table function column list (tfcolumn)

    If a table function is defined, this field is a pointer to an array containing 1 000 short integers that is dynamically allocated by DB2. If a table function is not defined, this pointer is null.

    This field is used only for table functions. Only the first n entries, where n is specified in the number of table function column list entries field are of interest. n may be equal to 0, and is less than or equal to the number of result columns defined for the function in the RETURNS TABLE(...) clause of the CREATE FUNCTION statement. The values correspond to the numbers of the columns which this statement needs from the table function. A value of '1' means the first defined result column, '2' means the second defined result column, and so on, and the values may be in any order. Note that n could be equal to zero, that is, the very first array element might be zero, for a statement similar to SELECT COUNT(*) FROM TABLE(TF(...)) AS QQ, where no actual values are needed.

    This array represents an opportunity for optimization. The UDF need not return all values for all the result columns of the table function, only those needed in the particular context, and these are the columns identified (by number) in the array. Since this optimization may complicate the UDF logic in order to gain the performance benefit, the UDF can optimally choose to return every defined column.

  17. Reserved field (resd2)

    This field is for future use. It is defined as 24 characters long.

Summary of UDF Argument Use

The following is a summary of the arguments described above, and how you use them in the interface between DB2 and an external UDF.

For scalar functions are:

A table function logically returns a table to the SQL statement that references it, but the physical interface between DB2 and the table function is row by row.

For table functions, the arguments are:

Observe that the normal outputs of the UDF, SQL-result, SQL-result-ind, and SQL-state, are returned to DB2 using arguments passed from DB2 to the UDF. Indeed, the UDF is written not to return anything in the functional sense (that is, the function's return type is void). See the void definition and the return statement in the following example:

     #include ... 
      void SQL_API_FN divid( 
           ... arguments ... ) 
     { 
           ... UDF body ... 
           return; 
     }

In the above example, SQL_API_FN is a macro that specifies the calling convention for a function that may vary across each supported operating platform. This macro is required when you write stored procedures or UDFs.

For programming examples of UDFs, see "Examples of UDF Code".

How the SQL Data Types are Passed to a UDF

This section identifies the valid types, for both UDF parameters and result, and specifies for each how the corresponding argument should be defined in your C or C++ language UDF. Note that if you use the sqludf.h include file and the types defined there, you can automatically generate language variables and structures that are correct for the different data types. This include file is discussed in "The UDF Include File: sqludf.h".

It is the data type for each function parameter defined in the CREATE FUNCTION statement that governs the format for argument values. Promotions from the argument data type may be needed to get the value in that format. Such promotions are performed automatically by DB2 on the argument values; argument promotion is discussed in the SQL Reference.

For the function result, it is the data type specified in the CAST FROM clause of the CREATE FUNCTION statement that defines the format. If no CAST FROM clause is present, then the data type specified in the RETURNS clause defines the format.

In the following example, the presence of the CAST FROM clause means that the UDF body returns a SMALLINT and that DB2 casts the value to INTEGER before passing it along to the statement where the function reference occurs:

     ... RETURNS INTEGER CAST FROM SMALLINT ...

In this case the UDF must be written to generate a SMALLINT, as defined below. Note that the CAST FROM data type must be castable to the RETURNS data type, so one cannot just arbitrarily choose another data type. Casting between data types is discussed in the SQL Reference.

The following is a list of the SQL types and their C language representations. It includes information on whether each type is valid as a parameter or a result. Also included are examples of how the types could appear as an argument definition in your C or C++ language UDF:

The type udf_locator is defined in the header file sqludf.h, which is discussed in "The UDF Include File: sqludf.h". The use of these locators is discussed in "Using LOB Locators as UDF Parameters or Results".

The UDF Include File: sqludf.h

This include file contains structures, definitions and values which are useful when writing your UDF. Its use is optional, however, and in the sample UDFs shown in "Examples of UDF Code", some examples use the include file. When compiling your UDF, you need to reference the directory which contains this file. This directory is sqllib/include.

The sqludf.h include file is self-describing. Following is a brief summary of its content:

  1. Structure definitions for the passed arguments which are structures:

  2. C language type definitions for all the SQL data types, for use in the definition of UDF arguments corresponding to SQL arguments and result having the data types. These are the definitions with names SQLUDF_x and SQLUDF_x_FBD where x is a SQL data type name, and FBD represents For Bit Data.

    Also included is a C language type for an argument or result which is defined with the AS LOCATOR appendage.

  3. Definition of C language types for the scratchpad and call-type arguments, with an enum type definition of the call-type argument.

  4. Macros for defining the standard trailing arguments, both with and without the inclusion of scratchpad and call-type arguments. This corresponds to the presence and absence of SCRATCHPAD and FINAL CALL keywords in the function definition. These are the SQL-state, function-name, specific-name, diagnostic-message, scratchpad and call-type UDF invocation arguments defined in "The Arguments Passed from DB2 to a UDF". Also included are definitions for referencing these constructs, and the various valid SQLSTATE values.

  5. Macros for testing whether the SQL arguments are null.

  6. Function prototypes for the APIs which can be used to manipulate LOB values by means of LOB locators passed to the UDF.

Some of the UDF examples in the next section illustrate the inclusion and use of sqludf.h.


Footnotes:

(5) DB2 treats the function result as null (-2) if the following is true:

This is also true if you define the function with the NOT NULL CALL option.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]

[ DB2 List of Books | Search the DB2 Books ]