Embedded SQL Programming Guide

User-Defined Functions (UDF)

A user-defined function is a mechanism with which you can write your own extensions to SQL. The built-in functions supplied with DB2 are a useful set of functions, but they may not satisfy all of your requirements. Thus, you may need to extend SQL for the following reasons:

Customization.
The function specific to your application does not exist in DB2. Whether the function is a simple transformation, a trivial calculation, or a complicated multivariate analysis, you can probably use a UDF to do the job.
Flexibility.
The DB2 built-in function does not quite permit the variations that you wish to include in your application.
Standardization.
Many of the programs at your site implement the same basic set of functions, but there are minor differences in all the implementations. Thus, you are unsure about the consistency of the results you receive. If you correctly implement these functions once, in a UDF, then all these programs can use the same implementation directly in SQL and provide consistent results.
Object-relational support.
As discussed in "User-defined Distinct Types (UDT)", UDTs can be very useful in extending the capability and increasing the safety of DB2. UDFs act as the methods for UDTs, by providing behavior and encapsulating the types.

Why Use UDFs?

In writing DB2 applications, you have a choice when implementing desired actions or operations:

As a UDF
As a subroutine or function in your application.

Although it may seem easier to implement new operations as subroutines or functions in your application, there are good reasons why you should consider using UDFs:

Re-use.
If the new operation is something of which other users or programs at your site can take advantage, then UDFs can help to reuse it. In addition, the function can be invoked directly in SQL wherever an expression can be used by any user of the database. The database will take care of many data type promotions of the function arguments automatically, for example DECIMAL to DOUBLE, allowing your function to be applied to different, but compatible data types.
It may seem easier to implement your new function as a normal function and then make it available to others for use in their programs, thereby avoiding the need to define the function to DB2. This requires that you inform all other interested application developers, and package the function effectively for their use. However, it ignores the interactive users like those who normally use the Command Line Processor (CLP) to access the database. CLP users cannot use your function unless it is a UDF in the database. This also applies to any other tools that use SQL (such as Visualizer), that do not get recompiled.
Performance.
Invoking the UDF directly from the database engine instead of from your application can have a considerable performance advantage, particularly when the function may be used in the qualification of data for further processing. Consider a simple scenario where you want to process some data, provided you can meet some selection criteria which can be expressed as a function SELECTION_CRITERIA(). Your application could issue the following select statement:
```
     SELECT A,B,C FROM T
```
When it receives each row, it runs SELECTION_CRITERIA against the data to decide if it is interested in processing the data further. Here, every row of table T must be passed back to the application. But if SELECTION_CRITERIA() is implemented as a UDF, your application can issue the following statement:
```
     SELECT A,B,C FROM T WHERE SELECTION_CRITERIA(A,B) = 1
```
In this case, only the rows of interest are passed across the interface between the application and the database. For large tables, or for cases where SELECTION_CRITERIA supplies significant filtering, the performance improvement can be very significant.
Another case where a UDF can offer a performance benefit is when dealing with Large Objects (LOB). If you have a function whose purpose is to extract some information from a value of one of the LOB types, you can perform this extraction right on the database server and pass only the extracted value back to the application. This is more efficient than passing the entire LOB value back to the application and then performing the extraction. The performance value of packaging this function as a UDF could be enormous, depending on the particular situation. (Note that you can also extract a portion of a LOB by using a LOB locator. See "Example: Deferring the Evaluation of a LOB Expression" for an example of a similar scenario.)
In addition, with table functions, you can very efficiently use relational operations and the power of SQL on data that resides outside a DB2 database (including non-relational data stores). A table function takes individual scalar values of different types and meanings as its arguments, and returns a table to the SQL statement that invokes it. You can write table functions that generate only the data in which you are interested, eliminating any unwanted rows or columns. For more information on table functions, including rules on where you can use them, see the section on Functions in the SQL Reference.
Object Orientation.
You can implement the behavior of a user-defined distinct type (UDT), also called distinct type, using a UDF. For more information on UDTs, see "User-defined Distinct Types (UDT)". For additional details on UDTs and the important concept of castability discussed herein, see the CREATE DISTINCT TYPE statement in the SQL Reference. When you create a distinct type, you are automatically provided cast functions between the distinct type and its source type, and you may be provided comparison operators such as =, >, <, and so on, depending on the source type. You have to provide any additional behavior yourself. Because it is clearly desirable to keep the behavior of a distinct type in the database where all of the users of the distinct type can easily access it, UDFs can be used as the implementation mechanism.
For example, suppose you have a BOAT distinct type, defined over a one megabyte BLOB. The BLOB contains the various nautical specifications, and some drawings. You may wish to compare sizes of boats, and with a distinct type defined over a BLOB source type, you do not get the comparison operations automatically generated for you. You can implement a BOAT_COMPARE function which decides if one boat is bigger than another based on a measure that you choose. These could be: displacement, length over all, metric tonnage, or another calculation based on the BOAT object. You create the BOAT_COMPARE function as follows:
```
     CREATE FUNCTION BOAT_COMPARE (BOAT, BOAT) RETURNS INTEGER ...
```
If your function returns 1 if the first BOAT is bigger, 2 if the second is bigger, and 0if they are equal, you could use this function in your SQL code to compare boats. Suppose you create the following tables:
```
     CREATE TABLE BOATS_INVENTORY ( 
       BOAT_ID        CHAR(5), 
       BOAT_TYPE      VARCHAR(25), 
       DESIGNER       VARCHAR(40), 
       OWNER          VARCHAR(40), 
       DESIGN_DATE    DATE, 
       SPEC           BOAT, 
       ...                  ) 
  
     CREATE TABLE MY_BOATS ( 
       BOAT_ID        CHAR(5), 
       BOAT_TYPE      VARCHAR(25), 
       DESIGNER       VARCHAR(40), 
       DESIGN_DATE    DATE, 
       ACQUIRE_DATE   DATE, 
       ACQUIRE_PRICE  CANADIAN_DOLLAR, 
       CURR_APPRAISL  CANADIAN_DOLLAR, 
       SPEC           BOAT, 
       ...                  )
```
You can execute the following SQL SELECT statement:
```
     SELECT INV.BOAT_ID, INV.BOAT_TYPE, INV.DESIGNER, 
       INV.OWNER, INV.DESIGN_DATE 
       FROM BOATS_INVENTORY INV, MY_BOATS MY 
       WHERE MY.BOAT_ID = '19GCC' 
       AND BOAT_COMPARE(INV.SPEC, MY.SPEC) = 1
```
This simple example returns all the boats from BOATS_INVENTORY that are bigger than a particular boat in MY_BOATS. Note that the example only passes the rows of interest back to the application because the comparison occurs in the database server. In fact, it completely avoids passing any values of data type BOAT. This is a significant improvement in storage and performance as BOAT is based on a one megabyte BLOB data type.

UDF Concepts

The following is a discussion of the important concepts you need to know prior to coding UDFs:

Full name of a function.
The full name of a function is <schema-name>.<function-name>. You can use this full name anywhere you refer to a function. For example:
```
     SLICKO.BOAT_COMPARE    SMITH.FOO    SYSIBM.SUBSTR    SYSFUN.FLOOR
```
However, you may also omit the <schema-name>., in which case, DB2 must determine the function to which you are referring. For example:
```
     BOAT_COMPARE    FOO    SUBSTR    FLOOR
```
Function Path
The concept of function path is central to DB2's resolution of unqualified references that occur when you do not use the schema-name. For the use of function path in DDL statements that refer to functions, see description of the corresponding statement in the SQL Reference. The function path is an ordered list of schema names. It provides a set of schemas for resolving unqualified function references to UDFs as well as UDTs. In cases where a function reference matches functions in more than one schema in the path, the order of the schemas in the path is used to resolve this match. The function path is established by means of the FUNCPATH option on the precompile and bind commands for static SQL. The function path is set by the SET CURRENT FUNCTION PATH statement for dynamic SQL. The function path has the following default value:
```
     "SYSIBM","SYSFUN","<ID>" 
```
This applies to both static and dynamic SQL, where <ID> represents the current statement authorization ID.
Overloaded function names.
Function names can be overloaded, which means that multiple functions, even in the same schema, can have the same name. Two functions cannot, however, have the same signature, which can be defined to be the qualified function name concatenated with the defined data types of all the function parameters in the order in which they are defined. For an example of an overloaded function, see "Example: BLOB String Search".
Function selection algorithm
It is the function selection algorithm that takes into account the facts of overloading and function path to choose the best fit for every function reference, whether it is a qualified or an unqualified reference. Even references to the built-in functions and the functions (also IBM-supplied) in the SYSFUN schema are processed through the function selection algorithm.
Types of function.
Each user-defined function is classified as a scalar, column or table function. A scalar function returns a single value answer each time it is called. For example, the built-in function SUBSTR() is a scalar function. Scalar UDFs can either be external (coded in a programming language such as C), or sourced (using the implementation of an existing function).
A column function receives a set of like values (a column of data) and returns a single value answer from this set of values. These are also called aggregating functions in DB2. An example of a column function is the built-in function AVG(). An external column UDF cannot be defined to DB2, but a column UDF that is sourced on one of the built-in column functions can be defined. This is useful for distinct types. For example, if a distinct type SHOESIZE exists that is defined with base type INTEGER, you could define a UDF, AVG(SHOESIZE), as a column function sourced on the existing built-in column function, AVG(INTEGER).
A table function returns a table to the SQL statement that references it. A table function can only be referenced in the FROM clause of a SELECT statement. Such a function can be used to apply the SQL language to non-DB2 data, or to capture such data and put it into a DB2 table. For example, it could dynamically convert a file consisting of non-DB2 data into a table, or it could retrieve data from the World Wide Web or an operating system and tabularize it. A table function can only be an external function.

The concept of function path, the SET CURRENT FUNCTION PATH statement, and the function selection algorithm are discussed in detail in the SQL Reference. The FUNCPATH precompile and bind options are discussed in detail in the Command Reference.

Implementing UDFs

The process of implementing a UDF requires the following steps:

Writing the UDF
Compiling the UDF
Linking the UDF
Debugging the UDF
Registering the UDF with DB2.

After these steps are successfully completed, your UDF is ready for use in data manipulation language (DML) or data definition language (DDL) statements such as CREATE VIEW. The steps of writing and defining UDFs are discussed in the following sections, followed by a discussion on using UDFs. For information on compiling and linking UDFs, see the DB2 SDK Building Applications book for your operating system. For information on debugging your UDF, see "Debugging your UDF".

Writing UDFs

You can find the details of how you write UDFs in Chapter 7. "Writing User-Defined Functions (UDFs)". This includes the details on the interface between DB2 and UDFs, UDF coding considerations, UDF coding examples, and UDF debugging information. For information on the related tasks of compiling and linking your UDFs, see the DB2 SDK Building Applications book for your operating system.

Registering UDFs

You should register the UDF to DB2 after you have written and completely tested the actual code. Note that it is possible to define the UDF prior to actually writing it. However, to avoid any problems with running your UDF, you are encouraged to write and test it extensively before registering it. For information on testing your UDF, see "Debugging your UDF".

Use the CREATE FUNCTION statement to define (or register) your UDF to DB2. You can find detailed explanations for this statement and its options in the SQL Reference.

Examples of Registering UDFs

The examples which follow illustrate a variety of typical situations where UDFs can be registered.

Note that in these examples:

The keyword or keyword/value specifications are always shown in the same order, for consistency of presentation and ease of understanding. In actually writing one of these CREATE FUNCTION statements, after the function name and the list of parameter data types, the specifications can appear in any order.
The specifications in the EXTERNAL NAME clause are always shown for DB2 for AIX. You may need to make changes if you run these examples on non-UNIX platforms. For example, by converting all the slash (/) characters to back slash characters (\) and adding a drive letter such as C:, you have examples that are valid in OS/2 or Windows environments. See the SQL Reference for a complete discussion of the EXTERNAL NAME clause.

Example: Exponentiation

Suppose you have written an external UDF to perform exponentiation of floating point values, and wish to register it in the MATH schema. Assume that you have DBADM authority. As you have tested the function extensively, and know that it does not represent any integrity exposure, you define it as NOT FENCED. By virtue of having DBADM authority, you possess the database authority, CREATE_NOT_FENCED, which is required to define the function as NOT FENCED.

     CREATE FUNCTION MATH.EXPON (DOUBLE, DOUBLE) 
       RETURNS DOUBLE 
       EXTERNAL NAME '/common/math/exponent' 
       LANGUAGE C 
       PARAMETER STYLE DB2SQL 
       NO SQL 
       DETERMINISTIC
       NO EXTERNAL ACTION 
       NOT FENCED

In this example, the system uses the NOT NULL CALL default value. This is desirable since you want the result to be NULL if either argument is NULL. Since you do not require a scratchpad and no final call is necessary, the NO SCRATCHPAD and NO FINAL CALL default values are used. As there is no reason why EXPON cannot be parallel, the ALLOW PARALLELISM default value is used.

Example: String Search

Your associate, Willie, has written a UDF to look for the existence of a given short string, passed as an argument, within a given CLOB value, which is also passed as an argument. The UDF returns the position of the string within the CLOB if it finds the string, or zero if it does not. Because you are concerned with database integrity for this function as you suspect the UDF is not fully tested, you define the function as FENCED.

Additionally, Willie has written the function to return a FLOAT result. Suppose you know that when it is used in SQL, it should always return an INTEGER. You can create the following function:

     CREATE FUNCTION FINDSTRING (CLOB(500K), VARCHAR(200)) 
       RETURNS INTEGER 
       CAST FROM FLOAT 
       SPECIFIC "willie_find_feb95" 
       EXTERNAL NAME '/u/willie/testfunc/testmod!findstr' 
       LANGUAGE C 
       PARAMETER STYLE DB2SQL 
       NO SQL 
       DETERMINISTIC
       NO EXTERNAL ACTION 
       FENCED

Note that a CAST FROM clause is used to specify that the UDF body really returns a FLOAT value but you want to cast this to INTEGER before returning the value to the statement which used the UDF. As discussed in the SQL Reference, the INTEGER built-in function can perform this cast for you. Also, you wish to provide your own specific name for the function and later reference it in DDL (see "Example: String Search over UDT"). Because the UDF was not written to handle NULL values, you use the NOT NULL CALL default value. And because there is no scratchpad, you use the NO SCRATCHPAD and NO FINAL CALL default values. As there is no reason why FINDSTRING cannot be parallel, the ALLOW PARALLELISM default value is used.

Example: BLOB String Search

Because you want this function to work on BLOBs as well as on CLOBs, you define another FINDSTRING taking BLOB as the first parameter:

     CREATE FUNCTION FINDSTRING (BLOB(500K), VARCHAR(200)) 
       RETURNS INTEGER 
       CAST FROM FLOAT 
       SPECIFIC "willie_fblob_feb95" 
       EXTERNAL NAME '/u/willie/testfunc/testmod!findstr' 
       LANGUAGE C 
       PARAMETER STYLE DB2SQL 
       NO SQL 
       DETERMINISTIC 
       NO EXTERNAL ACTION 
       FENCED

This example illustrates overloading of the UDF name, and shows that multiple UDFs can share the same body. Note that although a BLOB cannot be assigned to a CLOB, the same source code can be used. There is no programming problem in the above example as the programming interface for BLOB and CLOB between DB2 and UDF is the same; length followed by data. DB2 does not check if the UDF using a particular function body is in any way consistent with any other UDF using the same body.

Example: String Search over UDT

This example is a continuation of the previous example. Say you are satisfied with the FINDSTRING functions from "Example: BLOB String Search", but now you have defined a distinct type BOAT with source type BLOB. You also want FINDSTRING to operate on values having data type BOAT, so you create another FINDSTRING function. This function is sourced on the FINDSTRING which operates on BLOB values in "Example: BLOB String Search". Note the further overloading of FINDSTRING in this example:

     CREATE FUNCTION FINDSTRING (BOAT, VARCHAR(200)) 
       RETURNS INT 
       SPECIFIC "slick_fboat_mar95" 
       SOURCE SPECIFIC "willie_fblob_feb95"

Note that this FINDSTRING function has a different signature from the FINDSTRING functions in "Example: BLOB String Search", so there is no problem overloading the name. You wish to provide our own specific name for possible later reference in DDL. Because you are using the SOURCE clause, you cannot use the EXTERNAL NAME clause or any of the related keywords specifying function attributes. These attributes are taken from the source function. Finally, observe that in identifying the source function you are using the specific function name explicitly provided in "Example: BLOB String Search". Because this is an unqualified reference, the schema in which this source function resides must be in the function path, or the reference will not be resolved.

Example: External Function with UDT Parameter

You have written another UDF to take a BOAT and examine its design attributes and generate a cost for the boat in Canadian dollars. Even though internally, the labor cost may be priced in German marks, or Japanese yen, or US dollars, this function needs to generate the cost to build the boat in the required currency, Canadian dollars. This means it has to get current exchange rate information from the exchange_rate file, managed outside of DB2, and the answer depends on what it finds in this file. This makes the function NOT DETERMINISTIC (or VARIANT).

     CREATE FUNCTION BOAT_COST (BOAT) 
       RETURNS INTEGER 
       EXTERNAL NAME '/u/marine/funcdir/costs!boatcost' 
       LANGUAGE C 
       PARAMETER STYLE DB2SQL 
       NO SQL 
       NOT DETERMINISTIC
       NO EXTERNAL ACTION 
       FENCED

Observe that CAST FROM and SPECIFIC are not specified, but that NOT DETERMINISTIC is specified. Here again, FENCED is chosen for safety reasons.

Example: AVG over a UDT

This example implements the AVG column function over the CANADIAN_DOLLAR distinct type. See "Example: Money" for the definition of CANADIAN_DOLLAR. Strong typing prevents you from using the built-in AVG function on a distinct type. It turns out that the source type for CANADIAN_DOLLAR was DECIMAL, and so you implement the AVG by sourcing it on the AVG(DECIMAL) built-in function. The ability to do this depends on being able to cast from DECIMAL to CANADIAN_DOLLAR and vice versa, but since DECIMAL is the source type for CANADIAN_DOLLAR you know these casts will work.

     CREATE FUNCTION AVG (CANADIAN_DOLLAR) 
       RETURNS CANADIAN_DOLLAR 
       SOURCE "SYSIBM".AVG(DECIMAL(9,2))

Note that in the SOURCE clause you have qualified the function name, just in case there might be some other AVG function lurking in your function path.

Example: Counting

Your simple counting function returns a 1 the first time and increments the result by one each time it is called. This function takes no SQL arguments, and by definition it is a NOT DETERMINISTIC function since its answer varies from call to call. It uses the scratchpad to save the last value returned, and each time it is invoked it increments this value and returns it. You have rigorously tested this function, and possess DBADM authority on the database, so you will define it as NOT FENCED. (DBADM implies CREATE_NOT_FENCED.)

     CREATE FUNCTION COUNTER () 
       RETURNS INT 
       EXTERNAL NAME '/u/roberto/myfuncs/util!ctr' 
       LANGUAGE C 
       PARAMETER STYLE DB2SQL 
       NO SQL 
       NOT DETERMINISTIC
       NOT FENCED 
       SCRATCHPAD
       DISALLOW PARALLEL

Note that no parameter definitions are provided, just empty parentheses. The above function specifies SCRATCHPAD, and uses the default specification of NO FINAL CALL. In this case, as the size of the 100-byte scratchpad is sufficient, no storage has to be freed by means of a final call, and so NO FINAL CALL is specified. Since the COUNTER function requires that a single scratchpad be used to operate properly, DISALLOW PARALLEL is added to prevent DB2 from operating it in parallel. To see an implementation of this COUNTER function, refer to "Example: Counter".

EXAMPLE: Counting with an OLE automation object

This example implements the previous counting example as an OLE (Object Linking and Embedding) automation object, counter, with an instance variable, nbrOfInvoke, to keep track of the number of invocations. Every time the UDF gets invoked, the increment method of the object increments the nbrOfInvoke instance variable and returns its current state. The automation object is registered in the Windows registry with the OLE programmatic identifier (progID) bert.bcounter.

     CREATE FUNCTION bcounter ()
       RETURNS integer
       EXTERNAL NAME 'bert.bcounter!increment'
       LANGUAGE OLE 
       PARAMETER STYLE DB2SQL
       SCRATCHPAD 
       NOT DETERMINISTIC 
       FENCED
       NULL CALL 
       NO SQL
       NO EXTERNAL ACTION
       DISALLOW PARALLEL;

The implementation of the class counter is shown in "Example: Counter OLE Automation UDF in BASIC" and in "Example: Counter OLE Automation UDF in C++". For details of OLE support with DB2, see "Writing OLE Automation UDFs".

EXAMPLE: Table Function Returning Document IDs

You have written a table function which returns a row consisting of a single document identifier column for each known document in your text management system which matches a given subject area (the first parameter) and contains the given string (second parameter). This UDF uses the functions of the text management system to quickly identify the documents:

     CREATE FUNCTION DOCMATCH (VARCHAR(30), VARCHAR(255))
       RETURNS TABLE (DOC_ID CHAR(16))
       EXTERNAL NAME '/common/docfuncs/rajiv/udfmatch'
       LANGUAGE C
       PARAMETER STYLE DB2SQL
       NO SQL
       DETERMINISTIC
       NO EXTERNAL ACTION
       NOT FENCED
       SCRATCHPAD
       FINAL CALL
       DISALLOW PARALLEL
       CARDINALITY 20

Within the context of a single session it will always return the same table, and therefore it is defined as DETERMINISTIC. Note the RETURNS clause which defines the output from DOCMATCH, including the column name DOC_ID, and that FINAL CALL must be specified for each table function. In addition, the DISALLOW PARALLEL keyword is added as table functions cannot operate in parallel. Although the size of the output from DOCMATCH is highly variable, CARDINALITY 20 is a representative value, and is specified to help the DB2 optimizer to make good decisions.

Typically this table function would be used in a join with the table containing the document text, as follows:

  SELECT T.AUTHOR, T.DOCTEXT                                           
  FROM DOCS as T, TABLE(DOCMATCH('MATHEMATICS', 'ZORN''S LEMMA')) as F 
  WHERE T.DOCID = F.DOC_ID

Note the special syntax (TABLE keyword) for specifying a table function in a FROM clause. In this invocation, the docmatch() table function returns a row containing the single column DOC_ID for each mathematics document referencing Zorn's Lemma. These DOC_ID values are joined to the master document table, retrieving the author's name and document text.

Using UDFs

Scalar and column UDFs can be invoked within an SQL statement wherever an expression is valid (there are additional rules for all column functions that limit validity). Table UDFs can only be referenced in the FROM clause of a SELECT. The SQL Reference discusses all these contexts in detail. The discussion and examples used in this section focus on relatively simple SELECT statement contexts, but note that their use is not restricted to these contexts.

Refer to "UDF Concepts" for a summary of the use and importance of the function path and the function selection algorithm. You can find the details for both of these concepts in the SQL Reference. The resolution of any Data Manipulation Language (DML) reference to a function uses the function selection algorithm, so it is important to understand how it works.

Referring to Functions

Each reference to a function, whether it is a UDF, or a built-in function, contains the following syntax:

>--function_name--(--.----------------.--)---------->
                     | <-,----------< |              
                     '---expression---'

In the above, function_name can be either an unqualified or a qualified function name, and the arguments can number from 0 to 90, and are expressions which may contain:

A column name, qualified or unqualified
A constant
A host variable
A special register
A parameter marker. (For information on the limitations of parameter marker use, refer to the section in the SQL Reference that describes the rules for parameter markers.)

The position of the arguments is important and must conform to the function definition for the semantics to be correct. Both the position of the arguments and the function definition must conform to the function body itself. DB2 does not attempt to shuffle arguments to better match a function definition, and DB2 does not understand the semantics of the individual function parameters.

Use of column names in UDF argument expressions requires that the table references which contain the column have proper scope. For table functions referenced in a join, this means that for any argument which involves columns from another table or table function, that other table or table function must appear before the table function containing the reference, in the FROM clause. For a complete discussion of the rules for using columns in the arguments of table functions, see the chapter on Queries in the SQL Reference.

Examples of Function Invocations

Some valid examples of function invocations are:

      AVG(FLOAT_COLUMN) 
      BLOOP(COLUMN1) 
      BLOOP(FLOAT_COLUMN + CAST(? AS INTEGER)) 
      BLOOP(:hostvar :indicvar) 
      BRIAN.PARSE(CHAR_COLUMN CONCAT USER, 1, 0, 0, 1) 
      CTR() 
      FLOOR(FLOAT_COLUMN) 
      PABLO.BLOOP(A+B) 
      PABLO.BLOOP(:hostvar) 
      "search_schema"(CURRENT FUNCTION PATH, 'GENE')
      SUBSTR(COLUMN2,8,3) 
      SYSFUN.FLOOR(AVG(EMP.SALARY)) 
      SYSFUN.AVG(SYSFUN.FLOOR(EMP.SALARY)) 
      SYSIBM.SUBSTR(COLUMN2,11,LENGTH(COLUMN3)) 
      SQRT(SELECT SUM(length*length) 
           FROM triangles 
           WHERE id= 'J522' 
           AND legtype <> 'HYP')

Note that if any of the above functions are table functions, the syntax to reference them is slightly different than presented above. For example, if PABLO.BLOOP is a table function, to properly reference it, use:

     TABLE(PABLO.BLOOP(A+B)) AS Q

Using Parameter Markers in Functions

An important restriction involves parameter markers; you cannot simply code the following:

     BLOOP(?)

As the function selection logic does not know what data type the argument may turn out to be, it cannot resolve the reference. You can use the CAST specification to provide a type for the parameter marker, for example INTEGER, and then the function selection logic can proceed:

     BLOOP(CAST(? AS INTEGER))

Using Qualified Function Reference

If you use a qualified function reference, you restrict DB2's search for a matching function to that schema. For example, you have the following statement:

     SELECT PABLO.BLOOP(COLUMN1) FROM T

Only the BLOOP functions in schema PABLO are considered. It does not matter that user SERGE has defined a BLOOP function, or whether or not there is a built-in BLOOP function. Now suppose that user PABLO has defined two BLOOP functions in his schema:

     CREATE FUNCTION BLOOP (INTEGER) RETURNS ... 
     CREATE FUNCTION BLOOP (DOUBLE) RETURNS ...

BLOOP is thus overloaded within the PABLO schema, and the function selection algorithm would choose the best BLOOP, depending on the data type of the argument, column1. In this case, both of the PABLO.BLOOPs take numeric arguments, and if column1 is not one of the numeric types, the statement will fail. On the other hand if column1 is either SMALLINT or INTEGER, function selection will resolve to the first BLOOP, while if column1 is DECIMAL or DOUBLE, the second BLOOP will be chosen.

Several points about this example:

It illustrates argument promotion. The first BLOOP is defined with an INTEGER parameter, yet you can pass it a SMALLINT argument. The function selection algorithm supports promotions among the built-in data types (for details, see the SQL Reference) and DB2 performs the appropriate data value conversions.
If for some reason you want to invoke the second BLOOP with a SMALLINT or INTEGER argument, you have to take an explicit action in your statement as follows:
```
     SELECT PABLO.BLOOP(DOUBLE(COLUMN1)) FROM T
```
Alternatively, if you want to invoke the first BLOOP with a DECIMAL or DOUBLE argument, you have your choice of explicit actions, depending on your exact intent:
```
     SELECT PABLO.BLOOP(INTEGER(COLUMN1)) FROM T
     SELECT PABLO.BLOOP(FLOOR(COLUMN1)) FROM T 
     SELECT PABLO.BLOOP(CEILING(COLUMN1)) FROM T 
     SELECT PABLO.BLOOP(INTEGER(ROUND(COLUMN1,0))) FROM T
```
You should investigate these other functions in the SQL Reference. The INTEGER function is a built-in function in the SYSIBM schema. The FLOOR, CEILING, and ROUND functions are UDFs shipped with DB2, which you can find in the SYSFUN schema along with many other useful functions.

Using Unqualified Function Reference

If, instead of a qualified function reference, you use an unqualified function reference, DB2's search for a matching function normally uses the function path to qualify the reference. In the case of the DROP FUNCTION or COMMENT ON FUNCTION functions, the reference is qualified using the current authorization ID, if they are unqualified. Thus, it is important that you know what your function path is, and what, if any, conflicting functions exist in the schemas of your current function path. For example, suppose you are PABLO and your static SQL statement is as follows, where COLUMN1 is data type INTEGER:

     SELECT BLOOP(COLUMN1) FROM T

You have created the two BLOOP functions cited in "Using Qualified Function Reference", and you want and expect one of them to be chosen. If the following default function path is used, the first BLOOP is chosen (since column1 is INTEGER), if there is no conflicting BLOOP in SYSIBM or SYSFUN:

     "SYSIBM","SYSFUN","PABLO"

However, suppose you have forgotten that you are using a script for precompiling and binding which you previously wrote for another purpose. In this script, you explicitly coded your FUNCPATH parameter to specify the following function path for another reason that does not apply to your current work:

     "KATHY","SYSIBM","SYSFUN","PABLO"

If Kathy has written a BLOOP function for her own purposes, the function selection could very well resolve to Kathy's function, and your statement would execute without error. You are not notified because DB2 assumes that you know what you are doing. It becomes your responsibility to identify the incorrect output from your statement and make the required correction.

Summary of Function References

For both qualified and unqualified function references, the function selection algorithm looks at all the applicable functions, both built-in and user-defined, that have:

The given name
The same number of defined parameters as arguments in the function reference
Each parameter identical to or promotable from the type of the corresponding argument.

(Applicable functions means functions in the named schema for a qualified reference, or functions in the schemas of the function path for an unqualified reference.) The algorithm looks for an exact match, or failing that, a best match among these functions. The current function path is used, in the case of an unqualified reference only, as the deciding factor if two identically good matches are found in different schemas. The details of the algorithm can be found in the SQL Reference.

An interesting feature, illustrated by the examples at the end of "Using Qualified Function Reference", is the fact that function references can be nested, even references to the same function. This is generally true for built-in functions as well as UDFs; however, there are some limitations when column functions are involved.

Refining an earlier example:

     CREATE FUNCTION BLOOP (INTEGER) RETURNS INTEGER ... 
     CREATE FUNCTION BLOOP (DOUBLE) RETURNS INTEGER ...

Now consider the following DML statement:

     SELECT BLOOP( BLOOP(COLUMN1)) FROM T

If column1 is a DECIMAL or DOUBLE column, the inner BLOOP reference resolves to the second BLOOP defined above. Because this BLOOP returns an INTEGER, the outer BLOOP resolves to the first BLOOP.

Alternatively, if column1 is a SMALLINT or INTEGER column, the inner bloop reference resolves to the first BLOOP defined above. Because this BLOOP returns an INTEGER, the outer BLOOP also resolves to the first BLOOP. In this case, you are seeing nested references to the same function.

A few additional points important for function references are:

By defining a function with the name of one of the SQL operators, you can actually invoke a UDF using infix notation. For example, suppose you can attach some meaning to the "+" operator for values which have distinct type BOAT. You can define the following UDF:
```
     CREATE FUNCTION "+" (BOAT, BOAT) RETURNS ...
```
Then you can write the following valid SQL statement:
```
     SELECT BOAT_COL1 + BOAT_COL2 
     FROM BIG_BOATS 
     WHERE BOAT_OWNER = 'Nelson Mattos'
```
But you can also write the equally valid statement:
```
     SELECT "+"(BOAT_COL1, BOAT_COL2) 
     FROM BIG_BOATS 
     WHERE BOAT_OWNER = 'Nelson Mattos' 
```
Note that you are not permitted to overload the built-in conditional operators such as >, =, LIKE, IN, and so on, in this way. See "Example: Integer Divide Operator" for an example of a UDF which overloads the divide (/) operator.
The function selection algorithm does not consider the context of the reference in resolving to a particular function. Look at these BLOOP functions, modified a bit from before:
```
     CREATE FUNCTION BLOOP (INTEGER) RETURNS INTEGER ... 
     CREATE FUNCTION BLOOP (DOUBLE) RETURNS CHAR(10)...
```
Now suppose you write the following SELECT statement:
```
     SELECT 'ABCDEFG' CONCAT BLOOP(SMALLINT_COL) FROM T
```
Because the best match, resolved using the SMALLINT argument, is the first BLOOP defined above, the second operand of the CONCAT resolves to data type INTEGER. The statement fails because CONCAT demands string arguments. If the first BLOOP was not present, the other BLOOP would be chosen and the statement execution would be successful.
Another type of contextual inconsistency that causes a statement to fail is if a given function reference resolves to a table function in a context that requires a scalar or column function. The reverse could also occur. A reference could resolve to a scalar or column function when a table function is necessary.
UDFs can be defined with parameters or results having any of the LOB types: BLOB, CLOB, or DBCLOB. DB2 will materialize the entire LOB value in storage before invoking such a function, even if the source of the value is a LOB locator host variable. For example, consider the following fragment of a C language application:
```
     EXEC SQL BEGIN DECLARE SECTION; 
       SQL TYPE IS CLOB(150K) clob150K ;        /* LOB host var */ 
       SQL TYPE IS CLOB_LOCATOR clob_locator1;  /* LOB locator host var */ 
       char                   string[40];       /* string host var */ 
     EXEC SQL END DECLARE SECTION; 
```
Either host variable :clob150K or :clob_locator1 is valid as an argument for a function whose corresponding parameter is defined as CLOB(500K). Thus, referring to the FINDSTRING defined in "Example: String Search", both of the following are valid in the program:
```
     ... SELECT FINDSTRING (:clob150K, :string) FROM ... 
     ... SELECT FINDSTRING (:clob_locator1, :string) FROM ...
```
UDF parameters or results which have one of the LOB types can be created with the AS LOCATOR modifier. In this case, the entire LOB value is not materialized prior to invocation. Instead, a LOB LOCATOR is passed to the UDF, which can then use the special UDF APIs to manipulate the actual bytes of the LOB value (see "Using LOB Locators as UDF Parameters or Results" for details).
You can also use this capability on UDF parameters or results which have a distinct type that is based on a LOB. This capability is limited to UDFs defined as not-fenced. Note that the argument to such a function can be any LOB value of the defined type; it does not have to be a host variable defined as one of the LOCATOR types. The use of host variable locators as arguments is completely orthogonal to the use of AS LOCATOR in UDF parameters and result definitions.
UDFs can be defined with distinct types as parameters or as the result. (Earlier examples have illustrated this.) DB2 will pass the value to the UDF in the format of the source data type of the distinct type.
Distinct type values which originate in a host variable and which are used as arguments to a UDF which has its corresponding parameter defined as a distinct type, must be explicitly cast to the distinct type by the user. There is no host language type for distinct types. DB2's strong typing necessitates this. Otherwise your results may be ambiguous. So, consider the BOAT distinct type which is defined over a BLOB, and consider the BOAT_COST UDF from "Example: External Function with UDT Parameter", which takes an object of type BOAT as its argument. In the following fragment of a C language application, the host variable :ship holds the BLOB value that is to passed to the BOAT_COST function:
```
     EXEC SQL BEGIN DECLARE SECTION; 
       SQL TYPE IS BLOB(150K) ship; 
     EXEC SQL END DECLARE SECTION;
```
Both of the following statements correctly resolve to the BOAT_COST function, because both cast the :ship host variable to type BOAT:
```
     ... SELECT BOAT_COST (BOAT(:ship)) FROM ... 
     ... SELECT BOAT_COST (CAST(:ship AS BOAT)) FROM ...
```
If there are multiple BOAT distinct types in the database, or BOAT UDFs in other schema, you must exercise care with your function path. Otherwise your results may be ambiguous.

[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]

[ DB2 List of Books | Search the DB2 Books ]