A user-defined function is a mechanism with which you can write your own extensions to SQL. The built-in functions supplied with DB2 are a useful set of functions, but they may not satisfy all of your requirements. Thus, you may need to extend SQL for the following reasons:
The function specific to your application does not exist in DB2. Whether the function is a simple transformation, a trivial calculation, or a complicated multivariate analysis, you can probably use a UDF to do the job.
The DB2 built-in function does not quite permit the variations that you wish to include in your application.
Many of the programs at your site implement the same basic set of functions, but there are minor differences in all the implementations. Thus, you are unsure about the consistency of the results you receive. If you correctly implement these functions once, in a UDF, then all these programs can use the same implementation directly in SQL and provide consistent results.
As discussed in "User-defined Distinct Types (UDT)", UDTs can be very useful in extending the capability and increasing the safety of DB2. UDFs act as the methods for UDTs, by providing behavior and encapsulating the types.
In writing DB2 applications, you have a choice when implementing desired actions or operations:
Although it may seem easier to implement new operations as subroutines or functions in your application, there are good reasons why you should consider using UDFs:
If the new operation is something of which other users or programs at your site can take advantage, then UDFs can help to reuse it. In addition, the function can be invoked directly in SQL wherever an expression can be used by any user of the database. The database will take care of many data type promotions of the function arguments automatically, for example DECIMAL to DOUBLE, allowing your function to be applied to different, but compatible data types.
It may seem easier to implement your new function as a normal function and then make it available to others for use in their programs, thereby avoiding the need to define the function to DB2. This requires that you inform all other interested application developers, and package the function effectively for their use. However, it ignores the interactive users like those who normally use the Command Line Processor (CLP) to access the database. CLP users cannot use your function unless it is a UDF in the database. This also applies to any other tools that use SQL (such as Visualizer), that do not get recompiled.
Invoking the UDF directly from the database engine instead of from your application can have a considerable performance advantage, particularly when the function may be used in the qualification of data for further processing. Consider a simple scenario where you want to process some data, provided you can meet some selection criteria which can be expressed as a function SELECTION_CRITERIA(). Your application could issue the following select statement:
SELECT A,B,C FROM T
When it receives each row, it runs SELECTION_CRITERIA against the data to decide if it is interested in processing the data further. Here, every row of table T must be passed back to the application. But if SELECTION_CRITERIA() is implemented as a UDF, your application can issue the following statement:
SELECT A,B,C FROM T WHERE SELECTION_CRITERIA(A,B) = 1
In this case, only the rows of interest are passed across the interface between the application and the database. For large tables, or for cases where SELECTION_CRITERIA supplies significant filtering, the performance improvement can be very significant.
Another case where a UDF can offer a performance benefit is when dealing with Large Objects (LOB). If you have a function whose purpose is to extract some information from a value of one of the LOB types, you can perform this extraction right on the database server and pass only the extracted value back to the application. This is more efficient than passing the entire LOB value back to the application and then performing the extraction. The performance value of packaging this function as a UDF could be enormous, depending on the particular situation. (Note that you can also extract a portion of a LOB by using a LOB locator. See "Example: Deferring the Evaluation of a LOB Expression" for an example of a similar scenario.)
In addition, with table functions, you can very efficiently use relational operations and the power of SQL on data that resides outside a DB2 database (including non-relational data stores). A table function takes individual scalar values of different types and meanings as its arguments, and returns a table to the SQL statement that invokes it. You can write table functions that generate only the data in which you are interested, eliminating any unwanted rows or columns. For more information on table functions, including rules on where you can use them, see the section on Functions in the SQL Reference.
You can implement the behavior of a user-defined distinct type (UDT), also called distinct type, using a UDF. For more information on UDTs, see "User-defined Distinct Types (UDT)". For additional details on UDTs and the important concept of castability discussed herein, see the CREATE DISTINCT TYPE statement in the SQL Reference. When you create a distinct type, you are automatically provided cast functions between the distinct type and its source type, and you may be provided comparison operators such as =, >, <, and so on, depending on the source type. You have to provide any additional behavior yourself. Because it is clearly desirable to keep the behavior of a distinct type in the database where all of the users of the distinct type can easily access it, UDFs can be used as the implementation mechanism.
For example, suppose you have a BOAT distinct type, defined over a one megabyte BLOB. The BLOB contains the various nautical specifications, and some drawings. You may wish to compare sizes of boats, and with a distinct type defined over a BLOB source type, you do not get the comparison operations automatically generated for you. You can implement a BOAT_COMPARE function which decides if one boat is bigger than another based on a measure that you choose. These could be: displacement, length over all, metric tonnage, or another calculation based on the BOAT object. You create the BOAT_COMPARE function as follows:
CREATE FUNCTION BOAT_COMPARE (BOAT, BOAT) RETURNS INTEGER ...
If your function returns 1 if the first BOAT is bigger, 2 if the second is bigger, and 0if they are equal, you could use this function in your SQL code to compare boats. Suppose you create the following tables:
CREATE TABLE BOATS_INVENTORY ( BOAT_ID CHAR(5), BOAT_TYPE VARCHAR(25), DESIGNER VARCHAR(40), OWNER VARCHAR(40), DESIGN_DATE DATE, SPEC BOAT, ... ) CREATE TABLE MY_BOATS ( BOAT_ID CHAR(5), BOAT_TYPE VARCHAR(25), DESIGNER VARCHAR(40), DESIGN_DATE DATE, ACQUIRE_DATE DATE, ACQUIRE_PRICE CANADIAN_DOLLAR, CURR_APPRAISL CANADIAN_DOLLAR, SPEC BOAT, ... )
You can execute the following SQL SELECT statement:
SELECT INV.BOAT_ID, INV.BOAT_TYPE, INV.DESIGNER, INV.OWNER, INV.DESIGN_DATE FROM BOATS_INVENTORY INV, MY_BOATS MY WHERE MY.BOAT_ID = '19GCC' AND BOAT_COMPARE(INV.SPEC, MY.SPEC) = 1
This simple example returns all the boats from BOATS_INVENTORY that are bigger than a particular boat in MY_BOATS. Note that the example only passes the rows of interest back to the application because the comparison occurs in the database server. In fact, it completely avoids passing any values of data type BOAT. This is a significant improvement in storage and performance as BOAT is based on a one megabyte BLOB data type.
The following is a discussion of the important concepts you need to know prior to coding UDFs:
The full name of a function is <schema-name>.<function-name>. You can use this full name anywhere you refer to a function. For example:
SLICKO.BOAT_COMPARE SMITH.FOO SYSIBM.SUBSTR SYSFUN.FLOOR
However, you may also omit the <schema-name>., in which case, DB2 must determine the function to which you are referring. For example:
BOAT_COMPARE FOO SUBSTR FLOOR
The concept of function path is central to DB2's resolution of unqualified references that occur when you do not use the schema-name. For the use of function path in DDL statements that refer to functions, see description of the corresponding statement in the SQL Reference. The function path is an ordered list of schema names. It provides a set of schemas for resolving unqualified function references to UDFs as well as UDTs. In cases where a function reference matches functions in more than one schema in the path, the order of the schemas in the path is used to resolve this match. The function path is established by means of the FUNCPATH option on the precompile and bind commands for static SQL. The function path is set by the SET CURRENT FUNCTION PATH statement for dynamic SQL. The function path has the following default value:
"SYSIBM","SYSFUN","<ID>"
This applies to both static and dynamic SQL, where <ID> represents the current statement authorization ID.
Function names can be overloaded, which means that multiple functions, even in the same schema, can have the same name. Two functions cannot, however, have the same signature, which can be defined to be the qualified function name concatenated with the defined data types of all the function parameters in the order in which they are defined. For an example of an overloaded function, see "Example: BLOB String Search".
It is the function selection algorithm that takes into account the facts of overloading and function path to choose the best fit for every function reference, whether it is a qualified or an unqualified reference. Even references to the built-in functions and the functions (also IBM-supplied) in the SYSFUN schema are processed through the function selection algorithm.
Each user-defined function is classified as a scalar, column or table function. A scalar function returns a single value answer each time it is called. For example, the built-in function SUBSTR() is a scalar function. Scalar UDFs can either be external (coded in a programming language such as C), or sourced (using the implementation of an existing function).
A column function receives a set of like values (a column of data) and returns a single value answer from this set of values. These are also called aggregating functions in DB2. An example of a column function is the built-in function AVG(). An external column UDF cannot be defined to DB2, but a column UDF that is sourced on one of the built-in column functions can be defined. This is useful for distinct types. For example, if a distinct type SHOESIZE exists that is defined with base type INTEGER, you could define a UDF, AVG(SHOESIZE), as a column function sourced on the existing built-in column function, AVG(INTEGER).
A table function returns a table to the SQL statement that references it. A table function can only be referenced in the FROM clause of a SELECT statement. Such a function can be used to apply the SQL language to non-DB2 data, or to capture such data and put it into a DB2 table. For example, it could dynamically convert a file consisting of non-DB2 data into a table, or it could retrieve data from the World Wide Web or an operating system and tabularize it. A table function can only be an external function.
The concept of function path, the SET CURRENT FUNCTION PATH statement, and the function selection algorithm are discussed in detail in the SQL Reference. The FUNCPATH precompile and bind options are discussed in detail in the Command Reference.
The process of implementing a UDF requires the following steps:
After these steps are successfully completed, your UDF is ready for use in data manipulation language (DML) or data definition language (DDL) statements such as CREATE VIEW. The steps of writing and defining UDFs are discussed in the following sections, followed by a discussion on using UDFs. For information on compiling and linking UDFs, see the DB2 SDK Building Applications book for your operating system. For information on debugging your UDF, see "Debugging your UDF".
You can find the details of how you write UDFs in Chapter 7. "Writing User-Defined Functions (UDFs)". This includes the details on the interface between DB2 and UDFs, UDF coding considerations, UDF coding examples, and UDF debugging information. For information on the related tasks of compiling and linking your UDFs, see the DB2 SDK Building Applications book for your operating system.
You should register the UDF to DB2 after you have written and completely tested the actual code. Note that it is possible to define the UDF prior to actually writing it. However, to avoid any problems with running your UDF, you are encouraged to write and test it extensively before registering it. For information on testing your UDF, see "Debugging your UDF".
Use the CREATE FUNCTION statement to define (or register) your UDF to DB2. You can find detailed explanations for this statement and its options in the SQL Reference.
The examples which follow illustrate a variety of typical situations where UDFs can be registered.
Note that in these examples:
Suppose you have written an external UDF to perform exponentiation of floating point values, and wish to register it in the MATH schema. Assume that you have DBADM authority. As you have tested the function extensively, and know that it does not represent any integrity exposure, you define it as NOT FENCED. By virtue of having DBADM authority, you possess the database authority, CREATE_NOT_FENCED, which is required to define the function as NOT FENCED.
CREATE FUNCTION MATH.EXPON (DOUBLE, DOUBLE) RETURNS DOUBLE EXTERNAL NAME '/common/math/exponent' LANGUAGE C PARAMETER STYLE DB2SQL NO SQL DETERMINISTIC NO EXTERNAL ACTION NOT FENCED
In this example, the system uses the NOT NULL CALL default value. This is desirable since you want the result to be NULL if either argument is NULL. Since you do not require a scratchpad and no final call is necessary, the NO SCRATCHPAD and NO FINAL CALL default values are used. As there is no reason why EXPON cannot be parallel, the ALLOW PARALLELISM default value is used.
Your associate, Willie, has written a UDF to look for the existence of a given short string, passed as an argument, within a given CLOB value, which is also passed as an argument. The UDF returns the position of the string within the CLOB if it finds the string, or zero if it does not. Because you are concerned with database integrity for this function as you suspect the UDF is not fully tested, you define the function as FENCED.
Additionally, Willie has written the function to return a FLOAT result. Suppose you know that when it is used in SQL, it should always return an INTEGER. You can create the following function:
CREATE FUNCTION FINDSTRING (CLOB(500K), VARCHAR(200)) RETURNS INTEGER CAST FROM FLOAT SPECIFIC "willie_find_feb95" EXTERNAL NAME '/u/willie/testfunc/testmod!findstr' LANGUAGE C PARAMETER STYLE DB2SQL NO SQL DETERMINISTIC NO EXTERNAL ACTION FENCED
Note that a CAST FROM clause is used to specify that the UDF body really returns a FLOAT value but you want to cast this to INTEGER before returning the value to the statement which used the UDF. As discussed in the SQL Reference, the INTEGER built-in function can perform this cast for you. Also, you wish to provide your own specific name for the function and later reference it in DDL (see "Example: String Search over UDT"). Because the UDF was not written to handle NULL values, you use the NOT NULL CALL default value. And because there is no scratchpad, you use the NO SCRATCHPAD and NO FINAL CALL default values. As there is no reason why FINDSTRING cannot be parallel, the ALLOW PARALLELISM default value is used.
Because you want this function to work on BLOBs as well as on CLOBs, you define another FINDSTRING taking BLOB as the first parameter:
CREATE FUNCTION FINDSTRING (BLOB(500K), VARCHAR(200)) RETURNS INTEGER CAST FROM FLOAT SPECIFIC "willie_fblob_feb95" EXTERNAL NAME '/u/willie/testfunc/testmod!findstr' LANGUAGE C PARAMETER STYLE DB2SQL NO SQL DETERMINISTIC NO EXTERNAL ACTION FENCED
This example illustrates overloading of the UDF name, and shows that multiple UDFs can share the same body. Note that although a BLOB cannot be assigned to a CLOB, the same source code can be used. There is no programming problem in the above example as the programming interface for BLOB and CLOB between DB2 and UDF is the same; length followed by data. DB2 does not check if the UDF using a particular function body is in any way consistent with any other UDF using the same body.
This example is a continuation of the previous example. Say you are satisfied with the FINDSTRING functions from "Example: BLOB String Search", but now you have defined a distinct type BOAT with source type BLOB. You also want FINDSTRING to operate on values having data type BOAT, so you create another FINDSTRING function. This function is sourced on the FINDSTRING which operates on BLOB values in "Example: BLOB String Search". Note the further overloading of FINDSTRING in this example:
CREATE FUNCTION FINDSTRING (BOAT, VARCHAR(200)) RETURNS INT SPECIFIC "slick_fboat_mar95" SOURCE SPECIFIC "willie_fblob_feb95"
Note that this FINDSTRING function has a different signature from the FINDSTRING functions in "Example: BLOB String Search", so there is no problem overloading the name. You wish to provide our own specific name for possible later reference in DDL. Because you are using the SOURCE clause, you cannot use the EXTERNAL NAME clause or any of the related keywords specifying function attributes. These attributes are taken from the source function. Finally, observe that in identifying the source function you are using the specific function name explicitly provided in "Example: BLOB String Search". Because this is an unqualified reference, the schema in which this source function resides must be in the function path, or the reference will not be resolved.
You have written another UDF to take a BOAT and examine its design attributes and generate a cost for the boat in Canadian dollars. Even though internally, the labor cost may be priced in German marks, or Japanese yen, or US dollars, this function needs to generate the cost to build the boat in the required currency, Canadian dollars. This means it has to get current exchange rate information from the exchange_rate file, managed outside of DB2, and the answer depends on what it finds in this file. This makes the function NOT DETERMINISTIC (or VARIANT).
CREATE FUNCTION BOAT_COST (BOAT) RETURNS INTEGER EXTERNAL NAME '/u/marine/funcdir/costs!boatcost' LANGUAGE C PARAMETER STYLE DB2SQL NO SQL NOT DETERMINISTIC NO EXTERNAL ACTION FENCED
Observe that CAST FROM and SPECIFIC are not specified, but that NOT DETERMINISTIC is specified. Here again, FENCED is chosen for safety reasons.
This example implements the AVG column function over the CANADIAN_DOLLAR distinct type. See "Example: Money" for the definition of CANADIAN_DOLLAR. Strong typing prevents you from using the built-in AVG function on a distinct type. It turns out that the source type for CANADIAN_DOLLAR was DECIMAL, and so you implement the AVG by sourcing it on the AVG(DECIMAL) built-in function. The ability to do this depends on being able to cast from DECIMAL to CANADIAN_DOLLAR and vice versa, but since DECIMAL is the source type for CANADIAN_DOLLAR you know these casts will work.
CREATE FUNCTION AVG (CANADIAN_DOLLAR) RETURNS CANADIAN_DOLLAR SOURCE "SYSIBM".AVG(DECIMAL(9,2))
Note that in the SOURCE clause you have qualified the function name, just in case there might be some other AVG function lurking in your function path.
Your simple counting function returns a 1 the first time and increments the result by one each time it is called. This function takes no SQL arguments, and by definition it is a NOT DETERMINISTIC function since its answer varies from call to call. It uses the scratchpad to save the last value returned, and each time it is invoked it increments this value and returns it. You have rigorously tested this function, and possess DBADM authority on the database, so you will define it as NOT FENCED. (DBADM implies CREATE_NOT_FENCED.)
CREATE FUNCTION COUNTER () RETURNS INT EXTERNAL NAME '/u/roberto/myfuncs/util!ctr' LANGUAGE C PARAMETER STYLE DB2SQL NO SQL NOT DETERMINISTIC NOT FENCED SCRATCHPAD DISALLOW PARALLEL
Note that no parameter definitions are provided, just empty parentheses. The above function specifies SCRATCHPAD, and uses the default specification of NO FINAL CALL. In this case, as the size of the 100-byte scratchpad is sufficient, no storage has to be freed by means of a final call, and so NO FINAL CALL is specified. Since the COUNTER function requires that a single scratchpad be used to operate properly, DISALLOW PARALLEL is added to prevent DB2 from operating it in parallel. To see an implementation of this COUNTER function, refer to "Example: Counter".
This example implements the previous counting example as an OLE (Object Linking and Embedding) automation object, counter, with an instance variable, nbrOfInvoke, to keep track of the number of invocations. Every time the UDF gets invoked, the increment method of the object increments the nbrOfInvoke instance variable and returns its current state. The automation object is registered in the Windows registry with the OLE programmatic identifier (progID) bert.bcounter.
CREATE FUNCTION bcounter () RETURNS integer EXTERNAL NAME 'bert.bcounter!increment' LANGUAGE OLE PARAMETER STYLE DB2SQL SCRATCHPAD NOT DETERMINISTIC FENCED NULL CALL NO SQL NO EXTERNAL ACTION DISALLOW PARALLEL;
The implementation of the class counter is shown in "Example: Counter OLE Automation UDF in BASIC" and in "Example: Counter OLE Automation UDF in C++". For details of OLE support with DB2, see "Writing OLE Automation UDFs".
You have written a table function which returns a row consisting of a single document identifier column for each known document in your text management system which matches a given subject area (the first parameter) and contains the given string (second parameter). This UDF uses the functions of the text management system to quickly identify the documents:
CREATE FUNCTION DOCMATCH (VARCHAR(30), VARCHAR(255)) RETURNS TABLE (DOC_ID CHAR(16)) EXTERNAL NAME '/common/docfuncs/rajiv/udfmatch' LANGUAGE C PARAMETER STYLE DB2SQL NO SQL DETERMINISTIC NO EXTERNAL ACTION NOT FENCED SCRATCHPAD FINAL CALL DISALLOW PARALLEL CARDINALITY 20
Within the context of a single session it will always return the same table, and therefore it is defined as DETERMINISTIC. Note the RETURNS clause which defines the output from DOCMATCH, including the column name DOC_ID, and that FINAL CALL must be specified for each table function. In addition, the DISALLOW PARALLEL keyword is added as table functions cannot operate in parallel. Although the size of the output from DOCMATCH is highly variable, CARDINALITY 20 is a representative value, and is specified to help the DB2 optimizer to make good decisions.
Typically this table function would be used in a join with the table containing the document text, as follows:
SELECT T.AUTHOR, T.DOCTEXT FROM DOCS as T, TABLE(DOCMATCH('MATHEMATICS', 'ZORN''S LEMMA')) as F WHERE T.DOCID = F.DOC_ID
Note the special syntax (TABLE keyword) for specifying a table function in a FROM clause. In this invocation, the docmatch() table function returns a row containing the single column DOC_ID for each mathematics document referencing Zorn's Lemma. These DOC_ID values are joined to the master document table, retrieving the author's name and document text.
Scalar and column UDFs can be invoked within an SQL statement wherever an expression is valid (there are additional rules for all column functions that limit validity). Table UDFs can only be referenced in the FROM clause of a SELECT. The SQL Reference discusses all these contexts in detail. The discussion and examples used in this section focus on relatively simple SELECT statement contexts, but note that their use is not restricted to these contexts.
Refer to "UDF Concepts" for a summary of the use and importance of the function path and the function selection algorithm. You can find the details for both of these concepts in the SQL Reference. The resolution of any Data Manipulation Language (DML) reference to a function uses the function selection algorithm, so it is important to understand how it works.
Each reference to a function, whether it is a UDF, or a built-in function, contains the following syntax:
>--function_name--(--.----------------.--)----------> | <-,----------< | '---expression---'
In the above, function_name can be either an unqualified or a qualified function name, and the arguments can number from 0 to 90, and are expressions which may contain:
The position of the arguments is important and must conform to the function definition for the semantics to be correct. Both the position of the arguments and the function definition must conform to the function body itself. DB2 does not attempt to shuffle arguments to better match a function definition, and DB2 does not understand the semantics of the individual function parameters.
Use of column names in UDF argument expressions requires that the table references which contain the column have proper scope. For table functions referenced in a join, this means that for any argument which involves columns from another table or table function, that other table or table function must appear before the table function containing the reference, in the FROM clause. For a complete discussion of the rules for using columns in the arguments of table functions, see the chapter on Queries in the SQL Reference.
Some valid examples of function invocations are:
AVG(FLOAT_COLUMN) BLOOP(COLUMN1) BLOOP(FLOAT_COLUMN + CAST(? AS INTEGER)) BLOOP(:hostvar :indicvar) BRIAN.PARSE(CHAR_COLUMN CONCAT USER, 1, 0, 0, 1) CTR() FLOOR(FLOAT_COLUMN) PABLO.BLOOP(A+B) PABLO.BLOOP(:hostvar) "search_schema"(CURRENT FUNCTION PATH, 'GENE') SUBSTR(COLUMN2,8,3) SYSFUN.FLOOR(AVG(EMP.SALARY)) SYSFUN.AVG(SYSFUN.FLOOR(EMP.SALARY)) SYSIBM.SUBSTR(COLUMN2,11,LENGTH(COLUMN3)) SQRT(SELECT SUM(length*length) FROM triangles WHERE id= 'J522' AND legtype <> 'HYP')
Note that if any of the above functions are table functions, the syntax to reference them is slightly different than presented above. For example, if PABLO.BLOOP is a table function, to properly reference it, use:
TABLE(PABLO.BLOOP(A+B)) AS Q
An important restriction involves parameter markers; you cannot simply code the following:
BLOOP(?)
As the function selection logic does not know what data type the argument may turn out to be, it cannot resolve the reference. You can use the CAST specification to provide a type for the parameter marker, for example INTEGER, and then the function selection logic can proceed:
BLOOP(CAST(? AS INTEGER))
If you use a qualified function reference, you restrict DB2's search for a matching function to that schema. For example, you have the following statement:
SELECT PABLO.BLOOP(COLUMN1) FROM T
Only the BLOOP functions in schema PABLO are considered. It does not matter that user SERGE has defined a BLOOP function, or whether or not there is a built-in BLOOP function. Now suppose that user PABLO has defined two BLOOP functions in his schema:
CREATE FUNCTION BLOOP (INTEGER) RETURNS ... CREATE FUNCTION BLOOP (DOUBLE) RETURNS ...
BLOOP is thus overloaded within the PABLO schema, and the function selection algorithm would choose the best BLOOP, depending on the data type of the argument, column1. In this case, both of the PABLO.BLOOPs take numeric arguments, and if column1 is not one of the numeric types, the statement will fail. On the other hand if column1 is either SMALLINT or INTEGER, function selection will resolve to the first BLOOP, while if column1 is DECIMAL or DOUBLE, the second BLOOP will be chosen.
Several points about this example:
SELECT PABLO.BLOOP(DOUBLE(COLUMN1)) FROM T
SELECT PABLO.BLOOP(INTEGER(COLUMN1)) FROM T SELECT PABLO.BLOOP(FLOOR(COLUMN1)) FROM T SELECT PABLO.BLOOP(CEILING(COLUMN1)) FROM T SELECT PABLO.BLOOP(INTEGER(ROUND(COLUMN1,0))) FROM T
You should investigate these other functions in the SQL Reference. The INTEGER function is a built-in function in the SYSIBM schema. The FLOOR, CEILING, and ROUND functions are UDFs shipped with DB2, which you can find in the SYSFUN schema along with many other useful functions.
If, instead of a qualified function reference, you use an unqualified function reference, DB2's search for a matching function normally uses the function path to qualify the reference. In the case of the DROP FUNCTION or COMMENT ON FUNCTION functions, the reference is qualified using the current authorization ID, if they are unqualified. Thus, it is important that you know what your function path is, and what, if any, conflicting functions exist in the schemas of your current function path. For example, suppose you are PABLO and your static SQL statement is as follows, where COLUMN1 is data type INTEGER:
SELECT BLOOP(COLUMN1) FROM T
You have created the two BLOOP functions cited in "Using Qualified Function Reference", and you want and expect one of them to be chosen. If the following default function path is used, the first BLOOP is chosen (since column1 is INTEGER), if there is no conflicting BLOOP in SYSIBM or SYSFUN:
"SYSIBM","SYSFUN","PABLO"
However, suppose you have forgotten that you are using a script for precompiling and binding which you previously wrote for another purpose. In this script, you explicitly coded your FUNCPATH parameter to specify the following function path for another reason that does not apply to your current work:
"KATHY","SYSIBM","SYSFUN","PABLO"
If Kathy has written a BLOOP function for her own purposes, the function selection could very well resolve to Kathy's function, and your statement would execute without error. You are not notified because DB2 assumes that you know what you are doing. It becomes your responsibility to identify the incorrect output from your statement and make the required correction.
For both qualified and unqualified function references, the function selection algorithm looks at all the applicable functions, both built-in and user-defined, that have:
(Applicable functions means functions in the named schema for a qualified reference, or functions in the schemas of the function path for an unqualified reference.) The algorithm looks for an exact match, or failing that, a best match among these functions. The current function path is used, in the case of an unqualified reference only, as the deciding factor if two identically good matches are found in different schemas. The details of the algorithm can be found in the SQL Reference.
An interesting feature, illustrated by the examples at the end of "Using Qualified Function Reference", is the fact that function references can be nested, even references to the same function. This is generally true for built-in functions as well as UDFs; however, there are some limitations when column functions are involved.
Refining an earlier example:
CREATE FUNCTION BLOOP (INTEGER) RETURNS INTEGER ... CREATE FUNCTION BLOOP (DOUBLE) RETURNS INTEGER ...
Now consider the following DML statement:
SELECT BLOOP( BLOOP(COLUMN1)) FROM T
If column1 is a DECIMAL or DOUBLE column, the inner BLOOP reference resolves to the second BLOOP defined above. Because this BLOOP returns an INTEGER, the outer BLOOP resolves to the first BLOOP.
Alternatively, if column1 is a SMALLINT or INTEGER column, the inner bloop reference resolves to the first BLOOP defined above. Because this BLOOP returns an INTEGER, the outer BLOOP also resolves to the first BLOOP. In this case, you are seeing nested references to the same function.
A few additional points important for function references are:
CREATE FUNCTION "+" (BOAT, BOAT) RETURNS ...
Then you can write the following valid SQL statement:
SELECT BOAT_COL1 + BOAT_COL2 FROM BIG_BOATS WHERE BOAT_OWNER = 'Nelson Mattos'
But you can also write the equally valid statement:
SELECT "+"(BOAT_COL1, BOAT_COL2) FROM BIG_BOATS WHERE BOAT_OWNER = 'Nelson Mattos'
Note that you are not permitted to overload the built-in conditional operators such as >, =, LIKE, IN, and so on, in this way. See "Example: Integer Divide Operator" for an example of a UDF which overloads the divide (/) operator.
CREATE FUNCTION BLOOP (INTEGER) RETURNS INTEGER ... CREATE FUNCTION BLOOP (DOUBLE) RETURNS CHAR(10)...
Now suppose you write the following SELECT statement:
SELECT 'ABCDEFG' CONCAT BLOOP(SMALLINT_COL) FROM T
Because the best match, resolved using the SMALLINT argument, is the first BLOOP defined above, the second operand of the CONCAT resolves to data type INTEGER. The statement fails because CONCAT demands string arguments. If the first BLOOP was not present, the other BLOOP would be chosen and the statement execution would be successful.
Another type of contextual inconsistency that causes a statement to fail is if a given function reference resolves to a table function in a context that requires a scalar or column function. The reverse could also occur. A reference could resolve to a scalar or column function when a table function is necessary.
EXEC SQL BEGIN DECLARE SECTION; SQL TYPE IS CLOB(150K) clob150K ; /* LOB host var */ SQL TYPE IS CLOB_LOCATOR clob_locator1; /* LOB locator host var */ char string[40]; /* string host var */ EXEC SQL END DECLARE SECTION;
Either host variable :clob150K or :clob_locator1 is valid as an argument for a function whose corresponding parameter is defined as CLOB(500K). Thus, referring to the FINDSTRING defined in "Example: String Search", both of the following are valid in the program:
... SELECT FINDSTRING (:clob150K, :string) FROM ... ... SELECT FINDSTRING (:clob_locator1, :string) FROM ...
You can also use this capability on UDF parameters or results which have a distinct type that is based on a LOB. This capability is limited to UDFs defined as not-fenced. Note that the argument to such a function can be any LOB value of the defined type; it does not have to be a host variable defined as one of the LOCATOR types. The use of host variable locators as arguments is completely orthogonal to the use of AS LOCATOR in UDF parameters and result definitions.
Distinct type values which originate in a host variable and which are used as arguments to a UDF which has its corresponding parameter defined as a distinct type, must be explicitly cast to the distinct type by the user. There is no host language type for distinct types. DB2's strong typing necessitates this. Otherwise your results may be ambiguous. So, consider the BOAT distinct type which is defined over a BLOB, and consider the BOAT_COST UDF from "Example: External Function with UDT Parameter", which takes an object of type BOAT as its argument. In the following fragment of a C language application, the host variable :ship holds the BLOB value that is to passed to the BOAT_COST function:
EXEC SQL BEGIN DECLARE SECTION; SQL TYPE IS BLOB(150K) ship; EXEC SQL END DECLARE SECTION;
Both of the following statements correctly resolve to the BOAT_COST function, because both cast the :ship host variable to type BOAT:
... SELECT BOAT_COST (BOAT(:ship)) FROM ... ... SELECT BOAT_COST (CAST(:ship AS BOAT)) FROM ...
If there are multiple BOAT distinct types in the database, or BOAT UDFs in other schema, you must exercise care with your function path. Otherwise your results may be ambiguous.