Writing User-Defined Functions

There are two types of user-defined functions you can write for QDB to use: functions that transform some data (called scalar or aggregate functions), and functions that order data (called collation functions). The first type is invoked using the SELECT SQL statement, while the second by using the COLLATE clause. An example of a built in scalar function is ABS(), while BINARY() is an example of a built in collation function.

To define functions that QDB can use, you need to compile them into a DLL. You then tell QDB to load the DLL by setting the Collation and Function options in the QDB configuration file for each required function.

User scalar/aggregate functions

These are specified in the configuration file with the Function = tag@library.so option, where library.so is the name of a DLL containing your code (this can be an absolute path or a filename within the LD_LIBRARY_PATH search) and tag is the name of the struct qdb_function entry describing the function. This is set up as follows:

static void myfunc(sqlite3_context *context, int narg, sqlite3_value **value)
    {
    }
struct qdb_function ftag = { "func", SQLITE_UTF8, 1, NULL, myfunc, NULL, NULL };

The tag value in this case is ftag, the function name as visible to SQL is func, and the function called is myfunc(), which can retrieve the 4th field (here NULL) as its sqlite3_user_data().


Note: The ftag was used to clarify the example. You would probably use the name func here so it was the same as the SQL name.

There can be multiple functions defined (in the same or different DLLs), but each must have a Function= entry in the configuration file for the database it is associated with, and each must have a struct qdb_function with a unique name describing it.

The qdb_function structure has these members:

struct qdb_function {
 char       *name;
 int        encoding;
 int        narg;
 void       *arg;
 void       (*func)(struct sqlite3_context *, int, struct Mem **);
 void       (*step)(struct sqlite3_context *, int, struct Mem **);
 void       (*final)(struct sqlite3_context *);
};
name
The name used for this function in SQL statements. This is limited to 255 bytes, exclusive of the zero-terminator, and it can't contain any special tokens, or start with a digit. Any attempt to create a function with an invalid name will result in an SQLITE_ERROR error.
encoding
The character encoding of strings passed to your function. Can be one of:
narg
The number of arguments that the function or aggregate takes. If this argument is -1, then the function or aggregate may take any number of arguments. The maximum number of arguments to a new SQL function is 127. A number larger than 127 for the third argument results in an SQLITE_ERROR error.
arg
An arbitrary pointer. The function implementations can gain access to this pointer using the sqlite_user_data() API.
func, step, final
Pointers to your function or aggregate. A scalar function requires an implementation of the func callback only; NULL pointers should be passed as the step and final arguments. An aggregate function requires an implementation of step and final, and NULL should be passed for func. Specifying an inconsistent set of callback values, such as a func and a final, or an step but no final, results in an SQLITE_ERROR return.

User collation routines

Collation routines can be used to order results from a SELECT statement. You can provide your own routine, and tell qdb to use it by providing the COLLATE keyword to the ORDER BY clause.

These routines are specified in the configuration file with the Collation = tag@library.so option, where library.so is the name of a DLL object containing your code (this can be an absolute path or a filename within the LD_LIBRARY_PATH search) and tag is the name of the struct qdb_collation entry describing the collation. This is set up as follows:

static int mysort(void *arg, int l1, const void *s1, int l2, const void *s2)
    {
        return(0);
    }

struct qdb_collation ctag = { "nosort", SQLITE_UTF8, NULL, mysort, NULL };

The tag value in this case is ctag, the collation name as visible to SQL will be nosort, and the function called is mysort(), which is passed in the 3rd field (here NULL) as its arg argument (refer to SQLite docs on sqlite3_create_collation for more detail).


Note: The ctag was used to clarify the example. You would probably use the name nosort here so it was the same as the SQL name.

There can be multiple collation sequences defined (in the same or different DLLs), but each must have a Collation= entry in the configuration file for the database it is associated with, and each must have a struct qdb_collation of a unique name describing it. This replaces the old mechanism of an array of qdb_collmodule_list_t always named init_coll_list.

The qdb_collation structure has these members:

struct qdb_collation {
 char       *name;
 int        encoding;
 void       *arg;
 int        (*compare)(void *, int, const void *, int, const void *);
 int        (*setup)(void *, const void *, int, char **);
};
name
The name used for this function in SQL statements. This is limited to 255 bytes, exclusive of the zero-terminator, and it can't contain any special tokens, or start with a digit. Any attempt to create a function with an invalid name will result in an SQLITE_ERROR error.
encoding
The character encoding of strings passed to your function. Can be one of:
arg
An arbitrary pointer to user data that is passed as the first argument to your function each time it's invoked. The function implementations can gain access to this pointer using the sqlite_user_data() API.
compare
A pointer to your collation function.
setup
A pointer to a setup function to allow dynamic configuration of sort order at runtime. See below.

The setup function takes this form:

int (*setup)(void *arg, const void *data, int nbytes, char **errmsg);

The parameters of the setup function are:

void *arg
The context pointer. This is the same as the arg to the compare function, and is passed in from the arg element of the qdb_collation structure.
const void *data
int nbytes
The configuration data, used to configure the sort. When invoked from startup, this is NULL and 0. When invoked at runtime, it is the data provided to the qdb_collation() function. QDB does not interpret the format in any way; the DLL must cooperate with the caller of qdb_collation() to exchange data of a known format.
char **errmsg
A pointer to an error message string that is available to qdb_geterrmsg() displayed on failure (actually, from startup QDB will fail it, from runtime qdb_collation() will fail and this string will be available to it as qdb_geterrmsg())

The function should return a POSIX errno, or EOK if it succeeds.

If a collation entry has a non-NULL setup entry, then this is invoked at startup and passed NULL for data and 0 for nbytes, which it can use as a hint to go into some default configuration. Then, whenever you call qdb_collation(), the setup function is invoked with new data.

If a collation has no dynamic configuration, then it can use a NULL setup entry in the struct qdb_collation, and it can't be dynamically configured.

Example

Here is an example of a table-driven collation algorithm, which uses the data pointer arg to say what table to use. The DLL would have the following entries exported from it:

uca_t _en_US_ = { ... };
uca_t _fr_FR_ = { ... };

int UCAsort(void *arg, int l1, const void *s1, int l2, const void *s2) { }

struct qdb_collation en_US = {
     "en_US", SQLITE_UTF8, &_en_US_, UCAsort, NULL };

struct qdb_collation fr_FR = {
     "fr_FR", SQLITE_UTF8, &_fr_FR_, UCAsort, NULL };

Note that both collations call the UCASort() routine, but they pass in different data pointers (&_en_US_ vs &_fr_FR_), where those are tables inside the DLL that tell it how to sort in English or French. This is passed as the first argument to the function, arg.

You would install these to QDB in the configuration file as:

[DB]
Collation = en_US@/usr/lib/libqdb_uca.so
Collation = fr_FR@/usr/lib/libqdb_uca.so

SQLite C/C++ API

This is an abridged version of the C/C++ API documentation for SQLite, which covers just the functions you might call in user-defined functions. For the full API documentation, see the SQLite website (www.sqlite.org).


Note: When consulting SQLite documentation, ensure that it corresponds to the SQLite library version that QDB is using.

sqlite3_result_*

    void sqlite3_result_blob(sqlite3_context*, const void*, int n, void(*)(void*));
    void sqlite3_result_double(sqlite3_context*, double);
    void sqlite3_result_error(sqlite3_context*, const char*, int);
    void sqlite3_result_error16(sqlite3_context*, const void*, int);
    void sqlite3_result_int(sqlite3_context*, int);
    void sqlite3_result_int64(sqlite3_context*, long long int);
    void sqlite3_result_null(sqlite3_context*);
    void sqlite3_result_text(sqlite3_context*, const char*, int n, void(*)(void*));
    void sqlite3_result_text16(sqlite3_context*, const void*, int n, void(*)(void*));
    void sqlite3_result_text16be(sqlite3_context*, const void*, int n, void(*)(void*));
    void sqlite3_result_text16le(sqlite3_context*, const void*, int n, void(*)(void*));
    void sqlite3_result_value(sqlite3_context*, sqlite3_value*);

User-defined functions invoke these routines in order to set their return value. The sqlite3_result_value() routine returns an exact copy of one of the arguments to the function.

Your user-defined function should pass as the first argument the sqlite3_context* that was passed to it by QDB.

sqlite3_value_*

    const void *sqlite3_value_blob(sqlite3_value*);
    int sqlite3_value_bytes(sqlite3_value*);
    int sqlite3_value_bytes16(sqlite3_value*);
    double sqlite3_value_double(sqlite3_value*);
    int sqlite3_value_int(sqlite3_value*);
    long long int sqlite3_value_int64(sqlite3_value*);
    const unsigned char *sqlite3_value_text(sqlite3_value*);
    const void *sqlite3_value_text16(sqlite3_value*);
    const void *sqlite3_value_text16be(sqlite3_value*);
    const void *sqlite3_value_text16le(sqlite3_value*);
    int sqlite3_value_type(sqlite3_value*);

This group of routines returns information about arguments to a user-defined function. User-defined function implementations use these routines to access their arguments.

The sqlite3_value_type() routine returns one of:

If the result is a BLOB, then the sqlite3_value_blob() routine returns the number of bytes in that BLOB. No type conversions occur. If the result is a string (or a number since a number can be converted into a string), then sqlite3_value_bytes() converts the value into a UTF-8 string and returns the number of bytes in the resulting string. The value returned does not include the \000 terminator at the end of the string. The sqlite3_value_bytes16() routine converts the value into a UTF-16 encoding and returns the number of bytes (not characters) in the resulting string. The \u0000 terminator is not included in this count.

These routines attempt to convert the value where appropriate. For example, if the internal representation is FLOAT, and a text result is requested, sprintf() is used internally to do the conversion automatically. The following table details the conversions that are applied:

Internal Type Requested Type Conversion
NULL INTEGER Result is 0
NULL FLOAT Result is 0.0
NULL TEXT Result is NULL pointer
NULL BLOB Result is NULL pointer
INTEGER FLOAT Convert from integer to float
INTEGER TEXT ASCII rendering of the integer
INTEGER BLOB Same as for INTEGER to TEXT
FLOAT INTEGER Convert from float to integer
FLOAT TEXT ASCII rendering of the float
FLOAT BLOB Same as FLOAT to TEXT
TEXT INTEGER Use atoi()
TEXT FLOAT Use atof()
TEXT BLOB No change
BLOB INTEGER Convert to TEXT, then use atoi()
BLOB FLOAT Convert to TEXT, then use atof()
BLOB TEXT Add a \000 terminator if needed

sqlite3_user_data

void *sqlite3_user_data(sqlite3_context*);

The arg member to the qdb_function struct used to register user functions is available to the implementation of the function using this call.