XMLDB preliminary notes

XML database schema > XMLDB preliminary notes

This notes were collected in the early stages of the project, as long as our tests gave us more and more info about how to handle all this new stuff. They will be marked with the Moved tag as they go to their definitive place under the XMLDB documentation.

Moved to XMLDB Modifying DML functions. We need to change all the uses of the LIMIT offset, num clause to use the cross-db compatible SelectLimit() function. Two alternatives seem possible:
1. Create a new set of get_records_limit() (and get_recordset_limit() ?) functions, allowing to specify the offset, num parameters.
2. Modify the current get_records() (and get_recordset()) functions to allow two more optional parameters (offset, num)
Moved to XMLDB Problems.Analyse the impact of such SelectLimit() calls under SQL*Server, Oracle... because it's emulated by ADOdb on those DBs, because their lack of support for the LIMIT clause. As the offset parameter grows, ADOdb must iterate over more records to get the desired window and it could be a problem under long sets of records!
Moved to XMLDB Problems.What to do with all the previously created DB objects (indexes, unique indexes, sequences...) if their naming schema doesn't fit with the implemented by ADOdb. Drop/recreate everything? Leave it unmodified?
Moved to XMLDB Problems. Currently, both under MySQL and PostgreSQL, ADODB_FETCH_BOTH (default) is used so all the information is, practically, duplicated. In other side, SQL*Server, by default, uses ADODB_FETCH_NUM (although it can be changed to ADODB_FETCH_ASSOC). Proposal, after connection, change ALWAYS to ADODB_FETCH_ASSOC. It's supported by all the DB (while ADODB_FETCH_BOTH isn't!) and it will save us near 50% memory (and speed?) for record arrays! If there was some places using ADODB_FETCH_NUM structures inside Moodle (I really thing they aren't used) we must update them. UPDATED: it seems that the id field is lost in conversion to ADODB_FETCH_ASSOC so, perhaps it wouldn't be a good idea to force this mode! :-( UPDATED: We also could hack calls to getAssoc() to add the id field back. Memory gain is really big! And places requiring FETCH_NUM shouldn't be really legion....uhm...
Done!: Force ADODB_ASSOC_CASE to 0 (lowercase). Some DB could break this and until now, both mysql and postgresql are working fine with lowercased field names.
Moved to XMLDB Problems#Reserved Words in Moodle DB. Reserved words. What to do? One horrible example: "USER": http://www.petefreitag.com/tools/sql_reserved_words_checker/?word=user
1. Try to quote them.
2. Change them completely under Moodle.
Moved to XMLDB Defining one XML structure. Need to pre-parse the XML database schema to add prefix to all tables/indexes/constraints before parsing and executing!
Moved to XMLDB Creating new DDL functions. Analise how upgrades are going to succeed. Everything inside one unique upgrade.php file, ok, DML should be php, but DDL, how to handle it ? Proposal to create a bunch of functions to create, alter and drop everything! Handle special DB casuistic there (this will help to have more readable upgrade scripts).
Be able to specify, by plugin, the list of supported DBs.
Moved to XMLDB Modifying DML functions. MSSQL escape quotes by adding one more quote to the quote character, instead of standard addslashses. This can be solved by using qstr() (so a lot of addslashes() calls should be modified) or by setting magic_quotes_sybase when running against SQL*server (then addslashes() will escape only the quote char). Independent of the solution, stripslashes_safe() must be modified to strip the correct characters based on $CFG->dbtype (because the magic_quotes_sybase setting modifies what's quoted by magic_quotes_gpc too).
Moved to XMLDB Modifying DML functions. Under Oracle a similar mechanism to the used in postgresql in insert_record() could be needed: get sequence nextval, insert record... when the id is needed. Simple.
Moved to XMLDB Modifying DML functions. Important Oracle problems when inserting CLOB/BLOB data. Under Oracle the two steps (INSERT empty_b/clob() and UPDATE are needed). This will force us to review ALL the updates against current MEDIUMTEXT, TEXT, BYTEA columns to use the 2 steps approach, using the technique described in: http://phplens.com/adodb/reference.functions.updateblob.html and http://phplens.com/adodb/reference.functions.updateclob.html. Needs testing agains SQL*Server (because it isn't explicitly supported). We could implement some wrapper modifications in our central insert/update datalib functions to support this transparently (I hope), adding some parameter or detecting it dynamically (getting metadata for columns, detecting dbtype...). But all the hard-coded INSERT and UPDATE statements should disappear from Moodle code, using datalib functions always.
Moved to XMLDB Splitting datalib.php. Alleviate datalib.php from some functions currently present that aren't part of the "wrapper-over-adodb" objective, moving them to their own library (module, course...). With this we should end with one smaller and well-defined (and documented) datalib.
Moved to XMLDB Problems. Problems user Oracle (and SQL*Server) with NOT NULL fields and attempts to insert '' values. Not allowed! We should re-examine all those fields across DB and change their status to NULLABLE. Analyse why a lot of them were changed recently in MySQL!! See also Bug 6218
Moved to XMLDB Problems. Regular expressions. Problems with Oracle (not available until 10g) and MSSQL (not available). Oracle 10g implements them using directly and one package existed since ages (Oracle 8i?) to handle them (owa_pattern). MSSQL can execute them by installing some stored procedures. Oh, oh, problems with some (a few) queries...
Moved to XMLDB Modifying DML functions. Incompatibilities with the concatenation of strings: MySQL uses the CONCAT() function, while Oracle and PostgreSQL use the '||' concatenation operator and MSSQL uses the '+' operator. This affects both installation and some queries inside Moodle. At installation is could be pretty easy to define anything (some well defined syntax/nomenclature) and then transform the statement to the correct one. But on-the-fly-modifications on normal execution could be a nightmare, so some dmllib function should return the correct sql part).
Done! -- actually supported.Check compatibility of OUTER joins syntax across RDBMS. Recently I read MSSQL doesn't support it (the SQL standard syntax).
Look for all the places using $CFG->dbtype to see what happens with new RDBMS.
Done!: Modify the utf8 migration script in order to properly drop/recreate indexes with the previously existing names, to avoid any type of index duplication. Also, make the migration script to respect NULLable columns.

Documentation

XMLDB preliminary notes