XMLDB problems: Difference between revisions

Revision as of 16:11, 24 August 2006

Every section on this page will show important problems present in the adoption of the new database layer planned for 1.7. Each section will define the problem, its impact and, when available, its solution. Every section will include one line containing its status ('Open', 'Decided, 'Work in progress', 'Finished', 'Tested' and 'Closed').

Of course, fell free to use the Bug Tracker to follow the daily evolution of each one of these problems by adding one link inside each section pointing to the proper bug.

Prior to change the status from 'Decided' to 'Work in progress' the corresponding page of the roadmap must be modified, including all the actions/steps needed to solve the problem. Obviously, at the end, the expected status for each section is 'Closed'.

Regular expressions

(Status: Open - Severity: Critical - Bug: not defined)

Problems with Oracle (not available until 10g) and MSSQL (not available). Oracle 10g implements them using directly and one package existed since ages (Oracle 8i?) to handle them (owa_pattern). MSSQL can execute them by installing some stored procedures. Oh, oh, problems with some (a few) queries...

SQL Limit performance

(Status: Open - Severity: Not critical - Bug: not defined)

Analyse the impact of such SelectLimit() calls under SQL*Server, Oracle... because it's emulated by ADOdb on those DBs, because their lack of support for the LIMIT clause. As the offset parameter grows, ADOdb must iterate over more records to get the desired window and it could be a problem under long sets of records!

Naming conventions

(Status: Open - Severity: Critical - Bug: not defined)

What to do with all the previously created DB objects (indexes, unique indexes, sequences...) if their naming schema doesn't fit with the implemented by ADOdb. Drop/recreate everything? Leave it unmodified? Can we use different naming schemes for each DB. PostgreSQL one seems more formal, is it the way for Oracle and SQL*Server. Also I would propose to add some option to the Health Center in order to be able to drop all the indexes and regenerate them following the exact naming rules specified in the XMLDB naming page.

Naming conventions II

(Status: Closed - Severity: Critical - Bug: 6230)

Oracle imposes one limit of 30cc. for naming DB objects (table, field, index...), and currently we have some objects over such limit: mdl_glossary_entries_categories and mdl_question_dataset_definitions. They must be renamed to anything else ASAP. No alternate solution exists. Also, once renamed we should restrict the maximum length of $CFG->prefix to avoid undesired situations, at least for new installations to, say, 5cc, limiting table names to 25cc (without the prefix). PostgreSQL before 7.4 limit seems quite near too (32 cc.).

Solution: Limit table/field/index/key/sequence names to 30cc and deny the use of prefixes for Oracle installations. See key and index naming to get info about how XMLDB constructors will be naming all the remaining objects in DB.

NOT NULL fields using a DEFAULT '' clause

(Status: Open - Severity: Critical - Bug: 6218 )

Under Moodle 1.6 there are tons of char/varchar/text/blob columns that are defined as NOT NULL and contains one DEFAULT clause with the value '' (empty string).

There are two important problems with this type of definitions:

They haven't to much sense from a design perspective. If one field is defined as NOT NULL it shouldn't contain empty values (yep, agree that NULL != everything, included empty, but in a logical world I cannot find any reason to fill fields with empty values).
Oracle and MSSQL consider empty strings as NULL values the DEFAULT application will crash in both RDBS. (http://www.techonthenet.com/oracle/questions/empty_null.php)

The proper way to solve this situation should be to annihilate all those inconsistent/wrong combinations from the whole DB schema (NOT NULL + DEFAULT ''), but it can be really hard and risky to do it now (Moodle 1.6-1.7). So here it's the proposed solution:

Don't modify 1.6 at all. Everything should continue working properly.
In the XMLDB files, for 1.7, use the correct approach, i.e. all those fields won't have a default value in the XML structure, although they'll continue as NOT NULL there.
In the XMLDB generator (create table), for MySQL and PostgreSQL, add the DEFAULT clause for all the char/text columns that haven't a DEFAULT defined in the XML structure.
Add some check in the installation script to look for that MySQL modes being problematic.

With this approach we'll have MySQL and PostgreSQL servers running exactly without modification in 1.7 while MSSQL and Oracle versions would be more correct. It's possible that some parts of the code need some modification (if they were relying in that wrong default values) but they will be a small number, sure.

After 1.7, we should start to modify MySQL and PostgreSQL implementation to use the same (correct) field definition (nullable columns) with more time to change/test/fix everything.

Indexes on TEXT columns

(Status: Open - Severity: Not Critical - Bug: not defined)

Each RDBMS implementation has completely different mechanisms in order to create FULLTEXT indexes against TEXT columns. Also, the SQL commands to perform searches against such columns are pretty different (incompatible). This forces us to avoid TEXT column indexing for now. Here there are the list of fields we have changed in the transition to the XMLDB schema to avoid such indexing:

Also, it's important to differentiate between FULLTEXT indexes (where we are going to index and search freely any text content) and FUNCTION indexes where we just index the results of the execution of one function. This last indexes aren't a suitable replacement for TEXT indexes, and aren't used at all if the EXACT" function call used to define the index isn't used by the SQL query.

In the long term, the XMLDB schema should be extended to support this type of indexes to perform better fulltext searches. At the time of writing this, some efforts are being performed by the Global search project, part of the Student projects.

Finally, note that a lot of RDBMS flavours aren't able to keep all those FULLTEXT indexes updated online all they need to be synchronised/optimised continuously by "external" agents (stored procedures, command line executables, cron tasks...).

Some interesting links about FULLTEXT indexing/search/maintain:

http://mysql.com/doc/refman/5.0/en/fulltext-search.html: Fulltext for MySQL
http://www.devx.com/opensource/Article/21674/0/page/1 : Tsearch for PostgreSQL
http://www.oracle.com/technology/products/text/htdocs/ctxcat_primer.html : Oracle CTXCAT indexes (simpler) and http://www.oracle-base.com/articles/9i/FullTextIndexingUsingOracleText9i.php : Oracle CONTEXT indexes (the complex alternative).
http://www.databasejournal.com/features/mssql/article.php/1438211: Fulltext indexes for MSSQL.

MSSQL, PHP, UTF-8 and UCS-2

(Status: Open - Severity: Critical - Bug: not defined)

Reserved Words in Moodle DB

(Status: Open - Severity: Critical - Bug: not defined)

@@ Line 73: / Line 73: @@
 * http://www.databasejournal.com/features/mssql/article.php/1438211: Fulltext indexes for MSSQL.
-== MSSQL, PHP, UTF-8 UCS-2and  ==
+== MSSQL, PHP, UTF-8 and UCS-2  ==
-('''Status:''' <font color="red">Open</font> - '''Severity:''' Not Critical - '''Bug:''' not defined)
+('''Status:''' <font color="red">Open</font> - '''Severity:''' Critical - '''Bug:''' not defined)
+== Reserved Words in Moodle DB  ==
+('''Status:''' <font color="red">Open</font> - '''Severity:''' Critical - '''Bug:''' not defined)
 ==See also==

Documentation

XMLDB problems: Difference between revisions

Revision as of 16:11, 24 August 2006

Contents

Regular expressions

SQL Limit performance

Naming conventions

Naming conventions II

NOT NULL fields using a DEFAULT '' clause

Indexes on TEXT columns

MSSQL, PHP, UTF-8 and UCS-2

Reserved Words in Moodle DB

See also