Note:

If you want to create a new page for developers, you should create it on the Moodle Developer Resource site.

Talk:DB layer 2.0: Difference between revisions

From MoodleDocs
m (Talk:dmllib 2.0 moved to Talk:DB layer 2.0: proper naming of the whole thing (both dml and ddl))
 
(90 intermediate revisions by 3 users not shown)
Line 1: Line 1:
==Requirements==
* OOP
* easy unit testing
* full abstraction (AdoDB, PDO)
* easy to use
* code must not know on which db it is running (no condifional db family hacks anymore!)
* no magic quotes and slashing anymore


== Various Notes (before 20080501) ==


* AdoDB, AdoDB over PDO, PDO. (my initial +1 to continue using underlying ADOdb).
* Placeholder types: :named or ? (will require PHP parsing under some DBs).
* Multiple connections supported (auto-contained DB object).
* PERF debugging available.
* Logging available (to detect wrong uses).




* Total breakage (all contrib uses)
== Various Notes (before 20080515) ==
* Total breakage (current dmllib 1.0 functions definition).
* Development mode (branch with radical replacement, some compatibility layer to minimize breakage - current dmllib instantiating new dmllib + stripping slashes ?)


=== Interface proposal ===


* Mahara experience. ( it's also worthwhile to note that a long long time ago I also did this for elgg which involved doing the addslashes/stripslashes audit through the code base - we didn't have to do this with mahara as we had placeholders and no slashes right from the start - Penny )
Sample implementation with basic API definition is in MDL-14679
* Interface example.
* dmllib 2.0 tests since the beginning.
* dmllib 2.0 documentation since the beginning.
* Classes implementation order (PG, MySQL, MSSQL, Oracle).


* More ideas.
====DML (Data Manipulation Language)====


lib/dml/ classes are used to read/write data from database.


* Next meeting. TODO: document MDM20080501, AdoDb/PDO decision, Interface + PHP documentation + Docs documentation, define tests to perform. ADDRESS IT for... 20080508 ?
All classes are stored in lib/dml/ directory. The naming convention is dbtype_dblibrary_moodle_database for database classes. For example ''oci8po_adodb_moodle_database'' and ''oci_pdo_moodle_database''. There is a new configuration option $CFG->dblibrary which can be adodb, pdo or anything else in case of 3rd party modifications. The $CFG->dbtype is expected to contain internal driver name, it must not be used outside of dml.


[[User:Eloy Lafuente (stronk7)|Eloy Lafuente (stronk7)]] 19:00, 29 April 2008 (CDT)
Each database class must fully abstract all operations with database including API for XMLDB editor.


=== PDO/ADODB ===
To minimise conversion costs all function names are kept, though the prepared statements need a bit different method parameters:
* $sql parameters need $params array with values, $sql query must not contain any user submitted data - instead use ? or :name parameters, both types are supported, but not both in the same query
* $select parameters must be accompanied by $params too
* $field1, $value1 are replaced by $conditions array - you can have more than one condition in get_records() and more than three in get_record() now


I think it's interesting to discuss the potential of moving to PDO instead of ADODB but I'm not convinced it needs to be in the same conversation - we should be able to switch out ADODB and use PDO instead with completely no change outside dml - surely nowhere in the code is using $db directly - should always be using execute_sql
The use of recordsets was changed substantially - they can be used in foreach($rs as $record) directly. Recordset closing should be mandatory now, this might help with performance and memory consumption later. Each library abstraction or database class must define own recorset class with moodle_recordset interface.


[[User:Penny Leach|Penny Leach]]
::While I like the foreach iterators here... this makes me think if that won't cause people to get confused between get_record() and get_recordset() functions. First ones doesn't need closing but second ones yes. Also, while thinking on this... do you think we could profile how many open recordsets are left open per request, debugging error if necessary? That will help developers, for sure. [[User:Eloy Lafuente (stronk7)|Eloy Lafuente (stronk7)]] 19:27, 6 May 2008 (CDT)


::Yup 100% agree, underlying stuff will be completely hidden, so it's a non-priority decision. We'll be able "replace" it more or less easily or, alternatively, create new "Moodle drivers" using different internal stuff. Just prospecting your opinion about PDO. I've done some (basic) research and it seems not to be 100% ready for production under some DBs. But wanted to know if you have some experience with it (or at least you initial feeling).
get_records() and similar functions that return arrays now return empty array if nothing found and false only if error occured, this should not cause any major regressions, all code should be verified during the conversion (proposed by Donal).


::[[User:Eloy Lafuente (stronk7)|Eloy Lafuente (stronk7)]]
Database classes should not depend on $CFG settings - instead supplly them in constructors or use set_property() methods, this will allow us to use the same library for enrolment and auth plugins.


==Interface proposal==
====DDL (Database Definition Language)====
sample implementaion with nearly full API definition in MDL-14679
----


== Parameter type - :name or ? ==
lib/ddl classes are used to manipulate database structure - this is done usually from upgrade/installation scripts only.


The :name style seems to be much more flexible than ?.
Database manager instance may be obtained from database instance (ex: $dbman = $DB->get_manager();), this database manager replaces the old functions used in upgrade scripts (ex: $dbman->drop_table($table), $dbman->field_exists($table, $field))


Pros (name:)
Each database family has one generator class which may be extended to support special types of drivers, these generators are by default shared by all drivers of the same family.
* order not important
* possible validation


Pros (?)
== Various Notes (before 20080527) ==
* supported by all backends except Ora
* no probelm with overwriting


AdoDB
Agenda:
* expects ? except for ora (not sure)


PDO
* From previous meetings:
* supports both
** DDL Exceptions.
** Workshop to support 2.0 dmllib (assignee)
** Journal to support 2.0 dmllib (Petr)
** M4 way to go (HEAD, branch, patch)


== API changes ==
* Status:
** M2 status (each one). What's done, what's missing. MDL-14679
** [[dmllib_2.0_migration_docs|Migration guide]]
** What to do between May 28th and June 11th (M3)
** Commit May 27th


Donal suggested that get_records and families should return an empty array rather than false so you can iterate straight over them.
* Problems:
** Quick review of https://docs.moodle.org/en/Development:dmllib_2.0_problems


My +1 to return array() if nothing found and false if error, I do not think that if ($resultarray =get_records()) {} is wrong. [[User:Skodak|Skodak]]
* Everything else


== default $mdb object? ==
* Next meeting


In the above examples there is:
==What parts of adodb do we use/need==
* connect to database, configure and set encoding
* execute sql without result with bound parameters
* get results as recorset with bound parameters
* get results as array  with bound parameters (can be emulated)
* transactions - begin, end, rollback
* get table columns information - needed for insert and update data validation (can be partially emulated, in fact this could be nice perf boost to use the data we already have in install.xml's)
* get list of tables
* get list of indexes
* get list of keys
* debugging, logging , version info, etc.


$mdb = get_moodle_db(); // or static factory method
PDO and native drivers in PHP5 offer the same level of functionality (except the legacy mysql driver).


To make things simpler for the 99% of developers who won't need anything more, can I suggest we do this once by default somewhere in setup.php?  We already have a $db object by default, perhaps we can even re-use it to make things read nicely and reduce confusion, so developers can just do:
==Type hinting in sql parameters (will not be implemented)==
All db drivers support some sort of type hinting when binding sql parameters. There is also option to autodetect the type from native PHP data type, but unfortunately this does not work well in our codebase (Example is error when passing empty string instead of 0 in pg). The API would be relatively simple when binding parameters one by one, but it is at present unsolved when defining parameters with array.


$blah = $db->get_records(...)
In case of insert and update we can get the types from our table definitions or by looking directly into database, but in case of general query parameters it is often hard to guess (especially in case of empty string '').


Things like $db->debug = true; can be changed to  
Solution could be to use special hints which would be part of parameter name, the last letter (if capital) would describe the type of parameter :paramN for integer number (seems better than I), :paramF for floating point numbers, paramT for text, paramC for char.


$db->debug(true);  // sets the adodb debug var
We are extremely lucky we do not use dates, blobs and other complex data types that differ in each database engine. In any case we should always try to submit valid parameter data, supply data types and report potential problems to developers and not rely on various conversions especially in mysql.
 
 
Hmm, I am not sure having global $mdb is good for unit testing [[User:Skodak|Skodak]]

Latest revision as of 17:08, 25 July 2008



Various Notes (before 20080515)

Interface proposal

Sample implementation with basic API definition is in MDL-14679

DML (Data Manipulation Language)

lib/dml/ classes are used to read/write data from database.

All classes are stored in lib/dml/ directory. The naming convention is dbtype_dblibrary_moodle_database for database classes. For example oci8po_adodb_moodle_database and oci_pdo_moodle_database. There is a new configuration option $CFG->dblibrary which can be adodb, pdo or anything else in case of 3rd party modifications. The $CFG->dbtype is expected to contain internal driver name, it must not be used outside of dml.

Each database class must fully abstract all operations with database including API for XMLDB editor.

To minimise conversion costs all function names are kept, though the prepared statements need a bit different method parameters:

  • $sql parameters need $params array with values, $sql query must not contain any user submitted data - instead use ? or :name parameters, both types are supported, but not both in the same query
  • $select parameters must be accompanied by $params too
  • $field1, $value1 are replaced by $conditions array - you can have more than one condition in get_records() and more than three in get_record() now

The use of recordsets was changed substantially - they can be used in foreach($rs as $record) directly. Recordset closing should be mandatory now, this might help with performance and memory consumption later. Each library abstraction or database class must define own recorset class with moodle_recordset interface.

While I like the foreach iterators here... this makes me think if that won't cause people to get confused between get_record() and get_recordset() functions. First ones doesn't need closing but second ones yes. Also, while thinking on this... do you think we could profile how many open recordsets are left open per request, debugging error if necessary? That will help developers, for sure. Eloy Lafuente (stronk7) 19:27, 6 May 2008 (CDT)

get_records() and similar functions that return arrays now return empty array if nothing found and false only if error occured, this should not cause any major regressions, all code should be verified during the conversion (proposed by Donal).

Database classes should not depend on $CFG settings - instead supplly them in constructors or use set_property() methods, this will allow us to use the same library for enrolment and auth plugins.

DDL (Database Definition Language)

lib/ddl classes are used to manipulate database structure - this is done usually from upgrade/installation scripts only.

Database manager instance may be obtained from database instance (ex: $dbman = $DB->get_manager();), this database manager replaces the old functions used in upgrade scripts (ex: $dbman->drop_table($table), $dbman->field_exists($table, $field))

Each database family has one generator class which may be extended to support special types of drivers, these generators are by default shared by all drivers of the same family.

Various Notes (before 20080527)

Agenda:

  • From previous meetings:
    • DDL Exceptions.
    • Workshop to support 2.0 dmllib (assignee)
    • Journal to support 2.0 dmllib (Petr)
    • M4 way to go (HEAD, branch, patch)
  • Status:
    • M2 status (each one). What's done, what's missing. MDL-14679
    • Migration guide
    • What to do between May 28th and June 11th (M3)
    • Commit May 27th
  • Everything else
  • Next meeting

What parts of adodb do we use/need

  • connect to database, configure and set encoding
  • execute sql without result with bound parameters
  • get results as recorset with bound parameters
  • get results as array with bound parameters (can be emulated)
  • transactions - begin, end, rollback
  • get table columns information - needed for insert and update data validation (can be partially emulated, in fact this could be nice perf boost to use the data we already have in install.xml's)
  • get list of tables
  • get list of indexes
  • get list of keys
  • debugging, logging , version info, etc.

PDO and native drivers in PHP5 offer the same level of functionality (except the legacy mysql driver).

Type hinting in sql parameters (will not be implemented)

All db drivers support some sort of type hinting when binding sql parameters. There is also option to autodetect the type from native PHP data type, but unfortunately this does not work well in our codebase (Example is error when passing empty string instead of 0 in pg). The API would be relatively simple when binding parameters one by one, but it is at present unsolved when defining parameters with array.

In case of insert and update we can get the types from our table definitions or by looking directly into database, but in case of general query parameters it is often hard to guess (especially in case of empty string ).

Solution could be to use special hints which would be part of parameter name, the last letter (if capital) would describe the type of parameter :paramN for integer number (seems better than I), :paramF for floating point numbers, paramT for text, paramC for char.

We are extremely lucky we do not use dates, blobs and other complex data types that differ in each database engine. In any case we should always try to submit valid parameter data, supply data types and report potential problems to developers and not rely on various conversions especially in mysql.