Difference between revisions of "Backup 1.9 conversion for developers"

Jump to: navigation, search
m (The backup conversion workflow: improved the workflow description)
m
Line 69: Line 69:
 
== Writing the conversion handler ==
 
== Writing the conversion handler ==
  
Before we start, please make your own local branch that will track David's https://github.com/mudrd8mz/moodle/tree/backup-convert tree. That is the branch where the development of backup conversion is happening and your work will have to be merged there before it gets into the moodle.git master. Go into your local Git clone of Moodle master (2.1dev) and execute:
+
Before we start, please make your own local branch that will track David's https://github.com/mudrd8mz/moodle/tree/backup-convert tree. That is the pre-integration branch where the development of backup conversion is happening and your work will have to be merged there before it gets into the moodle.git master (use Github's pull request feature once you want your work being included there).
 +
 
 +
To add this branch into your current moodle.git clone, you may want to use something like this:
  
 
     $ cd ~/public_html/moodle21
 
     $ cd ~/public_html/moodle21
 
     $ git remote add mudrd8mz git://github.com/mudrd8mz/moodle.git
 
     $ git remote add mudrd8mz git://github.com/mudrd8mz/moodle.git
 
     $ git fetch mudrd8mz
 
     $ git fetch mudrd8mz
     $ git checkout -b backup-convert mudrd8mz/backup-convert
+
     $ git checkout --track -b backup-convert mudrd8mz/backup-convert
  
 
Alternatively, if you want to create a pristine installation just for this project, you can simply run
 
Alternatively, if you want to create a pristine installation just for this project, you can simply run
Line 81: Line 83:
 
     $ git clone git://github.com/mudrd8mz/moodle.git moodle21convert
 
     $ git clone git://github.com/mudrd8mz/moodle.git moodle21convert
 
     $ cd moodle21convert
 
     $ cd moodle21convert
     $ git checkout -b backup-convert origin/backup-convert
+
     $ git checkout --track -b backup-convert origin/backup-convert
 +
 
 +
=== The backup conversion workflow ===
 +
 
 +
To be able to write the module conversion handler, you should understand what's going on in the background. This is a simplified description of the workflow:
 +
 
 +
[[image:backup-convert-overview.png|thumb|UML sequence diagram of the backup conversion machinery]]
 +
 
 +
* In its constructor, the restore_controller instance detects the format of the data to be restored and if it realizes it is not the standard moodle2 format, it knows the conversion will be needed (grep for backup::STATUS_REQUIRE_CONV)
 +
* The restore_controller::convert() method is called (@todo is it? where? by who?) and it in turn calls convert_helper::to_moodle2_format(). This helper method tries to find the most effective conversion path between the current format and the target moodle2 format as it constructs the chain of converters, though this is not much interesting yet as we have just one converter now ... well not yet even :-p
 +
* For each converter in the chain, convert_factory::get_converter() creates new instance of it and its public convert() method is called.
 +
* The core functionality of moodle1 converter is defined in backup/converter/moodle1/lib.php in the class moodle1_converter and its subclasses. When the class is instantiated, it prepares a progressive parser and a parser processor and it registers all available handlers of the parsed data. The incoming convert() call sets up the directory to write to and runs the instance's execute() method.
 +
* The execute() starts up the parser of the 1.9 moodle.xml. That file is parsed sequentially and whenever it reaches a node to which some handler is attached, it dispatches the parsed data via dispatch_chunk(). The parser also triggers notify_path_start() and notify_path_end() when it is entering some registered path element and when it is leaving it.
 +
* The parser processor re-dispatches the parsed data and events via path_start_reached(), process_chunk() and path_end_reached() methods defined by the moodle1_convert.
 +
* Finally, the parsed data and events are re-dispatched once more and they are handled by moodle1_handler subclasses via their on_xxx_start(), process_xxx() and on_xxx_end() methods. These are the places where the actual conversion of data must happen and the new data are written into the new XML files.
 +
* The moodle1_handler subclasses can use either xml_writer's begin_tag(), full_tag() and end_tag() method to construct the XML file contents. A helper method that dumps a complete tree-ish structure is available: write_xml().
 +
 
 +
After the end of the parsed moodle.xml file is reached, the working directory with the new XML files is renamed so that it replaces the previous format. If there was a chain of converters, it would be the next one's round now. For us at the moment, the job is done. Once the directory contains valid moodle2 format, normal restore process is executed as if the course backup come from a 2.0 server.
 +
 
 +
==== Highlights and places to look at ====
 +
 
 +
While working on the converter, always keep in mind:
 +
 
 +
* The file moodle.xml is parsed sequentially by a progressive parser. If the current node's data are available, the handler must either write them immediately into a new XML file or stash them for processing them later - either by the on_element_end() event handler method or by other instance of the handler that is being executed later.
 +
* Minimise the memory footprint of the conversion job. Do not accumulate the incoming data in memory if it is not necessary. Especially those that may be huge (like all forum posts). Write the data to the new XML file as soon as possible (on-the-fly ideally but that is not always possible).
 +
* Look at various examples of moodle1 handler classes in backup/converter/moodle1/handlerlib.php
 +
 
 +
=== Putting a new wheel into the machine ===
  
 
=== The handler library template ===
 
=== The handler library template ===
Line 180: Line 209:
  
 
If you are lucky enough, you should see an array dumped for each choice instance defined in moodle.xml file. Also look at $CFG->dataroot/temp/backup/ - there should be several directories created and one the most recent is the one that contains the converted backup in the unpacked MBZ format (for each execution of the convert process, a new directory is created). Do not continue unless this works for you.
 
If you are lucky enough, you should see an array dumped for each choice instance defined in moodle.xml file. Also look at $CFG->dataroot/temp/backup/ - there should be several directories created and one the most recent is the one that contains the converted backup in the unpacked MBZ format (for each execution of the convert process, a new directory is created). Do not continue unless this works for you.
 
=== The backup conversion workflow ===
 
 
To be able to write the module conversion handler, you should understand what's going on in the background. This is a simplified description of the workflow:
 
 
[[image:backup-convert-overview.png|thumb|UML sequence diagram of the backup conversion machinery]]
 
 
* In its constructor, the restore_controller instance detects the format of the data to be restored and if it realizes it is not the standard moodle2 format, it knows the conversion will be needed (grep for backup::STATUS_REQUIRE_CONV)
 
* The restore_controller::convert() method is called (@todo is it? where? by who?) and it in turn calls convert_helper::to_moodle2_format(). This helper method tries to find the most effective conversion path between the current format and the target moodle2 format as it constructs the chain of converters, though this is not much interesting yet as we have just one converter now ... well not yet even :-p
 
* For each converter in the chain, convert_factory::get_converter() creates new instance of it and its public convert() method is called.
 
* The core functionality of moodle1 converter is defined in backup/converter/moodle1/lib.php in the class moodle1_converter and its subclasses. When the class is instantiated, it prepares a progressive parser and a parser processor and it registers all available handlers of the parsed data. The incoming convert() call sets up the directory to write to and runs the instance's execute() method.
 
* The execute() starts up the parser of the 1.9 moodle.xml. That file is parsed sequentially and whenever it reaches a node to which some handler is attached, it dispatches the parsed data via dispatch_chunk(). The parser also triggers notify_path_start() and notify_path_end() when it is entering some registered path element and when it is leaving it.
 
* The parser processor re-dispatches the parsed data and events via path_start_reached(), process_chunk() and path_end_reached() methods defined by the moodle1_convert.
 
* Finally, the parsed data and events are re-dispatched once more and they are handled by moodle1_handler subclasses via their on_xxx_start(), process_xxx() and on_xxx_end() methods. These are the places where the actual conversion of data must happen and the new data are written into the new XML files.
 
* The moodle1_handler subclasses can use either xml_writer's begin_tag(), full_tag() and end_tag() method to construct the XML file contents. A helper method that dumps a complete tree-ish structure is available: write_xml().
 
 
After the end of the parsed moodle.xml file is reached, the working directory with the new XML files is renamed so that it replaces the previous format. If there was a chain of converters, it would be the next one's round now. For us at the moment, the job is done. Once the directory contains valid moodle2 format, normal restore process is executed as if the course backup come from a 2.0 server.
 
 
==== Highlights and places to look at ====
 
 
While working on the converter, always keep in mind:
 
 
* The file moodle.xml is parsed sequentially by a progressive parser. If the current node's data are available, the handler must either write them immediately into a new XML file or stash them for processing them later - either by the on_element_end() event handler method or by other instance of the handler that is being executed later.
 
* Minimise the memory footprint of the conversion job. Do not accumulate the incoming data in memory if it is not necessary. Especially those that may be huge (like all forum posts). Write the data to the new XML file as soon as possible (on-the-fly ideally but that is not always possible).
 
* Look at various examples of moodle1 handler classes in backup/converter/moodle1/handlerlib.php
 
  
 
(to be cont.)
 
(to be cont.)

Revision as of 10:34, 13 May 2011

Introduction

This page follows up the tutorial at Backup 2.0 for developers and describes how to implement support of Moodle 1.9 backup conversion into the new Moodle 2.x format.

In short, since Moodle 2.1 there is a new type of core component available called backup converters. Converter is a tool that takes a directory containing some Moodle course data and converts it into another format. At the moment we are working on converter called moodle1 that supports 1.9 => 2.x conversion path. In the future, more converters can be written (eg supporting custom formats, Blackboard, IMS CC etc). Converters can be chained so if there is a converter that supports IMS => 1.9 conversion and another one converting 1.9 => 2.x, Moodle can automatically restore IMS backups. The core of the moodle1 converter is implemented in backup/converter/moodle1/.

The moodle1 converter is under a heavy development at the moment still. The first milestone is to be able to convert all activity modules without user data.

Getting familiar with the required changes of the structure

For the purpose of this tutorial, the Choice module is used as an example as it allows to demonstrate the basic workflow of the conversion. Let us start with performing a backup in both 1.9 and 2.1. Create an empty course with a single simple Choice module instance inside in both 1.9 and 2.1. Choose backup mode without any user data, without roles, files etc and include just the instance of the module you created.

In case of Moodle 1.9, you will end up having once monolithic file moodle.xml. In Moodle 2.1, the module data will be stored in a file like activities/choice_x/choice.xml. The task of the converter you are going to write is to convert data contained in moodle.xml into choice.xml.

Getting the list of XML paths

Looking at moodle.xml in 1.9 backup, you can see that the module data are stored in XML nodes at /MOODLE_BACKUP/COURSE/MODULES/MOD and that you are interested only to those MODs having <MODTYPE>choice</MODTYPE>. To make your life easier, the core of the moodle1 converter injects one virtual node into the path so that to our module, it appears as if its data were in /MOODLE_BACKUP/COURSE/MODULES/MOD/CHOICE. That is as if all Choice data were wrapped by yet another tag in moodle.xml.

Now look into 1.9 Moodle code and locate the file mod/choice/backuplib.php. Reading its code you can see that the MOD element in moodle.xml (that will be presented as MOD/CHOICE element to our code) contains the following tags holding the corresponding fields from mdl_choice table:

   ID              $choice->id
   NAME            $choice->name
   TEXT            $choice->text
   FORMAT          $choice->format
   PUBLISH         $choice->publish
   SHOWRESULTS     $choice->showresults
   DISPLAY         $choice->display
   ALLOWUPDATE     $choice->allowupdate
   SHOWUNANSWERED  $choice->showunanswered
   LIMITANSWERS    $choice->limitanswers
   TIMEOPEN        $choice->timeopen
   TIMECLOSE       $choice->timeclose
   TIMEMODIFIED    $choice->timemodified

Below these, the choice options data are dumped into the OPTIONS section, with each option details being wrapped by the OPTION tag:

   ID              $cho_opt->id
   TEXT            $cho_opt->text
   MAXANSWERS      $cho_opt->maxanswers
   TIMEMODIFIED    $cho_opt->timemodified

The file moodle.xml will be parsed by a progressive parser. That basically means it will be read in a sequential order and each time some interesting path is reached, the data contained by that element are dispatched to a handler (on contrary to DOM like parsers where the whole file would be converted into a huge in-memory tree structure). To catch the choice data in moodle.xml we will have to handle /MOODLE_BACKUP/COURSE/MODULES/MOD/CHOICE and /MOODLE_BACKUP/COURSE/MODULES/MOD/CHOICE/OPTIONS/OPTION paths

Getting know how data change during 1.9 => 2.x upgrade

Now let us open mod/choice/db/upgrade.php in your Moodle 2.1 code. It contains all the upgrade logic that is happening during the upgrade of 1.9 site to 2.x. Reading the code, we realize that:

  • the database field text in the {choice} table is renamed to intro
  • the database field format in the {choice} table is renamed to introformat
  • a new field completionsubmit is added to the {choice} table with the default value 0

Getting know the structure of choice.xml

Finally look at mod/choice/backup/moodle2/ in your Moodle 2.1, particularly the file backup_choice_stepslib.php. Here you can see how the structure of the choice.xml is defined and what data it contains:

   $choice = new backup_nested_element('choice', array('id'), array(
       'name', 'intro', 'introformat', 'publish',
       'showresults', 'display', 'allowupdate', 'allowunanswered',
       'limitanswers', 'timeopen', 'timeclose', 'timemodified',
       'completionsubmit'));

The <choice> element has a child element <options> that in turn contains set of <option> elements:

   $option = new backup_nested_element('option', array('id'), array(
       'text', 'maxanswers', 'timemodified'));

Great! Now we have all the information needed to start with coding.

Writing the conversion handler

Before we start, please make your own local branch that will track David's https://github.com/mudrd8mz/moodle/tree/backup-convert tree. That is the pre-integration branch where the development of backup conversion is happening and your work will have to be merged there before it gets into the moodle.git master (use Github's pull request feature once you want your work being included there).

To add this branch into your current moodle.git clone, you may want to use something like this:

   $ cd ~/public_html/moodle21
   $ git remote add mudrd8mz git://github.com/mudrd8mz/moodle.git
   $ git fetch mudrd8mz
   $ git checkout --track -b backup-convert mudrd8mz/backup-convert

Alternatively, if you want to create a pristine installation just for this project, you can simply run

   $ cd ~/public_html
   $ git clone git://github.com/mudrd8mz/moodle.git moodle21convert
   $ cd moodle21convert
   $ git checkout --track -b backup-convert origin/backup-convert

The backup conversion workflow

To be able to write the module conversion handler, you should understand what's going on in the background. This is a simplified description of the workflow:

UML sequence diagram of the backup conversion machinery
  • In its constructor, the restore_controller instance detects the format of the data to be restored and if it realizes it is not the standard moodle2 format, it knows the conversion will be needed (grep for backup::STATUS_REQUIRE_CONV)
  • The restore_controller::convert() method is called (@todo is it? where? by who?) and it in turn calls convert_helper::to_moodle2_format(). This helper method tries to find the most effective conversion path between the current format and the target moodle2 format as it constructs the chain of converters, though this is not much interesting yet as we have just one converter now ... well not yet even :-p
  • For each converter in the chain, convert_factory::get_converter() creates new instance of it and its public convert() method is called.
  • The core functionality of moodle1 converter is defined in backup/converter/moodle1/lib.php in the class moodle1_converter and its subclasses. When the class is instantiated, it prepares a progressive parser and a parser processor and it registers all available handlers of the parsed data. The incoming convert() call sets up the directory to write to and runs the instance's execute() method.
  • The execute() starts up the parser of the 1.9 moodle.xml. That file is parsed sequentially and whenever it reaches a node to which some handler is attached, it dispatches the parsed data via dispatch_chunk(). The parser also triggers notify_path_start() and notify_path_end() when it is entering some registered path element and when it is leaving it.
  • The parser processor re-dispatches the parsed data and events via path_start_reached(), process_chunk() and path_end_reached() methods defined by the moodle1_convert.
  • Finally, the parsed data and events are re-dispatched once more and they are handled by moodle1_handler subclasses via their on_xxx_start(), process_xxx() and on_xxx_end() methods. These are the places where the actual conversion of data must happen and the new data are written into the new XML files.
  • The moodle1_handler subclasses can use either xml_writer's begin_tag(), full_tag() and end_tag() method to construct the XML file contents. A helper method that dumps a complete tree-ish structure is available: write_xml().

After the end of the parsed moodle.xml file is reached, the working directory with the new XML files is renamed so that it replaces the previous format. If there was a chain of converters, it would be the next one's round now. For us at the moment, the job is done. Once the directory contains valid moodle2 format, normal restore process is executed as if the course backup come from a 2.0 server.

Highlights and places to look at

While working on the converter, always keep in mind:

  • The file moodle.xml is parsed sequentially by a progressive parser. If the current node's data are available, the handler must either write them immediately into a new XML file or stash them for processing them later - either by the on_element_end() event handler method or by other instance of the handler that is being executed later.
  • Minimise the memory footprint of the conversion job. Do not accumulate the incoming data in memory if it is not necessary. Especially those that may be huge (like all forum posts). Write the data to the new XML file as soon as possible (on-the-fly ideally but that is not always possible).
  • Look at various examples of moodle1 handler classes in backup/converter/moodle1/handlerlib.php

Putting a new wheel into the machine

The handler library template

The module conversion logic is stored in a library mod/choice/backup/moodle1/lib.php. The basic template for such a file is:

<?php
 
// This file is part of Moodle - http://moodle.org/
//
// Moodle is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// Moodle is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU General Public License
// along with Moodle.  If not, see <http://www.gnu.org/licenses/>.
 
/**
 * Provides support for the conversion of moodle1 backup to the moodle2 format
 *
 * @package    mod
 * @subpackage choice
 * @copyright  2011 Your Name <your@email>
 * @license    http://www.gnu.org/copyleft/gpl.html GNU GPL v3 or later
 */
 
defined('MOODLE_INTERNAL') || die();
 
/**
 * Choice conversion handler
 */
class moodle1_mod_choice_handler extends moodle1_mod_handler {
 
    /**
     * Declare the paths in moodle.xml we are able to convert
     *
     * The method returns list of {@link convert_path} instances.
     * For each path returned, the corresponding conversion method must be
     * defined.
     *
     * Note that the path /MOODLE_BACKUP/COURSE/MODULES/MOD/CHOICE does not
     * actually exist in the file. The last element with the module name was
     * appended by the moodle1_converter class.
     *
     * @return array of {@link convert_path} instances
     */
    public function get_paths() {
        return array(
            new convert_path('choice', '/MOODLE_BACKUP/COURSE/MODULES/MOD/CHOICE'),
            new convert_path('choice_option', '/MOODLE_BACKUP/COURSE/MODULES/MOD/CHOICE/OPTIONS/OPTION'),
        );
    }
 
    /**
     * This is executed every time we have one /MOODLE_BACKUP/COURSE/MODULES/MOD/CHOICE
     * data available
     */
    public function process_choice($data) {
    }
 
    /**
     * This is executed every time we have one /MOODLE_BACKUP/COURSE/MODULES/MOD/CHOICE/OPTIONS/OPTION
     * data available
     */
    public function process_choice_option($data) {
    }
}

Let us look at this code template closer so that you understand the bits and will be able to use it as a template for your own work.

As you can see, the file defines a single class that extends moodle1_mod_handler. All moodle1 handler classes must define get_paths() public method that returns a list of convert_path instances. In our example, we declare that the handler is interested in those two paths we have identified above. We register them with "aliases" choice and choice_option. The developer can choose any reasonable alias here and provide it as the first parameter of the convert_path constructor.

For each convert_path instance, a corresponding processing method must exist in the class. This method must be named process_xxx() where xxx is the alias declared in the get_paths().

Setting-up a debugging environment

At this stage of the development, this seems to be the most effective way of testing and debugging your code:

  • copy the moodle.xml file with the backup of the 1.9 course you created into $CFG->dirroot/backup/converter/moodle1/simpletest/files/moodle.xml (override the one existing there)
  • modify the process_choice() method so that it just dumps the $data:
public function process_choice($data) {
    print_object($data); // DONOTCOMMIT
}
  • put the following into your config.php:
    $CFG->keeptempdirectoriesonbackup = true;
  • execute the unit tests in the path "backup/converter/moodle1" (there is one test method that actually runs the conversion of the file you copied)

If you are lucky enough, you should see an array dumped for each choice instance defined in moodle.xml file. Also look at $CFG->dataroot/temp/backup/ - there should be several directories created and one the most recent is the one that contains the converted backup in the unpacked MBZ format (for each execution of the convert process, a new directory is created). Do not continue unless this works for you.

(to be cont.)