Data formats

Jump to: navigation, search

Data formats are plugins that define how a table of data can be exported for download. Moodle comes with plugins for CSV, Excel, ODS, JSON, and HTML.

Previously if you wanted to export some table of data you either hand rolled UI of the export yourself using one of the various csvlib or excellib classes, or some of the more modern Moodle code uses table_sql which is not widely used, especially in 3rd party plugins, and highly underrated. This meant that if you want to support csv and excel, or ODS, then adding each new format was more code. This new Dataformat API provides a simple API to handle all of the downloading UI for little effort and removes all format specific code from your plugin code.


Using the Dataformat API

Using the new API is very simple and consists of two steps.

Step 1: Rendering a download selector

Use the core renderer to output the download selector UI:

echo $OUTPUT->download_dataformat_selector(get_string('userbulkdownload', 'admin'), 'download.php');

You can pass extra params to override the query param used to specify the format, and append extra query params as needed:

echo $OUTPUT->download_dataformat_selector(get_string('userbulkdownload', 'admin'), 'download.php', 'myqueryparam', array('foo' => 'bar'));


Step 2: Serving the data for download

And then in the page you just linked to, you create your data record set, define some metadata for the columns, and then pass these to the Dataformat API which then streams the results to the browser in the users selected format:

$dataformat = optional_param('dataformat', '', PARAM_ALPHA);
$columns = array(
    'idnumber' => get_string('idnumber'),);
$rs = $DB->get_recordset_sql(...);
download_as_dataformat('myfilename', $dataformat, $columns, $rs);
$rs->close();

The Dataformat API can handle any Iterator, so it can be used for data which isn't actually from the database, and has been generated by some other means.

If you are dealing with a relatively straight forward table of sql data, and you also want a standard table, it is strongly recommended you use the table_sql API which itself uses the Dataformat API and you'll get all this for free.

There are cases where the data returned from the DB needs processing in some way, this is an option:

download_as_dataformat('myfilename', $dataformat, $columns, $rs, function($record) {
    // Process the data in some way.
    // You can add and remove columns as needed
    // as long as the resulting data matches the $column metadata.
});

Column metadata considerations

The $columns metadata is an array of unique key names to human readable names. It is intended that each Dataformat has both available to use as needed. For instance a human facing document format such as Excel uses the lang string as the column headers, but the json format uses the key name instead. If the data you are exporting is primarily for import into other systems, you may decide that it is better for the columns to always be fixed and not localized, like this:

$columns = array(
    'idnumber' => 'idnumber',
    'course' => 'course',
    ...
);

Creating a new dataformat

Moodle already ships with 5 formats and it is unlikely that a large number of new dataformat plugins will emerge over time. But creating new ones is very easy, apart from the lang and version boiler plate it consists of a single autoloaded class.

A practical approach would be to clone and rename one of the existing text based formats such as /dataformat/json or /dataformat/html.

Components

dataformat/FORMATNAME/classes/writer.php 
This file is the real meat of the form, it contains class which extends \core\dataformat\base defining how the data stream should be encoded.
dataformat/FORMATNAME/lang/en/dataformat_FORMATNAME.php 
Contains the English language strings used in the format.
dataformat/FORMATNAME/version.php 
Version definitions, see version.php. It is highly recommended always to have it and it is required if you have any files in db folder

Generally no other files are needed, but all the other files available to other plugins could also be used here.

Performance

The design intent of the data formats is to have low and ideally fixed memory overhead, and to stream data to the browser record by record. This enables very large datasets to be exported easily and safely without running into a whole range of issues including php memory thresholds, php execution timeouts, and other timeouts in every layer of the stack at the apache / nginx layer, a caching layer such as varnish, or load balancers, any intermediate proxies, and finally any browser time-outs.

So please avoid building up a complete data structure in memory, or ideally not even on disk either. Instead just encode the data record by record as the data stream is processed and echo it to the browser. The HTML, JSON, and CSV data formats are good examples of this. The Excel and ODS formats are far more performant than in earlier versions of moodle, but still suffer from network inactivity timeouts at very large scale because they buffer the file to disc before finally sending when complete.