Difference between revisions of "Resource module file API migration"

Jump to: navigation, search
(See also)
Line 1: Line 1:
{{Work in progress}}{{Moodle 2.0}}
+
{{Moodle 2.0}}
  
 
* '''PROJECT STATE: Approved proposal'''
 
* '''PROJECT STATE: Approved proposal'''

Revision as of 19:53, 20 July 2009

Moodle 2.0


Contents

Introduction

Content distribution is important part of every VLE/CMS/LMS. Content is created in complex authoring tools by professional designers and distributed in standardised formats (IMS, SCORM, etc.), simple resources may be created by teachers directly in Moodle, there is also a lot of free materials present on Internet that can be linked from courses. Our mission is to create user interface suitable both for beginners and advanced users, full compatibility with client computers is important aspect too.


Resource module is one of the oldest modules in Moodle, it was originally called reading. Current subplugin based design dates back to 2004 (Moodle 1.4). In 2005 Eloy added IMS type, Hive plugin was contributed the same year. This module does not have any permanent maintainer, there were no new features or significant user interface changes since 2005. There were several moments when things did not go as planned. First when the embedding changes were introduced in order to work around the Eolas patent trouble in IE. After multiple regressions and rewrite of embedding code Microsoft finally settled the patent dispute and reverted the changes in IE. Other problems started appearing after the migration to xhtml strict/accessibility compliance in the 1.8dev phase. We have started to use new
object
tags instead of good old deprecated
embed
tags and html frames. Browsers and especially plug-ins (flash, PDF, media) were not yet compatible with these new standards. Recent 1.9.x builds together with latest browsers/plug-ins should work fine now, unfortunately not all client computers and servers are updated regularly. In theory the resource module is now more accessible, but in reality ordinary users may feel it as a step backwards.


Some general problems:

  • no access control - files are protected at the course level only, even course guests may download all course files if they can guess the names
  • web page x text page confusion - separate types for formatted text and web page with html editor is confusing for some users
  • browser caching issues - all course files except moddata are cached in browsers, this often manifests as my resource files can not be updated bug reports from teachers
  • outdated customised HTML Area - limited browser support, problematic media embeddings, bugs (Tiny MCE already in 2.0dev - needs more integration improvements though)
  • can not safely duplicate resources - in backup/restore/import/export we never know the list of files used in each resource, we have to drag along all course files
  • linking of site files - some servers are using site files for storage of shared resources, there is a hack in backup/restore that tries to solve this, but unfortunately this was not a good long term solution
  • unmaintained code - especially Hive and localfiles support
  • problematic options - force download for external files (we can force download only for files hosted by Moodle), new window controls seem to be confusing
  • uploaded file filtering - site-wide uploaded file filtering interferes with some plug-ins that use text or html data files


The idea of resource types implemented as subplugins has always been a little awkward and not that helpful, incomplete list of problems follows:

  1. sub-plugin APIs are not as well-defined as module APIs
  2. refactoring inside the module is made harder
  3. subplugins can't uninstall
  4. subplugins need to bump the module version to upgrade
  5. universal data fields get reused for different things
  6. capability in subplugins not supported
  7. subplugin upgrade is not integrated into main module upgrade - subplugin upgrade is executed after the main module upgrade (critical)
  8. subplugins can not have own lang packs
  9. subplugins do not have own log actions
  10. instead of reducing code duplication advanced plugins may duplicate a lot more core
  11. it is not easy to move unmaintained subplugins into contrib
  12. unofficial subplugins not reported during upgrade
  13. no support for subplugins in file API

Migration of resource types in Moodle 2.0

Splitting resource into multiple modules

After some discussion, we've decided to split resource into normal modules, simplifying the structure at the same time. The benefits are cleaner, more maintainable code and a simpler Moodle.

mod/resource will still remain and will be used for two purposes - to serve uploaded resource materials and to redirect old URLs to the new modules. There will be a new table with backup of current resource table together with all other information needed for upgrades and URL redirections.

In order to still be able to group the new modules together in the user interface as "resource" plugins, we can mark these as "resource" plugins in the modules table in core, in a field called "archetypes" (later there might be "assignment" in here too), see #Module archetypes.

Resource types mapping

Resource types in Moodle 1.9
New modules in Moodle 2.0
Type Description Module Description
directory an arbitrary directory of files in course files mod/folder a collection of files in separate file area
html Web page - an html page containing absolute links to media from course files or external links mod/page a general text page - html markup, format FORMAT_HTML and files stored in separate file area
text Text page - selectable text format, plain text area for editing; main contain links to course files mod/page a general text page - text, format and files stored in separate file area
file link to file from course files of the same course, files may contain relative links to other files mod/resource a file or collection of files, one file is always selected as main file
file link to other course or general external URL mod/url a general URL
ims Files extracted into moddata from an IMS content package obtained from course files or web repository mod/imscp Files in separate file area extracted from an IMS content package obtained from repository
repository Links to external files in repositories (ie Hive) mod/url? (na)

Contrib resource types

There are other third party plugins in contrib that need to be migrated:

  • mod/resource/type/digitalnz - browser and add items listed in http://www.digitalnz.org/
    Sounds like it should become a repository plugin.
  • mod/resource/type/globe - browser and add items in Ariadne repositories
    Sounds like it should become a repository plugin.
  • mod/resource/type/jmol - display chemistry structures from a jmol format file
    Sounds like it should become a filter.
  • mod/resource/type/mrcuteget
    Sounds like it should become a repository plugin.
  • mod/resource/type/mrcuteput
    Sounds like it should become a portfolio plugin.
  • mod/resource/type/rss - display one RSS feed on a page
    Sounds like it could be fixed by allowing an RSS block to be added to an otherwise empty mod/page.
  • mod/resource/type/slideshow - display a directory of images as a slideshow
    Sounds like a new image gallery module

Current resource-like modules

Separate modules will use similar migration techniques:

New resource modules

Resource file (mod/resource)

General resource type with two purposes - displaying of upload files and redirecting of legacy resource links to new modules.


Primary use cases:

  • presentation of single file - PDF files, office documents, etc.
  • download of single file or zip archive - distribution of binary files or zip archives


Secondary use cases:

  • simple media file presentation - images, flash, movies, mp3, etc.
  • serving of general html files created in external system - web pages, html exported from Word, content packages, etc.
  • presentation of non-standard media files - no full control over embedding, might require proprietary browser plug-ins (quicktime, windows media, real media, adobe shockwave)


User interface is not optimised for handling of hundreds of files, full file manager may be used instead after creation of resource instance. Please note media embedding is automatic and without any options, compared to manual embedding in Page resource this is easier to use but less flexible.

resource database table

Field
Type
Default
Description
id int (10) auto-numbered
course int (10) the course id
name char (255) the title of the resource as it appears at the course outline
intro text (medium) the description/assignment of the workshop
introformat int (4) 0 the format of the intro field
tobemigrated int (4) 0 0 means new resource file created in 2.0 or later, 1 means old plug-in type not migrated yet
mainfile char(255) file area path to main file
legacyfiles int (4) 0 0 means no legacy files, 1 means on-demand migration off, 2 means on-demand migration on
legacyfileslast int (10) null timestamp, last date of on-demand mirgation
display int (4) 0 constants for "Same window / Embed / Frame / New window / Force download"
displayoptions text (small) null arbitrary display options - serialized PHP array
filterfiles int (4) 0 constants
revision int (10) 0 revision counter, incremented when any file changed
timemodified int (10) 0 the timestamp when the module was modified

Edit UI mockup

resource modedit.png

Potential problems

  1. #On-demand migration should solve relative link issues
  2. direct WebDAV management of files would be nice

Handling of legacy mod/resource/view.php links

Implemented in mod/resource/view.php, id and f parameters are used to get instance record from resource table. Converted modules do not have instance record in resource table, new instance id and new module name can be found in resource_old table. Next step is easy, redirect to new location
 redirect("$CFG->wwwroot/mod/$newmodulename/view.php?id=$cmid");
.

This redirection would be fully transparent for the original site, it would not work in restored courses though. We need to convert the old resource URLs to new URLs during backup/restore/import/export of any activity in course. It should be relatively easy for new backups in 2.0. The restores from 1.9 will will be more problematic, we can either recode everything during restore or insert fake records into resource_old table for each resource coming from 1.9 restore file.

Handling of non-migrated types

There is a special flag in resource table which indicates if module was already converted. It is verified in mod/resource/view.php and notification message is printed if needed. Please note the original resource data and settings are stored in resource_old, the resource table structure is changed during upgrade.

Page resource (mod/page)

Formatted text pages with embedded media files (images, flash, etc.) created in on-line text editor (TinyMCE). Very easy to use.


Primary use cases:

  • short study materials - equivalent to one html page
  • general information - short notices and information for students
  • media file presentation - html page with manually embedded images, flash, mp3, etc.; embedding can be achieved via text filters too


Most of the features are actually implemented in embedded text editor which integrates repository file picker. Each page should be self-contained, pages should only link own files or files from Internet. It is also possible to link other course activities, unfortunately these links may not work properly when restored on different servers.

Multiple-page resources may be available in Moodle 2.0 either as part of this Page resource module or as new separate module such as Book. It would be nice some have something like eXe editor (http://exelearning.org/) directly in Moodle.

page database table

Field
Type
Default
Description
id int (10) auto-numbered
course int (10) the course id
name char (255) the title of the resource as it appears at the course outline
intro text (medium) the description/assignment of the workshop
introformat int (4) 0 the format of the intro field
content text (medium) the description/assignment of the workshop
contentformat int (3) 0 the format of the intro field
legacyfiles int (3) 0 0 means no legacy files, 1 means on-demand migration off, 2 means on-demand migration on
legacyfileslast int (10) null timestamp, last date of on-demand mirgation
displayoptions text (small) null arbitrary display options - serialized PHP array
revision int (10) 0 revision counter, incremented when any file changed
timemodified int (10) 0 the timestamp when the module was modified

Edit UI mockup

page modedit.png

Potential problems

  • #On-demand migration should solve relative link issues
  • removed Show course blocks options - to be replaced by something else during blocks rewrite
  • removed New window support - accessibility reasons (this is going to be controversial)

Folder resource (mod/folder)

Folder resource module, replaces original directory of files resource type. Files and folders are stored in separate file area. This module is not intended for serving of html pages.

Portfolio export will allow exporting of all files in one zip file.


Primary use cases:

  • easy distribution of many documents - PDFs, office documents, zip archives, etc.


Files may be organised into tree structure, the tree is expanded automatically when less than 20 files+folders found.

folder database table

Field
Type
Default
Description
id int (10) auto-numbered
course int (10) the course id
name char (255) the title of the resource as it appears at the course outline
intro text (medium) the description/assignment of the workshop
introformat int (4) 0 the format of the intro field
revision int (10) 0 revision counter, incremented when any file changed
timemodified int (10) 0 the timestamp when the module was modified

Edit UI mockup

folder modedit.png folder edit files.png

View UI mockup

folder view.png

Potential problems

  1. we need some good file manager
  2. direct WebDAV management of files would be nice

URL resource (mod/url)

A URL to an external page or file.

url database table

Field
Type
Default
Description
id int (10) auto-numbered
course int (10) the course id
name char (255) the title of the resource as it appears at the course outline
intro text (medium) the description/assignment of the workshop
introformat int (4) 0 the format of the intro field
externalurl text (small) external url
display int (3) 0 constants for "Same window / Embed / Frame / New window"
displayoptions text (small) null arbitrary display options - serialized PHP array
parameters text (small) null list of extra url parameters added by Moodle
timemodified int (10) 0 the timestamp when the module was modified

Edit UI mockup

url modedit.png

Potential problems

  • teachers must understand they can not copy&paste course file links any more

IMS Content Package (mod/imscp)

IMS Content Package module, designed for easy presentation of standard compliant content packages. This is the recommended format for import of materials created in external systems. See http://www.imsglobal.org/content/packaging/


Primary use cases:

  • import of study materials from other systems

imscp database table

Field
Type
Default
Description
id int (10) auto-numbered
course int (10) the course id
name char (255) the title of the resource as it appears at the course outline
intro text (medium) the description/assignment of the workshop
introformat int (4) 0 the format of the intro field
display int (3) 0 constants for "Same window / Frame / New window"
displayoptions text (small) null arbitrary display options - serialized PHP array
revision int (10) 0 revision counter, incremented when any file changed
timemodified int (10) 0 the timestamp when the module was modified

Edit UI mockup

ims modedit.png

Potential problems

  • custom IMS repository design obsoleted by new repository API
  • synchronisation with IMS repositories

Upgrade

Each resource type requires different upgrade code because the files are stored in different places that require special handling. They also have different settings and options with different text encoding in the table fields: reference, alltext and options.

lib/db/upgrade.php

Create new table resource_old and copy all current records from resource table. Add new fields:

  • oldid - original id from resource table
  • cmid - original course module id
  • newmodule - null means not converted, string value is name of new module
  • newid - new instance id referencing id in newmodule table
  • migration time


resource_old database table

Field
Type
Default
Description
id int (10) auto-numbered
course int (10) the course id
name char (255) original title of resource
type char (30) name of resource type
reference char (255) general reference
summary text (small) old description in HTML format
alltext text (medium) general text field
popup text (small) some popup info
options char (255) some options
timemodified int (10) 0 the timestamp when the module was modified
oldid int (10) the original id from resource table
cmid int (10) the original course module id of resource, kept the same in new module
newmodule char (50) mod/newmodule/* name of new module
newid int (10) new instance id in the newmodule table
migrated int (10) 0 timestamp when type migrated

Switching resource types to full modules

Implemented in lib/upgradelib.php function switch_resource_type($resource_old)

  • $resource_old must already contain new module info (newmodule and newid fields)
  • links course_module to newid instead of old resource id
  • deletes old record instance
  • updates log table

After each call we must also reset course modinfo.

Following items are not required in case of Resource module:

  • no need to upgrade gradebook - no grades or outcomes in current resources
  • no need to change permissions - the same cmid means the same context
  • no need for capability migration - resource does not define any capabilities, only mod/resource:exportresource added in 2.0dev
  • old section placement retained, because it is specified in course_modules table
  • no need to migrate files, no files in areas yet

Module upgrade steps

mod/resource/db/upgrade.php

  1. change resource table structure - we do not need to keep any data because copy of everything is kept in resource_old table
  2. mark all records in resource table as not migrated
  3. search for migration candidates in resource_old table - original type file, link pointing to course file from the same course
  4. find out old setting from old record
  5. update resource table record and mark the instance as active
  6. copy linked course file to new file area, mark this file as main file
  7. enable #On-demand migration of files if file may contain relative links (whitelist image types and some other binary types)

mod/url/db/install.php

  1. search for migration candidates in resource_old table - type file with URLs outside of current course files
  2. find out all new settings from existing old record
  3. create new instance in url table
  4. update resource_old
  5. call switch_resource_type($resource_old)

mod/folder/db/install.php

  1. search for migration candidates in resource_old table - type directory
  2. create new instance in folder table
  3. update resource_old
  4. call switch_resource_type($resource_old)
  5. recursively copy the chosen course directory to the folder file area

mod/page/db/install.php

  1. search for migration candidates in resource_old table - type html and text
  2. find out all new settings from existing old record
  3. create new instance in page table
  4. update resource_old
  5. call switch_resource_type($resource_old)
  6. search for any absolute links in case of html text and copy them to new page file area keeping the same path relative to course files root
  7. enable #On-demand migration of files if any copied file above may contain relative links

mod/imscp/db/install.php

  1. search for migration candidates in resource_old table - type ims
  2. find instance setting
  3. create new instance in ims table
  4. update resource_old
  5. call switch_resource_type($resource_old)
  6. recursively copy moddata files into ims file area
  7. copy original backup of package file to another new ims file area

Others

Local files are a special form of file type, the reference starts with RESOURCE_LOCALPATH.

Hive repositories could be converted to simple URL pages.

Contrib types do not need to be migrated at the same time, it will be possible to migrate them independently because the first step up upragade in core upgrade.php script copies all existing data into the resource_old table. New modules are responsible for deleting or renaming of any extra db tables if previously created. The new
mod/newmodule/db/install.php
script is the expected migration point.

On-demand migration of files

Some new resource modules may contain relative links to other files originally stored in course files area. We may not be able to detect all these relatively linked files during the migration/upgrade and copy them to the new file area. When browser opens a migrated file (served via $CFG->wwwroot/pluginfile.php) it first constructs absolute url (again starting with $CFG->wwwroot/pluginfile.php) and requests it from server. Now pluginfile.php locates our new resource module, module instance looks for the file in its own new file area and if it does not exists it tries to find the forgotten file in original course files. Any forgotten files are migrated to new file area on-demand.

The on-demand migration will be implemented only in several new modules (mod/resource and mod/page), others either know all required files or there can not be any relative links.


On-demand flag legacyfiles is stored in module instance table, all new instances created in 2.0 have it disabled. Upgraded instances have modedit setting which turns it on/off as needed. Migration/upgrade process tries to detect and problematic resource content and automatically turns on the on-demand migration:

  • html files possibly containing images, css, applets and javascript
  • flash applets containing relative links to media (eg sounds or images) and data files
  • java applets containing relative links to media
  • javascript that loads media via relative links (eg imagine an auto-generated slideshow)

We still want to move all these legacy files into the correct resource filearea, let's try to explain it once more ;-)

  1. When a browser requests to view one of these legacy HTML files, it needs to load all the JS and Java and Flash it contains.
  2. These small programs run and need to access other files, they do not know absolute URLs, so they contruct it from own absolute URL they know and specified relative links.
  3. These requests to Moodle will hit the pluginfile.php script
  4. The pluginfile.php script checks that the file doesn't exist yet in the resource file area, and if so:
    1. pluginfile.php uses the relative path to copy the original file from the courses area to the resource file area
    2. pluginfile.php updates legacyfile field for the resource to be the current timestamp, indicating that some processing has happened.

In any case the on-demand migration is relatively cheap, we do not need to turn it off for performance reasons - it is just one wasted db query when resource contains incorrect relative links. The cost of migration itself is not significant because it is done only once.


Problem is we don't really know when the resource migration is completely "finished" (ie every single associated file has been copied into new file area), so we can't automatically turn off on-demand migration after this process. That's why the manual post-upgrade script described bellow may help in some situations.

There are cases when we need to know list of files belonging to each resource:

  • backup
  • import resource instances into other courses

Since we have a timestamp in there we COULD consider a cron which detects completely migrated resources - resource with a lot of some recent active and no on-demand migration for a long time. Alternatively we may use these conditions to propose finished migrations on some admin report or post-upgrade script.

Post-upgrade processing

After the Moodle 2.0 upgrade has completed we take the admin to a special page that:

  1. explains the issue of resources that contain links to other files
  2. explains how it can be fixed by on-demand migration (see below)
  3. presents an option for "automatic parsing" which tries to automatically migrate simpler resources such as HTML files with links to HTML files, images, CSS
  4. presents a list of remaining flagged resources as links that need to be visited manually in the browser to be fixed

I imagine this interface might have a small frame on the left with the main pages being loaded in the larger frame on the right. The left frame has controls and info like:

  • total number of pages to process
  • next and previous buttons
  • some button to say "I have completely viewed this resource"

So the admin will have to sit there and go through all these resources and check them off to complete the upgrade. Until this process is finished none of these operations can be guaranteed to work so we might even think about disabling those operations:

  • Backup
  • Restore
  • Import of a course

Normal viewing of resources WILL work fine for teachers and students, because it will just trigger the normal on-demand migration of files.

Candidates for removal from core

List of obsoleted or unmaintained features proposed for removal from core

hive sso

Located in /sso/hive/*. Unmaintained and obsoleted by auth SSO plugins and Moodle networking. It should be possible to implement new auth/hivesso and maintain it in contrib. Any volunteers?

hive repositories

Unmaintained. Any volunteers?

IMS repository

Obsoleted by regular repositories. Maybe a new core IMS repo plugin.

file/localfile.php

No maintainer. Intended for large files stored on local networks or CDs, obsoleted by faster Internet connections and various offline Moodle projects. Volunteers may create new plugin and maintained it in contrib.

weblinkauth hook

Obsoleted during the auth refactoring in 1.8dev. Volunteers may continue maintaining it as new contrib auth plugin.

Other ideas

Dealing with caching issues

To have more control over the caching of files, we could use the same mechanism already implemented in the SCORM module, by changing the URL:

  1. Internally store all files as itemid=0
  2. Add new revision db field into resource tables
  3. When serving files always construct links with revision as itemid
  4. Ignore itemid value for resource files in pluginfile.php

Module archetypes

One of the reasons for resource plugins was to visually group all resources in user interface - separate drop downs when adding activities, grouping of resources in Activities block, mod/resource/index.php page, reports and overviews, etc. Unfortunately this solution excluded contrib resource-like modules and in general was not flexible enough.

There are several possible ways how to find out activity types:

  1. new field archetypes in modules table - benefits are performance, ability to change types easily
  2. modulename_plugin_supports() - new core API in 2.0, each plugin may indicate various supported features

Opening of new windows

At present there is an option to force opening of resource in new window with a lot of detailed options, I think it is too complex for majority of our users. This is going to be controversial, not really sure how to improve this, please post in the page comments ;-)


Known uses of new windows are:

  • leaving of Moodle site - when clicking on external links we do not want users to leave our site, they might have trouble finding way back to us
  • loss of navigation elements - if you open pdf or image directly in browser you need to click on Back button to return to site
  • course design - some designers want flash applets to open in small new windows without chrome (annoying for many users, not accessible at all)
  • copy protection - this is plain silly, anything in browser can be downloaded easily


The concept of forced new windows is considered obsolete by authors of web standards and accessibility guidelines:

  • there is no support for target property in xhtml strict, we have to use ugly JS tricks to work around validation problems
  • not recommended by accessibility experts
  • removing browser UI elements such as address/menu/status via javascript may be considered evil - browser interfaces are not cluttered with useless dead space anymore, screens are larger, F11 full-screen support available in all browsers, opening in new tab ignores this directives


The mockups above contain simplified new window controls on modedit pages.

Embedding of files

Files may be opened directly in browser window or embedded into html page. For security reasons only trusted user may use
object
tag in html code, this is not a problem in case of resource because it is designed only for trusted teachers.

Embedding is done using:

  • html editor - TinyMCE editor contains several plug-ins for embedding of media files, we can add more plug-ins easily
  • text filters - core Moodle feature, allows safer embedding for untrusted users
  • automatic resource embedding - special feature of current file resource type


The resulting code is very similar in all cases, we could create some general library and share code in resource-like modules and text filters.

Filtering of uploaded html files

Missing navigation is one of the biggest problems when using content created in external authoring tools. On of the possible solutions is to apply special filtering to each html file in resource file area. This does not have to be implemented as regular Moodle text filter, instead it can be implemented directly in resource_pluginfile() function.

The general idea is to search for markers in html file (special html comments, divs, etc.) and replace them with current navigation elements, course module info, course name, current user information on-the-fly. We have to make sure caching is prevented when using session or current user information.

This approach was not possible in 1.9.x because we could not determine current resource instance from file.php parameters.


This feature could be implemented in mod/resource, mod/imscp and mod/scorm.

See also