Development:Resource module file API migration: Difference between revisions
Line 529: | Line 529: | ||
===Potential problems=== | ===Potential problems=== | ||
* is ''imspackage'' a suitable name for this module? | |||
* custom IMS repository design obsoleted by new repository API | * custom IMS repository design obsoleted by new repository API | ||
* synchronisation with IMS repositories | * synchronisation with IMS repositories |
Revision as of 13:04, 8 June 2009
Note: This article is a work in progress. Please use the page comments or an appropriate moodle.org forum for any recommendations/suggestions for improvement.
- PROJECT STATE: Proposal
- MAIN TRACKER ISSUE: MDL-16089 Resource module migration
- DISCUSSION AND COMMENTS: not announced on moodle.org yet
- AUTHOR: Petr Škoda (škoďák) and others
Introduction
Content distribution is important part of every VLE/CMS/LMS. Content is created in complex authoring tools by professional designers and distributed in standardised formats (IMS, SCORM, etc.), simple resources may be created by teachers directly in Moodle, there is also a lot of free materials present on Internet that can be linked from courses. Our mission is to create user interface suitable both for beginners and advanced users, full compatibility with client computers is important aspect too.
Resource module is one of the oldest modules in Moodle, it was originally called reading. Current subplugin based design dates back to 2004 (Moodle 1.4). In 2005 Eloy added IMS type, Hive plugin was contributed the same year. This module does not have any permanent maintainer, there were no new features or significant user interface changes since 2005. There were several moments when things did not go as planned. First when the embedding changes were introduced in order to work around the Eolas patent trouble in IE. After multiple regressions and rewrite of embedding code Microsoft finally settled the patent dispute and reverted the changes in IE. Other problems started appearing after the migration to xhtml strict/accessibility compliance in the 1.8dev phase. We have started to use new object
tags instead of good old deprecated embed
tags and html frames. Browsers and especially plug-ins (flash, PDF, media) were not yet compatible with these new standards. Recent 1.9.x builds together with latest browsers/plug-ins should work fine now, unfortunately not all client computers and servers are updated regularly. In theory the resource module is now more accessible, but in reality ordinary users may feel it as a step backwards.
Some general problems:
- no access control - files are protected at the course level only, even course guests may download all course files if they can guess the names
- hardcoded text formats - separate types for plain text and HTML seems to be confusing for many users, there is no support for other text mark-up formats
- browser caching issues - all course files except moddata are cached in browsers, this often manifests as my resource files can not be updated bug reports from teachers
- outdated customised HTML Area - limited browser support, problematic media embeddings, bugs (Tiny MCE already in 2.0dev - needs more integration improvements though)
- can not safely duplicate resources - in backup/restore/import/export we never know the list of files used in each resource, we have to drag along all course files
- linking of site files - some servers are using site files for storage of shared resources, there is a hack in backup/restore that tries to solve this, but unfortunately this was not a good long term solution
- unmaintained code - especially Hive and localfiles support
- problematic options - force download for external files (we can force download only for files hosted by Moodle), new window controls seem to be confusing
- uploaded file filtering - site-wide uploaded file filtering interferes with some plug-ins that use text or html data files
The idea of resource types implemented as subplugins has always been a little awkward and not that helpful, incomplete list of problems follows:
- sub-plugin APIs are not as well-defined as module APIs
- refactoring inside the module is made harder
- subplugins can't uninstall
- subplugins need to bump the module version to upgrade
- universal data fields get reused for different things
- capability in subplugins not supported
- subplugin upgrade is not integrated into main module upgrade - subplugin upgrade is executed after the main module upgrade (critical)
- subplugins can not have own lang packs
- subplugins do not have own log actions
- instead of reducing code duplication advanced plugins may duplicate a lot more core
- it is not easy to move unmaintained subplugins into contrib
- unofficial subplugins not reported during upgrade
- no support for subplugins in file API
Migration of resource types in Moodle 2.0
Splitting resource into multiple modules
After some discussion, we've decided to split resource into normal modules, simplifying the structure at the same time. The benefits are cleaner, more maintainable code and a simpler Moodle.
mod/resource will still remain and will be used for two purposes - to serve uploaded resource materials and also to redirect old URLs to the new modules. We will add a new table with backup of current resource table with all other information needed for upgrades and URL redirections.
In order to still be able to group the new modules together in the user interface as "resource" plugins, we can mark these as "resource" plugins in the modules table in core, in a field called "archetypes" (later there might be "assignment" in here too), see #Module archetypes.
Resource types mapping
Type | Description | Module | Description |
---|---|---|---|
directory | an arbitrary directory of files in course files | mod/folder | a collection of files in separate file area |
html | A html page, containing absolute links to media from course files or external links | mod/page | a general text page - html markup, format FORMAT_HTML and files stored in separate file area |
text | A plain text page in some format | mod/page | a general text page - no markup, format FORMAT_PLAIN and no files |
file | link to file from course files of the same course, files may contain relative links to other files | mod/resource | one file or collection of files, one file is always selected as main file |
file | link to other course or general external URL | mod/url | general URL |
ims | Files extracted into moddata from an IMS content package obtained from course files or web repository | mod/imspackage | Files in separate file area extracted from an IMS content package obtained from repository |
repository | Links to external files in repositories (ie Hive) | mod/url? | (na) |
Contrib resource types
There are other third party plugins in contrib that need to be migrated:
- mod/resource/type/digitalnz - browser and add items listed in http://www.digitalnz.org/
- mod/resource/type/globe - browser and add items in Ariadne repositories
- mod/resource/type/jmol - display chemistry structures from a jmol format file
- mod/resource/type/mrcuteget
- mod/resource/type/mrcuteput
- mod/resource/type/rss - display one RSS feed on a page
- mod/resource/type/slideshow - display a directory of images as a slideshow
Current resource-like modules
Separate modules will use similar migration techniques:
- mod/book - Book displays multiple pages of HTML, print support, table of contents, etc.
- mod/lightboxgallery - http://moodle.greenhead.ac.uk/external/lightbox_gallery/
- mod/label - html fragment displayed on main course page
New resource modules
Resource file (mod/resource)
General resource type with two purposes - displaying of upload files and redirecting of legacy resource links to new modules.
Primary use cases:
- presentation of single file - PDF files, office documents, etc.
- download of single file or zip archive - distribution of binary files or zip archives
Secondary use cases:
- simple media file presentation - images, flash, movies, mp3, etc.
- serving of general html files created in external system - web pages, html exported from Word, content packages, etc.
- presentation of non-standard media files - no full control over embedding, might require proprietary browser plug-ins (quicktime, windows media, real media, adobe shockwave)
User interface is not optimised for handling of hundreds of files, full file manager may be used instead after creation of resource instance. Please note media embedding is automatic and without any options, compared to manual embedding in Page resource this is easier to use but less flexible.
resource database table
id | int (10) | auto-numbered | |
course | int (10) | the course id | |
name | char (255) | the title of the resource as it appears at the course outline | |
intro | text (medium) | the description/assignment of the workshop | |
introformat | int (3) | 0 | the format of the intro field |
mainfile | char(255) | file area path to main file | |
migrated | int (3) | 1 | 1 means new resource file created in 2.0 or later, 0 means old plug-in type not migrated yet |
legacyfiles | int (3) | 0 | 0 means no legacy files, 1 means on-demand migration off, 2 means on-demand migration on |
legacyfileslast | int (10) | null | timestamp, last date of on-demand mirgation |
display | int (3) | 0 | constants for "Same window / Embed / Frame / New window / Force download" |
displayoptions | text (small) | null | arbitrary display options - serialized PHP array |
filterfiles | int (3) | 0 | constants |
revision | int (10) | 0 | revision counter, incremented when any file changed |
timemodified | int (10) | 0 | the timestamp when the module was modified |
Edit UI mockup
Potential problems
- #On-demand migration should solve relative link issues
- direct WebDAV management of files would be nice
Handling of legacy mod/resource/view.php links
Implemented in mod/resource/view.php, id and f parameters are used to get instance record from resource table. Converted modules do not have instance record in resource table, new instance id and new module name can be found in resource_historic table. Next step is easy, redirect to new location redirect("$CFG->wwwroot/mod/$newmodulename/view.php?id=$cmid");
.
This redirection would be fully transparent for the original site, it would not work in restored courses though. We need to convert the old resource URLs to new URLs during backup/restore/import/export of any activity in course. It should be relatively easy for new backups in 2.0. The restores from 1.9 will will be more problematic, we can either recode everything during restore or insert fake records into resource_historic table for each resource coming from 1.9 restore file.
Handling of non-migrated types
There is a special flag in resource table which indicates if module was already converted. It is verified in mod/resource/view.php and notification message is printed if needed. Please note the original resource data and settings are stored in resource_historic, the resource table structure is changed during upgrade.
Page resource (mod/page)
Formatted text pages with embedded media files (images, flash, etc.) created in on-line text editor (TinyMCE). Very easy to use.
Primary use cases:
- short study materials - equivalent to one html page
- general information - short notices and information for students
- media file presentation - html page with manually embedded images, flash, mp3, etc.; embedding can be achieved via text filters too
Most of the features are actually implemented in embedded text editor which integrates repository file picker. Each page should be self-contained - pages should only link own files and files from Internet. It is also possible to link other course activities, unfortunately these links may not work properly when restored on different servers.
Multiple-page resources may be available in Moodle 2.0 either as part of this Page resource module or as new separate module such as Book. It would be nice some have something like eXe editor (http://exelearning.org/) directly in Moodle.
page database table
id | int (10) | auto-numbered | |
course | int (10) | the course id | |
name | char (255) | the title of the resource as it appears at the course outline | |
intro | text (medium) | the description/assignment of the workshop | |
introformat | int (3) | 0 | the format of the intro field |
content | text (medium) | the description/assignment of the workshop | |
contentformat | int (3) | 0 | the format of the intro field |
legacyfiles | int (3) | 0 | 0 means no legacy files, 1 means on-demand migration off, 2 means on-demand migration on |
legacyfileslast | int (10) | null | timestamp, last date of on-demand mirgation |
displayoptions | text (small) | null | arbitrary display options - serialized PHP array |
revision | int (10) | 0 | revision counter, incremented when any file changed |
timemodified | int (10) | 0 | the timestamp when the module was modified |
Edit UI mockup
Potential problems
- #On-demand migration should solve relative link issues
- removed Show course blocks options - to be replaced by something else during blocks rewrite
- removed New window support - accessibility reasons (this is going to be controversial)
Folder resource (mod/folder)
Folder resource module, replaces original directory of files resource type. Files and folders are stored in separate file area. This module is not intended for serving of html pages.
Primary use cases:
- easy distribution of many documents - PDFs, office documents, zip archives, etc.
Files may be organised into tree structure, the tree is expanded automatically when less than 20 files+folders found.
folder database table
id | int (10) | auto-numbered | |
course | int (10) | the course id | |
name | char (255) | the title of the resource as it appears at the course outline | |
intro | text (medium) | the description/assignment of the workshop | |
introformat | int (3) | 0 | the format of the intro field |
allowzipall | int (3) | 1 | allow download of all files in one zip file |
revision | int (10) | 0 | revision counter, incremented when any file changed |
timemodified | int (10) | 0 | the timestamp when the module was modified |
Edit UI mockup
View UI mockup
Potential problems
- we need some good file manager
- direct WebDAV management of files would be nice
URL resource (mod/url)
A URL to an external page or file.
url database table
id | int (10) | auto-numbered | |
course | int (10) | the course id | |
name | char (255) | the title of the resource as it appears at the course outline | |
intro | text (medium) | the description/assignment of the workshop | |
introformat | int (3) | 0 | the format of the intro field |
externalurl | text (medium) | external url | |
display | int (3) | 0 | constants for "Same window / Embed / Frame / New window" |
displayoptions | text (small) | null | arbitrary display options - serialized PHP array |
parameters | text (small) | null | list of extra url parameters added by Moodle |
timemodified | int (10) | 0 | the timestamp when the module was modified |
Edit UI mockup
Potential problems
- teachers must understand they can not copy&paste course file links any more
IMS Content Package (mod/imspackage)
IMS Content Package module, designed for easy presentation of standard compliant content packages. This is the recommended format for import of materials created in external systems. See http://www.imsglobal.org/content/packaging/
Primary use cases:
- import of study materials from other systems
imspackage database table
id | int (10) | auto-numbered | |
course | int (10) | the course id | |
name | char (255) | the title of the resource as it appears at the course outline | |
intro | text (medium) | the description/assignment of the workshop | |
introformat | int (3) | 0 | the format of the intro field |
display | int (3) | 0 | constants for "Same window / Frame / New window" |
displayoptions | text (small) | null | arbitrary display options - serialized PHP array |
revision | int (10) | 0 | revision counter, incremented when any file changed |
timemodified | int (10) | 0 | the timestamp when the module was modified |
Edit UI mockup
Potential problems
- is imspackage a suitable name for this module?
- custom IMS repository design obsoleted by new repository API
- synchronisation with IMS repositories
Upgrade
Each resource type requires different upgrade code because the files are stored in different places that require special handling. They also have different settings and options with different text encoding in the table fields: reference, alltext and options.
lib/db/upgrade.php
Create new table resource_historic and copy all current records from resource table. Add new fields:
- oldid - original id from resource table
- cmid - original course module id
- newmodule - null means not converted, string value is name of new module
- newid - new instance id referencing id in newmodule table
- migration time
resource_historic database table
id | int (10) | auto-numbered | |
course | int (10) | the course id | |
name | char (255) | original title of resource | |
type | char (30) | name of resource type | |
reference | char (255) | general reference | |
summary | text (small) | old description in HTML format | |
alltext | text (medium) | general text field | |
popup | text (small) | some popup info | |
options | char (255) | some options | |
timemodified | int (10) | 0 | the timestamp when the module was modified |
oldid | int (10) | the original id from resource table | |
cmid | int (10) | the original course module id of resource, kept the same in new module | |
newmodule | char (50) | mod/newmodule/* name of new module | |
newid | int (10) | new instance id in the newmodule table | |
migrated | int (10) | 0 | timestamp when type migrated |
Switching resource types to full modules
Implemented in lib/upgradelib.php function switch_resource_type($resource_historic)
- $resource_historic must already contain new module info (newmodule and newid fields)
- links course_module to newid instead of old resource id
- deletes old record instance
- updates log table
After each call we must also reset course modinfo.
Following items are not required in case of Resource module:
- no need to upgrade gradebook - no grades or outcomes in current resources
- no need to change permissions - the same cmid means the same context
- no need for capability migration - resource does not define any capabilities, only mod/resource:exportresource added in 2.0dev
- old section placement retained, because it is specified in course_modules table
- no need to migrate files, no files in areas yet
Module upgrade steps
mod/resource/db/upgrade.php
- change resource table structure - we do not need to keep any data because copy of everything is kept in resource_historic table
- mark all records in resource table as not migrated
- search for migration candidates in resource_historic table - original type file, link pointing to course file from the same course
- find out old setting from historic record
- update resource table record and mark the instance as active
- copy linked course file to new file area, mark this file as main file
- enable #On-demand migration of files if file may contain relative links (whitelist image types and some other binary types)
mod/url/db/install.php
- search for migration candidates in resource_historic table - type file with URLs outside of current course files
- find out all new settings from existing historic record
- create new instance in url table
- update resource_historic
- call switch_resource_type($resource_historic)
mod/folder/db/install.php
- search for migration candidates in resource_historic table - type directory
- create new instance in folder table
- update resource_historic
- call switch_resource_type($resource_historic)
- recursively copy the chosen course directory to the folder file area
mod/page/db/install.php
- search for migration candidates in resource_historic table - type html and text
- find out all new settings from existing historic record
- create new instance in page table
- update resource_historic
- call switch_resource_type($resource_historic)
- search for any absolute links in case of html text and copy them to new page file area keeping the same path relative to course files root
- enable #On-demand migration of files if any copied file above may contain relative links
mod/imspackage/db/install.php
- search for migration candidates in resource_historic table - type ims
- find instance setting
- create new instance in ims table
- update resource_historic
- call switch_resource_type($resource_historic)
- recursively copy moddata files into ims file area
- copy original backup of package file to another new ims file area
Others
Local files are a special form of file type, the reference starts with RESOURCE_LOCALPATH.
Hive repositories could be converted to simple URL pages.
Contrib types do not need to be migrated at the same time, it will be possible to migrate them independently because the first step up upragade in core upgrade.php script copies all existing data into the resource_historic table. New modules are responsible for deleting or renaming of any extra db tables if previously created. The new mod/newmodule/db/install.php
script is the expected migration point.
On-demand migration of files
Some new resource modules may contain relative links to other files originally stored in course files area. We may not be able to detect all these relatively linked files during the migration/upgrade and copy them to the new file area. When browser opens a migrated file (served via $CFG->wwwroot/pluginfile.php) it first constructs absolute url (again starting with $CFG->wwwroot/pluginfile.php) and requests it from server. Now pluginfile.php locates our new resource module, module instance looks for the file in its own new file area and if it does not exists it tries to find the forgotten file in original course files. Any forgotten files are migrated to new file area on-demand.
The on-demand migration will be implemented only in several new modules (mod/resource and mod/page), others either know all required files or there can not be any relative links.
On-demand flag legacyfiles is stored in module instance table, all new instances created in 2.0 have it disabled. Upgraded instances have modedit setting which turns it on/off as needed. Migration/upgrade process tries to detect and problematic resource content and automatically turns on the on-demand migration:
- html files possibly containing images, css, applets and javascript
- flash applets containing relative links to media (eg sounds or images) and data files
- java applets containing relative links to media
- javascript that loads media via relative links (eg imagine an auto-generated slideshow)
We still want to move all these legacy files into the correct resource filearea, let's try to explain it once more ;-)
- When a browser requests to view one of these legacy HTML files, it needs to load all the JS and Java and Flash it contains.
- These small programs run and need to access other files, they do not know absolute URLs, so they contruct it from own absolute URL they know and specified relative links.
- These requests to Moodle will hit the pluginfile.php script
- The pluginfile.php script checks that the file doesn't exist yet in the resource file area, and if so:
- pluginfile.php uses the relative path to copy the original file from the courses area to the resource file area
- pluginfile.php updates legacyfile field for the resource to be the current timestamp, indicating that some processing has happened.
In any case the on-demand migration is relatively cheap, we do not need to turn it off for performance reasons - it is just one wasted db query when resource contains incorrect relative links. The cost of migration itself is not significant because it is done only once.
Problem is we don't really know when the resource migration is completely "finished" (ie every single associated file has been copied into new file area), so we can't automatically turn off on-demand migration after this process. That's why the manual post-upgrade script described bellow may help in some situations.
There are cases when we need to know list of files belonging to each resource:
- backup
- import resource instances into other courses
Since we have a timestamp in there we COULD consider a cron which detects completely migrated resources - resource with a lot of some recent active and no on-demand migration for a long time. Alternatively we may use these conditions to propose finished migrations on some admin report or post-upgrade script.
Post-upgrade processing
After the Moodle 2.0 upgrade has completed we take the admin to a special page that:
- explains the issue of resources that contain links to other files
- explains how it can be fixed by on-demand migration (see below)
- presents an option for "automatic parsing" which tries to automatically migrate simpler resources such as HTML files with links to HTML files, images, CSS
- presents a list of remaining flagged resources as links that need to be visited manually in the browser to be fixed
I imagine this interface might have a small frame on the left with the main pages being loaded in the larger frame on the right. The left frame has controls and info like:
- total number of pages to process
- next and previous buttons
- some button to say "I have completely viewed this resource"
So the admin will have to sit there and go through all these resources and check them off to complete the upgrade. Until this process is finished none of these operations can be guaranteed to work so we might even think about disabling those operations:
- Backup
- Restore
- Import of a course
Normal viewing of resources WILL work fine for teachers and students, because it will just trigger the normal on-demand migration of files.
Candidates for removal from core
List of obsoleted or unmaintained features proposed for removal from core
hive sso
Located in /sso/hive/*. Unmaintained and obsoleted by auth SSO plugins and Moodle networking. It should be possible to implement new auth/hivesso and maintain it in contrib. Any volunteers?
hive repositories
Unmaintained. Any volunteers?
IMS repository
Obsoleted by regular repositories. Maybe a new core IMS repo plugin.
file/localfile.php
No maintainer. Intended for large files stored on local networks or CDs, obsoleted by faster Internet connections and various offline Moodle projects. Volunteers may create new plugin and maintained it in contrib.
weblinkauth hook
Obsoleted during the auth refactoring in 1.8dev. Volunteers may continue maintaining it as new contrib auth plugin.
Other ideas
Dealing with caching issues
To have more control over the caching of files, we could use the same mechanism already implemented in the SCORM module, by changing the URL:
- Internally store all files as itemid=0
- Add new revision db field into resource tables
- When serving files always construct links with revision as itemid
- Ignore itemid value for resource files in pluginfile.php
Module archetypes
One of the reasons for resource plugins was to visually group all resources in user interface - separate drop downs when adding activities, grouping of resources in Activities block, mod/resource/index.php page, reports and overviews, etc. Unfortunately this solution excluded contrib resource-like modules and in general was not flexible enough.
There are several possible ways how to find out activity types:
- new field archetypes in modules table - benefits are performance, ability to change types easily
- modulename_plugin_supports() - new core API in 2.0, each plugin may indicate various supported features
Opening of new windows
At present there is an option to force opening of resource in new window with a lot of detailed options, I think it is too complex for majority of our users. This is going to be controversial, not really sure how to improve this, please post in the page comments ;-)
Known uses of new windows are:
- leaving of Moodle site - when clicking on external links we do not want users to leave our site, they might have trouble finding way back to us
- loss of navigation elements - if you open pdf or image directly in browser you need to click on Back button to return to site
- course design - some designers want flash applets to open in small new windows without chrome (annoying for many users, not accessible at all)
- copy protection - this is plain silly, anything in browser can be downloaded easily
The concept of forced new windows is considered obsolete by authors of web standards and accessibility guidelines:
- there is no support for target property in xhtml strict, we have to use ugly JS tricks to work around validation problems
- not recommended by accessibility experts
- removing browser UI elements such as address/menu/status via javascript may be considered evil - browser interfaces are not cluttered with useless dead space anymore, screens are larger, F11 full-screen support available in all browsers, opening in new tab ignores this directives
The mockups above contain simplified new window controls on modedit pages.
Embedding of files
TODO: describe xhtml strict problems, embedding, size autodetection, code duplication in multimedia filter plug-in, etc.
Filtering of uploaded html files
Missing navigation is one of the biggest problems when using content created in external authoring tools. On of the possible solutions is to apply special filtering to each html file in resource file area. This does not have to be implemented as regular Moodle text filter, instead it can be implemented directly in resource_pluginfile() function.
The general idea is to search for markers in html file (special html comments, divs, etc.) and replace them with current navigation elements, course module info, course name, current user information on-the-fly. We have to make sure caching is prevented when using session or current user information.
This approach was not possible in 1.9.x because we could not determine current resource instance from file.php parameters.
This feature could be implemented in mod/resource, mod/imspackage and mod/scorm.
See also
- Development:Using the file API
- Development:Repository API
- Development:Portfolio API
- Development:File API
- MDL-14589 - File API Meta issue
- MDL-16089 - Resource module migration