Note:

If you want to create a new page for developers, you should create it on the Moodle Developer Resource site.

Repository API: Difference between revisions

From MoodleDocs
Line 392: Line 392:
*module files - not relinked, urls are not permanent, link can not be copypasted (assignments, forum attachments, rss files, etc.)
*module files - not relinked, urls are not permanent, link can not be copypasted (assignments, forum attachments, rss files, etc.)
*user files - not relinked, links work only on original server (blog attachments, personal files, etc.)
*user files - not relinked, links work only on original server (blog attachments, personal files, etc.)
===Access needed for both file and its instances===
We need two types of access control - first who can create instances (link files), second who can access the instance.

Revision as of 22:17, 28 February 2008

This page describes the specification for a future feature, currently being worked on for Moodle 2.0. This spec is STILL UNDER CONSTRUCTION.

Objectives

  1. Allow files to be added directly into Moodle (as we do now)
  2. Allow all Moodle users to easily bring content into Moodle from external repositories
  3. Allow content to be used in multiple Moodle contexts securely and simply via capabilities
  4. Consistency and simplicity for ALL file handling

Overview

The Repository API is a core set of interfaces that all Moodle code will use to:

  1. copy files from external servers
  2. store files within Moodle
  3. display files to Moodle users

It's important to remember that a repository will generally be treated as READ-ONLY. Management of the files will normally be done through the native interface provided by the repository. Publishing of Moodle content TO a repository is handled by the Portfolio API.

A typical user story:

  1. User wants to add a new resource to a course
  2. User clicks the "Choose a resource" button
  3. User is able to choose from a list of configured repositories (this step will be skipped if there's only one).
  4. User is presented with a simple file picker to choose a file
  5. User chooses a file
  6. File is COPIED into Moodle and included in the course.
  7. File is marked as owned by that user
  8. Access controls are automatically added for that file so that only those with privileges to see that course can see that file (the owner can change those permissions anytime)


When selecting the document, the user may have options to

  • only return the URL to the file if it's desired to keep it external (but this does present security and integrity risks), or
  • refresh the local file copy regularly and automatically
  • just leave it as a once-off copy (default)

General Architecture

All file-handling areas in Moodle (eg adding a new resource, adding attachments to a forum post, uploading assignments) will be rewritten to talk to the standard API class methods in a standard way.

Each repository plugin (a standard Moodle plugin stored under /repository/xxx) will subclass the standard API and override methods specific to that repository.

As is usual in Moodle, there will be admin settings to disable/enable certain repository plugins as standard, as well as user settings so that users can add their own personal repositories to the standard list (eg Yahoo Briefcase or Google Docs) and to select their default repository.

Once a repository has been used the file will usually be copied into Moodle.

All files in Moodle will be listed in a table (see below) allowing us to store various metadata about each file. The file contents will not be the database (though we could easily offer that option if we want to), they will be on disk with a name related to the id rather than the "human" name (this avoids a lot of OS Unicode problems).

The current "course file manager" will be replaced by a personal file manager, which basically is a user's view of the internal Moodle repository, showing the files that are available to you in that context (which may include files from other people, repositories etc) and a nice browse/search interface.

Finally, normal Moodle modules will have easy functions it can use to add/remove permissions to particular files, according to module rules. For example, the assignment plugin may, after allowing a student to select a file to be submitted, add permissions so that people who have grade permissions in that assignment can read it.

Repository requirements

From the Moodle point of view, each repository is just a hierarchy of nodes.

The repository MUST provide:

  1. A URL to download each node (eg file).
  2. A list of the nodes (eg files and directories) under a given node (eg directory). This allows Moodle to construct a standard browse interface (much like a standard OS file picker). However some repository plugins may choose to completely override the repository_browse() method and implement their own interface, that's OK, as long as they end up with a URL for the file.

The repository can OPTIONALLY:

  1. Require some authentication credentials
  2. Provide more metadata about each node (mime type, size, dates, related files, dublin core stuff, etc)
  3. Describe a search facility (so that Moodle can construct a search form)
  4. Provide copyright and usage rules (or just information about the rules)

Repository plugins

Some plugins I'd like to see developed for the first version are:

  • local - very similar to the current course-based file manager, except user-based
  • moodle - an interface to another Moodle site, accessed over a secure mnet connection
  • jsr170 - an interface that can talk to anything that supports js170 (eg Alfresco)
  • oki - an OKI emulator allowing us to access things with OKI interfaces,like Fedora
  • briefcase - an interface to Yahoo Briefcase
  • myspace - an interface to MySpace files (perhaps via this MySpace API)
  • googledocs - an interface to Google Docs
  • skydrive - an interface to Microsoft's SkyDrive files
  • facebook - an interface to Facebook files
  • merlot - an interface to the learning materials in Merlot.org
  • flickr - an interface to flickr
  • youtube - an interface to YouTube
  • mahara - an interface to a Mahara installation

Local Files

In general, all external files will be copied locally and stored in Moodle. This section describes the storage of the files and how we define ACLs (access control lists) for them. All existing files in the Moodle dataroot course areas will be moved into this new system during the upgrade.

Tables

repository

This table contains one entry for every configured external repository instance.

Field Type Default Info
id int(10) autoincrementing
repositoryname varchar A custom name for this reopsitory (non-unique)
repositorytype varchar The name of the plugin being used
userid int(10) The person who created this repository instance
contextid int(10) The context that this repository is available to ( = system context for site-wide ones)
username varchar username to log in with, if required
password varchar password to log in with, if required
option1 varchar Other information useful to the plugin
option2 varchar Other information useful to the plugin
option3 varchar Other information useful to the plugin
option4 varchar Other information useful to the plugin
option5 varchar Other information useful to the plugin
timecreated int(10) The time this repository was created
timemodified int(10) The last time the repository was modified


file

This table contains one entry for every file. Enough information is kept here so that the file can be fully identified and retrieved again if necessary.

Field Type Default Info
id int(10) autoincrementing
userid int(10) The owner of the file (person who created this entry)
filename varchar The full Unicode name of this file
repositoryid int(10) The repository instance this is associated with
updates int(10) Specifies the update schedule (0 = none, 1 = on demand, other = some period in seconds)
repositorypath text The full path to the file on the repository
timeimportfirst int(10) The first time this file was imported into Moodle
timeimportlast int(10) The most recent time that this file was imported into Moodle
timecreated int(10) The time this file was created (if known), otherwise same as time imported
timemodified int(10) The last time the file was modified
timeaccessed int(10) The last time this file was accessed for any reason


file_instances

This table contains one entry for every "place" a file is published to. For example, one file might appear in an assignment but also in a forum attachment, so there would be two entries here.

Field Type Default Info
id int(10) autoincrementing
fileid int(10) The file we are defining access for
instancetype varchar This defines the table in Moodle that this instance is associated with (eg 'forum_posts', 'assignment_submissions' etc
instanceid int(10) The id in the foreign table (of name instancetype) that this instance is associated with

file_access

This table describes the ACL for each file, so that checks can easily be made on whether someone can see this file or not. Note there can be multiple entries per file. Users can ALWAYS see their own files, so there are no entries here for that.

Field Type Default Info
id int(10) autoincrementing
fileid int(10) The file we are defining access for
contextid int(10) The context where this file is being published
capability text The capability that is required to see this file


Class methods

Repository class

This class implements the interface to a particular repository, for browsing, selecting and updating files.

get_file($path)

get_listing($parent='/', $search=''')

cron()

etc

File class

This class implements the display and management of files from local storage, with full access checking. Some of the functions are for single files, while some are optimised for bulk display and searching (eg in the personal files interface).

display_file()

sort of like file.php is now, except smarter

set_access($fileid, $accessstuff)

Grant some access to people to a file

has_access($fileid, $userid=NULL)

Returns true or false depending on access to a file

Areas in Moodle that need re-writing

Examples of use

Potential problems and limitations

Missing concept of trusted files

Files are not created equal, some of them are to be trusted, some can not be trusted at all. Web browsers trust everything received from the server, files from server may access cookie information and thus scripting technologies may allow them to do anything user can do. We do have to trust our teachers because they are supposed to create the learning content, but we definitely can not trust all students.

Imagine if students were allowed to upload arbitrary files to server, like html file loaded with javascript and the server would happily serve them to all Moodle users. Our solution is to use html cleaning filters for submitted texts and force downloads of student uploaded files. To do this we must know if we trust the files or not. Unfortunately the forced downloads of student uploaded files and cleaning of html texts does not solve all problems, because bugs in browsers and especially browser plug-ins may sometimes be used to work around our protections.

The best solution would be to use separate web addresses for trusted and not trusted files (two wwwroots in config.php), not all sites may afford two different addresses but we should be imo prepared for this.

Relative file links

Flash, Java and SCORM require relative links and directory hierarchy in general - we must support it. Some SCORM packages load hundreds of files per page which means the file serving must be very fast with minimum of db access.

Reading the proposal above it seems the API is about serving of isolated files referenced by repository ids. HTML requires to use relative or absolute locators with file names, we can not use repository ids directly in relative links. In case of scorm we have absolute path to base of SCORM package of given activity and SCORM files use relative links inside the package. Solution could be to store relative paths directly in filename field ex:directory1/directory2/filename.ext.

Virtual directories and files

Sometimes the content of files is generated on the fly (csv exports, etc.), there are many special files spread through codebase doing nearly the same, it should be imho possible to use the same file API for these.

Another virtual example is assignment submissions and webdav. I would like to see an option to browse the assignment submissions as directory structure, the top directory would be a list of names of users, inside html files with online assignment and uploaded files. This would allow us to implement simple zip&go or webdav based offline grading solutions. The problem here is that the content of this virtual submissions directly needs to be created on the fly based on user references, the proposed repository structure can not be used for this.

Backup/restore relinking

We are supporting relinking inside only. Till now it was easy to guess if absolute link will work after restore on another server. There are several types of files:

  • course files - relinked during restore, works on any server if from the same course
  • module files - not relinked, urls are not permanent, link can not be copypasted (assignments, forum attachments, rss files, etc.)
  • user files - not relinked, links work only on original server (blog attachments, personal files, etc.)

Access needed for both file and its instances

We need two types of access control - first who can create instances (link files), second who can access the instance.