Development talk:Repository API
(Ideas will be deleted from the comments section as they are resolved or merged into the main spec)
Missing concept of trusted files
Files are not created equal, some of them are to be trusted, some can not be trusted at all. Web browsers trust everything received from the server, files from server may access cookie information and thus scripting technologies may allow them to do anything user can do. We do have to trust our teachers because they are supposed to create the learning content, but we definitely can not trust all students.
Imagine if students were allowed to upload arbitrary files to server, like html file loaded with javascript and the server would happily serve them to all Moodle users. Our solution is to use html cleaning filters for submitted texts and force downloads of student uploaded files. To do this we must know if we trust the files or not. Unfortunately the forced downloads of student uploaded files and cleaning of html texts does not solve all problems, because bugs in browsers and especially browser plug-ins may sometimes be used to work around our protections.
The best solution would be to use separate web addresses for trusted and not trusted files (two wwwroots in config.php), not all sites may afford two different addresses but we should be imo prepared for this. Petr Škoda (škoďák) 16:42, 28 February 2008 (CST)
Great idea, yes. In fact couldn't all the files be served via $CFG->fileroot all the time? Martin Dougiamas 07:08, 29 February 2008 (CST)
userid field could be used for this, but separate flag might be better in order to allow teachers to upload untrusted files (teacher uploads assignment submission for the student).
Relative file links
Flash, Java and SCORM require relative links and directory hierarchy in general - we must support it. Some SCORM packages load hundreds of files per page which means the file serving must be very fast with minimum of db access.
Reading the proposal above it seems the API is about serving of isolated files referenced by repository ids. HTML requires to use relative or absolute locators with file names, we can not use repository ids directly in relative links. In case of scorm we have absolute path to base of SCORM package of given activity and SCORM files use relative links inside the package. Solution could be to store relative paths directly in filename field ex:directory1/directory2/filename.ext.
Yes I agree, we definitely need to support (virtual) directories and slasharguments. We could just match the file argument to a path in the db. Perhaps add the fileid to the argument path as a primary key: fileid/directory1/directory2/filename.ext Martin Dougiamas 07:08, 29 February 2008 (CST)
Virtual directories and files
Sometimes the content of files is generated on the fly (csv exports, etc.), there are many special files spread through codebase doing nearly the same, it should be imho possible to use the same file API for these.
Another virtual example is assignment submissions and webdav. I would like to see an option to browse the assignment submissions as directory structure, the top directory would be a list of names of users, inside html files with online assignment and uploaded files. This would allow us to implement simple zip&go or webdav based offline grading solutions. The problem here is that the content of this virtual submissions directly needs to be created on the fly based on user references, the proposed repository structure can not be used for this.
Very interesting idea! Martin Dougiamas 07:08, 29 February 2008 (CST)
Backup/restore relinking
We are supporting relinking inside courses only. Till now it was easy to guess if absolute link will work after restore on another server. There are several types of files:
- course files - relinked during restore, works on any server if from the same course
- module files - not relinked, urls are not permanent, link can not be copypasted (assignments, forum attachments, rss files, etc.)
- user files - not relinked, links work only on original server (blog attachments, personal files, etc.)
I think all this will be simplified in the proposed system because everything is represented using the same system: a file with an ACL (backup can quickly determine all the files in one course, probably using one SQL query). Martin Dougiamas 07:08, 29 February 2008 (CST)
Access needed for both file and its instances
We need two types of access control - first who can create instances (link files), second who can access the instance (download the file).
For the first don't we already have those capabilities? (like mod/forum:createattachment, moodle/course:managefiles) but they probably could use rationalising. We'll need new ones per repository, too, of course. Martin Dougiamas 07:08, 29 February 2008 (CST)
Cache lifetime
There should be a way to specify cache filetime for each instance of file. For example 0 for uploaded assignments, 1 day for resource file. It might be better to allow modules to decide about this, at present it is hardcoded in file.php.
Great idea. Martin Dougiamas 07:08, 29 February 2008 (CST)
Hierarchy in tables ?
It's possible that I don't undertand a key piece of the concept, but I wonder why there is no reference to any "parent id" in the file or file_instance table ? In other words, how the hierarchical structure is supposed to be "imported" from repository to Moodle ? If the hierarchy is reserved to the course context, and not to the repository context, it seems difficult to allow students to access to a complete directory, for example.
On the other hand, maybe it's only a question for the "local" repository type, and not to the "generic" repository API ? Allegre Guillaume 16:43, 5 March 2008 (CST)
Editing Repository Files / Version Control ?
How do you imagine to handle the "editing file" problem ? I can see several solutions :
- the simplest way : write access to a file really allows to edit (re-upload) the file, each instance being modified
- the "cheap copy" way : optionally, the modification is applied only to new file_instances (or those for which the teacher forces to). Here you have to handle two (then maybe more) revisions of this file.
This triggers the file revisions (or version control) question.
A related question is about versionned files : should it depend only upon the repository layer (for example, a plugin could implement a SVN "repository") ? Or should Moodle be aware of the file "revision number" ? Allegre Guillaume 17:00, 5 March 2008 (CST)
Replacing Moodle's File System
The API as specified still uses the standard Moodle file system, which has its limitations while providing a simple file access method. But, shouldn't we also be considering completely (or partially) replacing the Moodle file system with the repository system? To that end, the file interface would be the same to users, but where the files are and how they are accessed would be up to which repository was being used. In this system, the standard Moodle file system would just be one of the available repositories. Then the API could support more robust access controls if available, or very few (as Moodle does now). This could allow for per-directory / per-file privilege granting, common file areas so that courses could access the same copy of a file (rather than copying it into the Moodle file area), etc.
The API could also support multiple repositories and allow choosing a file as a link rather than copying it.
Of course, this would also mean allowing write access to the repository area from Moodle, rather than leaving it as read-only. I know the plan was to do this through the Portfolio API, but that really is limited to a user storage function and not a file management function. I think having an API defined to allow for full file management would be a better solution - even if not supported by all repositories.