Note:

If you want to create a new page for developers, you should create it on the Moodle Developer Resource site.

File System API

From MoodleDocs
Revision as of 01:36, 9 February 2017 by Andrew Nicols (talk | contribs) (Created page with "{{Moodle 3.3}} As standard Moodle uses the locally available file system for all files. Since Moodle 3.3 it is possible to extend the file system component of the File Storage...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Moodle 3.3

As standard Moodle uses the locally available file system for all files. Since Moodle 3.3 it is possible to extend the file system component of the File Storage API to support alternative File Systems.

Introduction

Moodle ships with a file system API which enables the internal Moodle File Storage system to set, and retrieve files, and file content. The standard file system implementation uses the Moodle filedir, which is a locally available directory on disk, and which can be shared between clustered servers via network file systems such as NFS.

Since Moodle 3.3 it is possible to use alternative file systems, including remote file systems. These are easy to setup and configure, and allow for greater scalability which does not depends so heavily upon traditional network file systems.

All files accessed via the standard File Storage API is processed using this API, however the existing tempdir, cachedir, and localcachedir parameters remain separate.

Defining a new filesystem

All file system implementations must extend the file_system class, and define the required abstract functions.

It is entirely up to the individual implementation how it handles storage, saving, and retrieval of files from it's file system, however certain key concepts apply.

Concepts

Moodle File API

The Moodle File API is broken into different components, each having a related but fundamentally separate purpose.

File Storage API

The File Storage API is responsible for all interactions with the rest of Moodle.

Files can be accessed using this API.

Stored File

Any file stored in Moodle's File Storage can be represented as a stored_file.

The stored_file class holds various metadata about the files in the repository.

Content Hash

The Moodle File Storage API performs de-duplication by generating a sha1 checksum of the content of the file. The SHA1 algorithm is sufficiently unique for the purposes of this de-duplication.

This SHA1 checksum, referred to as a Content Hash (typically $contenthash) is stored for each file in the Moodle File Storage Database tables.

Files are always referred to using this content hash, and can be both stored and fetched using it.

Helper functions allow conversion of stored_file objects into a content hash.

Distinction between local and remote file paths

The file system API makes the distinction between a local, and a remote file path.

Local file paths must be capable of existing on disk. Remote file paths may be either the valid local path, or a [Supported Protocol|http://au1.php.net/manual/en/wrappers.php].

Local file paths

Local file paths must be formatted as a standard local file path. They must not be a streamable URL. This is because some PHP functions are unable to work with seekable, or streamable resources and can only work with local files. These include, but are not limited to:

  • The ZipArchive used in the zip_packer; and
  • curl_file_create, used to add files to a curl request; and
  • finfo, used to determine mime information about files.

Additionally there are some cases which may suffer performance issues when dealing with streamable files. This includes:

  • getimagesize() which must fetch the entire image first in order to determine size.
Remote file paths

Remote file paths may be formatted as a standard local file path. They may be a streamable URL.

See the PHP documentation on [Supported Protocols|http://au1.php.net/manual/en/wrappers.php] for more information on the accepted formats.

Remote files may be passed into PHP functions such as:

  • file_get_contents; and
  • readfile.

Trash

Whilst files are in use, they should always be accessible within your file system and may be interacted with using their contenthash.

If a file ceases to have any uses within the Moodle File Storage API, the File Storage API will attempt to mark the file as being trash.

The trash system is periodically emptied, and files may be restored from trash back into the standard file system.

Implementation of trash functions is optional, but recommended.

Note: If you intend to share your file system between many Moodle instances, you should not implement the trash system.

Explanation of required functions

setup_instance

The setup_instance() function is called during instantiation and allows you to setup any required configuration for your file system implementation.

An example implementation might look like: protected function setup_instance() {

   // Setup the client.
   self::$client = new Awesome\Remote\File\Storage\System();
   // Create a directory for use during the current request.
   // This directory will be automatically removed at the end of the request.
   self::$filedir = make_request_directory();

}

get_local_filepath_from_hash

The get_local_filepath_from_hash function is responsible for returning the correct path to the file in disk.

This path must be consistent for each contenthash.

The file does not need to exist on disk unless the $fetchifnotfound parameter is truthy.

An example implementation might look like: protected function get_local_path_from_hash($contenthash, $fetchifnotfound = false) {

   $path = self::$filedir . DIRECTORY_SEPARATOR . $contenthash;
   if ($fetchifnotfound && !is_readable($path)) {
       $this->fetch_local_copy($contenthash, $path);
   }
   return $path;

}

get_remote_filepath_from_hash =

The get_remote_filepath_from_hash($contenthash) function is responsible for returning the correct path to the file.

The returned path must be in either:

Remote paths should not be passed outside of the File System implementation.

An example implementation might look like: protected function get_remote_path_from_hash($contenthash, $fetchifnotfound = false) {

   return $this->get_presigned_url($contenthash, '+6 hours');

}

Note: If using a one-time/pre-signed URL, please ensure that the lifetime of the URL is sufficient for larger files.

add_file_to_pool

The add_file_to_pool function is responsible for storing the provided local file on disk into your file system.

It is your responsibility to:

  • generate the file's contenthash (if it is not provided);
  • check whether an existing file with the same contenthash exists in the file system;
  • ensure that the contenthash matches if there is a matching file; and
  • copy the file to your file system; and
  • ensure that file permissions are correct.

add_string_to_pool

Similar to add_file_to_pool, the add_string_to_pool function is responsible for storing the provided string content into your file system.

Although a default implementation is available, it may be more efficient in some cases to define your own.

copy_content_from_storedfile

The copy_content_from_storedfile function is responsible for copying an existing file in the file system to a new local file.

If you are using a local file system, you will likely just copy the file:

public function copy_content_from_storedfile(stored_file $file, $target) {

   return copy($this->get_local_filepath_from_storedfile($file), $target);

}

However, if you are implementing a remote file system, you can likely make certain performance improvements by downloading the file straight to the intended target: public function copy_content_from_storedfile(stored_file $file, $target) {

   if ($this->is_readable_locally_from_storedfile($file, false)) {
       return copy($this->get_local_filepath_from_storedfile($file), $target);
   } else {
       return $this->fetch_local_copy($file->get_contenthash(), $target);
   }

}

_move_file_to_trash

The _move_file_to_trash function is responsible for moving a file from the file system to the trash system.

recover_file_from_trash

The recover_file_from_trash function is responsible for attempting to restore a file from the trash system back into the standard file system area.

If defining this function, it is your responsibility to ensure that the filesize and contenthash are both correct.

empty_trash

The empty_trash function is responsible for removing all files in the trash system.