Difference between revisions of "File System API"

Jump to: navigation, search
(Created page with "{{Moodle 3.3}} As standard Moodle uses the locally available file system for all files. Since Moodle 3.3 it is possible to extend the file system component of the File Storage...")
 
m
Line 51: Line 51:
  
 
''Local'' file paths '''''must''''' be capable of existing on disk.
 
''Local'' file paths '''''must''''' be capable of existing on disk.
''Remote'' file paths may be either the valid local path, or a [Supported Protocol|http://au1.php.net/manual/en/wrappers.php].
+
''Remote'' file paths may be either the valid local path, or a [http://au1.php.net/manual/en/wrappers.php:Supported Protocol].
  
 
===== Local file paths =====
 
===== Local file paths =====
Line 68: Line 68:
 
Remote file paths '''may''' be formatted as a standard local file path. They '''may''' be a streamable URL.
 
Remote file paths '''may''' be formatted as a standard local file path. They '''may''' be a streamable URL.
  
See the PHP documentation on [Supported Protocols|http://au1.php.net/manual/en/wrappers.php] for more information on the accepted formats.
+
See the PHP documentation on [http://au1.php.net/manual/en/wrappers.php:Supported Protocols] for more information on the accepted formats.
  
 
Remote files may be passed into PHP functions such as:
 
Remote files may be passed into PHP functions such as:
Line 131: Line 131:
 
The returned path must be in either:
 
The returned path must be in either:
 
* a local file format; or
 
* a local file format; or
* a remote file path as per the [Supported Protocol|http://au1.php.net/manual/en/wrappers.php] documentation.
+
* a remote file path as per the [http://au1.php.net/manual/en/wrappers.php:Supported Protocol] documentation.
  
 
Remote paths should not be passed outside of the File System implementation.
 
Remote paths should not be passed outside of the File System implementation.
Line 158: Line 158:
  
 
Similar to <code>add_file_to_pool</code>, the <code>add_string_to_pool</code> function is responsible for storing the provided string content into your file system.
 
Similar to <code>add_file_to_pool</code>, the <code>add_string_to_pool</code> function is responsible for storing the provided string content into your file system.
 
Although a default implementation is available, it may be more efficient in some cases to define your own.
 
  
 
==== copy_content_from_storedfile ====
 
==== copy_content_from_storedfile ====

Revision as of 01:48, 9 February 2017

Moodle 3.3

As standard Moodle uses the locally available file system for all files. Since Moodle 3.3 it is possible to extend the file system component of the File Storage API to support alternative File Systems.

Introduction

Moodle ships with a file system API which enables the internal Moodle File Storage system to set, and retrieve files, and file content. The standard file system implementation uses the Moodle filedir, which is a locally available directory on disk, and which can be shared between clustered servers via network file systems such as NFS.

Since Moodle 3.3 it is possible to use alternative file systems, including remote file systems. These are easy to setup and configure, and allow for greater scalability which does not depends so heavily upon traditional network file systems.

All files accessed via the standard File Storage API is processed using this API, however the existing tempdir, cachedir, and localcachedir parameters remain separate.

Defining a new filesystem

All file system implementations must extend the
file_system
class, and define the required abstract functions.

It is entirely up to the individual implementation how it handles storage, saving, and retrieval of files from it's file system, however certain key concepts apply.

Concepts

Moodle File API

The Moodle File API is broken into different components, each having a related but fundamentally separate purpose.

File Storage API

The File Storage API is responsible for all interactions with the rest of Moodle.

Files can be accessed using this API.

Stored File
Any file stored in Moodle's File Storage can be represented as a
stored_file
. The
stored_file
class holds various metadata about the files in the repository.

Content Hash

The Moodle File Storage API performs de-duplication by generating a sha1 checksum of the content of the file. The SHA1 algorithm is sufficiently unique for the purposes of this de-duplication.

This SHA1 checksum, referred to as a Content Hash (typically
$contenthash
) is stored for each file in the Moodle File Storage Database tables.

Files are always referred to using this content hash, and can be both stored and fetched using it.

Helper functions allow conversion of
stored_file
objects into a content hash.

Distinction between local and remote file paths

The file system API makes the distinction between a local, and a remote file path.

Local file paths must be capable of existing on disk. Remote file paths may be either the valid local path, or a Protocol.

Local file paths

Local file paths must be formatted as a standard local file path. They must not be a streamable URL. This is because some PHP functions are unable to work with seekable, or streamable resources and can only work with local files. These include, but are not limited to:

  • The
    ZipArchive
    used in the
    zip_packer
    ; and
  • curl_file_create
    , used to add files to a curl request; and
  • finfo
    , used to determine mime information about files.

Additionally there are some cases which may suffer performance issues when dealing with streamable files. This includes:

  • getimagesize()
    which must fetch the entire image first in order to determine size.
Remote file paths

Remote file paths may be formatted as a standard local file path. They may be a streamable URL.

See the PHP documentation on Protocols for more information on the accepted formats.

Remote files may be passed into PHP functions such as:

  • file_get_contents
    ; and
  • readfile
    .

Trash

Whilst files are in use, they should always be accessible within your file system and may be interacted with using their contenthash.

If a file ceases to have any uses within the Moodle File Storage API, the File Storage API will attempt to mark the file as being trash.

The trash system is periodically emptied, and files may be restored from trash back into the standard file system.

Implementation of trash functions is optional, but recommended.

Note: If you intend to share your file system between many Moodle instances, you should not implement the trash system.

Explanation of required functions

setup_instance

The
setup_instance()
function is called during instantiation and allows you to setup any required configuration for your file system implementation.

An example implementation might look like:

protected function setup_instance() {
    // Setup the client.
    self::$client = new Awesome\Remote\File\Storage\System();

    // Create a directory for use during the current request.
    // This directory will be automatically removed at the end of the request.
    self::$filedir = make_request_directory();
}

get_local_filepath_from_hash

The
get_local_filepath_from_hash
function is responsible for returning the correct path to the file in disk.

This path must be consistent for each contenthash.

The file does not need to exist on disk unless the
$fetchifnotfound
parameter is truthy.

An example implementation might look like:

protected function get_local_path_from_hash($contenthash, $fetchifnotfound = false) {
    $path = self::$filedir . DIRECTORY_SEPARATOR . $contenthash;

    if ($fetchifnotfound && !is_readable($path)) {
        $this->fetch_local_copy($contenthash, $path);
    }

    return $path;
}

get_remote_filepath_from_hash =

The
get_remote_filepath_from_hash($contenthash)
function is responsible for returning the correct path to the file.

The returned path must be in either:

  • a local file format; or
  • a remote file path as per the Protocol documentation.

Remote paths should not be passed outside of the File System implementation.

An example implementation might look like:

protected function get_remote_path_from_hash($contenthash, $fetchifnotfound = false) {
    return $this->get_presigned_url($contenthash, '+6 hours');
}

Note: If using a one-time/pre-signed URL, please ensure that the lifetime of the URL is sufficient for larger files.

add_file_to_pool

The
add_file_to_pool
function is responsible for storing the provided local file on disk into your file system.

It is your responsibility to:

  • generate the file's contenthash (if it is not provided);
  • check whether an existing file with the same contenthash exists in the file system;
  • ensure that the contenthash matches if there is a matching file; and
  • copy the file to your file system; and
  • ensure that file permissions are correct.

add_string_to_pool

Similar to
add_file_to_pool
, the
add_string_to_pool
function is responsible for storing the provided string content into your file system.

copy_content_from_storedfile

The
copy_content_from_storedfile
function is responsible for copying an existing file in the file system to a new local file.

If you are using a local file system, you will likely just copy the file:

public function copy_content_from_storedfile(stored_file $file, $target) {
    return copy($this->get_local_filepath_from_storedfile($file), $target);
}

However, if you are implementing a remote file system, you can likely make certain performance improvements by downloading the file straight to the intended target:

public function copy_content_from_storedfile(stored_file $file, $target) {
    if ($this->is_readable_locally_from_storedfile($file, false)) {
        return copy($this->get_local_filepath_from_storedfile($file), $target);
    } else {
        return $this->fetch_local_copy($file->get_contenthash(), $target);
    }
}

_move_file_to_trash

The
_move_file_to_trash
function is responsible for moving a file from the file system to the trash system.

recover_file_from_trash

The
recover_file_from_trash
function is responsible for attempting to restore a file from the trash system back into the standard file system area.

If defining this function, it is your responsibility to ensure that the filesize and contenthash are both correct.

empty_trash

The
empty_trash
function is responsible for removing all files in the trash system.