Note:

If you want to create a new page for developers, you should create it on the Moodle Developer Resource site.

Server clustering improvements proposal: Difference between revisions

From MoodleDocs
Line 106: Line 106:
$CFG->filedir does not require locking, it may be outside of $CFG->dataroot. The file system/sharing may use very aggressive caching techniques because the files never change, the node needs to contact the master when writing or on cache miss. This means we could even create a very simple local cluster node cache for filedir in PHP which would be very fast - a few lines of hacks in current file storage code, master would not have to be a directory at all, anything that stores files would be ok.
$CFG->filedir does not require locking, it may be outside of $CFG->dataroot. The file system/sharing may use very aggressive caching techniques because the files never change, the node needs to contact the master when writing or on cache miss. This means we could even create a very simple local cluster node cache for filedir in PHP which would be very fast - a few lines of hacks in current file storage code, master would not have to be a directory at all, anything that stores files would be ok.


One elegant hack might be to use S3 for persistent file storage and just copy the files to local cluster node cache filedir. This would require minimal changes in current codebase, performance would be better because the filedir would not be shared and it would be more fault tolerant. The lack of deleting would mean that the data in central S3 store may grow substantially, on the other hand this would make file-less backups reliable on the same site or sites that share the same S3 pool. The size of local node cache filedir could be kept under control by periodically deleting files that were not accessed for some time or are over sized.
One elegant hack might be to use S3 for persistent file storage and just copy the files to local cluster node cache filedir. This would require minimal changes in current codebase, performance would be better because the filedir would not be shared and it would be more fault tolerant. The lack of deleting would mean that the data in central S3 store may grow substantially, on the other hand this would make file-less backups reliable on the same site or sites that share the same S3 pool. The size of local node cache filedir could be kept under control by periodically deleting files that were not accessed for some time or are over sized, or you could purge the local cache filedir and just let it refetch everything from central S3 storage file pool.


Please note that there are major performance problems and bugs in the file linking code, maybe it would be good to rework it completely to rely on file pool.
Please note that there are major performance problems and bugs in the file linking code, maybe it would be good to rework it completely to rely on file pool.

Revision as of 15:31, 3 July 2013

Note: This page is a work-in-progress. Feedback and suggested improvements are welcome. Please join the discussion on moodle.org or use the page comments.

Moodle 2.6

This is a base for discussion about potential server clustering improvements in Moodle 2.6

Current config.php settings

Each node in cluster may use local set of php files including config.php, these may be synchronised via git for example, rsync, etc.

$CFG->wwwroot

It must be the same on all nodes, it must be the public facing URL. It cannot be dynamic.

$CFG->sslproxy = true

Enable if you have https:// wwwroot but the SSL is not done by Apache.

$CFG->reverseproxy = true

Enable if your nodes are accessed via different URL. Please note that it is not compatible with $CFG->loginhttps.

$CFG->dirroot

It is strongly recommended that $CFG->dirroot (which is automatically set via realpath(config.php)) contains the same path on all nodes. It does not need to point to the same shared directory though. The reason is that some some low level code may use the dirroot value for cache invalidation.

The simplest solution is to have the same directory structure on each cluster node and synchronise these during each upgrade.

The dirroot should be always read only for apache process because otherwise built in add-on installation and plugin uninstallation would get the nodes out of sync.

$CFG->dataroot

This MUST be a shared directory where each cluster node is accessing the files directly. It must be very reliable, data must not be manipulated directly there.

Locking support is not required, if any code tries to use file locks in dataroot outside of cachedir or tempdir it is a bug.

$CFG->tempdir

It is recommended to use separate ram disks on each node. Scripts may use this directory during one request only. The contents of this directory may be deleted if there is no pending HTTP request, otherwise delete only files with older timestamps.

Always purge this directory when starting cluster node.

If any script tries to use files that were not created during current request it is a bug that needs to be fixed.

$CFG->cachedir

This MUST be a shared directory. The existing caching code is not designed to deal with local node cache dirs, the code is not going to be changed to hack around this restriction.

Why does it have to be shared? Because the developers who wrote the current caching code in all stable 2.x branches expected that there is only one cachedir and no changes in that directory are lost when processing another http request on different node.

Shared filesystems are usually slow, ideally it should be possible to use MUC caches instead of $CFG->cachedir. I is also possible to create new MUC backends that are clustering aware.

You can safely purge cachedir when restarting the whole cluster.

File locking is for now required.

Proposed new $CFG->localcachedir

Local node caches (not shared) require revision numbers (such as $CFG->themerev, $CFG->jsrev). We found out that the the revision number should be based on current time and always incrementing, it prevents fatal cache invalidation problems on restored sites. -1 usually means do not cache.

The revision numbers must the same on all nodes, the simplest solution is to store it in database and bump it up after any change in the cached data.

Component cache

Class core_component is using a $CFG->cachedir/core_component.php cache that contains a complete list of all plugins and all classes present in $CFG->dirroot. The implementation must be as fast as possible and the results must be extremely reliable.

The cache is automatically invalidated on admin/index.php page and during installation and every upgrade. It is also cleared during purge_all_caches(), but that is only a side effect of storing it in cachedir and it is not required.

core_component class and cache cannot depend on database, MUC or any core libraries - that is the reason why there cannot be any revision flag, there is nowhere to store it, the sha1() of that file itself is the revision.

See MDL-40475 for proposed workaround that allows you to use an alternative component cache file.

Typically the $CFG->alternative_component_cache = '/local/cache/dir/core_component.php' would point to local node cache directory. Before upgrade the administrator would have to manually execute following on each node:

$ php admin/cli/alternative_component_cache.php --rebuild

Alternatively you could put the cache file directly into dirroot and distribute it together with new PHP source files. Yet another possibility would be to purge all local caches on all nodes before upgrade.

MUC and clustering

The requirement is to make MUC stores aware of revision numbers, if we do that we can store the data in multiple backends without getting the caches stale. Another benefit is that we would not have to purge all existing data which would help on shared servers.

Default MUC file stores could decide to use either $CFG->cachedir and $CFG->localcachedir depending on availability of the revision number. Current workaround is to include the revision number in cache key, but that does not solve the problem with $CFG->cachedir that must be shared in all nodes.

Another problem is that MUC cache configuration is stored in shared directory which makes tweaking of individual node cache configurations problematic. It would be probably better to allow alternative MUC cache file location in config.php

Theme caching

Javascript caching

Language installation and customisations

Op code caches

Standard opcache extension is strongly recommended for Moodle 2.6 forward, it is the only solution officially supported by PHP developers.

Potential problems:

  • file time stamps must be verified in Moodle 2.5 and bellow

See MDL-40415 for more details.

Browser sessions

File pool

At present there is a filedir in dataroot where we store all contents of files in Moodle. Theoretically the code could be abstracted to allow storage of file contents in non-filesystem storage.

Rule number one is to not store automated backups in file pool, always use real external filesystem directories.

The original idea was to have a setting for preventing of file removals from the $CFG->filedir directory, this would allow you to put files from multiple installations into just one giant file pool that could be replicated to all nodes - files would be only added there and never deleted automatically. The only potential problem is size of this huge dir and there would have to be a special new tool that exports used files from one site to another that is not using the shared file pool. The setting was supposed to be in file_storage::deleted_file_cleanup() method (one if that simply exits method).

$CFG->filedir does not require locking, it may be outside of $CFG->dataroot. The file system/sharing may use very aggressive caching techniques because the files never change, the node needs to contact the master when writing or on cache miss. This means we could even create a very simple local cluster node cache for filedir in PHP which would be very fast - a few lines of hacks in current file storage code, master would not have to be a directory at all, anything that stores files would be ok.

One elegant hack might be to use S3 for persistent file storage and just copy the files to local cluster node cache filedir. This would require minimal changes in current codebase, performance would be better because the filedir would not be shared and it would be more fault tolerant. The lack of deleting would mean that the data in central S3 store may grow substantially, on the other hand this would make file-less backups reliable on the same site or sites that share the same S3 pool. The size of local node cache filedir could be kept under control by periodically deleting files that were not accessed for some time or are over sized, or you could purge the local cache filedir and just let it refetch everything from central S3 storage file pool.

Please note that there are major performance problems and bugs in the file linking code, maybe it would be good to rework it completely to rely on file pool.

Installation procedure

  • Install Moodle first without any clustering.
  • ...

TODO: describe possible setups for distribution of HTTP requests, failovers, etc.

Upgrade procedure

Database clustering