Cache API: Difference between revisions

Revision as of 23:29, 3 October 2017

This document details the Cache API aka MUC aka Moodle's Universal Cache. I've chosen to use a tutorial/example flow for this document always working with a theoretical module plugin called myplugin. There is also a Cache API - Quick reference if you would rather read that.

Basic usage

It's very easy to get started with the Cache API. It is designed to be as easy and as quick to use as possible and is predominantely self contained. All you need to do is add a definition for your cache and you are ready to start working with the Cache API.

Creating a definition

Cache definitions exist within the db/caches.php file for a component/plugin.
In the case of core that is the moodle/lib/db/caches.php file, in the case of a module that would be moodle/mod/myplugin/db/caches.php.

The definition is used API in order to understand a little about the cache and what it is being used for, it also allows the administrator to set things up especially for the definition if they want. From a development point of view the definition allows you to tell the API about your cache, what it requires, and any (if any) advanced features you want it to have.
The following shows a basic definition containing just the bare minimum: // moodle/mod/myplugin/db/caches.php $definitions = array(

   'somedata' => array(
       'mode' => cache_store::MODE_APPLICATION
   )

); This informs the API that the myplugin module has a cache called somedata and that it is an application (globally shared) cache.
When creating a definition thats the bare minimum, to provide an area (somedata) and declare the type of the cache application, session, or request.

An application cache is a shared cache, all users can access it.

Session caches are, well, just stores in the users session.

Request caches you can think of as static caches, only available to the user owning the request, and only alive until the end of the request.

There are of course many more options available that allow you to really take the cache by the reigns, you can read about some of the important ones further on, or skip ahead to #The definition section which details the available options in full.

Please note that for each definition a language string with the name cachedef_ followed by the name of the definition is expected. Using the example above you would have to define:

// moodle/mod/myplugin/lang/en/mod_myplugin.php $string['cachedef_somedata'] = 'This is the description of the cache somedata';

Getting a cache object

Once your definition has been created you should bump the version number so that Moodle upgrades and processes the definitions file at which point your definition will be useable.

Now within code you can get a cache object corresponding to the definition created earlier.

$cache = cache::make('mod_myplugin', 'somedata');

The cache::make method is a factory method, it will create a cache object to allow you to work with your cache. The cache object will be one of several classes chosen by the API based upon what your definition contains. All of these classes will extend the base cache class, and in nearly all cases you will get one of cache_application, cache_session, or cache_request depending upon the mode you selected.

Using your cache object

Once you have a cache object (will extend the cache class and implements cache_loader) you are ready to start interacting with the cache.

Of course there are three basic basic operations. get, set, and delete.

The first is to send something to the cache. $result = $cache->set('key', 'value'); Easy enough. The key must an int or a string. The value can be absolutely anything your want. The result is true if the operation was a success, false otherwise.

The second is to fetch something from the cache. $data = $cache->get('key'); $data will either be what ever was being stored in the cache, or false if the cache could not find the key.

The third and final operation is delete. $result = $cache->delete('key'); Again just like set the result will either be true if the operation was a success, or false otherwise.

You can also set, get, and delete multiple key=>value pairs in a single transaction. $result = $cache->set_many(array(

   'key1' => 'data1',
   'key3' => 'data3'

));
// $result will be the number of pairs sucessfully set.
$result = $cache->get_many(array('key1', 'key2', 'key3'));
print_r($result);
// Will print the following:
// array(
// 'key1' => 'data1',
// 'key2' => false,
// 'key3' => 'data3'
// )

$result = $cache->delete_many(array('key1', 'key3'); // $result will be the number of records sucessfully deleted.

That covers the basic operation of the Cache API.
In many situations there is not going to be any more to it than that.

Ad-hoc Caches

This is the alternative method of using the cache API.
It involves creating a cache using just the required params at the time that it is required. It doesn't require that a definition exists making it quicker and easier to use, however it can only use the default settings and is only recommended for insignificant caches (rarely used during operation, never to be mapped or customised, only existsing in a single place in code).

Once a cache object has been retrieved it operates exactly as the same as a cache that has been created for a definition.

To create an ad-hoc cache you would use the following: $cache = cache::make_from_params(cache_store::MODE_APPLICATION, 'mod_myplugin', 'mycache');

Really don't be lazy, if you don't have a good reason to use an ad-hoc cache you should be spending a extra 5 minutes creating a definition.

The definition

The above section illustrated how to create a basic definition, specifying just the area name (the key) and the mode for the definition. Those being the two required properties for a definition.
There are many other options that will let you make the most of the Cache API and will undoubtedly be required when implementing and converting cache solutions to the Cache API.

The following details the options available to a definition and their defaults if not applied:

$definitions = array(

   // The name of the cache area is the key. The component/plugin will be picked up from the file location.
   'area' => array(
       'mode' => cache_store::MODE_*,
       'simplekeys' => false,
       'simpledata' => false,
       'requireidentifiers' => array('ident1', 'ident2'),
       'requiredataguarantee' => false,
       'requiremultipleidentifiers' => false,
       'requirelockingread' => false,
       'requirelockingwrite' => false,
       'maxsize' => null,
       'overrideclass' => null,
       'overrideclassfile' => null,
       'datasource' => null,
       'datasourcefile' => null,
       'staticacceleration' => false,
       'staticaccelerationsize' => null,
       'ttl' => 0,
       'mappingsonly' => false,
       'invalidationevents' => array('event1', 'event2'),
       'canuselocalstore' => false
       'sharingoptions' => cache_definition::SHARING_DEFAULT,
       'defaultsharing' => cache_definition::SHARING_DEFAULT,
   )

);

Setting requirements

The definition can specify several requirements for the cache.
This includes identifiers that must be provided when creating the cache object, that the store guarantees data stored in it will remain there until removed, a store that supports multiple identifiers, and finally read/write locking. The options for these are as follows:

simplekeys: [bool] Set to true if your cache will only use simple keys for its items.
Simple keys consist of digits, underscores and the 26 chars of the english language. a-zA-Z0-9_
If true the keys won't be hashed before being passed to the cache store for gets/sets/deletes. It will be better for performance and possible only because we know the keys are safe.
simpledata: [bool] If set to true we know that the data is scalar or array of scalar.
requireidentifiers: [array] An array of identifiers that must be provided to the cache when it is created.
requiredataguarantee: [bool] If set to true then only stores that can guarantee data will remain available once set will be used.
requiremultipleidentifiers: [bool] If set to true then only stores that support multiple identifiers will be used.
requirelockingread: [bool] If set to true then a lock will be gained before reading from the cache store. It is recommended not to use this setting unless 100% absolutely positively required.
Remember 99.9% of caches will NOT need this setting.
This setting will only be used for application caches presently.
requirelockingwrite: [bool] If set to true then a lock will be gained before writing to the cache store. As above this is not recommended unless truly needed. Please think about the order of your code and deal with race conditions there first.
This setting will only be used for application caches presently.

Cache modifiers

You are also to modify the way in which the cache is going to operate when working for your definition.
By enabling the static option the Cache API will only ever generate a single cache object for your definition on the first request for it, further requests will be returned the original instance.
This greatly speeds up the collecting of a cache object.
Enabling persistence also enables a static store within the cache object, anything set to the cache, or retrieved from it will be stored in that static array for the life of the request. This makes the persistence options some of the most powerful. If you know you are going to be using you cache over and over again or if you know you will be making lots of requests for the same items then this will provide a great performance boost. Of course the static storage of cache objects and of data is costly in terms of memory and should only be used when actually required, as such it is turned off by default. As well as persistence you can also set a maximum number of items that the cache should store (not a hard limit, its up to each store) and a time to live (ttl) although both are discouraged as efficient design negates the need for both in most situations.

staticacceleration: [bool] This setting does two important things. First it tells the cache API to only instantiate the cache structure for this definition once, further requests will be given the original instance.
Second the cache loader will keep an array of the items set and retrieved to the cache during the request.
This has several advantages including better performance without needing to start passing the cache instance between function calls, the downside is that the cache instance + the items used stay within memory.
Consider using this setting when you know that there are going to be many calls to the cache for the same information or when you are converting existing code to the cache and need to access the cache within functions but don't want to add it as an argument to the function.
staticaccelerationsize: [int] This supplements the above setting by limiting the number of items in the caches persistent array of items.
Tweaking this setting lower will allow you to minimise the memory implications above while hopefully still managing to offset calls to the cache store.
ttl: [int] A time to live for the data (in seconds). It is strongly recommended that you don't make use of this and instead try to create an event driven invalidation system (even if the event is just time expiring, better not to rely on ttl).
Not all cache stores will support this natively and there are undesired performance impacts if the cache store does not.
maxsize: [int] If set this will be used as the maximum number of entries within the cache store for this definition.
Its important to note that cache stores don't actually have to acknowledge this setting or maintain it as a hard limit.
canuselocalstore: [bool] This setting specifies whether the cache can safely be local to each frontend in a cluster which can avoid latency costs to a shared central cache server. The cache needs to be carefully written for this to be safe. It is conceptually similar to using $CFG->localcachedir (can be local) vs $CFG->cachedir (must be shared). Look at purify_html() in lib/weblib.php for an example.

Overriding a cache loader

This is a super advanced feature and should not be done. ever. Unless you have an very good reason to do so. It allows you to create your own cache loader and have it be used instead of the default cache loader class. The cache object you get back from the make operations will be an instance of this class.

overrideclass: [string] A class to use as the loader for this cache. This is an advanced setting and will allow the developer of the definition to take 100% control of the caching solution.
Any class used here must inherit the cache_loader interface and must extend default cache loader for the mode they are using.
overrideclassfile: [string] Suplements the above setting indicated the file containing the class to be used. This file is included when required.

Specifying a data source

This is a great wee feature, especially if your code is object orientated.
It allows you to specify a class that must inherit the cache_data_source object and will be used to load any information requested from the cache that is not already being stored.
When the requested key cannot be found in the cache the data source will be asked to load it. The data source will then return the information to the cache, the cache will store it, and it will then return it to the user as a request of their get request. Essentially no get request should ever fail if you have a data source specified.

datasource: [string] A class to use as the data loader for this definition.
Any class used here must inherit the cache_data_source interface.
datasourcefile: [string] Suplements the above setting indicated the file containing the class to be used. This file is included when required.

Misc settings

The following are stand along settings that don't fall into any of the above categories.

invalidationevents: [array] An array of events that should cause this cache to invalidate some or all of the items within it.
mappingsonly: [bool] If set to true only the mapped cache store(s) will be used and the default mode store will not. This is a super advanced setting and should not be used unless absolutely required. It allows you to avoid the default stores for one reason or another.
sharingoptions: [int] The sharing options that are appropriate for this definition. Should be the sum of the possible options.
defaultsharing: [int] The default sharing option to use. It's highly recommended that you don't set this unless there is a very specific reason not to use the system default.

@@ Line 169: / Line 169: @@
 ; ttl : [int] A time to live for the data (in seconds). It is strongly recommended that you don't make use of this and instead try to create an event driven invalidation system (even if the event is just time expiring, better not to rely on ttl).<br />Not all cache stores will support this natively and there are undesired performance impacts if the cache store does not.
 ; maxsize : [int] If set this will be used as the maximum number of entries within the cache store for this definition.<br />Its important to note that cache stores don't actually have to acknowledge this setting or maintain it as a hard limit.
+; canuselocalstore : [bool] This setting specifies whether the cache can safely be local to each frontend in a cluster which can avoid latency costs to a shared central cache server. The cache needs to be carefully written for this to be safe. It is conceptually similar to using $CFG->localcachedir (can be local) vs $CFG->cachedir (must be shared). Look at purify_html() in lib/weblib.php for an example.
 ===Overriding a cache loader===

Documentation