Search engine adapters

Revision as of 14:04, 18 November 2007 by Valery Fremaux (talk | contribs) (Providing The indexer Documents From The Module)

Jump to: navigation, search

The global search engine of Moodle, available as experimental feature till the Moodle 1.9 release, allows plugin new document types for being searched and indexed by the Lucene indexer.

Each module or block should have an adapter written to wrap the plugin's internal data model to a searchable document. The actual implementation allows a module to provide the search engine with a set of virtual documents. The search engine will index the text content of these documents, recording sufficiant data to access back the exact context in which it was appearing (i.e, access URL, course context, etc.).

Virtual documents are defined as subclasses of the SearchDocument class. Only the constructor of the subclass must be written in order to map an input record to an internal document definition.

The goals of the adapter are :

  • extract all virtual documents from the module data model and give them to the indexer (index first construction) through an iterator.
  • extract a single document for index update
  • provide sufficiant information for index delete of obsolete content
  • define the access URL that will access back the resource
  • define the access check algorithm so that the user will only access resources he is allowed to

Adapters For Modules And Blocks

Any adapter for a module or a block must reside in the
subdirectory, and will be named such as

Search defines

Each searchable module should add at least both (typical) following defines in /search/lib.php :

define('SEARCH_TYPE_<MODULE>', '<module>');
define('PATH_FOR_SEARCH_TYPE_<MODULE>', 'mod/<module>');


define('SEARCH_TYPE_<BLOCK>', '<block>');
define('PATH_FOR_SEARCH_TYPE_<BLOCK>', 'blocks/<block>');


The constructor of the SearchDocument class has the following signature :

public function __construct(&$doc, &$data, $course_id, $group_id, $user_id, $path)

where :

  • &$doc is a reference onto a PHP object that should provide the fields :
    • docid : the id of the document, as suitable for reconstructing the access URL.
    • documenttype : in general, the name of the module itself
    • itemtype : a subclassifier, if the module provides more than one virtual document to the search engine.
    • contextid : the context object id that should be considered when checking accessor's capabilities
    • title : the title string to appear in search results as a caption
    • author : if the author is known, the user id ( representing the author.
    • contents : a text bulk from the document content, filtered out from any formatting attributes or tags
    • url : the document url, that will be constructed by the adapter to access back the resource
    • date : usually the date when the resource was created
  • &$data is a reference onto a contextual metadata object that will be serialized among with the record, but will not be used as searchable content
  • $course_id is the current course the ressource is within
  • $group_id is the current group the resource belongs to, if the ressource is in a group scope (i.e. separate group wiki attachements), 0 elsewhere.
  • $user_id is the id of the user the resource beslongs to, in case the ressource is in a user specific scope (i.e. post or assignment attachements), 0 elsewhere.
  • $path is one of the above PATH defines for the module.

Providing The indexer Documents From The Module

When first constructing the index, The Indexer needs scanning all the instances of the plugin.

The adapter API must provide the

<module>_iterator(){ ... }

function that will give a set of consistant plugin instances. Here is a very standard template code for this method :

function <module>_iterator() {
    $<module> = get_records('<module>');
    return $<module>;
} //<module>_iterator

On each instance, the function :

function <module>_get_content_for_index(&$plugininstance) { ... }

is called for constructing relevant instances of the SearchDocument subclass. This function MUST return an array of SearchDocuments or false. The typical synopsis of this function is :

function <module>_get_content_for_index(&$plugininstance){
    $documents = array();

    // invalid plugin
    if (!$plugininstance) return $documents;

    // TODO : get an indexable item set

    foreach($indexableitems as $indexableitem) {

       // TODO : Prepare params with 

       $documents[] = new ForumSearchDocument(... params ...);
    return $documents;

Making The Backaccess Link

The constructor of the SearchDocument subclass must construct a backaccess link for the document, and give it as the 'url' attribute of the first constructor parameter (&$doc). this is usually done using a callback to the document API. the synopsys is :

function <module>_make_link(...contextual params...) {
    global $CFG;
    return $CFG->wwwroot.<moodle path expression that drives back to the content>;
} //<module>_make_link

Contextual params are usually ids of course module, or internal entities depending on the module construction, modal values...

Checking Access Back To The Content

Physical Document Adapters