Difference between revisions of "Search engine adapters"

Jump to: navigation, search
(Adapters for modules and blocks)
(Providing Documents From The Module)
Line 58: Line 58:
 
* '''$path''' is one of the above PATH defines for the module.
 
* '''$path''' is one of the above PATH defines for the module.
  
== Providing Documents From The Module ==
+
== Providing The indexer Documents From The Module ==
 +
 
 +
When first constructing the index, The Indexer needs scanning all the instances of the plugin.
 +
 
 +
The adapter API must provide the
 +
 
 +
<pre>
 +
<module>_iterator(){ ... }
 +
</pre>
 +
 
 +
function that will give a set of consistant plugin instances. Here is a ''very standard'' template code for this method :
 +
 
 +
<pre>
 +
function <module>_iterator() {
 +
    $<module> = get_records('<module>');
 +
    return $<module>;
 +
} //<module>_iterator
 +
</pre>
 +
 
 +
On each instance, the function :
 +
 
 +
<pre>
 +
function <module>_get_content_for_index(&$plugininstance) { ... }
 +
</pre>
 +
 
 +
is called for constructing relavant instances of the SearchDocument subclass.
  
 
== Checking Access Back To The Content ==
 
== Checking Access Back To The Content ==
  
 
== Physical Document Adapters ==
 
== Physical Document Adapters ==

Revision as of 21:49, 17 November 2007

The global search engine of Moodle, available as experimental feature till the Moodle 1.9 release, allows plugin new document types for being searched and indexed by the Lucene indexer.

Each module or block should have an adapter written to wrap the plugin's internal data model to a searchable document. The actual implementation allows a module to provide the search engine with a set of virtual documents. The search engine will index the text content of these documents, recording sufficiant data to access back the exact context in which it was appearing (i.e, access URL, course context, etc.).

Virtual documents are defined as subclasses of the SearchDocument class. Only the constructor of the subclass must be written in order to map an input record to an internal document definition.

The goals of the adapter are :

  • extract all virtual documents from the module data model and give them to the indexer (index first construction) through an iterator.
  • extract a single document for index update
  • provide sufficiant information for index delete of obsolete content
  • define the access URL that will access back the resource
  • define the access check algorithm so that the user will only access resources he is allowed to

Adapters For Modules And Blocks

Any adapter for a module or a block must reside in the
/search/documents
subdirectory, and will be named such as
<module>_document.php

Search defines

Each searchable module should add at least both (typical) following defines in /search/lib.php :

define('SEARCH_TYPE_<MODULE>', '<module>');
define('PATH_FOR_SEARCH_TYPE_<MODULE>', 'mod/<module>');

or

define('SEARCH_TYPE_<BLOCK>', '<block>');
define('PATH_FOR_SEARCH_TYPE_<BLOCK>', 'blocks/<block>');

Constructor

The constructor of the SearchDocument class has the following signature :

public function __construct(&$doc, &$data, $course_id, $group_id, $user_id, $path)

where :

  • &$doc is a reference onto a PHP object that should provide the fields :
    • docid : the id of the document, as suitable for reconstructing the access URL.
    • documenttype : in general, the name of the module itself
    • itemtype : a subclassifier, if the module provides more than one virtual document to the search engine.
    • contextid : the context object id that should be considered when checking accessor's capabilities
    • title : the title string to appear in search results as a caption
    • author : if the author is known, the user id (mdl_user.id) representing the author.
    • contents : a text bulk from the document content, filtered out from any formatting attributes or tags
    • url : the document url, that will be constructed by the adapter to access back the resource
    • date : usually the date when the resource was created
  • &$data is a reference onto a contextual metadata object that will be serialized among with the record, but will not be used as searchable content
  • $course_id is the current course the ressource is within
  • $group_id is the current group the resource belongs to, if the ressource is in a group scope (i.e. separate group wiki attachements), 0 elsewhere.
  • $user_id is the id of the user the resource beslongs to, in case the ressource is in a user specific scope (i.e. post or assignment attachements), 0 elsewhere.
  • $path is one of the above PATH defines for the module.

Providing The indexer Documents From The Module

When first constructing the index, The Indexer needs scanning all the instances of the plugin.

The adapter API must provide the

<module>_iterator(){ ... }

function that will give a set of consistant plugin instances. Here is a very standard template code for this method :

function <module>_iterator() {
    $<module> = get_records('<module>');
    return $<module>;
} //<module>_iterator

On each instance, the function :

function <module>_get_content_for_index(&$plugininstance) { ... }

is called for constructing relavant instances of the SearchDocument subclass.

Checking Access Back To The Content

Physical Document Adapters