Note:

If you want to create a new page for developers, you should create it on the Moodle Developer Resource site.

Global search (GSoC2013): Difference between revisions

From MoodleDocs
mNo edit summary
No edit summary
Line 27: Line 27:
*Writing cron jobs:
*Writing cron jobs:
**Addition of records.
**Addition of records.
**Deletion of records.
**Deletion of records.[shouldn't be focused upon just now]
**Update of records.
**Update of records.
*Design the solr schema and solconfig files. These files will be embedded in the in a separate directory under the Global Search directory. Users will have to copy these two files to the Apache Solr example directory that they will download which will run the Solr jetty server. See Installation for more.  
*Design the solr schema and solconfig files. These files will be embedded in the in a separate directory under the Global Search directory. Users will have to copy these two files to the Apache Solr example directory that they will download which will run the Solr jetty server. See Installation for more.  
Line 40: Line 40:
**Resume indexing. (If a previous indexing was paused).
**Resume indexing. (If a previous indexing was paused).
**Deletion of entire index.
**Deletion of entire index.
**Re-indexing.
**Deletion of index by specific modules only.(For example, only the index of records belonging to 'forum' module is to be deleted).
**Deletion of index by specific modules only.(For example, only the index of records belonging to 'forum' module is to be deleted).
**configurational options for cron
**configurational options for cron

Revision as of 11:18, 12 June 2013

Note: This page is a work-in-progress. Feedback and suggested improvements are welcome. Please join the discussion on moodle.org or use the page comments.

Global search
Project state Community bonding period
Tracker issue MDL-31989
Discussion Writing Moodle's Global Search
Assignee Prateek Sachan

GSOC '13

Introduction

Global Search will have the feature of searching keywords within the entire Moodle site across modules keeping the security intact.

  • It will display results based on relevance weightage.
  • Security will be preserved throughout the search.
  • Search Modules will enable chosen search engine integration with ease. Admins will have the option for selecting the modules that could be made "searchable"
  • It will include keywords from other files types (like PDFs, PPTs, HTML content and others).
  • Following are the features that I'm considering in implementing in the first version of Global Search:
  1. Groupings of AND and OR. Eg.: ("query1" AND "query2") OR "query3"
  2. Searching for phrases. Results with matched phrase will have higher priority and hence will be shown higher in the results.
  3. Wildcard (*) (?) feature.
  4. Stemming. Eg.: bag will return results both from bag and bags

Milestones

  • Writing cron jobs:
    • Addition of records.
    • Deletion of records.[shouldn't be focused upon just now]
    • Update of records.
  • Design the solr schema and solconfig files. These files will be embedded in the in a separate directory under the Global Search directory. Users will have to copy these two files to the Apache Solr example directory that they will download which will run the Solr jetty server. See Installation for more.
    • schema.xml contains all the properties about the documents fields which are being indexed. There may be different fields pertaining to different modules.
    • solrconfig.xml contains the configurational parameters for solr.
  • Integrating Apache Tika to handle indexing from external files (PDFs, PPTX etc.)
  • Writing the search functions for querying. (Just a basic search UI to be used at this moment).
  • Adding proper security to the search results that are returned.
    • Use of Access API
  • Admin page for search configuration options.(The UI page here will be the default type as being currently used. For example, Site Administration>Advance features)
    • Start initial indexing. (When the admin will install Global Search into Moodle)
    • Resume indexing. (If a previous indexing was paused).
    • Deletion of entire index.
    • Re-indexing.
    • Deletion of index by specific modules only.(For example, only the index of records belonging to 'forum' module is to be deleted).
    • configurational options for cron
      • Time of cron run.
  • Preparing for mid-term evaluation.
  • Re-designing the search page.(Taking ideas from community+discussion in forum)
  • Implementing the first prototype.
    • Asking the community developers to test it.
      • Taking their guidance to optimize the code wherever possible.
      • Getting feedback.
  • Running Test cases and performance testing. (Performance testing will be good at this point as the code would have been optimized to some level as instructed by the developers above)
  • Debugging.
  • Finalizing the Global Search documentation.
    • Discussing it with my mentors whether everything has been properly covered or not.
  • Submitting my code to Moodle and Google.
  • Suggested Pencils Down Period.
  • Performing edits to the documentation after feedback from the Moodle community.
  • Firm Pencils Down and Final Evaluation.

Installation

For using Global Search, users will have to install the PHP Solr PECL extension on server.

Following is the procedure for installing the extension in UNIX:

There are two dependencies of the extension:

  • CURL extension (libcurl 7.15.0 or later is required)
  • LIBXML extension (libxml2 2.6.26 or later is required)

Test whether the required extensions are installed or not by executing the following in a php file (Remember to delete the file as it has important information about your system):

echo phpinfo();

If the system does not have required versions of libcurl or libxml libraries, follow the steps given below. You will have to download the libraries

and compile them from source into a separate install prefix.

  1. For libcurl:
    wget http://curl.haxx.se/download/curl-7.19.6.tar.gz

tar -zxvf curl-7.19.6.tar.gz cd curl-7.19.6 sudo ./configure --prefix=/root/custom/software sudo make sudo make install

  1. For libxml:
    wget ftp://xmlsoft.org/libxml2/libxml2-2.7.6.tar.gz

tar -zxvf libxml2-2.7.6.tar.gz cd libxml2-2.7.6 sudo ./configure --prefix=/root/custom/software sudo make sudo make install After installing the above dependencies, you will need to restart your apache server by executing sudo service apache2 restart

Next, you will be ready to install the PECL extension for Solr by cloning the following repository for Solr 4.x. (Please Note: Current;y, the official php-pecl-solr is not compatible with Solr 4.x. The following repository provides a small fix to make it compatible with Solr 4.x and will go to the official release.)

  • git clone https://github.com/lukaszkujawa/php-pecl-solr.git
  • cd php-pecl-solr/
  • phpize
    • This a shell script used to prepare the build environment for a php extension to be compiled. If you don't have phpize, you can install it by executing sudo apt-get install php5-dev

If the libxml2 and libcurl libraries were compiled from source, then you will have to pass the libcurl prefix to the configure script for CURL and LIBXML respectively as shown below:

  • sudo ./configure --enable-solr --with-curl=/root/custom/software --with-libxml-dir=/root/custom/software
  • If you already have the latest versions of the libraries then executing sudo ./configure is sufficient.
  • sudo make
  • sudo make install

The above procedure will compile and install it in the extension_dir directory in the php.ini file. To enable, the installed extension, you could follow any of the following two steps:

1. Navigate to the directory etc/php5/conf.d and create a new solr.ini file with the following line:

extension=solr.so

OR

2. Open your php.ini file and include the following line:

extension=solr.so

You may follow any of the above two steps. You will need to restart your apache server after that by executing sudo service apache2 restart

You can now view the solr extension by executing echo phpinfo(); in browser or php -m in Terminal (Ctrl+Alt+T)

After installing the php-pecl-solr extension, users will have to download Apache Solr, unzip it and keep it in an external directory of Moodle.

Users will have to replace solconfig.xml and schema.xml inside the downloaded directory /example/solr/collection1/conf with the ones that Global Search will provide. Once, the files have been copied and replaced, users will have to start the java jetty server start.jar located in /example/ directory by executing java -jar start.jar.

Schedule

Design

See also