Note:

If you want to create a new page for developers, you should create it on the Moodle Developer Resource site.

Scheduled Tasks Proposal: Difference between revisions

From MoodleDocs
No edit summary
Line 431: Line 431:
|
|
|}
|}


== Tasks ==
== Tasks ==

Revision as of 17:44, 9 December 2009

Moodle 2.0

Introduction

This proposal is meant both to provide a replacement for the moodle cron job, and provide a means to schedule once off tasks to be run outside of the user's request lifecycle.

Terminology

  • *Subtask* an individual piece of cron processing that should be run (equivalent to forum_cron now, or maybe even smaller)
  • *Moodle cron instance* a cron.php process

Rationale

The moodle cronjob currently delegates all scheduling to each subtask that is run - for example, the forum cron is responsible for checking when it last run, and making decisions about whether or not it should be run again. This sort of decision process should be centralised, and individual cron subtasks should be called by the central controller.

Additionally, there is not any central locking of subtasks. At the moment, some subtasks that expect that they might take a long time to run implement their own locking (for example statistics), but it's not centralised. Each moodle cron instance runs to completion, no matter how long it takes, and it processes tasks in the order that they're programmed, regardless of if there are any other moodle cron instances running, that might be processing sub tasks in parallel

Finally, we need to be able to run non-related tasks in parallel so that the entire moodle queue isn't held up by single long running jobs.

Goals

  • Admin screen to move around scheduling.
  • ...


Unresolved issues

  • Do we need to allow for scheduling different subtasks on different servers?
  • We need to find a way to separate subtasks by some logic so that subtasks that write to the same areas of the database never run at the same time. We could do this by getting each cron job to say what areas of Moodle they write to, but this is problematic.
  • We also have to deal with the order of some subtasks - we could maybe do this by introducing dependencies
  • When the first cron in a long time is running, we should lock the entire cron and let it run to completeness, because the order is really important then.

Pseudo code proposal

This is just a rough idea to provoke thought:--Tim Hunt 11:06, 9 December 2009 (UTC)

<?php function handle_cron($maxprocesstime) {

   $timestart = time();
   $runid = cron_record_starting_processing($timenow);
   while (($timenow = time()) < $timestart + $maxprocesstime) {
       try {
           $task = cron_get_next_task($timenow);
           cron_acquire_lock($task);
           $next = $task->execute($timenow);
           cron_record_task_success($task, $next, $timenow);
           cron_release_lock($task);
       } catch (Exception $e) {
           cron_record_task_failure($task, $timenow, $e);
       }
   }
   cron_record_ending_processing($runid, $timenow);

}

/**

* Instances of this class are rows in the scheduled_tasks table.
*/

abstract class scheduled_task {

   protected $id;
   protected $plugin; // = 'mod_quiz', 'qtype_multichoice'.
   protected $function;
   protected $lastruntime;
   protected $nextscheduledtime;
   protected $priority; // LOW, MEDIUM, HIGH, or, a number like nice 19, or something.
   protected abstract function get_start_end_times($timenow);
   protected abstract function schedule_next($timenow);
   public function execute($timenow) {
       $file = get_plugin_dir($this->plugin) . '/cron.php';
       if (!is_readable($file)) {
           throw new cron_exception('required file does not exist.');
       }
       if (include_once($file)) {
           throw new cron_exception('required file could not be included.');
       }
       list($start, $end) = next_times($timenow);
       $this->$function($start, $end);
       return $this->schedule_next();
   }

}

class one_off_task extends scheduled_task {

   protected abstract function get_start_end_times($timenow) {
       return array(null, $timenow);
   }
   protected abstract function schedule_next($timenow) {
       return null;
   }

}

class catchup_task extends scheduled_task {

   protected $desiredinterval; // seconds
   protected abstract function get_start_end_times($timenow) {
       return array($lastruntime, $timenow);
   }
   protected abstract function schedule_next($timenow) {
       return $timenow + $desiredinterval;
   }

}

class every_day_task extends scheduled_task {

   protected $swtichovertime; // Seconds after midnight.
   protected $runtime; // Seconds after midnight.
   // For now this code assumes $runtime > $swtichovertime, but that is just me
   // being lazy. It can be fixed.
   protected abstract function get_start_end_times($timenow) {
       $midnight = get_midnight_before($this->nextscheduledtime);
       $previousmidnight = get_previous_midnight($midnight);
       $starttime = $previousmidnight + $swtichovertime;
       $endtime = $midnight + $swtichovertime;
       return array($starttime, $endtime);
   }
   protected abstract function schedule_next($timenow) {
       $midnight = get_midnight_before($this->nextscheduledtime);
       $nextmidnight = get_following_midnight($midnight);
       return $nextmidnight + $this->runtime;
   }

}

class weekly_task extends scheduled_task {

   // ...

}

/* Approximate database tables.

scheduled_tasks

   id AUTOINCREMENT
   type varchar
   plugin varchar
   function varchar UNIQUE
   lastruntime int 
   nextscheduledtime int
   priority int
   custom1 int
   custom2 int 
   customextra text

scheduled_task_locks

   id int auto
   function fk unique
   locktime int
  • /

Audit of current cron

Main section Subtask Frequency Notes
session_gc every run
mod/assignment plugins (none) every minute
mod/assignment message submissions every minute checks last run time
mod/chat update chat times every five minutes
mod/chat update_events every five minutes
mod/chat delete old chat_users and add quits every five minutes
mod/chat delete old messages every five minutes
mod/data every minute no _cron function (includes file unnecessarily)
mod/forum mail posts every minute checks last run time
mod/forum digest processing every minute
mod/forum delete old read tracking every minute
mod/scorm reparse all scorms every five minutes does hourly checking
mod/wiki delete expired locks every hour
blocks/rss_client update feeds every five minutes
quiz/report/statistics delete old statistics every run
admin/reports none every run
language_cache every run
remove expired enrolments every run
main gradebook lock pending grades (*2) every run
main gradebook clean old grade history every run has a TODO to not process as often
event queue every run potentially large
portfolio cron clean expired exports every run potentially large
longtimenosee 20%
deleteunconfirmedusers 20%
deleteincompleteusers 20%
deleteoldlogs 20%
deletefiltercache 20%
notifyloginfailures 20%
metacourse syncing 20%
createpasswordemails 20%
tag cron 20%
clean contexts 20%
gc_cache_flags 20%
build_context_path 20%
scheduled backups daily (admin defined)
make rss feeds every run
auth/mnet keepalives every run
auth/mnet delete old sessions every run
auth/ldap sync users custom not scheduled (external cronjob)
auth/cas sync users custom not scheduled (external cronjob)
auth/db sync users custom not scheduled (external cronjob)
enrol/authorize clears old data daily (admin defined)
enrol/authorize notifies administrators of old data daily (admin defined)
enrol/authorize process orders & email teachers every run
enrol/flatfile read file and sync users every run !?!
enrol/imsenterprise read file and sync users every run !?!?!
enrol/manual notify people of pending unenrolments daily
statistics daily (admin defined) huge
grade/import none every run
grade/export none every run
grade/reports none every run
fetch blog entries every run
file gc (optional) daily, else every run?
local cron

Tasks