Note:

If you want to create a new page for developers, you should create it on the Moodle Developer Resource site.

Logging 2

From MoodleDocs

Note: This page is a work-in-progress. Feedback and suggested improvements are welcome. Please join the discussion on moodle.org or use the page comments.

Logging stage 2
Project state In very early specification
Tracker issue MDL-37658
Discussion
Assignee moodle.com DEV team


Why is Logging an important topic?

Logging is fundamental to research, reporting and analytics.

  • Not much of new Moodle development is based on measured data. We need to
    • Prove there is a problem
    • Prove that a given change is an improvement.
  • Data comes from logging
  • Better data would allow researchers to study what happens in online teaching.
  • Better data would give admins more feedback on how to run their site (both technically and process)
  • Better data would give teachers better feedback to improve their teaching process.
  • Better data would give students better feedback to improve the learning process.
  • We can't predict all the analysis that might happen in future.


Examples of possible developments that would require better logs

Here are some random ideas, please add more!

  • Timely notifications of critical events (defined by users), eg more than five people posted in the same forum within an hour.
  • "Red circle" notification badge counter icons everywhere in Moodle highlighting things that need attention for each user.
  • Colour heatmaps overlaid on course page showing recent usage patterns on a course
  • Visualisation of live activity across a whole site (for admins)
  • Live engagement analytics that take every little action into account
  • Identification of students at risk based on complete history

Problems

  • Logging is added adhoc by developers, has spotty coverage. No admin logging!
  • Log tables are joined in many popular queries and are slow.
  • No archiving, so have to delete old logs
  • Deletion is wholesale, no control.
  • No synchronization with server logs to help debugging performance issues
  • No storage of traces when there are problems, so we can debug Moodle issues

Users of log table at the moment

A lot of the time-critical uses of the log table can be moved to new tables specifically for this purpose. It will speed things up a lot.

  • recent activity (block, myMoodle) EVENTS -> NEW TABLE
  • conditional access EVENTS -> NEW TABLE
  • failed logins EVENTS -> NEW TABLE
  • files via clamav EVENTS -> NEW TABLE

Other things can can continue to query the main logs:

  • course activity report, participation report, user outline report
  • stats
  • cron
  • automated backup locking/heartbeats (probably)
  • auth/mnet/auth.php
  • backup/restore (target logs and own logging)
  • grade history (currently separate log)

And some future things that will need the full log:

  • Engagement reporting

Places where we need more logs

  • Errors and warnings
    • People report errors, but cannot duplicate it.
    • Logging history with stack trace would be very useful
    • Also dump session variables and all other data that can be useful for debugging
  • Admin activities
  • Micro activities in a page (more than one per load)
  • AJAX calls (eg on course page)
  • Should be careful about logging confidential information (e.g. do not log any POST data)
  • Shouldn’t delete logs when course is deleted
  • All events should all be logged
    • Add more events (they are cheap and very useful)
    • Need more standardisation of event data for all triggered events
    • Add a catch-all handler that pushes events to the logging system
  • --> Action logging improvements META - MDL-28443

Existing log reading analysis

Files that make use of get_logs*

There are core functions (get_logs, get_logs_usercourse and get_logs_userday) but other classes also contain functions with the same name which are not related to the mdl_log table. Will we will be replacing these?

Locations using or declaring the functions get_logs*

backup/util/structure/backup_nested_element.class.php

The backup process contains it's own get_logs function that does not obtain data from the Moodle log table, but it's own backup_logs table. The structure is as follows -

id bigint backupid character varying(32) loglevel smallint message character varying(255) timecreated bigint

A few things to consider. How will this integrate into the new system with the backupid? Which field would that be placed in? It also has a loglevel, is that going to be a universal log level or unique to backup? If so, which fields to store in? Are we just going to leave it as it is?

course/lib.php

This is called for the function build_logs_array which is only ever called when the user has requested to print the logs for the course via the log report (report/log). So, is not called all the time (fortunately).

lib/datalib.php

This is where the core log functions are created, but are not used here. The functions get_logs_usercourse and get_logs_userday are only ever called in the report/log/graphs.php.

mod/workshop/allocation/scheduled/lib.php

This is a stdClass unique to the workshop that is json_encoded before being saved in the database field resultlog in the table 'workshopallocation_scheduled'.

The log stdClass is as follows. $log = new stdClass(); $log->message = $message; $log->type = $type; $log->indent = $indent;

report/log/graph.php

Self explanatory. This is a report based on the log records. Could potentially execute large queries but at least it is only performed when requested by user.

Files that contain usages of {log} in database queries

auth/mnet/auth.php

$mnethostlogssql = " SELECT

   mhostlogs.remoteid, mhostlogs.time, mhostlogs.userid, mhostlogs.ip,
   mhostlogs.course, mhostlogs.module, mhostlogs.cmid, mhostlogs.action,
   mhostlogs.url, mhostlogs.info, mhostlogs.username, c.fullname as coursename,
   c.modinfo

FROM

   (
        SELECT
            l.id as remoteid, l.time, l.userid, l.ip, l.course, l.module, l.cmid,
            l.action, l.url, l.info, u.username
        FROM
             {user} u
             INNER JOIN {log} l on l.userid = u.id
        WHERE
             u.mnethostid = ?
             AND l.id > ?
        ORDER BY remoteid ASC
        LIMIT 500
   ) mhostlogs
   INNER JOIN {course} c on c.id = mhostlogs.course

ORDER by mhostlogs.remoteid ASC";

Function keepalive_client: A heavy SQL query is performed that joins the log table to check whether the user who logged in via mnet is still active.

lib/statslib.php

$sql = "INSERT INTO {stats_user_weekly} (stattype, timeend, courseid, userid, statsreads)

       SELECT 'logins', timeend, courseid, userid, COUNT(statsreads)
          FROM (
                 SELECT $nextstartweek AS timeend, ".SITEID." as courseid, l.userid, l.id AS statsreads
                   FROM {log} l
                 WHERE action = 'login' AND $logtimesql
               ) inline_view
        GROUP BY timeend, courseid, userid
          HAVING COUNT(statsreads) > 0";

$DB->execute($sql);

Function stats_cron_weekly: Used to gather weekly statistics. Retrieves all 'login' actions in the log table between a certain time period and inserts them into the stats_user_weekly table.

$sql = "INSERT INTO {stats_user_monthly} (stattype, timeend, courseid, userid, statsreads)

       SELECT 'logins', timeend, courseid, userid, COUNT(statsreads)
         FROM (
                  SELECT $nextstartmonth AS timeend, ".SITEID." as courseid, l.userid, l.id AS statsreads
                    FROM {log} l
                   WHERE action = 'login' AND $logtimesql
              ) inline_view
     GROUP BY timeend, courseid, userid";

$DB->execute($sql);

Function stats_cron_monthly: Used to gather monthly statistics. Retrieves all 'login' actions in the log table between a certain time period and inserts them into the stats_user_monthly table.

if ($firstlog = $DB->get_field_sql('SELECT MIN(time) FROM {log}')) {

   return $firstlog;

}

Function stats_get_start_from: Returns starting date of the statistics. If there are currently no stats the variable $CFG->statsfirstrun is checked, and if it is equal to 'all', then a simple database query is performed on the log table to retrieve the lowest time recorded.

$sql = 'INSERT INTO {temp_log1} (userid, course, action)

       SELECT userid, course, action FROM {log}
        WHERE time >= ? AND time < ?';

$DB->execute($sql, array($timestart, $timeend));

Function stats_temp_table_fill: Fills the temporary stats tables (temp_log1 and temp_log2) with new data by performing a single select query on the log table to retrieve data from a given time range.

lib/datalib.php

These are where the functions mentioned in the section above are used as they perform the actual query on the log table.

  1. Function get_logs.
  2. Function get_logs_usercourse.
  3. Function get_logs_userday.

lib/deprecatedlib.php

$sql = "SELECT u.id, u.firstname, u.lastname, MAX(l.time)

         FROM {user} u, {role_assignments} ra, {log} l
        WHERE l.time > ?
              AND l.course = ?
              AND l.module = 'course'
              AND l.action = 'enrol'
              AND ".$DB->sql_cast_char2int('l.info')." = u.id
              AND u.id = ra.userid
              AND ra.contextid ".get_related_contexts_string($context)."
     GROUP BY u.id, u.firstname, u.lastname
     ORDER BY MAX(l.time) ASC";

$params = array($timestart, $courseid); return $DB->get_records_sql($sql, $params);

Function get_recent_enrolments: Performs a query with the user, role_assignments and log table to return recent enrolments.

lib/cronlib.php

$sql = "SELECT info, count(*)

         FROM {log}
        WHERE module = 'login' AND action = 'error'
              AND time > ?
     GROUP BY ip
       HAVING count(*) >= ?";

Function notify_login_failures: The above query is performed twice except the group by variable changes.

  1. Gets all the IPs with more than notifyloginthreshold failures since lastnotifyfailure and insert them into the cache_flags temp table.
  2. Get all the INFOs with more than notifyloginthreshold failures since lastnotifyfailure and insert them into the cache_flags temp table.

The following is then performed -

$sql = "SELECT * FROM (

       SELECT l.*, u.firstname, u.lastname
             FROM {log} l
             JOIN {cache_flags} cf ON l.ip = cf.name
        LEFT JOIN {user} u         ON l.userid = u.id
            WHERE l.module = 'login' AND l.action = 'error'
                  AND l.time > ?
                  AND cf.flagtype = 'login_failure_by_ip'
       UNION ALL
           SELECT l.*, u.firstname, u.lastname
             FROM {log} l
             JOIN {cache_flags} cf ON l.info = cf.name
        LEFT JOIN {user} u         ON l.userid = u.id
            WHERE l.module = 'login' AND l.action = 'error'
                  AND l.time > ?
                  AND cf.flagtype = 'login_failure_by_info') t
       ORDER BY t.time DESC";
   $params = array($CFG->lastnotifyfailure, $CFG->lastnotifyfailure);

This selects all the login errors logged belonging to the IPs and infos since lastnotifyfailure that are stored in the cache_flags table.

report/outline/index.php

if (!$logstart = $DB->get_field_sql("SELECT MIN(time) FROM {log}")) {

   print_error('logfilenotavailable');

}

A simple query to return the minimum time in the log table - used to make sure there are logs to report.

$sql = "SELECT cm.id, COUNT('x') AS numviews, MAX(time) AS lasttime

         FROM {course_modules} cm
              JOIN {modules} m ON m.id = cm.module
              JOIN {log} l     ON l.cmid = cm.id
        WHERE cm.course = ? AND l.action LIKE 'view%' AND m.visible = 1
     GROUP BY cm.id";

$views = $DB->get_records_sql($sql, array($course->id));

Returns the number of views a module has had.

report/participation/index.php

$minlog = $DB->get_field_sql('SELECT min(time) FROM {log} WHERE course = ?', array($course->id));

A simple query to return the minimum time in the log table - used in later query.

$sql = "SELECT ra.userid, u.firstname, u.lastname, u.idnumber, l.actioncount AS count

       FROM (SELECT * FROM {role_assignments} WHERE contextid $relatedcontexts AND roleid = :roleid ) ra
       JOIN {user} u ON u.id = ra.userid
       LEFT JOIN (
           SELECT userid, COUNT(action) AS actioncount FROM {log} WHERE cmid = :instanceid AND time > :timefrom AND $actionsql GROUP BY userid
       ) l ON (l.userid = ra.userid)";

A more complex query used to return the number of times a user has either posted or viewed a module.

Files that make use of $DB functions to query the logs table directly

I am ignoring the usages of log_display - based on what Petr said I assume we can just get rid of this completely

admin/handlevirus.php

$log = $DB->get_record("log", array("module"=>"upload", "info"=>$file, "action"=>"upload")

A simple check to see if the file was recorded as being uploaded.

backup/util/helper/backup_cron_helper.class.php

$logexists = $DB->record_exists_select('log', $sqlwhere, $params)

Function run_automated_backup: This query is performed twice in this function. It is used to check the log if there were any modifications to the course content.

blocks/recent_activity/block_recent_activity.php

$logs = $DB->get_records_select('log',

                               "time > ? AND course = ? AND
                                module = 'course' AND
                                (action = 'add mod' OR action = 'update mod' OR action = 'delete mod')",
                                array($timestart, $course->id), "id ASC");

Function get_structural_changes: Returns list of recent changes in the course structure. Used only within this block.

course/lib.php

$params = array('userid'=>$USER->id, 'url'=>"view.php?id=$courseid", 'since'=>$since); $select = "module = 'course' AND action = 'new' AND userid = :userid AND url = :url AND time > :since";

return $DB->record_exists_select('log', $select, $params);

Function can_delete_course: If the user does not have the capability moodle/course:delete but has the capability moodle/course:create. The above checks when the user created the course, and if it was less than a day ago they can delete it.

lib/datalib.php

if (is_siteadmin()) {

   if ($count->attempts = $DB->count_records_select('log', $select, $params)) {
       $count->accounts = $DB->count_records_select('log', $select, $params, 'COUNT(DISTINCT info)');
       return $count;
   }

} else if ($mode == 'everybody') {

   if ($count->attempts = $DB->count_records_select('log', "$select AND info = :username", $params)) {
       return $count;
   }

}

Function count_login_failures: Is used by renderers (both lib/outputrenderers.php and mymobile/renderers.php).

lib/deprecatedlib.php

$logs = $DB->get_records_select('log', "time > ? AND course = ? AND

                                       module = 'course' AND
                                       (action = 'add mod' OR action = 'update mod' OR action = 'delete mod')",
                                       array($timestart, $course->id), "id ASC");

Function print_recent_activity - self explanatory.

lib/uploadlib.php

if (!$record = $DB->get_record('log', array('info'=>$oldpath, 'module'=>'upload'))) {

   return false;

}

Function clam_change_log: This function does not seem to be used anywhere within Moodle.

mod/book/lib.php

$logs = $DB->get_records('log', array('userid'=>$user->id, 'module'=>'book',

                                     'action'=>'view', 'info'=>$folder->id), 'time ASC')

  1. Used in the function book_user_outline.

mod/folder/lib.php

$logs = $DB->get_records('log', array('userid'=>$user->id, 'module'=>'folder',

                                     'action'=>'view', 'info'=>$folder->id), 'time ASC')

  1. Used in the function folder_user_outline.
  2. Used in the function folder_user_complete.

mod/imscp/lib.php

$logs = $DB->get_records('log', array('userid'=>$user->id, 'module'=>'imscp',

                                     'action'=>'view', 'info'=>$folder->id), 'time ASC')

  1. Used in the function imscp_user_outline.
  2. Used in the function imscp_user_complete.

mod/page/lib.php

$logs = $DB->get_records('log', array('userid'=>$user->id, 'module'=>'page',

                                     'action'=>'view', 'info'=>$folder->id), 'time ASC')

  1. Used in the function page_user_outline.
  2. Used in the function page_user_complete.

mod/resource/lib.php

$logs = $DB->get_records('log', array('userid'=>$user->id, 'module'=>'resource',

                                     'action'=>'view', 'info'=>$folder->id), 'time ASC')

  1. Used in the function resource_user_outline.
  2. Used in the function resource_user_complete.

mod/url/lib.php

$logs = $DB->get_records('log', array('userid'=>$user->id, 'module'=>'url',

                                     'action'=>'view', 'info'=>$folder->id), 'time ASC')

  1. Used in the function url_user_outline.
  2. Used in the function url_user_complete.

Solution summary

We can't actually predict all the use cases so we must think generically and plan for worst cases.

  • Things that don't need to use logs should not use logs. (Recent activity, etc)
  • Make logging calls as cheap as possible
  • Use Events API for the originating calls to create lots of hooks for other things
  • Use MUC-like plug-ins to determine where to put logs permanently (NoSQL, SAS, file...). Default will be to the current log table.
  • Log everything we possibly can think of.
  • Define log level settings (on each logging call) so different sites can choose what they want logged and so control size, speed etc
  • add_to_log retained for backward compatibility, make it a wrapper for new logger function.
  • provide hooks to expose logs everywhere. For example, each page should have a link that shows logs and stats for that page.

Moodle -> Logs

Logs -> Moodle

  • Each plugin implements some methods to query and extract logs back to Moodle
  • Do we auto-sync anything back to Moodle tables for convenience?

See also

Examples of data-based learning technology development: