Note:

If you want to create a new page for developers, you should create it on the Moodle Developer Resource site.

Logging 2: Difference between revisions

From MoodleDocs
No edit summary
No edit summary
Line 80: Line 80:
* --> Action logging improvements META - MDL-28443
* --> Action logging improvements META - MDL-28443


= Existing log reading analysis =
== Existing log reading analysis ==


== Files that make use of get_logs* ==
=== Files that make use of get_logs* ===


There are core functions (get_logs, get_logs_usercourse and get_logs_userday) but other classes also contain functions with the same name which are not related to the mdl_log table. Will we will be replacing these?
There are core functions (get_logs, get_logs_usercourse and get_logs_userday) but other classes also contain functions with the same name which are not related to the mdl_log table. Will we will be replacing these?
Line 88: Line 88:
Locations using or declaring the functions get_logs*
Locations using or declaring the functions get_logs*


=== backup/util/structure/backup_nested_element.class.php ===
==== backup/util/structure/backup_nested_element.class.php ====


The backup process contains it's own get_logs function that does not obtain data from the Moodle log table, but it's own backup_logs table. The structure is as follows -
The backup process contains it's own get_logs function that does not obtain data from the Moodle log table, but it's own backup_logs table. The structure is as follows -
Line 102: Line 102:
A few things to consider. How will this integrate into the new system with the backupid? Which field would that be placed in? It also has a loglevel, is that going to be a universal log level or unique to backup? If so, which fields to store in? Are we just going to leave it as it is?
A few things to consider. How will this integrate into the new system with the backupid? Which field would that be placed in? It also has a loglevel, is that going to be a universal log level or unique to backup? If so, which fields to store in? Are we just going to leave it as it is?


=== course/lib.php ===
==== course/lib.php ====


This is called for the function build_logs_array which is only ever called when the user has requested to print the logs for the course via the log report (report/log). So, is not called all the time (fortunately).
This is called for the function build_logs_array which is only ever called when the user has requested to print the logs for the course via the log report (report/log). So, is not called all the time (fortunately).


=== lib/datalib.php ===
==== lib/datalib.php ====


This is where the core log functions are created, but are not used here. The functions get_logs_usercourse and get_logs_userday are only ever called in the report/log/graphs.php.
This is where the core log functions are created, but are not used here. The functions get_logs_usercourse and get_logs_userday are only ever called in the report/log/graphs.php.


=== mod/workshop/allocation/scheduled/lib.php ===
==== mod/workshop/allocation/scheduled/lib.php ====


This is a stdClass unique to the workshop that is json_encoded before being saved in the database field resultlog in the table 'workshopallocation_scheduled'.
This is a stdClass unique to the workshop that is json_encoded before being saved in the database field resultlog in the table 'workshopallocation_scheduled'.
Line 122: Line 122:
</code>
</code>


=== report/log/graph.php ===
==== report/log/graph.php ====


Self explanatory. This is a report based on the log records. Could potentially execute large queries but at least it is only performed when requested by user.
Self explanatory. This is a report based on the log records. Could potentially execute large queries but at least it is only performed when requested by user.


== Files that contain usages of {log} in database queries ==
=== Files that contain usages of {log} in database queries ===


=== auth/mnet/auth.php ===
==== auth/mnet/auth.php ====


<code php>
<code php>
Line 157: Line 157:
Function keepalive_client: A heavy SQL query is performed that joins the log table to check whether the user who logged in via mnet is still active.
Function keepalive_client: A heavy SQL query is performed that joins the log table to check whether the user who logged in via mnet is still active.


=== lib/statslib.php ===
==== lib/statslib.php ====


<code php>
<code php>
Line 211: Line 211:
Function stats_temp_table_fill: Fills the temporary stats tables (temp_log1 and temp_log2) with new data by performing a single select query on the log table to retrieve data from a given time range.
Function stats_temp_table_fill: Fills the temporary stats tables (temp_log1 and temp_log2) with new data by performing a single select query on the log table to retrieve data from a given time range.


=== lib/datalib.php ===
==== lib/datalib.php ====


These are where the functions mentioned in the section above are used as they perform the actual query on the log table.
These are where the functions mentioned in the section above are used as they perform the actual query on the log table.
Line 219: Line 219:
# Function get_logs_userday.
# Function get_logs_userday.


=== lib/deprecatedlib.php ===
==== lib/deprecatedlib.php ====


<code php>
<code php>
Line 239: Line 239:
Function get_recent_enrolments: Performs a query with the user, role_assignments and log table to return recent enrolments.
Function get_recent_enrolments: Performs a query with the user, role_assignments and log table to return recent enrolments.


=== lib/cronlib.php ===
==== lib/cronlib.php ====


<code php>
<code php>
Line 280: Line 280:
This selects all the login errors logged belonging to the IPs and infos since lastnotifyfailure that are stored in the cache_flags table.
This selects all the login errors logged belonging to the IPs and infos since lastnotifyfailure that are stored in the cache_flags table.


=== report/outline/index.php ===
==== report/outline/index.php ====


<code php>
<code php>
Line 302: Line 302:
Returns the number of views a module has had.
Returns the number of views a module has had.


=== report/participation/index.php ===
==== report/participation/index.php ====


<code php>
<code php>
Line 321: Line 321:
A more complex query used to return the number of times a user has either posted or viewed a module.
A more complex query used to return the number of times a user has either posted or viewed a module.


== Files that make use of $DB functions to query the logs table directly ==
=== Files that make use of $DB functions to query the logs table directly ===


I am ignoring the usages of log_display - based on what Petr said I assume we can just get rid of this completely
I am ignoring the usages of log_display - based on what Petr said I assume we can just get rid of this completely


=== admin/handlevirus.php ===
==== admin/handlevirus.php ====


<code php>
<code php>
Line 333: Line 333:
A simple check to see if the file was recorded as being uploaded.
A simple check to see if the file was recorded as being uploaded.


=== backup/util/helper/backup_cron_helper.class.php ===
==== backup/util/helper/backup_cron_helper.class.php ====


<code php>
<code php>
Line 341: Line 341:
Function run_automated_backup: This query is performed twice in this function. It is used to check the log if there were any modifications to the course content.
Function run_automated_backup: This query is performed twice in this function. It is used to check the log if there were any modifications to the course content.


=== blocks/recent_activity/block_recent_activity.php ===
==== blocks/recent_activity/block_recent_activity.php ====


<code php>
<code php>
Line 353: Line 353:
Function get_structural_changes: Returns list of recent changes in the course structure. Used only within this block.
Function get_structural_changes: Returns list of recent changes in the course structure. Used only within this block.


=== course/lib.php ===
==== course/lib.php ====


<code php>
<code php>
Line 364: Line 364:
Function can_delete_course: If the user does not have the capability moodle/course:delete but has the capability moodle/course:create. The above checks when the user created the course, and if it was less than a day ago they can delete it.
Function can_delete_course: If the user does not have the capability moodle/course:delete but has the capability moodle/course:create. The above checks when the user created the course, and if it was less than a day ago they can delete it.


=== lib/datalib.php ===
==== lib/datalib.php ====


<code php>
<code php>
Line 381: Line 381:
Function count_login_failures: Is used by renderers (both lib/outputrenderers.php and mymobile/renderers.php).
Function count_login_failures: Is used by renderers (both lib/outputrenderers.php and mymobile/renderers.php).


=== lib/deprecatedlib.php ===
==== lib/deprecatedlib.php ====


<code php>
<code php>
Line 392: Line 392:
Function print_recent_activity - self explanatory.
Function print_recent_activity - self explanatory.


=== lib/uploadlib.php ===
==== lib/uploadlib.php ====


<code php>
<code php>
Line 402: Line 402:
Function clam_change_log: This function does not seem to be used anywhere within Moodle.
Function clam_change_log: This function does not seem to be used anywhere within Moodle.


=== mod/book/lib.php ===
==== mod/book/lib.php ====


<code php>
<code php>
Line 411: Line 411:
# Used in the function book_user_outline.
# Used in the function book_user_outline.


=== mod/folder/lib.php ===
==== mod/folder/lib.php ====


<code php>
<code php>
Line 421: Line 421:
# Used in the function folder_user_complete.
# Used in the function folder_user_complete.


=== mod/imscp/lib.php ===
==== mod/imscp/lib.php ====


<code php>
<code php>
Line 431: Line 431:
# Used in the function imscp_user_complete.
# Used in the function imscp_user_complete.


=== mod/page/lib.php ===
==== mod/page/lib.php ====


<code php>
<code php>
Line 441: Line 441:
# Used in the function page_user_complete.
# Used in the function page_user_complete.


=== mod/resource/lib.php ===
==== mod/resource/lib.php ====


<code php>
<code php>
Line 451: Line 451:
# Used in the function resource_user_complete.
# Used in the function resource_user_complete.


=== mod/url/lib.php ===
==== mod/url/lib.php ====


<code php>
<code php>

Revision as of 07:40, 23 May 2013

Note: This page is a work-in-progress. Feedback and suggested improvements are welcome. Please join the discussion on moodle.org or use the page comments.

Logging stage 2
Project state In very early specification
Tracker issue MDL-37658
Discussion
Assignee moodle.com DEV team


Why is Logging an important topic?

Logging is fundamental to research, reporting and analytics.

  • Not much of new Moodle development is based on measured data. We need to
    • Prove there is a problem
    • Prove that a given change is an improvement.
  • Data comes from logging
  • Better data would allow researchers to study what happens in online teaching.
  • Better data would give admins more feedback on how to run their site (both technically and process)
  • Better data would give teachers better feedback to improve their teaching process.
  • Better data would give students better feedback to improve the learning process.
  • We can't predict all the analysis that might happen in future.


Examples of possible developments that would require better logs

Here are some random ideas, please add more!

  • Timely notifications of critical events (defined by users), eg more than five people posted in the same forum within an hour.
  • "Red circle" notification badge counter icons everywhere in Moodle highlighting things that need attention for each user.
  • Colour heatmaps overlaid on course page showing recent usage patterns on a course
  • Visualisation of live activity across a whole site (for admins)
  • Live engagement analytics that take every little action into account
  • Identification of students at risk based on complete history

Problems

  • Logging is added adhoc by developers, has spotty coverage. No admin logging!
  • Log tables are joined in many popular queries and are slow.
  • No archiving, so have to delete old logs
  • Deletion is wholesale, no control.
  • No synchronization with server logs to help debugging performance issues
  • No storage of traces when there are problems, so we can debug Moodle issues

Users of log table at the moment

A lot of the time-critical uses of the log table can be moved to new tables specifically for this purpose. It will speed things up a lot.

  • recent activity (block, myMoodle) EVENTS -> NEW TABLE
  • conditional access EVENTS -> NEW TABLE
  • failed logins EVENTS -> NEW TABLE
  • files via clamav EVENTS -> NEW TABLE

Other things can can continue to query the main logs:

  • course activity report, participation report, user outline report
  • stats
  • cron
  • automated backup locking/heartbeats (probably)
  • auth/mnet/auth.php
  • backup/restore (target logs and own logging)
  • grade history (currently separate log)

And some future things that will need the full log:

  • Engagement reporting

Places where we need more logs

  • Errors and warnings
    • People report errors, but cannot duplicate it.
    • Logging history with stack trace would be very useful
    • Also dump session variables and all other data that can be useful for debugging
  • Admin activities
  • Micro activities in a page (more than one per load)
  • AJAX calls (eg on course page)
  • Should be careful about logging confidential information (e.g. do not log any POST data)
  • Shouldn’t delete logs when course is deleted
  • All events should all be logged
    • Add more events (they are cheap and very useful)
    • Need more standardisation of event data for all triggered events
    • Add a catch-all handler that pushes events to the logging system
  • --> Action logging improvements META - MDL-28443

Existing log reading analysis

Files that make use of get_logs*

There are core functions (get_logs, get_logs_usercourse and get_logs_userday) but other classes also contain functions with the same name which are not related to the mdl_log table. Will we will be replacing these?

Locations using or declaring the functions get_logs*

backup/util/structure/backup_nested_element.class.php

The backup process contains it's own get_logs function that does not obtain data from the Moodle log table, but it's own backup_logs table. The structure is as follows -

id bigint backupid character varying(32) loglevel smallint message character varying(255) timecreated bigint

A few things to consider. How will this integrate into the new system with the backupid? Which field would that be placed in? It also has a loglevel, is that going to be a universal log level or unique to backup? If so, which fields to store in? Are we just going to leave it as it is?

course/lib.php

This is called for the function build_logs_array which is only ever called when the user has requested to print the logs for the course via the log report (report/log). So, is not called all the time (fortunately).

lib/datalib.php

This is where the core log functions are created, but are not used here. The functions get_logs_usercourse and get_logs_userday are only ever called in the report/log/graphs.php.

mod/workshop/allocation/scheduled/lib.php

This is a stdClass unique to the workshop that is json_encoded before being saved in the database field resultlog in the table 'workshopallocation_scheduled'.

The log stdClass is as follows. $log = new stdClass(); $log->message = $message; $log->type = $type; $log->indent = $indent;

report/log/graph.php

Self explanatory. This is a report based on the log records. Could potentially execute large queries but at least it is only performed when requested by user.

Files that contain usages of {log} in database queries

auth/mnet/auth.php

$mnethostlogssql = " SELECT

   mhostlogs.remoteid, mhostlogs.time, mhostlogs.userid, mhostlogs.ip,
   mhostlogs.course, mhostlogs.module, mhostlogs.cmid, mhostlogs.action,
   mhostlogs.url, mhostlogs.info, mhostlogs.username, c.fullname as coursename,
   c.modinfo

FROM

   (
        SELECT
            l.id as remoteid, l.time, l.userid, l.ip, l.course, l.module, l.cmid,
            l.action, l.url, l.info, u.username
        FROM
             {user} u
             INNER JOIN {log} l on l.userid = u.id
        WHERE
             u.mnethostid = ?
             AND l.id > ?
        ORDER BY remoteid ASC
        LIMIT 500
   ) mhostlogs
   INNER JOIN {course} c on c.id = mhostlogs.course

ORDER by mhostlogs.remoteid ASC";

Function keepalive_client: A heavy SQL query is performed that joins the log table to check whether the user who logged in via mnet is still active.

lib/statslib.php

$sql = "INSERT INTO {stats_user_weekly} (stattype, timeend, courseid, userid, statsreads)

       SELECT 'logins', timeend, courseid, userid, COUNT(statsreads)
          FROM (
                 SELECT $nextstartweek AS timeend, ".SITEID." as courseid, l.userid, l.id AS statsreads
                   FROM {log} l
                 WHERE action = 'login' AND $logtimesql
               ) inline_view
        GROUP BY timeend, courseid, userid
          HAVING COUNT(statsreads) > 0";

$DB->execute($sql);

Function stats_cron_weekly: Used to gather weekly statistics. Retrieves all 'login' actions in the log table between a certain time period and inserts them into the stats_user_weekly table.

$sql = "INSERT INTO {stats_user_monthly} (stattype, timeend, courseid, userid, statsreads)

       SELECT 'logins', timeend, courseid, userid, COUNT(statsreads)
         FROM (
                  SELECT $nextstartmonth AS timeend, ".SITEID." as courseid, l.userid, l.id AS statsreads
                    FROM {log} l
                   WHERE action = 'login' AND $logtimesql
              ) inline_view
     GROUP BY timeend, courseid, userid";

$DB->execute($sql);

Function stats_cron_monthly: Used to gather monthly statistics. Retrieves all 'login' actions in the log table between a certain time period and inserts them into the stats_user_monthly table.

if ($firstlog = $DB->get_field_sql('SELECT MIN(time) FROM {log}')) {

   return $firstlog;

}

Function stats_get_start_from: Returns starting date of the statistics. If there are currently no stats the variable $CFG->statsfirstrun is checked, and if it is equal to 'all', then a simple database query is performed on the log table to retrieve the lowest time recorded.

$sql = 'INSERT INTO {temp_log1} (userid, course, action)

       SELECT userid, course, action FROM {log}
        WHERE time >= ? AND time < ?';

$DB->execute($sql, array($timestart, $timeend));

Function stats_temp_table_fill: Fills the temporary stats tables (temp_log1 and temp_log2) with new data by performing a single select query on the log table to retrieve data from a given time range.

lib/datalib.php

These are where the functions mentioned in the section above are used as they perform the actual query on the log table.

  1. Function get_logs.
  2. Function get_logs_usercourse.
  3. Function get_logs_userday.

lib/deprecatedlib.php

$sql = "SELECT u.id, u.firstname, u.lastname, MAX(l.time)

         FROM {user} u, {role_assignments} ra, {log} l
        WHERE l.time > ?
              AND l.course = ?
              AND l.module = 'course'
              AND l.action = 'enrol'
              AND ".$DB->sql_cast_char2int('l.info')." = u.id
              AND u.id = ra.userid
              AND ra.contextid ".get_related_contexts_string($context)."
     GROUP BY u.id, u.firstname, u.lastname
     ORDER BY MAX(l.time) ASC";

$params = array($timestart, $courseid); return $DB->get_records_sql($sql, $params);

Function get_recent_enrolments: Performs a query with the user, role_assignments and log table to return recent enrolments.

lib/cronlib.php

$sql = "SELECT info, count(*)

         FROM {log}
        WHERE module = 'login' AND action = 'error'
              AND time > ?
     GROUP BY ip
       HAVING count(*) >= ?";

Function notify_login_failures: The above query is performed twice except the group by variable changes.

  1. Gets all the IPs with more than notifyloginthreshold failures since lastnotifyfailure and insert them into the cache_flags temp table.
  2. Get all the INFOs with more than notifyloginthreshold failures since lastnotifyfailure and insert them into the cache_flags temp table.

The following is then performed -

$sql = "SELECT * FROM (

       SELECT l.*, u.firstname, u.lastname
             FROM {log} l
             JOIN {cache_flags} cf ON l.ip = cf.name
        LEFT JOIN {user} u         ON l.userid = u.id
            WHERE l.module = 'login' AND l.action = 'error'
                  AND l.time > ?
                  AND cf.flagtype = 'login_failure_by_ip'
       UNION ALL
           SELECT l.*, u.firstname, u.lastname
             FROM {log} l
             JOIN {cache_flags} cf ON l.info = cf.name
        LEFT JOIN {user} u         ON l.userid = u.id
            WHERE l.module = 'login' AND l.action = 'error'
                  AND l.time > ?
                  AND cf.flagtype = 'login_failure_by_info') t
       ORDER BY t.time DESC";
   $params = array($CFG->lastnotifyfailure, $CFG->lastnotifyfailure);

This selects all the login errors logged belonging to the IPs and infos since lastnotifyfailure that are stored in the cache_flags table.

report/outline/index.php

if (!$logstart = $DB->get_field_sql("SELECT MIN(time) FROM {log}")) {

   print_error('logfilenotavailable');

}

A simple query to return the minimum time in the log table - used to make sure there are logs to report.

$sql = "SELECT cm.id, COUNT('x') AS numviews, MAX(time) AS lasttime

         FROM {course_modules} cm
              JOIN {modules} m ON m.id = cm.module
              JOIN {log} l     ON l.cmid = cm.id
        WHERE cm.course = ? AND l.action LIKE 'view%' AND m.visible = 1
     GROUP BY cm.id";

$views = $DB->get_records_sql($sql, array($course->id));

Returns the number of views a module has had.

report/participation/index.php

$minlog = $DB->get_field_sql('SELECT min(time) FROM {log} WHERE course = ?', array($course->id));

A simple query to return the minimum time in the log table - used in later query.

$sql = "SELECT ra.userid, u.firstname, u.lastname, u.idnumber, l.actioncount AS count

       FROM (SELECT * FROM {role_assignments} WHERE contextid $relatedcontexts AND roleid = :roleid ) ra
       JOIN {user} u ON u.id = ra.userid
       LEFT JOIN (
           SELECT userid, COUNT(action) AS actioncount FROM {log} WHERE cmid = :instanceid AND time > :timefrom AND $actionsql GROUP BY userid
       ) l ON (l.userid = ra.userid)";

A more complex query used to return the number of times a user has either posted or viewed a module.

Files that make use of $DB functions to query the logs table directly

I am ignoring the usages of log_display - based on what Petr said I assume we can just get rid of this completely

admin/handlevirus.php

$log = $DB->get_record("log", array("module"=>"upload", "info"=>$file, "action"=>"upload")

A simple check to see if the file was recorded as being uploaded.

backup/util/helper/backup_cron_helper.class.php

$logexists = $DB->record_exists_select('log', $sqlwhere, $params)

Function run_automated_backup: This query is performed twice in this function. It is used to check the log if there were any modifications to the course content.

blocks/recent_activity/block_recent_activity.php

$logs = $DB->get_records_select('log',

                               "time > ? AND course = ? AND
                                module = 'course' AND
                                (action = 'add mod' OR action = 'update mod' OR action = 'delete mod')",
                                array($timestart, $course->id), "id ASC");

Function get_structural_changes: Returns list of recent changes in the course structure. Used only within this block.

course/lib.php

$params = array('userid'=>$USER->id, 'url'=>"view.php?id=$courseid", 'since'=>$since); $select = "module = 'course' AND action = 'new' AND userid = :userid AND url = :url AND time > :since";

return $DB->record_exists_select('log', $select, $params);

Function can_delete_course: If the user does not have the capability moodle/course:delete but has the capability moodle/course:create. The above checks when the user created the course, and if it was less than a day ago they can delete it.

lib/datalib.php

if (is_siteadmin()) {

   if ($count->attempts = $DB->count_records_select('log', $select, $params)) {
       $count->accounts = $DB->count_records_select('log', $select, $params, 'COUNT(DISTINCT info)');
       return $count;
   }

} else if ($mode == 'everybody') {

   if ($count->attempts = $DB->count_records_select('log', "$select AND info = :username", $params)) {
       return $count;
   }

}

Function count_login_failures: Is used by renderers (both lib/outputrenderers.php and mymobile/renderers.php).

lib/deprecatedlib.php

$logs = $DB->get_records_select('log', "time > ? AND course = ? AND

                                       module = 'course' AND
                                       (action = 'add mod' OR action = 'update mod' OR action = 'delete mod')",
                                       array($timestart, $course->id), "id ASC");

Function print_recent_activity - self explanatory.

lib/uploadlib.php

if (!$record = $DB->get_record('log', array('info'=>$oldpath, 'module'=>'upload'))) {

   return false;

}

Function clam_change_log: This function does not seem to be used anywhere within Moodle.

mod/book/lib.php

$logs = $DB->get_records('log', array('userid'=>$user->id, 'module'=>'book',

                                     'action'=>'view', 'info'=>$folder->id), 'time ASC')

  1. Used in the function book_user_outline.

mod/folder/lib.php

$logs = $DB->get_records('log', array('userid'=>$user->id, 'module'=>'folder',

                                     'action'=>'view', 'info'=>$folder->id), 'time ASC')

  1. Used in the function folder_user_outline.
  2. Used in the function folder_user_complete.

mod/imscp/lib.php

$logs = $DB->get_records('log', array('userid'=>$user->id, 'module'=>'imscp',

                                     'action'=>'view', 'info'=>$folder->id), 'time ASC')

  1. Used in the function imscp_user_outline.
  2. Used in the function imscp_user_complete.

mod/page/lib.php

$logs = $DB->get_records('log', array('userid'=>$user->id, 'module'=>'page',

                                     'action'=>'view', 'info'=>$folder->id), 'time ASC')

  1. Used in the function page_user_outline.
  2. Used in the function page_user_complete.

mod/resource/lib.php

$logs = $DB->get_records('log', array('userid'=>$user->id, 'module'=>'resource',

                                     'action'=>'view', 'info'=>$folder->id), 'time ASC')

  1. Used in the function resource_user_outline.
  2. Used in the function resource_user_complete.

mod/url/lib.php

$logs = $DB->get_records('log', array('userid'=>$user->id, 'module'=>'url',

                                     'action'=>'view', 'info'=>$folder->id), 'time ASC')

  1. Used in the function url_user_outline.
  2. Used in the function url_user_complete.

Solution summary

We can't actually predict all the use cases so we must think generically and plan for worst cases.

  • Things that don't need to use logs should not use logs. (Recent activity, etc)
  • Make logging calls as cheap as possible
  • Use Events API for the originating calls to create lots of hooks for other things
  • Use MUC-like plug-ins to determine where to put logs permanently (NoSQL, SAS, file...). Default will be to the current log table.
  • Log everything we possibly can think of.
  • Define log level settings (on each logging call) so different sites can choose what they want logged and so control size, speed etc
  • add_to_log retained for backward compatibility, make it a wrapper for new logger function.
  • provide hooks to expose logs everywhere. For example, each page should have a link that shows logs and stats for that page.

Moodle -> Logs

Logs -> Moodle

  • Each plugin implements some methods to query and extract logs back to Moodle
  • Do we auto-sync anything back to Moodle tables for convenience?

See also

Examples of data-based learning technology development: