Note:

If you want to create a new page for developers, you should create it on the Moodle Developer Resource site.

Logging usage: Difference between revisions

From MoodleDocs
No edit summary
No edit summary
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
= Existing log reading analysis =
= Existing log reading analysis =


[http://tracker.moodle.org/browse/MDL-39886 MDL-39886]
Tracker issue - MDL-39886


== admin/handlevirus.php ==
== admin/handlevirus.php ==
Line 357: Line 357:
Returns the number of views a module has had.
Returns the number of views a module has had.


= Existing log writing analysis =
= Reports =


TODO
There are different types of reports in Moodle, this sections describes reports that use the logged data.
 
== Live logs report ==
 
This report lists logged events in real time.
 
It is compatible with any log storage that implements get_events() with “created more than (now−60*60)” select, ordered by timecreated in reverse order and paging.
 
== Logs report ==
 
Simple listing of events with very basic filtering by course, user, date (in current timezone) and actions.
 
It is compatible with all storages that implement get_events() with basic “equals” and date filtering, ordered by timecreated and paging.
 
== Activity report (outline) ==
 
This report shows the total number of educational actions for each activity in course.
 
The data cannot be obtained using get_events(). The events need to be counted in database, that is why only SQL based log storages may be used for this. It would be absurdly slow if we were iterating get_events() results one by one in PHP memory.
 
Alternative for external databases could be some count_events() with appropriate parameters, but again that would be very should for non-ssql storages.
 
This report needs to select one active log storage with get_log_table() method. It is very important to know the dictionary of interesting action words or some education level of each action. This might be also improved by new plugin callbacks.
 
Nice to have features:
* specify custom date range instead of all data in logs
* views per each week/day
* limit to enrolled users only
* limit to guests only
* more action types for each activity
 
== Course participation report ==
 
This report shows counts of actions of selected users in selected course activities. It is similar to Activity report which is showing totals for all users in course. There is also an option to send bulk messages to selected users.
 
It requires the same get_log_table() method, it might theoretically use count_events() for external database tables.
 
Nice to have features:
* show actions for more activities on one page
* manually specify date ranges
* more action options (post/view)
 
== Statistics report ==
 
This report calculates monthly, weekly and daily action counts for view and post actions. The data is aggregated for each course and for each user.
 
Again it is not possible to loop through existing logged events from the whole day using PHP iteration over get_events(). It is not even possible to use count_events() because it would have to be done for each user and course in the system (tens of thousands of complex queries on a large site once per day).
 
Theoretically this could be implemented via a new observer that would be doing the bean counting continually, but maybe the performance would not be good enough on large sites. Theoretically two update queries per page could do the trick -one for page totals, other for current user totals.
 
It might be interesting to add some hooks that allow processing of stats in replicated database.


= See also =
= See also =


[[Logging_2]]
[[Logging_2]]

Revision as of 04:29, 14 August 2013

Existing log reading analysis

Tracker issue - MDL-39886

admin/handlevirus.php

$log == $DB->get_record("log", array("module"==>"upload", "info"==>$file, "action"==>"upload")

A simple check to see if the file was recorded as being uploaded.

auth/mnet/auth.php

$mnethostlogssql == " SELECT

   mhostlogs.remoteid, mhostlogs.time, mhostlogs.userid, mhostlogs.ip,
   mhostlogs.course, mhostlogs.module, mhostlogs.cmid, mhostlogs.action,
   mhostlogs.url, mhostlogs.info, mhostlogs.username, c.fullname as coursename,
   c.modinfo

FROM

   (
        SELECT
            l.id as remoteid, l.time, l.userid, l.ip, l.course, l.module, l.cmid,
            l.action, l.url, l.info, u.username
        FROM
             {user} u
             INNER JOIN {log} l on l.userid == u.id
        WHERE
             u.mnethostid == ?
             AND l.id > ?
        ORDER BY remoteid ASC
        LIMIT 500
   ) mhostlogs
   INNER JOIN {course} c on c.id == mhostlogs.course

ORDER by mhostlogs.remoteid ASC";

Function keepalive_client: A heavy SQL query is performed that joins the log table to check whether the user who logged in via mnet is still active.

backup/util/helper/backup_cron_helper.class.php

$logexists == $DB->record_exists_select('log', $sqlwhere, $params)

Function run_automated_backup: This query is performed twice in this function. It is used to check the log if there were any modifications to the course content.

backup/util/structure/backup_nested_element.class.php

The backup process contains it's own get_logs function that does not obtain data from the Moodle log table, but it's own backup_logs table. The structure is as follows -

id bigint backupid character varying(32) loglevel smallint message character varying(255) timecreated bigint

A few things to consider. How will this integrate into the new system with the backupid? Which field would that be placed in? It also has a loglevel, is that going to be a universal log level or unique to backup? If so, which fields to store in? Are we just going to leave it as it is?

blocks/recent_activity/block_recent_activity.php

$logs == $DB->get_records_select('log',

                               "time > ? AND course == ? AND
                                module == 'course' AND
                                (action == 'add mod' OR action == 'update mod' OR action == 'delete mod')",
                                array($timestart, $course->id), "id ASC");

Function get_structural_changes: Returns list of recent changes in the course structure. Used only within this block.

course/lib.php

This is called for the function build_logs_array which is only ever called when the user has requested to print the logs for the course via the log report (report/log). So, is not called all the time (fortunately).

$params == array('userid'==>$USER->id, 'url'==>"view.php?id==$courseid", 'since'==>$since); $select == "module == 'course' AND action == 'new' AND userid == :userid AND url == :url AND time > :since";

return $DB->record_exists_select('log', $select, $params);

Function can_delete_course: If the user does not have the capability moodle/course:delete but has the capability moodle/course:create. The above checks when the user created the course, and if it was less than a day ago they can delete it.

lib/cronlib.php

$sql == "SELECT info, count(*)

         FROM {log}
        WHERE module == 'login' AND action == 'error'
              AND time > ?
     GROUP BY ip
       HAVING count(*) >== ?";

Function notify_login_failures: The above query is performed twice except the group by variable changes.

  1. Gets all the IPs with more than notifyloginthreshold failures since lastnotifyfailure and insert them into the cache_flags temp table.
  2. Get all the INFOs with more than notifyloginthreshold failures since lastnotifyfailure and insert them into the cache_flags temp table.

The following is then performed -

$sql == "SELECT * FROM (

       SELECT l.*, u.firstname, u.lastname
             FROM {log} l
             JOIN {cache_flags} cf ON l.ip == cf.name
        LEFT JOIN {user} u         ON l.userid == u.id
            WHERE l.module == 'login' AND l.action == 'error'
                  AND l.time > ?
                  AND cf.flagtype == 'login_failure_by_ip'
       UNION ALL
           SELECT l.*, u.firstname, u.lastname
             FROM {log} l
             JOIN {cache_flags} cf ON l.info == cf.name
        LEFT JOIN {user} u         ON l.userid == u.id
            WHERE l.module == 'login' AND l.action == 'error'
                  AND l.time > ?
                  AND cf.flagtype == 'login_failure_by_info') t
       ORDER BY t.time DESC";
   $params == array($CFG->lastnotifyfailure, $CFG->lastnotifyfailure);

This selects all the login errors logged belonging to the IPs and infos since lastnotifyfailure that are stored in the cache_flags table.

lib/datalib.php

This is where the core log functions are created, but are not used here. The functions get_logs_usercourse and get_logs_userday are only ever called in the report/log/graphs.php.

if (is_siteadmin()) {

   if ($count->attempts == $DB->count_records_select('log', $select, $params)) {
       $count->accounts == $DB->count_records_select('log', $select, $params, 'COUNT(DISTINCT info)');
       return $count;
   }

} else if ($mode == 'everybody') {

   if ($count->attempts == $DB->count_records_select('log', "$select AND info == :username", $params)) {
       return $count;
   }

}

Function count_login_failures: Is used by renderers (both lib/outputrenderers.php and mymobile/renderers.php).


lib/deprecatedlib.php

$sql == "SELECT u.id, u.firstname, u.lastname, MAX(l.time)

         FROM {user} u, {role_assignments} ra, {log} l
        WHERE l.time > ?
              AND l.course == ?
              AND l.module == 'course'
              AND l.action == 'enrol'
              AND ".$DB->sql_cast_char2int('l.info')." == u.id
              AND u.id == ra.userid
              AND ra.contextid ".get_related_contexts_string($context)."
     GROUP BY u.id, u.firstname, u.lastname
     ORDER BY MAX(l.time) ASC";

$params == array($timestart, $courseid); return $DB->get_records_sql($sql, $params);

Function get_recent_enrolments: Performs a query with the user, role_assignments and log table to return recent enrolments.

$logs == $DB->get_records_select('log', "time > ? AND course == ? AND

                                       module == 'course' AND
                                       (action == 'add mod' OR action == 'update mod' OR action == 'delete mod')",
                                       array($timestart, $course->id), "id ASC");

Function print_recent_activity - self explanatory.

lib/statslib.php

$sql == "INSERT INTO {stats_user_weekly} (stattype, timeend, courseid, userid, statsreads)

       SELECT 'logins', timeend, courseid, userid, COUNT(statsreads)
          FROM (
                 SELECT $nextstartweek AS timeend, ".SITEID." as courseid, l.userid, l.id AS statsreads
                   FROM {log} l
                 WHERE action == 'login' AND $logtimesql
               ) inline_view
        GROUP BY timeend, courseid, userid
          HAVING COUNT(statsreads) > 0";

$DB->execute($sql);

Function stats_cron_weekly: Used to gather weekly statistics. Retrieves all 'login' actions in the log table between a certain time period and inserts them into the stats_user_weekly table.

$sql == "INSERT INTO {stats_user_monthly} (stattype, timeend, courseid, userid, statsreads)

       SELECT 'logins', timeend, courseid, userid, COUNT(statsreads)
         FROM (
                  SELECT $nextstartmonth AS timeend, ".SITEID." as courseid, l.userid, l.id AS statsreads
                    FROM {log} l
                   WHERE action == 'login' AND $logtimesql
              ) inline_view
     GROUP BY timeend, courseid, userid";

$DB->execute($sql);

Function stats_cron_monthly: Used to gather monthly statistics. Retrieves all 'login' actions in the log table between a certain time period and inserts them into the stats_user_monthly table.

if ($firstlog == $DB->get_field_sql('SELECT MIN(time) FROM {log}')) {

   return $firstlog;

}

Function stats_get_start_from: Returns starting date of the statistics. If there are currently no stats the variable $CFG->statsfirstrun is checked, and if it is equal to 'all', then a simple database query is performed on the log table to retrieve the lowest time recorded.

$sql == 'INSERT INTO {temp_log1} (userid, course, action)

       SELECT userid, course, action FROM {log}
        WHERE time >== ? AND time < ?';

$DB->execute($sql, array($timestart, $timeend));

Function stats_temp_table_fill: Fills the temporary stats tables (temp_log1 and temp_log2) with new data by performing a single select query on the log table to retrieve data from a given time range.

lib/uploadlib.php

if (!$record == $DB->get_record('log', array('info'==>$oldpath, 'module'==>'upload'))) {

   return false;

}

Function clam_change_log: This function does not seem to be used anywhere within Moodle.

mod/book/lib.php

$logs == $DB->get_records('log', array('userid'==>$user->id, 'module'==>'book',

                                     'action'==>'view', 'info'==>$folder->id), 'time ASC')

  1. Used in the function book_user_outline.

mod/folder/lib.php

$logs == $DB->get_records('log', array('userid'==>$user->id, 'module'==>'folder',

                                     'action'==>'view', 'info'==>$folder->id), 'time ASC')

  1. Used in the function folder_user_outline.
  2. Used in the function folder_user_complete.

mod/imscp/lib.php

$logs == $DB->get_records('log', array('userid'==>$user->id, 'module'==>'imscp',

                                     'action'==>'view', 'info'==>$folder->id), 'time ASC')

  1. Used in the function imscp_user_outline.
  2. Used in the function imscp_user_complete.

mod/page/lib.php

$logs == $DB->get_records('log', array('userid'==>$user->id, 'module'==>'page',

                                     'action'==>'view', 'info'==>$folder->id), 'time ASC')

  1. Used in the function page_user_outline.
  2. Used in the function page_user_complete.

mod/resource/lib.php

$logs == $DB->get_records('log', array('userid'==>$user->id, 'module'==>'resource',

                                     'action'==>'view', 'info'==>$folder->id), 'time ASC')

  1. Used in the function resource_user_outline.
  2. Used in the function resource_user_complete.

mod/url/lib.php

$logs == $DB->get_records('log', array('userid'==>$user->id, 'module'==>'url',

                                     'action'==>'view', 'info'==>$folder->id), 'time ASC')

  1. Used in the function url_user_outline.
  2. Used in the function url_user_complete.

mod/workshop/allocation/scheduled/lib.php

This is a stdClass unique to the workshop that is json_encoded before being saved in the database field resultlog in the table 'workshopallocation_scheduled'.

The log stdClass is as follows. $log == new stdClass(); $log->message == $message; $log->type == $type; $log->indent == $indent;

report/log/graph.php

Self explanatory. This is a report based on the log records. Could potentially execute large queries but at least it is only performed when requested by user.

report/participation/index.php

$minlog == $DB->get_field_sql('SELECT min(time) FROM {log} WHERE course == ?', array($course->id));

A simple query to return the minimum time in the log table - used in later query.

$sql == "SELECT ra.userid, u.firstname, u.lastname, u.idnumber, l.actioncount AS count

       FROM (SELECT * FROM {role_assignments} WHERE contextid $relatedcontexts AND roleid == :roleid ) ra
       JOIN {user} u ON u.id == ra.userid
       LEFT JOIN (
           SELECT userid, COUNT(action) AS actioncount FROM {log} WHERE cmid == :instanceid AND time > :timefrom AND $actionsql GROUP BY userid
       ) l ON (l.userid == ra.userid)";

A more complex query used to return the number of times a user has either posted or viewed a module.

report/outline/index.php

if (!$logstart == $DB->get_field_sql("SELECT MIN(time) FROM {log}")) {

   print_error('logfilenotavailable');

}

A simple query to return the minimum time in the log table - used to make sure there are logs to report.

$sql == "SELECT cm.id, COUNT('x') AS numviews, MAX(time) AS lasttime

         FROM {course_modules} cm
              JOIN {modules} m ON m.id == cm.module
              JOIN {log} l     ON l.cmid == cm.id
        WHERE cm.course == ? AND l.action LIKE 'view%' AND m.visible == 1
     GROUP BY cm.id";

$views == $DB->get_records_sql($sql, array($course->id));

Returns the number of views a module has had.

Reports

There are different types of reports in Moodle, this sections describes reports that use the logged data.

Live logs report

This report lists logged events in real time.

It is compatible with any log storage that implements get_events() with “created more than (now−60*60)” select, ordered by timecreated in reverse order and paging.

Logs report

Simple listing of events with very basic filtering by course, user, date (in current timezone) and actions.

It is compatible with all storages that implement get_events() with basic “equals” and date filtering, ordered by timecreated and paging.

Activity report (outline)

This report shows the total number of educational actions for each activity in course.

The data cannot be obtained using get_events(). The events need to be counted in database, that is why only SQL based log storages may be used for this. It would be absurdly slow if we were iterating get_events() results one by one in PHP memory.

Alternative for external databases could be some count_events() with appropriate parameters, but again that would be very should for non-ssql storages.

This report needs to select one active log storage with get_log_table() method. It is very important to know the dictionary of interesting action words or some education level of each action. This might be also improved by new plugin callbacks.

Nice to have features:

  • specify custom date range instead of all data in logs
  • views per each week/day
  • limit to enrolled users only
  • limit to guests only
  • more action types for each activity

Course participation report

This report shows counts of actions of selected users in selected course activities. It is similar to Activity report which is showing totals for all users in course. There is also an option to send bulk messages to selected users.

It requires the same get_log_table() method, it might theoretically use count_events() for external database tables.

Nice to have features:

  • show actions for more activities on one page
  • manually specify date ranges
  • more action options (post/view)

Statistics report

This report calculates monthly, weekly and daily action counts for view and post actions. The data is aggregated for each course and for each user.

Again it is not possible to loop through existing logged events from the whole day using PHP iteration over get_events(). It is not even possible to use count_events() because it would have to be done for each user and course in the system (tens of thousands of complex queries on a large site once per day).

Theoretically this could be implemented via a new observer that would be doing the bean counting continually, but maybe the performance would not be good enough on large sites. Theoretically two update queries per page could do the trick -one for page totals, other for current user totals.

It might be interesting to add some hooks that allow processing of stats in replicated database.

See also

Logging_2