Logging 2
Note: This page is a work-in-progress. Feedback and suggested improvements are welcome. Please join the discussion on moodle.org or use the page comments.
Logging stage 2 | |
---|---|
Project state | Specification under review |
Tracker issue | MDL-37658 |
Discussion | https://moodle.org/mod/forum/discuss.php?d=235318 |
Assignee | moodle.com BACKEND team |
Mandate
To rewrite the logging API in Moodle to:
- allow capturing of richer information from plugins/core about actions;
- provide control over how much information is logged;
- abstract log writing and reading so logging can be scaled;
- control how much logging history is kept; and
- support research, reporting and analytics.
Why is Logging an important topic?
Logging is fundamental to research, reporting and analytics.
Not much of new Moodle development is based on measured data. We need to be able to:
- prove there is a problem and
- prove that a given change is an improvement.
Logging data can be used by different Moodle users.
- Better data would allow researchers to study what happens in online teaching.
- Better data would give admins more feedback on how to run their site - both technically and process (please note this issue is not about profiling).
- Better data would give teachers better feedback to improve their teaching process.
- Better data would give students better feedback to improve their learning.
We can't predict all forms of analysis that might happen in future.
- The logging system needs to be flexible in order to accommodate applications by these different users.
- The logging API needs to be constructed in a way that is possible to improve in future.
Existing problems
This project will help to resolve a number of problems.
- Logging is added ad-hoc by developers and has spotty coverage.
- Some logged actions do not contain important information needed later.
- There are performance and scalability problems, such as:
- log tables are joined in many popular queries and are slow;
- all logs go to one standard moodle database table, which is not scalable;
- it is not possible to configure level of logging at present; and
- it is not possible to disable logging completely.
Logging
The new Events API introduced in Moodle 2.6 provides a notification mechanism and single collection point for actions within Moodle. The Events system also defines information about events that can be extended with additional information and functionality by core code and plugins, as needed.
In order to capture events a logging API will be specified, consisting of a number of plugin types. Simple examples will be provided that exemplify the new plugin types.
To transition from the old logging system to the new, work is required in a number of areas.
- In places where we previously added information to 'the' log table, we must instead generate a new event.
- The standard Moodle distribution will include simple DB logging storage plugin that will store events in new format.
- The original logging code and DB tables will be available for full backwards compatibility for users of customised reports. It will be possible to enable logging to the old {log} table if required.
- All standard reports will be upgraded to use the new logging system that will use a logging retrieval plugin to access log information.
- Custom report plugins (add-ons) will be able to query the {log} table, but should ideally shift to the new logging system over time, so that sites are not forced to continue double logging.
Existing log usage analysis
Log data is currently stored in a single standard Moodle database table, which is accessed directly. The new API will still encourage log storage plugins to expose data in the form of standardised database table, but it will be possible to use read only database views with an arbitrary name. Alternatively simplified log readers may return arrays of event instances only.
See this analysis to see log usage in Moodle 2.5 core code.
Example scenario
Writing to the log
- A student submits an assignment.
- An event is triggered by the Assignment module.
- The log manager plugin observes the event and determines if it is log-worthy depending on some admin settings.
- The log manager sends the event to all enabled log storage subplugins implementing \tool_log\log\writer. Storage plugins may decide to filter out some events based on their admin settings.
- The log store plugin(s) stores the event data optionally including extra information such as remote IP or original logged-in-as user id.
Reading from the log
- A teacher opens activity report.
- Activity reports queries available log readers from the current log manager (implementing \core\log\manager).
- If there is more than one log reader (implementing \core\log\sql_reader) available they can select the source of data for the report.
- A teacher requests report of student assignment submissions.
- The report requests events from the log reader and displays them.
From event to report
There are four steps of how event is captured, logged and potentially presented in a report. These steps are summarised in Figure 1 below and can be described as follows.
- Handling of events and filtering what needs to be logged. See interface \core\log\manager below.
- Storing the events data in the log storage (DB, filesystem, etc.). See interface \tool_log\log\writer below.
- Retrieving the data from log storage - each plugin implements some methods to query and extract logs back to Moodle. See interfaces \core\log\reader and \core\log\sql_reader below.
- Displaying the data in the report.
One plugin may cover multiple steps. For example we expect that most of log writer plugins will also implement log reader. It is also possible to create a report that has built-in event handling, log writing and reading and report presentation (all four steps). Also external logging systems may not wish to allow log data they capture to be reported within Moodle, handling it through an external system, so they may not implement steps 3 and 4 at all.
The Moodle standard distribution will provide a suggestion for a logging-report chain that may be followed by add-on plugins, but the API will be flexible enough to allow alternatives.
The Figure 2 below shows how the requests are made between user actions, plugins and core when:
- configuring logging - which log writer plugins are enabled
- event occurs
- report - selecting log store
- viewing report
Public API
This section describes all interfaces that need to be implemented by log manager and each log storage class.
Reading API
- function \get_log_manager() - returns the current log manager if configured or some dummy instance otherwise, it uses the signleton pattern.
- interface \core\log\manager - log manager interface, in standard distribution implemented by \tool_log\log\manager and \core\log\dummy_manager
- public function get_readers($context) returns the active storages implementing reader interface, returns array(classname => storage instance, ...)
- interface \core\log\reader extends \core\log\reader - general log storage that supports sql selects (or emulates them)
- public function get_name() - returns name of the log storage, this will be used when selection data source in reports.
- public function get_description() - some help for admins.
- public function can_access($context) - returns true if user can access the logged data, individual event access control must be verified too.
- public function get_events($selectwhere, array $params, $order, $limitfrom, $limitnum) - returns event object instances matching given parameters.
- public function get_events_count($selectwhere, array $params) - returns number of events that would be returned from get_events() with the same parameters.
- $selectwhere is supposed to be a limited sql predicate.
- The set of operators allowed in $selectwhere are :- =, <>, >, >=, <, <=
- The set of keywords allowed in $selectwhere are :- IN, BETWEEN, LIKE, IS NULL, IS NOT NULL
- Logical operators OR, AND
- NOT operator
- ( and ) brackets
- ? and :named parameter placeholders
- Use of anything besides the mentioned keywords/operators is strictly forbidden.
- interface \core\log\sql_reader extends \core\log\reader - log storage that can expose data as read-only database view or table.
- public function get_log_table() - returns database table or view; structure is described in phpdocs, it will include all columns defined by log_sql
Reports are expected to use only the reading API. Please note that all non-sql readers need to somehow translate the $selectwhere SQL query syntax.
When reports are searching for log readers they use the get_log_manager()->get_readers() and verify if the returned log reader instances implement the requested interface (one class can implement multiple interfaces). If there are multiple readers returned by the log manager, then the report is responsible for letting the user choose which storage to create the report on.
Writing API
- interface \tool_log\log\store - must be implemented by each store subplugin.
- public function __construct(\tool_log\log\manager $manager) - no parameters, configuration is taken from admin settings.
- public function dispose() - called when reading/writing finished.
- interface \tool_log\log\writer extends \tool_log\log\store - interface indicating the log store may be used for writing.
- public function write(\core\event\base $event) - called from event observer defined in log manager.
The writing API is defined in tool_log plugin, the log stores are defined as new subplugin of this plugin.
Each log store plugin will be responsible for filtering of events, it may use standardised admin settings and filtering or decide to implement custom filtering.
Standard logging plugins
Standard Moodle distribution will include four plugins.
Logging management plugin (tool_log)
Defines:
class \tool_log\log\manager implements \core\log\manager;
The log manager is responsible for step 1 from Figure 1 above. It allows admins to configure which log store subplugin is enabled and in which order they appear. The log store plugins are implemented as tool_log subplugins with class "store" in their namespace (\logstore_xxxx\log\store).
The log manager registers a '*' event observer, this observer is external - it means that events are received only when database transactions are not in progress.
DB log storage plugin (logstore_standard)
Defines:
class \logstore_standard\log\store implements \core\log\sql_reader, \tool_log\log\writer;
The name logstore_standard does not seem to be optimal - logstore_db is not allowed because it collides with standard /db/ subdirectory in plugins and logstore_database should be probably reserved for external database logging (to prevent any confusion).
This log storage plugin is responsible for steps 2 and 3 from Figure 1 above. It is the primary replacement for the former {log} table. The performance will be similar to former log table, but for smaller sites, this simple default will suffice.
Each plugin decides how much data is kept, when it is deleted, etc.
Storage settings
Each storage plugin with writer interface is responsible for backlisting or filtering of events. There will be a set of standardised admin settings for the following types of filters:
- Include guests actions (default Yes).
- Exclude log levels (default empty), accepts constants from Event_2#Level_property.
- Exclude by CRUD (default empty), accepts C, R, U, and D.
- Exclude by component (default empty)
- Exclude by event name (default empty)
To simplify implementation in other log storages the setting classes will be accompanied by some simple implementation of event filter that accepts values from these settings.
External database log storage plugin (logstore_database)
Defines:
class \logstore_database\log\store implements \tool_log\log\writer;
This log storage plugin is responsible for step 2 only from Figure 1 above. There will be no interface to read from it provided with Moodle core.
Legacy log table reader (logstore_legacy)
Defines:
class \logstore_legacy\log\store implements \core\log\reader, \tool_log\log\store;
This plugin will not conduct any log writing; it will translate existing {log} table data into the new event format and transforms new SQL selects to match the legacy table format. This serves as an emulation layer that allows new reports to access historical data stored in mdl_log table. Users will have to select this datasource manually if they want to see historical data that is not present in new log tables.
In new reports the legacy data will be presented using a special legacy event class that can not be triggered: \logstore_legacy\event\legacy
Performance
Compared to older Moodle version the new system will need to deal with higher number of log entries. This can be improved by buffering of log entries together with new improved bulk insert operations in the DML layer, this would also require some new hooks system because the log storages would need to be notified that the script processing was finished. This might be an experimental feature at first because it will not be a trivial change and there might be some problems that might result in loss of logged data.
Running new and old event logging at the same time with result in 2x number of DB inserts, it will be therefore recommended to disable the old logging as soon as possible.
In [https://tracker.moodle.org/browse/MDL-42894 MDL-42894], logging api comparison is done and it seems promising. Results are:
Enabled Stores | dbwrites | dbreads | Time used | Session size | Memory | Server load |
---|---|---|---|---|---|---|
Legacy + Standard store | Little increase | Same | Same | Same | Same | Increased |
Only Standard store | Decrease | Same | Same | Same | Same | Increased |
Only Legacy store | Same | Same | Same | Same | Same | Decrease |
Neither Legacy nor Standard store | Decrease | Same | Same | Same | Same | Decrease |
FYI: Serverload is showing worse in most cases, but as this was done on my local machine with all interference, I am ignoring this for now. --Rajesh Taneja (talk) 16:19, 14 February 2014 (WST)
Mockups
Log management page
Log store settings pages
Overarching considerations
We can't actually predict all use cases, so we must think generically and plan for worst cases. Here are some overarching principles that will be followed in the implementation of the new logging system.
- Things that don't need to use logs should not use logs. (Recent activity, etc)
- Logging calls should be as cheap as possible.
- The Events API will be the central communication mechanism, allowing lots of potential information gathering tools to exist in future.
- Use plug-ins to determine where to put logs permanently (NoSQL, SAS, file...). Initially, the default will be to the current log table.
- Log everything we possibly can think of.
- Define log level settings (on each logging call) so different sites can choose what they want logged, which allows controls cotrol of log size, speed, etc..
- Function add_to_log should be retained for backward compatibility, without changing existing implementation, but allowing admins to disable it.
Implementation steps
The primary goal of this project is to write a new logging API (manager, writers, readers, legacy reader).
- Define all interfaces - MDL-41266
- Define the expected log table structure with extra request information - MDL-41266
- Add support for admin tool subplugins - MDL-41267
- Implement tool_log plugin - MDL-41268 (blocked by MDL-41266, MDL-41267)
- Implement logstore_standard plugin - MDL-41269 (blocked by MDL-41266, MDL-41267)
- Implement new admin log store management UI - MDL-41270
- Implement new log store setting and related filtering code - MDL-41271 (blocked by MDL-41269, MDL-41270)
- Implement logstore_legacy emulation reader plugin - it needs to map new selects to old mdl_log table and restore the legacy log data as new events - MDL-41272 (blocked by MDL-41266, MDL-41267)
- Optionally implement log_database plugin or any other sample external storage - MDL-39933 (blocked by MDL-41266, MDL-41267, MDL-41270)
- Implement log buffering to delay writes to the end of the process/page load - MDL-42893
- Buffering in standard log storage - MDL-41287 (blocked by MDL-41269)
- Investigating performance of the new logging API in practice - MDL-42894
Reports
Logging usage#Reports describes current reports.
Migration to new logging API
Implementation steps
- Alter reports to accept new event object instances instead of array data.
- Alter reports to understand new event properties.
- Rework statistics report to use new event data.
By report:
(All issues require completion of either log_legacy or log_standard: MDL-41269, MDL-41272)
- Reproduce log report to use the new logging system (log and loglive) - MDL-39950
- Convert Activity report (outline) to use new logging API - MDL-41282
- Convert Course participation report to new logging API - MDL-41283
- Convert Statistics report to new logging API - MDL-41284
- Convert block recent activity to use new Logging API - MDL-41285
- Convert outline report to use new logging API - MDL-41286
The following usages of mdl_log table need to be implemented differently:
- auth/mnet/auth.php - MDL-42888
- admin/handlevirus.php - MDL-42889
- backup/util/helper/backup_cron_helper.class.php - MDL-42890
- Function can_delete_course() - MDL-42892
- Function notify_login_failures(), count_login_failures() - MDL-42891
Follow-up Projects
- Implement an external logging plugin with writer and reading support - MDL-42895
- Converting completion to use events - MDL-42897
- Developing a Timeline page/block so each user to see a stream of activity that is relevant to them individually - MDL-42898
Related
Here are some ideas of how new even data and observers can be put to use:
- Please note that the log storage API is not optimised for these tasks.
- Timely notifications of critical events (defined by users), for example more than five people posted in the same forum within an hour.
- "Red circle" notification badge counter icons everywhere in Moodle highlighting things that a user needs to pay attention attention to, such as ???.
- Colour heatmaps overlaid on course pages showing recent usage patterns of the course.
- Visualisation of live activity across a whole site (for admins) (is that logging or event handling?)
- Live engagement analytics reporting ??? that take specific actions into account, such as ???.
- Identification of students at risk based on their involvement in the course (or lack of it).
Logging of exceptions, errors and debugging:
- Logging of Errors and warnings is not part of this specification, it meeds to be handled via a completely separate mechanism.
- People report errors that cannot be duplicated. Logging errors with a stack trace, session variables and other relevant data could be useful for debugging.
- Standard logs may theoretically produce errors and exceptions that need to be logged somewhere else.
- There is no synchronization with server logs to help debugging performance issues. (This will not be addressed by new logging API.)
- There is no logging of error information to help debug Moodle issues. (This will not be addressed by new logging API.)
Performance logging and profiling
We need to measure the performance of logging system and store the information somewhere.
Other
- There is no mechanism for archiving, so old logs are deleted and lost.
- Log deletion is indiscriminate, with no control of what log information is deleted or retained.
Places where we need more logs
The principle work of this project is to replace the existing logging API. Once the new system is in place, the coverage of logging around Moodle can be improved.
Suggestions for places in Moodle where logging is needed should be reported as Tracker issues and linked to the Epic MDL-28443 "Action logging improvements". Feel free to echo significant areas here.
- Activities on site, course and activity administration pages.
- Micro activities in a page (more than one per load), such as ???.
- AJAX calls (eg on course page), such as ???.
- Logging of long events (such as an LDAP synchronisation) should have:
- one unique identifier per event (like for example postfix mail server) to help admin monitor the cron processes and
- at least 2 log entries (begining and end) for each event, possibly more.
- Should be careful about logging confidential information. The privacy is controlled by event access control and can_access() method in each storage plugin..
- Shouldn’t delete logs when course is deleted. There is no API to delete logs, each log storage controls the rotation or deleting of old data individually.
See also
Other logging systems:
Examples of data-based learning technology development: