Logging 2

Jump to: navigation, search
Logging stage 2
Project state Completed
Tracker issue MDL-37658
Discussion https://moodle.org/mod/forum/discuss.php?d=235318
Assignee moodle.com BACKEND team


Benefits of the new logging system

A few benefits of the new logging system are:

  • it allows capturer of richer information from Moodle about actions;
  • it provides control over how much information is logged;
  • it abstracts log writing and reading so logging can be scaled;
  • it lets you write/read logs to/from any external system through the use of logstore plugins;
  • it allows control over how much logging history is kept; and
  • it supports research, reporting and analytics.

Why is Logging an important topic?

Logging is fundamental to research, reporting and analytics. Logging allows informed, evidence based decisions to be made.

It is important that changes to Moodle are based on measured data. We need to be able to:

  • prove there is a problem and
  • prove that a given change is an improvement.

Logging data can be used by different Moodle users.

  • Better data allows researchers to study what happens in online teaching.
  • Better data gives admins more feedback on how to run their site - both technically and in relation to processes. (Please note this change is not about profiling).
  • Better data gives teachers better feedback to improve their teaching processes.
  • Better data gives students better feedback to improve their learning.

We can't predict all forms of analysis that might happen in future.

  • The logging system needs to be flexible in order to accommodate applications by these different users.
  • The logging API needs to be constructed in a way that is possible to improve in future.

Problems with the old logging approach

The old logging infrastructure had a number of issues that were resolved by this project.

  • Logging was added ad-hoc by developers and had spotty coverage.
  • Some logged actions did not contain important information needed later.
  • There were performance and scalability problems, such as:
    • log tables were joined in many popular queries and were slow;
    • logged events were stored in a single Moodle database table, which was not scalable;
    • it was not possible to configure levels of logging ; and
    • it was not possible to disable logging completely.

Logging

The new Events API introduced in Moodle 2.6 provides a notification mechanism and single collection point for actions within Moodle. The Events system also defines information about events that can be extended with additional information and functionality by core code and plugins, as needed.

In order to capture events a logging API was specified, consisting of a number of plugin types.

To transition from the old logging system to the new, work was required in a number of areas.

  • In places where we previously added information to 'the' log table, we must instead generate a new event.
  • The standard Moodle distribution now includes three simple logging storage plugins that will store events in new format (to an internal or external database).
  • The original logging code and DB tables are available for full backwards compatibility for users of customised reports. It is possible to enable logging to the old {log} table if required.
  • All standard reports have been upgraded to use the new logging system and use a logging retrieval plugin to access log information.
  • Custom report plugins (add-ons) will be able to query the {log} table, but should ideally shift to the new logging system over time, so that sites are not forced to continue double logging.

Old log usage analysis

See this analysis to see log usage in Moodle 2.5 core code.

Example scenario

Writing to the log

  1. A student submits an assignment.
  2. An event is triggered by the Assignment module.
  3. The log manager plugin observes the event and determines if it is log-worthy depending on some admin settings.
  4. The log manager sends the event to all enabled log storage subplugins implementing \tool_log\log\writer. Storage plugins may decide to filter out some events based on their admin settings.
  5. The log store plugin(s) stores the event data optionally including extra information such as remote IP or original logged-in-as user id.

Reading from the log

  1. A teacher opens an activity report.
  2. The activity report queries available log readers by using the current log manager (implementing \core\log\manager).
  3. If there is more than one log reader (implementing \core\log\sql_select_reader) available they can select the source of data for the report.
  4. The report requests events from the log reader and displays them.

From event to report

There are four steps of how event is captured, logged and potentially presented in a report. These steps are summarised in Figure 1 below and can be described as follows.

  1. Handling of events and filtering what needs to be logged. See interface \core\log\manager below.
  2. Storing the event data in the log storage (DB, filesystem, etc.). See interface \tool_log\log\writer below.
  3. Retrieving the data from log storage - each plugin implements some methods to query and extract logs back to Moodle. See interfaces \core\log\reader and \core\log\sql_reader below.
  4. Displaying the data in the report.

One plugin may cover multiple steps. For example we expect that most of log writer plugins will also implement log reader. It is also possible to create a report that has built-in event handling, log writing and reading and report presentation (all four steps). Also external logging systems may not wish to allow log data they capture to be read back into Moodle, handling such data instead through an external system, so they may not implement steps 3 and 4.

logging plugins relation.png

Figure 1. Logging plugins relationship diagram

The Figure 2 below shows how the requests are made between user actions, plugins and core when:

  • configuring logging (which log writer plugins are enabled);
  • an event occurs;
  • a report is configured (selecting log store) and
  • a report is viewed.

logging_sequence.png

Figure 2. Logging plugins sequence diagram

Public API

This section describes all interfaces that need to be implemented by log manager and each log storage class.

Reading API

  • function \get_log_manager() - returns the current log manager if configured or some dummy instance otherwise, it uses the signleton pattern.
  • interface \core\log\manager - log manager interface, in standard distribution implemented by \tool_log\log\manager and \core\log\dummy_manager
    • public function get_readers($interface = null) returns the active storages implementing reader interface, returns array(classname => storage instance, ...)
  • interface \core\log\reader - general event reader
    • public function get_name() - returns name of the log storage, this will be used when selection data source in reports.
    • public function get_description() - some help for admins.
    • public function is_logging() - is latest data available?
  • interface \core\log\sql_select_reader extends \core\log\reader - general log storage that supports sql selects (or emulates them)
    • public function get_events_select($selectwhere, array $params, $order, $limitfrom, $limitnum) - returns event object instances matching given parameters.
    • public function get_events_select_count($selectwhere, array $params) - returns number of events that would be returned from get_events() with the same parameters.
      • $selectwhere is supposed to be a limited sql predicate.
      • The set of operators allowed in $selectwhere are :- =, <>, >, >=, <, <=
      • The set of keywords allowed in $selectwhere are :- IN, BETWEEN, LIKE, IS NULL, IS NOT NULL
      • Logical operators OR, AND
      • NOT operator
      • ( and ) brackets
      •  ? and :named parameter placeholders
      • Use of anything besides the mentioned keywords/operators is strictly forbidden.
  • interface \core\log\sql_internal_reader extends \core\log\reader - log storage that can expose data as read-only database view or table.
    • public function get_internal_log_table_name() - returns database table or view; structure is described in phpdocs, it will include all columns defined by log_sql

Most reports are expected to use only the reading API. Please note that all non-sql readers need to somehow translate the $selectwhere SQL query syntax.

When reports are searching for log readers, they use the get_log_manager()->get_readers($interface = null) where $interface is the name of reader interface that the report supports. If there are multiple readers returned by the log manager, then the report is responsible for letting the user choose which storage to create the report on.

Writing API

  • interface \tool_log\log\store - must be implemented by each store subplugin.
    • public function __construct(\tool_log\log\manager $manager) - no parameters, configuration is taken from admin settings.
    • public function dispose() - called when reading/writing finished.
  • interface \tool_log\log\writer extends \tool_log\log\store - interface indicating the log store may be used for writing.
    • public function write(\core\event\base $event) - called from event observer defined in log manager.

The writing API is defined in tool_log plugin, the log stores are defined as new subplugin of this plugin.

Each log store plugin will be responsible for filtering of events, it may use standardised admin settings and filtering or decide to implement custom filtering.

Standard logging plugins

Standard Moodle distribution will include four plugins.

Logging management plugin (tool_log)

Defines:

 class \tool_log\log\manager implements \core\log\manager;

The log manager is responsible for step 1 from Figure 1 above. It allows admins to configure which log store subplugin is enabled and in which order they appear. The log store plugins are implemented as tool_log subplugins with class "store" in their namespace (\logstore_xxxx\log\store).

The log manager registers a '*' event observer, this observer is external - it means that events are received only when database transactions are not in progress.

Standard log storage plugin (logstore_standard)

Defines:

 class \logstore_standard\log\store implements \core\log\sql_internal_reader, \tool_log\log\writer;

The name logstore_standard does not seem to be optimal - logstore_db is not allowed because it collides with standard /db/ subdirectory in plugins and logstore_database should be probably reserved for external database logging (to prevent any confusion).

This log storage plugin is responsible for steps 2 and 3 from Figure 1 above. It is the primary replacement for the former {log} table. The performance will be similar to former log table, but for smaller sites, this simple default will suffice.

Each plugin decides how much data is kept, when it is deleted, etc.

Storage settings

Each storage plugin with a writer interface is responsible for backlisting or filtering of events. Some of the examples of filtering that can done is as follows.

  • Include guests actions (default Yes)
  • Include log levels (default empty), accepts constants from Event_2#Level_property
  • Include by CRUD (default empty), accepts C, R, U, and D
  • Include by component (default empty)
  • Include by event name (default empty)

External database log storage plugin (logstore_database)

Defines:

 class \logstore_database\log\store implements \core\log\sql_select_reader, \tool_log\log\writer;

This log storage plugin is responsible for step 2 only from Figure 1 above. There will be no interface to read from it provided with Moodle core.

Legacy log table reader (logstore_legacy)

Defines:

 class \logstore_legacy\log\store implements \core\log\sql_select_reader, \tool_log\log\store;

This plugin will not conduct any log writing; it will translate existing {log} table data into the new event format and transforms new SQL selects to match the legacy table format. This serves as an emulation layer that allows new reports to access historical data stored in mdl_log table. Users will have to select this datasource manually if they want to see historical data that is not present in new log tables.

In new reports the legacy data will be presented using a special legacy event class that can not be triggered: \logstore_legacy\event\legacy

Performance

Compared to older Moodle versions, the new system will need to deal with higher number of log entries. This has been improved by buffering log entry writes together with new bulk insert operations in the DML layer. Helper traits were introduced \tool_log\helper\buffered_writer to help stores easily buffer the events.

Running new and old event logging at the same time with result in twice the number of DB inserts. It is therefore recommended to disable the old logging as soon as possible.

In [http://tracker.moodle.org/browse/MDL-42894 MDL-42894], a logging API comparison was done and it's results were promising.

Enabled Stores dbwrites dbreads Time used Session size Memory Server load
Legacy + Standard store Little increase Same Same Same Same Increased
Only Standard store Decrease Same Same Same Same Increased
Only Legacy store Same Same Same Same Same Decrease
Neither Legacy nor Standard store Decrease Same Same Same Same Decrease

FYI: Serverload is showing worse in most cases, but as this was done on my local machine with all interference, I am ignoring this for now. --Rajesh Taneja (talk) 16:19, 14 February 2014 (WST)

Screenshots

Log management page

Log management page

Log store settings pages

settings page for a the standard log setup
settings page for an external log: Database setup
settings page for an external log: Filtering

Overarching considerations

We can't actually predict all use cases, so we must think generically and plan for worst cases. Here are some overarching principles that will be followed in the implementation of the new logging system.

  • Things that don't need to use logs should not use logs. (Recent activity, etc)
  • Logging calls should be as cheap as possible.
  • The Events API will be the central communication mechanism, allowing lots of potential information gathering tools to exist in future.
  • Use plug-ins to determine where to put logs permanently (NoSQL, SAS, file...). Initially, the default will be to the current log table.
  • Log everything we possibly can think of.
  • Define log level settings (on each logging call) so different sites can choose what they want logged, which allows controls cotrol of log size, speed, etc..
  • The functionality of the pre-existing function add_to_log() should be retained for backward compatibility, but allowing admins to disable it.

Reports

Logging usage#Reports describes current reports.

Migration to new logging API

Follow-up Projects

  • Converting completion to use events - MDL-42897
  • Developing a Timeline page/block so each user to see a stream of activity that is relevant to them individually - MDL-42898

Related

Here are some ideas of how new even data and observers can be put to use. Please note that the log storage API is not optimised for these tasks.

  • Timely notifications of critical events (defined by users), for example more than five people posted in the same forum within an hour.
  • "Red circle" notification badge counter icons everywhere in Moodle highlighting things that a user needs to pay attention attention to, such as ???.
  • Colour heatmaps overlaid on course pages showing recent usage patterns of the course.
  • Visualisation of live activity across a whole site (for admins) (is that logging or event handling?)
  • Live engagement analytics reporting ??? that take specific actions into account, such as ???.
  • Identification of students at risk based on their involvement in the course (or lack of it).

Logging of exceptions, errors and debugging

  • Logging of Errors and warnings is not part of this specification, it meeds to be handled via a completely separate mechanism.
    • People report errors that cannot be duplicated. Logging errors with a stack trace, session variables and other relevant data could be useful for debugging.
  • Standard logs may theoretically produce errors and exceptions that need to be logged somewhere else.
  • There is no synchronization with server logs to help debugging performance issues. (This will not be addressed by new logging API.)
  • There is no logging of error information to help debug Moodle issues. (This will not be addressed by new logging API.)

Performance logging and profiling

We need to measure the performance of logging system and store the information somewhere.

Other

  • There is no mechanism for archiving, so old logs are deleted and lost.
  • Log deletion is indiscriminate, with no control of what log information is deleted or retained.

Places where we need more logs

The principle work of this project is to replace the existing logging API. Once the new system is in place, the coverage of logging around Moodle can be improved.

Suggestions for places in Moodle where logging is needed should be reported as Tracker issues and linked to the Epic MDL-28443 "Action logging improvements". Feel free to echo significant areas here.

  • Activities on site, course and activity administration pages.
  • Micro activities in a page (more than one per load), such as ???.
  • AJAX calls (eg on course page), such as ???.
  • Logging of long events (such as an LDAP synchronisation) should have:
    • one unique identifier per event (like for example postfix mail server) to help admin monitor the cron processes and
    • at least 2 log entries (begining and end) for each event, possibly more.
  • Should be careful about logging confidential information. Anonymous events can be used to controll privacy.
  • Shouldn’t delete logs when course is deleted. There is no API to delete logs, each log storage controls the rotation or deleting of old data individually.

See also

Other logging systems:

Examples of data-based learning technology development: