Note:

If you want to create a new page for developers, you should create it on the Moodle Developer Resource site.

Perth Hackfest October 2012/Logging User activity / Reporting / Analytics: Difference between revisions

From MoodleDocs
(copied session notes from google doc)
 
No edit summary
Line 1: Line 1:
* Why this is the most important topic?
==Why is Logging an important topic?==
** Hardly any of new development is data based. Hard to prove issue being solved is a problem and later that your fix was an improvement.
* Not much of new Moodle development is based on measured data. We need to  
** To get data, we need to do more logging. Right now logging is added whenever someone feels like it. No admin actions are logged, no synchronization with server logs, no capturing data over time, not deleted with much granularity (all or nothing mostly)
** Prove there is a problem
** Would allow researchers to study data
** Prove that a given change is an improvement.
** Would allow teachers to improve their teaching process. Provide real time feedback.
*Data comes from logging
* Approach: Build something and they will come vs built what we think people need
*Better data would allow researchers to study what happens in online teaching.
** Just log everything
*Better data would give admins more feedback on how to run their site (both technically and process)
** store it ‘somewhere’ really quick and let plug-ins determine where to put it permanently (NoSQL, SAS, file...) default base level plug-in to current log table
*Better data would give teachers better feedback to improve their teaching process.  
** Should be more of a tracing architecture
*Better data would give students better feedback to improve the learning process.
** The more defined you have your data, the more data you can get out of it. The less restrictive, the more data you can capture, but more you need to sift through. Need to find a balance between the two.
*Note we can't predict all the analysis
** have a log level settings so different sites can define how much they want logged - small sites might have low-level logging, to help manage log size
 
** add_to_log retained for backward compatibility, wrapper for new logger class.
==Problems==
* Parts of Moodle that aren’t being logged, that should
 
** Errors and warnings
* Logging is added adhoc by developers, has spotty coverage.  No admin logging!
*** People report errors, but cannot duplicate it.
* Log tables are joined in many popular queries and are slow.
*** Logging history with stack trace would be very useful
* No archiving, so have to delete old logs
*** Also dump session variables and all other data that can be useful for debugging
* Deletion is wholesale, no control.
** Should be careful about logging confidential information (e.g. POST data)
* No synchronization with server logs to help debugging performance issues
** Shouldn’t delete logs when course is deleted
* No storage of traces when there are problems, so we can debug Moodle issues
** Action logging improvements META - MDL-28443
 
* Notification screen
 
** Like facebook timeline
==Parts of Moodle that aren’t being logged, that should have logs==
** log view for current page available on every page
* Errors and warnings
* How to capture logging
** People report errors, but cannot duplicate it.
** Streaming logs into logsstash, fluentd, etc via UDP. Then query or filter on generic log server.
** Logging history with stack trace would be very useful
** Use event handler and plugins can determine where to send logs or if it would be logged.
** Also dump session variables and all other data that can be useful for debugging
** Similar to MUC, have different store back-end plugins.
* Admin activities
* Needs to be able to send data BACK to Moodle => Requirement.
* Micro activities in a page (more than one per load)
* Uses of log table at the moment
* AJAX calls (eg on course page)
** recent activity (block, myMoodle)
* Should be careful about logging confidential information (e.g. do not log any POST data)
** course activity report, participation report, user outline report
* Shouldn’t delete logs when course is deleted
** stats
* All events should all be logged
** cron
** Add more events (they are cheap and very useful)
** conditional access
** Need more standardisation of event data for all triggered events
** files via clamav
** Add a catch-all handler that pushes events to the logging system
** automated backup locking/heartbeats (probably)
* --> Action logging improvements META - MDL-28443
** failed logins
 
** auth/mnet/auth.php
==Users of log table at the moment==
** backup/restore (target logs and own logging)
 
** grade history (currently separate log)
A lot of the time-critical uses of the log table can be moved to new tables specifically for this purpose.  It will speed things up a lot.
* Problems
* recent activity (block, myMoodle)     EVENTS -> NEW TABLE
** How to bring logs stored in course backup in a restore?
* conditional access    EVENTS -> NEW TABLE
*** who cares?
* failed logins    EVENTS -> NEW TABLE
** Need to standardize event data that is being passed
* files via clamav    EVENTS -> NEW TABLE
 
Other things can can continue to query the main logs:
* course activity report, participation report, user outline report
* stats
* cron
* automated backup locking/heartbeats (probably)
* auth/mnet/auth.php
* backup/restore (target logs and own logging)
* grade history (currently separate log)
 
And some future things that will need the full log:
* Engagement reporting


Examples of data-based learning technology development:


[http://onlinelibrary.wiley.com/doi/10.1111/j.1467-8535.2008.00928.x/abstract http://onlinelibrary.wiley.com/doi/10.1111/j.1467-8535.2008.00928.x/abstract]
==Basic approach==


[http://www.sciencedirect.com/science/article/pii/S0360131510000461 http://www.sciencedirect.com/science/article/pii/S0360131510000461]
We can't actually predict all the use cases so we must think generically and plan for worst cases.


-> qtype_pmatch
* Make logging calls as cheap as possible
* Use MUC-like plug-ins to determine where to put logs permanently (NoSQL, SAS, file...).  Default will be to the current log table.
* Log everything we possibly can think of.
* Define log level settings (on each logging call) so different sites can choose what they want logged and so control size, speed etc
* add_to_log retained for backward compatibility, make it a wrapper for new logger function.
* provide hooks to expose logs everywhere.  For example, each page should have a link that shows logs and stats for that page.


[http://www.tandfonline.com/doi/abs/10.1080/02680513.2011.567754 http://www.tandfonline.com/doi/abs/10.1080/02680513.2011.567754]


[http://oro.open.ac.uk/24619/ http://oro.open.ac.uk/24619/]
===Moodle -> Logs ===


* MUC-like plugins to interface between our logging calls and log backends
** File
** logsstash
** fluentd
** memcached?
** http://logging.apache.org/log4php/ ?


Events API would need a catch all event handler for the logging system
===Logs -> Moodle===


* Each plugin implements some methods to query and extract logs back to Moodle
* Do we auto-sync anything back to Moodle tables for convenience?


Trigger an event for every loggable action


Possible logging backends


# Database (default)
# File
# logsstash, fluentd, etc


Why use the Event API? Any log is a recorded event.
==See also==


Eloy side-note: No. The event API can be a “vehicle” to send information to logging (agree with that), but logging MUST work (be able to log “things”) independent of the event API.
Examples of data-based learning technology development:
* http://onlinelibrary.wiley.com/doi/10.1111/j.1467-8535.2008.00928.x/abstract
* http://www.sciencedirect.com/science/article/pii/S0360131510000461 -> qtype_pmatch
* http://www.tandfonline.com/doi/abs/10.1080/02680513.2011.567754
* http://oro.open.ac.uk/24619/

Revision as of 07:27, 17 December 2012

Why is Logging an important topic?

  • Not much of new Moodle development is based on measured data. We need to
    • Prove there is a problem
    • Prove that a given change is an improvement.
  • Data comes from logging
  • Better data would allow researchers to study what happens in online teaching.
  • Better data would give admins more feedback on how to run their site (both technically and process)
  • Better data would give teachers better feedback to improve their teaching process.
  • Better data would give students better feedback to improve the learning process.
  • Note we can't predict all the analysis

Problems

  • Logging is added adhoc by developers, has spotty coverage. No admin logging!
  • Log tables are joined in many popular queries and are slow.
  • No archiving, so have to delete old logs
  • Deletion is wholesale, no control.
  • No synchronization with server logs to help debugging performance issues
  • No storage of traces when there are problems, so we can debug Moodle issues


Parts of Moodle that aren’t being logged, that should have logs

  • Errors and warnings
    • People report errors, but cannot duplicate it.
    • Logging history with stack trace would be very useful
    • Also dump session variables and all other data that can be useful for debugging
  • Admin activities
  • Micro activities in a page (more than one per load)
  • AJAX calls (eg on course page)
  • Should be careful about logging confidential information (e.g. do not log any POST data)
  • Shouldn’t delete logs when course is deleted
  • All events should all be logged
    • Add more events (they are cheap and very useful)
    • Need more standardisation of event data for all triggered events
    • Add a catch-all handler that pushes events to the logging system
  • --> Action logging improvements META - MDL-28443

Users of log table at the moment

A lot of the time-critical uses of the log table can be moved to new tables specifically for this purpose. It will speed things up a lot.

  • recent activity (block, myMoodle) EVENTS -> NEW TABLE
  • conditional access EVENTS -> NEW TABLE
  • failed logins EVENTS -> NEW TABLE
  • files via clamav EVENTS -> NEW TABLE

Other things can can continue to query the main logs:

  • course activity report, participation report, user outline report
  • stats
  • cron
  • automated backup locking/heartbeats (probably)
  • auth/mnet/auth.php
  • backup/restore (target logs and own logging)
  • grade history (currently separate log)

And some future things that will need the full log:

  • Engagement reporting


Basic approach

We can't actually predict all the use cases so we must think generically and plan for worst cases.

  • Make logging calls as cheap as possible
  • Use MUC-like plug-ins to determine where to put logs permanently (NoSQL, SAS, file...). Default will be to the current log table.
  • Log everything we possibly can think of.
  • Define log level settings (on each logging call) so different sites can choose what they want logged and so control size, speed etc
  • add_to_log retained for backward compatibility, make it a wrapper for new logger function.
  • provide hooks to expose logs everywhere. For example, each page should have a link that shows logs and stats for that page.


Moodle -> Logs

Logs -> Moodle

  • Each plugin implements some methods to query and extract logs back to Moodle
  • Do we auto-sync anything back to Moodle tables for convenience?



See also

Examples of data-based learning technology development: