Note:

If you want to create a new page for developers, you should create it on the Moodle Developer Resource site.

Learning Analytics Specification: Difference between revisions

From MoodleDocs
(Adding another question.)
Line 20: Line 20:
==Why is a Learning Analytics system needed?==
==Why is a Learning Analytics system needed?==


Although there are already a number of sources of learning analytics, there is no API or consistent source of learning analytics that combines data from multiple sources and makes it accessible to plugins throughout Moodle.
Although there are already a number of sources of learning analytics, there is no API or consistent source of learning analytics that combines data from multiple sources and makes it accessible to plugins throughout Moodle. An framework that relies on a common language will allow development of learning analytics tools, to gather and to present information.


Learning analytics are needed to inform teachers about students, to inform the staff that support teachers and to inform students themselves. The primary need for learning analytics has been identified as promoting student retention by identifying student's status (''at risk'', ''normal'', ''ahead''), based on their activity in a course.
Learning analytics are needed to inform teachers about students, to inform the staff that support teachers and to inform students themselves. The primary need for learning analytics has been identified as promoting student retention by identifying student's status (''at risk'', ''normal'', ''ahead''), based on their activity in a course.
'''??? Will we ever need more than three states? How flexible do we need to be? Adding this as a constraint does simplify a lot of the design, but will we regret it later? ???'''


A learning analytics system should be:
A learning analytics system should be:
Line 71: Line 73:


'''??? How will individual LA values be calculated? It will need to differ for each source? Can we describe how this will be done for the desired learning analytics, such as course access? ???'''
'''??? How will individual LA values be calculated? It will need to differ for each source? Can we describe how this will be done for the desired learning analytics, such as course access? ???'''
'''??? Should status values relate to relative or absolute performance in relation to other students? Should they be relative to expected status at a particular point in the course? How should this be calculated at varying stages through a course? Would this be relevant in courses that have no defined beginning or end? ???'''
'''??? Should status values relate to relative or absolute performance in relation to other students? Should they be relative to expected status at a particular point in the course? How should this be calculated at varying stages through a course? Would this be relevant in courses that have no defined beginning or end? ???'''



Revision as of 01:32, 3 November 2015

Learning Analytics
Project state Planning
Tracker issue XXX
Discussion XXX
Assignee XXX

Note: This page is a work-in-progress. Feedback and suggested improvements are welcome. Please join the discussion on moodle.org or use the page comments.


This is a specification for a Learning Analytics system including an API for collection and combination of learning analytics data and a number of interfaces for presenting this information.

The specification started at the Learning Analytics working group during MoodleMoot US 2015 (course, notes).

What are Learning Analytics?

Learning Analytics are any piece of information that can help an LMS user improve learning outcomes. Users include students, teachers, administrators and decision-makers.

There are a number of existing reports, blocks and other plugins for Moodle (standard and additional) that provide analytics. A list of Learning Analytics plugins is listed in the Moodle user docs.

Why is a Learning Analytics system needed?

Although there are already a number of sources of learning analytics, there is no API or consistent source of learning analytics that combines data from multiple sources and makes it accessible to plugins throughout Moodle. An framework that relies on a common language will allow development of learning analytics tools, to gather and to present information.

Learning analytics are needed to inform teachers about students, to inform the staff that support teachers and to inform students themselves. The primary need for learning analytics has been identified as promoting student retention by identifying student's status (at risk, normal, ahead), based on their activity in a course.

??? Will we ever need more than three states? How flexible do we need to be? Adding this as a constraint does simplify a lot of the design, but will we regret it later? ???

A learning analytics system should be:

  • predictive (combining currently available data to show current status and enable prediction of future status)
  • proactive (notifying users when action is needed to change status)
  • verifiable (able to compare predictions to actual outcomes to measure accuracy)

??? How important is history and validation? Keeping a history of LA values is not hard, but how could it be used later? ???

Validation against historical data is key because different institutions (and even programs within institutions) have different outcomes they wish to predict, e.g. course completion, grades, etc. and different criteria that may be relevant, such as levels of participation, mastery of specific outcomes (if used), depth or breadth of social network or content access, etc., so creating universal learning analytic schemae is impossible. Even a prepared suite of broadly applicable learning analytics will need to be validated by an institution against their own historical data before being considered usable for predictive purposes by that institution.

A learning analytics system will be able to make use of pre-fetched data that can be combined in varying ways, on the fly.

Predictions will be based on a configuration (weights applied to source data) based on institution/organisation calibration and able to be adjusted on the fly.

Learning Analytics data gathering API

The learning analytics system is made up of a number of components.

  • Sources represent individual, pre-defined learning analytics. They are objects based on an extensible API, so the number and nature of sources can be added to.
  • The Learning Analytics API consists of a number of data tables and a calculator that takes learning analytics and calculates a status value, based on the current configuration, as it is requested. It is an API in the sense that other components can request the current status value for a user (student) in a given context (course).
  • A number of essential user interfaces are defined initially, but there are other potential interfaces possible.

Learning Analytics API diagram.png

Sources of Learning Analytics

??? Can we used the word "source" instead of "metric". I think "source" would be clearer to most users. ???

Each of the designated Learning Analytics sources should be collected so they can be combined in calculations on the fly. Teachers and other users should be able to manipulate weights to explore the effect this has on student status.

A learning analytics source API is a pluggable system so that additional sources can be added. In this sense, sources are a sub-plugin of the Learning Analytics API.

The following sources were identified as important learning analytics that need to be made available.

  • Course access/views (useful early in course) (on the fly DB access)
  • Activity/resource views, reads (event observer, using R in CRUD)
  • Activity resource submissions, postings, etc. (event observer, using C,U in CRUD)
  • Gradebook current grades and course grade (on the fly DB access?)
  • Completion status, if used (on the fly DB access)
  • Assessment Feedback views (quiz, assignment, workshop, others?) (events observer, specific events)
  • Interaction between students (Social Network Analysis) (forum posts vs views, diversity of interaction vs number of) (pre-calculated from DB)

Learning analytics source values are pre-gathered so values are available for combination/calculation as needed. They will be recalculated when dependent information changes. Such changes are observed through specific events. For example, when a user access a course, the course access source is triggered and recalculates the related learning analytic value.

Sources produce a floating point value between zero and one and pass this to the Learning Analytics API for storage. This value limitation is important so that learning analytics can have relative meaning. The way of calculating this value will differ for each source because they are drawing on different information. Such information may be calculated based on:

  • a proportion of a possible maximum grade/value,
  • a normalised relative score for a user within a context,
  • events taking place in relation to expected dates (eg., submitting before a deadline) or
  • a more complex algorithmic combination of information.

??? How will individual LA values be calculated? It will need to differ for each source? Can we describe how this will be done for the desired learning analytics, such as course access? ???

??? Should status values relate to relative or absolute performance in relation to other students? Should they be relative to expected status at a particular point in the course? How should this be calculated at varying stages through a course? Would this be relevant in courses that have no defined beginning or end? ???

LA sources themselves may have some configuration so they can be tweaked under different circumstances (such as education sector or class scales). Such configuration may be limited to the site level or it could be configured within a given context (course).

Some sources would relate to multiple activities. For example, viewing may related to all activities/resources within a course. ??? Is it necessary for different activities to be distinguished by a source? Could it be the case that some activities should be included/excluded? Would this make the system too complex? ???

Structure and Functions of the Sources API

XXX

Learning Analytics API data structures

To calculate status of individual users (students), learning analytics values are combined as a weighted sum, according to a configuration, and stored as status values. These status values can be used by numerous interfaces via requests to the Learning Analytics API.

Learning Analytics

  • User ID
  • Context ID
  • Source
  • Timestamp
  • Value {0..1}

Learning Analytics are gathered and/or pre-calculated pieces of information from LA sources. They relate to an individual user's (student's) success as reported by the source. Each calculated learning analytics value is stored so they can be accessed later. Usually only the latest values will be used.

Configuration

  • Context ID
  • Source
  • Weight
  • Thresholds
  • Timestamp

The configuration defines how learning analytics should be combined within a particular context (course). Each source is given a weight in the configuration and the weighted sum is calculated to form the student status. Threshold values for at risk, normal and ahead are defined in the configuration.

Configurations can be validated when compared to final outcomes, such as course completion or course grade.

Configurations can be saved as revisions with a timestamp, so that a series of status values can be recalculated later and so that the validity of different configurations can be compared over time.

Student Status

  • User ID
  • Context ID
  • Status value {0..1}
  • Status code {'At risk','Normal','Ahead'}
  • Timestamp

Student status values can be requested for each user (student) in each relevant context (their enrolled courses). This information is available from the Learning Analytics API to various interfaces throughout Moodle.

Status values are the weighted sum of learning analytics values, calculated based on the configuration within the context (course). Status values relate to status codes, which denote whether the user (student) is at risk, making normal progress or is ahead within the context (course). These status codes are bounded by thresholds defined in the configuration for the context (course).

Calculations should continue to work when there is an absence of a value from a particular source. When this is the case, zero values should be assumed.

Structure and functions of the Learning Analytics API

XXX

Interfaces

The Learning Analytics system development includes a number of interfaces. Initially, a basic set of interfaces will be developed, but the system will be able to be expanded, with future interfaces possible, created as standard and additional plugins.

Configuration Interface (narrowest audience)

The configuration interface is a mandatory part of the system. It allows teachers/administrators to control which learning analytics sources are involved in status calculations, their weights and their configuration. A configuration is relative to a specific context (course).

Learning Analytics configuration.png

  • Thresholds can be set for status values at risk and normal, implying a threshold for ahead.
  • Each available source is listed in a table of sources. As the LA API is extensible, this list is not fixed and sources could be added.
  • Each source has a weight {0..1}, represented as a percentage. The sum of all weights must always sum to 1 (100%). Weights can be adjusted up and down, with the weights of other sources dynamically adjusting to compensate. Adjustments will be saved automatically. ??? When weights are adjusted, how should other weights change to compensate? ???
  • When a source has a weight of zero, it is not included in status calculations and the source will not appear in other interfaces. To include a source in calculations and to show it on other interfaces, its weight will need to be increased above zero.
  • Sources may have the potential to be configured, although this may not be the case for all sources. Configuration may include tweaking how the source presents it's score for students, such as the inclusion/exclusion of certain activities. The configuration will appear in a form presented by the source and this configuration interface will differ between sources. The configuration set for an individual source will be relevant to context (course). It is assumed that in most cases, a default configuration for a course will be sufficient for most teaching situations.
  • At the site level, thresholds and a default set of weights can be established and applied for new courses. In this way, a relevant default setup can be created for an institution and the general way that teaching happens there. The learning analytics system should be able to work, even if the configuration interface is never visited in a particular context.

List of students sorted by status (teachers, support staff)

The list of students in a context (course) will be displayed on one page, with their status value and number of notifications sent from the learning analytics system. The list of students is always sorted by status value, but may be filtered.

LA List interface.png

  • The list is always sorted by status value, with students having the lowest status values (at risk) being noted at the top of the list and ordered downwards towards students whose status is normal or ahead. This sorting denotes the relative performance/position/rank of the student in the class.
  • When the size of the class exceeds a certain number, say 50, it may be useful to include pagination. To assist in finding individuals, filtering by name may be included.
  • The row below the table header includes controls for on-the-fly weight adjustments. (These controls should be visually different to ones that indicate sorting). When weights are adjusted, other weights will dynamically compensate and adjustments will be saved automatically. The effect of weighting changes will be an immediate recalculation of status values, potentially reordering students and providing new status icons.
  • The status column will include a colour-coded status icon. These colours will need to be configurable at the site level as colours can have different meanings in different cultures; also, some sites may wish to match status icons to theme colours. The default colours will be:
    • red for at risk,
    • blue for normal and
    • green for ahead.
  • Hovering over the status icon will reveal the status and status value (as a percentage).
  • Notifications sent to students from the learning analytics system will be shown. An envelope icon appears with the number of sent notifications beside it (when greater than zero). Clicking on the envelope takes the user to the notification interface, where they can send an individual notification to the relevant user (student).

Detail view per user (student)

The detail view shows details for an individual user (student).

Detail view per student.png

  • The status icon is shown with the textual term for the status and the status value (as a percentage).
  • The relative performance of the user is given in relation to other users in the course. This represents their position in the course and a percentile is also given.
  • The number of notifications sent from the system can be seen and there is an explicit link to allow individual notifications to be sent to the user.
  • Components of the status, from each source, are given and there could potentially be additional information provided by the relevant source (such as the last access date).
  • ??? What other information could be shown on the Details view page? It's not much more informative than the table. ???
  • By default, the user (student) referred to in details should not be able to see this page. A capability controlling this should be created so exceptions can be made. (Studies have shown that showing at-risk students detailed evidence of their progress leads to poorer outcomes.)

Push notifications (widest audience)

The notification interface allows notifications to be sent to users in response to their status. This interface is similar to the messaging interface, but needs to be independent from it so that:

  • notifications can be sent based on triggers,
  • specific information can be included in notifications and
  • notifications can be tracked in relation to the learning analytics system.

Notifications interface.png

  • Notifications can be sent to a group of students who have been identified as at risk (default) or to everyone in the course.
  • Notifications can be sent at a certain point in time, for example, every two weeks, or they can be sent Manually (default). When a frequency other than Manually is selected, the Send now button will be replaced by a Save button.
  • Notifications sent to users (students) can also be sent to secondary users as a way of tracking notifications. The list of secondary recipients would be based on roles of users enrolled in the course, excluding those based on archetypal student roles. For example, a teacher could be sent notifications at the same time they are sent to students. If other roles are defined in the course, such as a manager or mentor, these could be nominated as a secondary recipient. A note should be automatically added to the subject when sent to secondary recipients.
  • The message subject and body can contain place-holders, replaced by values when the message is sent. Help will need to be provided to use these place-holders. Default values for subject and body will be made available, based on language strings.
  • The notification interface can be used to send a message to an individual user (student) when this action is selected from the list or detail views. When sending to an individual, the status and frequency controls will be hidden. The only button possible will be Send now as this is an act of manual intervention.

How it works

XXX

Event takes place (eg., student accesses course)

Observing Source is triggered

Learning analytics value calculated

Learning analytics value passed to LA API for storage

LA value stored

---

Status requested

Configuration retrieved

LA values gathered

Status calculation sums weighted LA values

Status codes calculated, based on thresholds

Status passed

Benefits

  • XXX

Usage scenarios

  • XXX

Future work

XXX

Other potential extensions to the system

  • Reporting of resource/activity effectiveness in contributing to learning outcomes
  • Potential for students to opt in/out of data collection, analysis, reporting or exports. This may be necessary in some places. It might be possible to have student logging be anonymous or exported anonymously.
  • Ability to export data as a static dump or as a stream of analytics information, possibly conforming to the IMS Caliper standard, for extra-system analysis.
  • Third-party plugins that use the Learning Analytics API

Other potential sources of Learning Analytics data

There are other possible sources of Learning Analytics. As well as those suggested above, here are some additional potential sources.

  • performance in other courses enrolled (now & previously)
  • student ratings on forums
  • self-evaluation
  • history of activity within the current course
  • history of course activity within a faculty/category
  • mood/sense-making
  • activity in relation to deadlines
  • time spent online based on activity captured through JavaScript
  • demographic/profile data
    • year/progress
    • past grades, prior credit, GPA
    • completion of prerequisites vs recommended prerequisites
    • personal background
    • native language, exceptionalities / accommodations
  • cross-instance performance across multiple Moodle instances

Other potential interfaces

  • An institutional summary view showing an overview of student retention in courses
  • Activity restrictions based on risk status based on learning analytics
  • A block showing a student their current status and linking to their details, showing teachers a list of students (with at risk status?)
  • A profile page element showing the student's status

See also