Students at risk of dropping out

Overview

Beginning in version 3.4, Moodle core now implements open source, transparent next-generation learning analytics. In Moodle 3.4, this system ships with a built-in model called "Students at risk of dropping out." This documentation describes this model in detail.

This model predicts students who are at risk of non-completion (dropping out) of a Moodle course, based on low student engagement. In this model, the definition of "dropping out" is "no student activity in the last quarter of the course." This prediction model uses the Community of Inquiry model of student engagement, consisting of three parts:

Cognitive presence
Social presence
Teacher presence

See below for further details of how these constructs are defined within the model.

Features

By abstracting the concepts of "cognitive presence" and "social presence," this prediction model is able to analyze and draw conclusions from a wide variety of courses, and apply those conclusions to make predictions about new courses, even courses never taught on the Moodle system before. The model is not limited to making predictions about student success only in exact duplicates of courses offered in the past.

Limitations

This prediction model assumes that courses have fixed start and end dates, and is not designed to be used with rolling enrollment courses. Models that support a wider range of course types will be included in future versions of Moodle.
1. Because of this model design assumption, it is very important to properly set course start and end dates for each course to use this model. If both past courses and ongoing courses start and end dates are not properly set predictions cannot be accurate.
2. Courses will not be included in training or predictions if the end date is before the start date.
This model requires the use of sections within the courses, in order to split all activities into time ranges.
Courses with start and end dates further than one year apart will not be used.
This model requires a certain amount of in-Moodle data with which to make predictions. At the present time, only core Moodle activities are included in the indicator set (see below). Courses which do not include several core Moodle activities per “time slice” will have poor predictive support in this model. This prediction model will be most effective with fully online or “hybrid” or “blended” courses with substantial online components.

Because the course end date field was only introduced in Moodle 3.2 and some courses may not have set a course start date in the past, we include a command line interface script:

$ admin/tool/analytics/cli/guess_course_start_and_end.php

This script attempts to estimate past course start and end dates by looking at the student enrolments and students' activity logs. After running this script, please check that the estimated start and end dates script results are reasonably correct.

Target

The target here is "students drop out of the course" (negative target) and is defined as: Enroled students show no activity in the final quarter of the course.

Additionally,

Enrolments in which the time end is before the current start date in the course will be excluded from predictions.
Enrolments lasting more than 1 year are excluded
Course completion can be used as a success metric if it is enabled
Otherwise, activity within the last quarter of the course is considered "not dropping out"

The code for this target is located at moodlesite/lib/classes/analytics/target/course_dropout.php

where moodlesite is the root directory of your moodle site.

Indicators

Indicators can be defined at any context level. The indicators used in this model are based on the concepts of "cognitive depth" and "social breadth," which are implemented for each of the core activity modules.

Cognitive depth

Cognitive depth is a measure of the construct "cognitive presence" within the Community of Inquiry theoretical framework. Cognitive presence is defined as “The extent to which the participants in any particular configuration of a community of inquiry are able to construct meaning through sustained communication” (Garrison, Anderson & Archer, 2000, p 89). Cognitive presence has usually been determined in research by manual content analysis. In this model, we define this construct based on the type of activity offered to the student, and the extent to which the student demonstrates cognitive engagement in that activity. The level of depth ranges from 0 to 5, where 0 indicates that the learner has not even viewed the activity. The levels of potential cognitive depth are:

The learner has viewed the activity details
The learner has submitted content to the activity
The learner has viewed feedback from an instructor or peer for the activity
The learner has provided feedback to the instructor or a peer within the activity
The learner has revised and/or resubmitted content to the activity

This model begins by assigning a maximum potential value of cognitive depth to each activity module. For example, the Assignment module allows up to cognitive depth of 4. See below for more details of how these levels are assigned for core activity modules.

Once the potential levels are assigned, each student enroled in a course is evaluated based on the proportion of the potential depth reached. For example, if an activity only supports up to level 3 and the student has reached level 3, the student is participating at 100 percent of the possible level of cognitive depth.

Social breadth

Social breadth is a measure of the construct "social presence" within the Community of Inquiry theoretical framework. It is defined as “The ability of participants to identify with the group or course of study, communicate purposefully in a trusting environment, and develop personal and affective relationships progressively by way of projecting their individual personalities” (Garrison, 2009, p 352). In the past, social presence has usually been measured via post-course surveys and manual discourse analysis, though there have been increasing attempts to automate this process. This model implements social presence as "social breadth" by examining the breadth of opportunities the participant has to communicate with others. The level of breadth ranges from 0 to 5, where 0 indicates the learner has not interacted with anyone. The levels of potential social breadth are:

The learner has not interacted with any other participant in this activity (e.g. they have read a page)
The learner has interacted with at least one other participant (e.g. they have submitted an assignment or attempted a self-grading quiz providing feedback)
The learner has interacted with multiple participants in this activity, e.g. posting to a discussion forum, wiki, database, etc.
The learner has interacted with participants in at least one "volley" of communications back and forth
The learner has interacted with people outside the class, e.g. in an authentic community of practice

This model begins by assigning a maximum potential value of social breadth to each activity module. For example, the Assignment module allows up to social breadth of 2. See below for more details of how these levels are assigned for core activity modules.

Once the potential levels are assigned, each student enroled in a course is evaluated based on the proportion of the potential depth reached. For example, if an activity only supports up to level 3 and the student has reached level 3, the student is participating at 100 percent of the possible level of social breadth.

Potential indicator levels for selected activity modules

The potential for engagement via cognitive presence and social presence constitutes instructional design, which is one key element of teaching presence. This diagram shows the potential cognitive depth and social breadth of all core activities and select non-core activities:

By categorizing each activity by potential cognitive depth and social breadth, we can anticipate what level of engagement is supported (and possibly expected) of the learner, even without a history of many learners’ actions in that activity instance. Note that higher levels along each axis include all lower levels, i.e. an activity that involves a student and all peers (social breadth 3) automatically includes levels 1 (student only) and 2 (student +1). In many cases, the specific level can only be determined by analyzing the parameter settings for the activity. (Note that the model included in Moodle 3.4 only supports activities in Moodle Core. Some non-Core activities are included here as examples.)

Analysable

The "analysable" element of Moodle for this model is the Course. This indicates that the model will iterate through the courses on the site and process each one, either to train the model or to make predictions. Predictions are made for each "sample" entity (see below) within the context of the course.

For scalability reasons all calculations at course level are executed in per-course basis and the resulting datasets are merged together once all site courses analysis is complete.

Samples

"Samples" in the context of machine learning indicate the unit of analysis. In this model, the samples are student enrolments in courses. Predictions will be made for each student enrolment in a course, based on the data observed during the training of the model for all previous student enrolments in courses that have ended.

Valid samples

Valid samples are defined for each model in terms of model training and model predictions. For this model, the criteria are:

For prediction = ongoing courses
For training = finished courses with activity

Insights

Insights are the specific predictions generated by a model for each unit defined in the sample (in this case, student enrolments in a course) within the context of that model (in this case, each course). The context is used to define who will receive notifications based on the moodle/analytics:listinsights capability for that context. For this model, this permission is defined for the teacher role by default.

In this model, the insights are binary, i.e. "student at risk of dropping out" or "student not at risk of dropping out."

Actions

Each insight can have one or more actions defined. For this model, the actions are:

Send a message to the student
View the Outline report for the student in this course
View prediction details
Acknowledge the notification
Mark the notification as "not useful"

Recommended Time Splitting Methods

The time splitting method is selected when the model is enabled. This depends on the typical length of courses and the length of the add/drop period (if relevant). If you wish to see predictions within the first two weeks of a 16 week course, you will need to use "tenths." (16 weeks = 112 days, so predictions will be calculated approximately every 11 days.) However, if you wish to see predictions every two weeks in an 8 week course, "quarters" will suffice. Remember, the evaluation process will iterate through all enabled time-spitting methods, so the more time-splitting methods enabled, the slower the evaluation process will be each time it runs (and the longer the model will take to train).

Documentation