ワークショップモジュール

Moodle 2.0

A ワークショップは多くのオプションを持った相互評価活動です。ワークショップは一番最初に提供されたモジュールですが、正常に動作させるため様々な開発により緊急修正された以外は大部分が長期間メンテナンスされていない状態でした。Moodle 1.9では「サイト管理 > モジュール > 活動の管理」にてデフォルトで非表示にされています。これらのバージョンでは既知の問題が数多く存在したため、Moodle 1.xバージョンのワークショップモジュールは使用を推奨されていませんでした。

ワークショップモジュールはMoodle 2.0のため、新しいテクノロジー、APIおよびMoodle 2.0にて提供される機能を使用して、完全に再設計および再コーディングされました。このページはMoodle 2.0のワークショップに関して文書化します。Moodle 1.xのワークショップ使用に関する詳細情報は「関連情報」セクションで一覧表示されるリソースをご覧ください。

主な特徴

ワークショップは課題モジュールに類似したモジュールであり、多くの方法でその機能が拡張されています。しかし、コース内でワークショップを使用する前に、コース世話役 (教師) およびコース参加者 (学生) の両者が少なくとも課題モジュールの使用をある程度経験することをお勧めします。

課題を目的として、コース参加者はワークショップ活動中に自分のワークを提出します。すべてのコース参加者は自分のワークを提出します。提出にはテキストおよび添付ファイルを含みます。そのため、ワークショップ提出は「オンラインテキスト」および「アップロードファイルTタイプの課題を統合したモジュールと言えます。チームワーク (参加者グループごとの提出) のサポートはワークショップモジュールの範囲外です。
提出課題はコース世話人 (教師) によって構築された評価タイプを使用して評価されます。ワークショップではいくつかの評価タイプをサポートします。評価タイプすべては複数クライテリア評価を許可します - それに対して、課題モジュールでは提出課題に対して1つの評定のみ与えることができます。
ワークショップは相互評価をサポートします。コース参加者には選択された一連の同級生、同僚の提出課題の評価が求められます。モジュールはこれらの評価の収集および配布を調整します。
実際には1つのワークショップ活動で、コース参加者は2つの評点を取得します - 提出課題の評点 (提出したワークがどれほど良かったか) および評価の評点 (同級生、同僚に対する評価がどれほど良かったか) です。ワークショップ活動ではコース評定表に2つの評定項目を作成します。そして、必要に応じてそれらを総計することができます (Moodle 1.xではワークショップは自動的にこれらの評点を合計して、その合計を評定表に単一の評定項目として送ります)。
相互評価作業および評価方法は、提出例と呼ばれる提出を使って事前にトレーニングすることができます。これらの例は参考評価と一緒に世話人 (教師) によって提供されます。ワークショップの参加者はこれらの例を評価して、参考評価と比較することができます。
コース世話人 (教師) はいくつかの提出を選択して、ワークショップ終了時に他のユーザが閲覧できるよう公開することができます (反対に、課題モジュールの提出課題は提出者および教師のみ閲覧することができます)。

ワークショップフェーズ

標準的なワークショップは短い期間の活動ではなく、完了するまで数日から数週間を要する場合もあります。ワークフローは5つのフェーズに分割されます。コース世話人 (教師) は活動のフェーズを1つのフェースから別のフェーズに手動でスイッチします。まだ、自動スケジュールスイッチはサポートしていません。1つのフェースから、どのフェーズにもスイッチすることができます。最も一般的なシナリオは最初のフェーズから最後のフェーズまで直線的にスイッチする方法です。しかし、高度な再帰モデル (advanced recursive model) を適用することもできます。

活動の進捗はワークショッププランナーと呼ばれるツールの中に可視化されます。このツールではワークショップすべてのフェーズを表示して、現在のフェーズをハイライトします。また、終了タスク、未終了タスクおよび不合格タスク情報を含めて、現在のフェーズでユーザが持っているタスクすべてをツールに表示します。

セットアップフェーズ

この初期フェーズでは、ワークショップの参加者は (提出課題および評価の修正を含めて) 実際に何もすることができません。ワークショップ設定の変更、評価方略の修正のため、このフェーズをコース世話人 (教師) が使用します。ワークショップの設定変更およびユーザのワーク変更を禁止する必要がある場合、あなたはいつでもこのフェーズにスイッチすることができます。

提出フェーズ

In the submission phase, Workshop participants submit their work. Access control dates can be set so that even if the Workshop is in this phase, submitting can be allowed in the given time frame only. Submission start date (and time), submission end date (and time) or both can be specified.

評価フェーズ

If the Workshop uses peer assessment feature, this is the phase when Workshop participants assess the submissions allocated to them for the review. As in the submission phase, access can be controlled by specified date and time since when and/or until when the assessment is allowed.

成績評価フェーズ

The major task during this phase is to calculate the final grades for submissions and for assessments and provide feedback for authors and reviewers. Workshop participants cannot modify their submissions or their assessments in this phase any more. Course facilitators can manually override the calculated grades. Also, selected submissions can be set as published so they become available to all Workshop participants in the next phase.

終了

Whenever the Workshop is being switched into this phase, the final grades calculated in the previous phase are pushed into the course Gradebook. This will result in the Workshop grades appearing in the Gradebook. Participants may view their submissions, their submission assessments and eventually other published submissions in this phase.

Grading strategies

Simply said, selected grading strategy determines how the assessment form may look like and how the final grade for submission is calculated from all the filled assessment forms for the given submission. Workshop ships with four standard grading strategies. More strategies can be developed as pluggable extensions.

Accumulative grading strategy

In this case, the assessment form consists of a set of criteria. Each criteria is graded separately using either a number grade (eg out of 100) or a scale (using either one of site-wide scale or a scale defined in a course). Each criterion can have its weight set. Reviewers can put comments to all assessed criteria.

When calculating the total grade for the submission, the grades for particular criteria are firstly normalized to a range from 0% to 100%. Then the total grade by a given assessment is calculated as weighted mean of normalized grades. Scales are considered as grades from 0 to M-1, where M is the number of scale items.

G_{s}={\frac {\sum _{i=1}^{N}{\frac {g_{i}}{max_{i}}}w_{i}}{\sum _{i=1}^{N}w_{i}}}

where

g_{i}\in \mathbb {N}

is the grade given to the i-th criterion,

max_{i}\in \mathbb {N}

is the maximal possible grade of the i-th criterion,

w_{i}\in \mathbb {N}

is the weight of the i-th criterion and

N\in \mathbb {N}

is the number of criteria in the assessment form.

It is important to realize that the influence of a particular criterion is determined by its weight only, not the grade type or range used. Let us have three criteria in the form, first using 0-100 grade, the second 0-20 grade and the third using a three items scale. If they all have the same weight, then giving grade 50 in the first criteria has the same impact as giving grade 10 for the second criteria.

Comments

The assessment form is similar to the one used in accumulative grading strategy but no grades can be given, just comments. The total grade for the assessed submission is always set to 100%. This strategy can be effective in repetitive workflows when the submissions are firstly just commented by reviewers to provide initial feedback to the authors. Then Workshop is switched back to the submission phase and the authors can improve it according the comments. Then the grading strategy can be changed to a one using proper grading and submissions are assessed again using different assessment form.

Number of errors

In Moodle 1.x, this was called Error banded strategy. The assessment form consists of several assertions, each of them can be marked as passed or failed by the reviewer. Various words can be set to express the pass or failure state - eg Yes/No, Present/Missing, Good/Poor, etc.

The grade given by a particular assessment is calculated from the weighted count of negative assessment responses (failed assertions). Here, the weighted count means that a response with weight $w_{i}$ is counted $w_{i}$ -times. Course facilitators define a mapping table that converts the number of failed assertions to a percent grade for the given submission. Zero failed assertion is always mapped to 100% grade.

This strategy may be used to make sure that certain criteria were addressed in the submission. Examples of such assessment assertions are: Has less than 3 spelling errors, Has no formatting issues, Has creative ideas, Meets length requirements etc. This assessment method is considered as easier for reviewers to understand and deal with. Therefore it is suitable even for younger participants or those just starting with peer assessment, while still producing quite objective results.

Rubric

See the description of this scoring tool at Wikipedia. The rubric assessment form consists of a set of criteria. For each criterion, several ordered descriptive levels is provided. A number grade is assigned to each of these levels. The reviewer chooses which level answers/describes the given criterion best.

The final grade is aggregated as

G_{s}={\frac {\sum _{i=1}^{N}g_{i}}{\sum _{i=1}^{N}max_{i}}}

where

g_{i}\in \mathbb {N}

is the grade given to the i-th criterion,

max_{i}\in \mathbb {N}

is the maximal possible grade of the i-th criterion and

N\in \mathbb {N}

is the number of criterions in the rubric.

Example of a single criterion can be: Overall quality of the paper with the levels 5 - An excellent paper, 3 - A mediocre paper, 0 - A weak paper (the number represent the grade).

There are two modes how the assessment form can be rendered - either in common grid form or in a list form. It is safe to switch the representation of the rubric any time and it is better to actually try it than to read a description here :-)

Note on backwards compatibility: This strategy merges the legacy Rubric and Criterion strategies from Moodle 1.x into a single one. Conceptually, legacy Criterion was just one dimension of Rubric. In Workshop 1.x, Rubric could have several criteria (categories) but were limited to a fixed scale with 0-4 points. On the other hand, Criterion strategy in Workshop 1.9 could use custom scale, but was limited to a single aspect of assessment. The new Rubric strategy combines the old two. To mimic the legacy behaviour, the old Workshop are automatically upgraded so that:

Criterion strategy from 1.9 are replaced with Rubric 2.0 using just one dimension
Rubric from 1.9 are by Rubric 2.0 by using point scale 0-4 for every criterion.

In Moodle 1.9, reviewer could suggest an optional adjustment to a final grade. This is not supported any more. Eventually this may be supported in the future versions again as a standard feature for all grading strategies, not only rubric.

Calculation of final grades

The final grades for a Workshop activity are obtained gradually at several stages. The following scheme illustrates the process and also provides the information in what database tables the grade values are stored.

ファイル:workshop grades calculation.png

The scheme of grades calculation in Workshop

As you can see, every participant gets two numerical grades into the course Gradebook. During the Grading evaluation phase, course facilitator can let Workshop module to calculate these final grades. Note that they are stored in Workshop module only until the activity is switched to the final (Closed) phase. Therefore it is pretty safe to play with grades unless you are happy with them and then close the Workshop and push the grades into the Gradebook. You can even switch the phase back, recalculate or override the grades and close the Workshop again so the grades are updated in the Gradebook again (should be noted that you can override the grades in the Gradebook, too).

During the grading evaluation, Workshop grades report provides you with a comprehensive overview of all individual grades. The report uses various symbols and syntax:

Value	Meaning
- (-) < Alice	The is an assessment allocated to be done by Alice, but it has been neither assessed nor evaluated yet
68 (-) < Alice	Alice assessed the submission, giving the grade for submission 68. The grade for assessment (grading grade) has not been evaluated yet.
23 (-) > Bob	Bob's submission was assessed by a peer, receiving the grade for submission 23. The grade for this assessment has not been evaluated yet.
76 (12) < Cindy	Cindy assessed the submission, giving the grade 76. The grade for this assessment has been evaluated 12.
67 (8) @ 4 < David	David assessed the submission, giving the grade for submission 67, receiving the grade for this assessment 8. His assessment has weight 4
80 (20 / 17) > Eve	Eve's submission was assessed by a peer. Eve's submission received 80 and the grade for this assessment was calculated to 20. Teacher has overridden the grading grade to 17, probably with an explanation for the reviewer.

Grade for submission

The final grade for every submission is calculated as weighted mean of particular assessment grades given by all reviewers of this submission. The value is rounded to a number of decimal places set in the Workshop settings form.

Course facilitator can influence the grade for a given submission in two ways:

by providing their own assessment, possibly with a higher weight than usual peer reviewers have
by overriding the grade to a fixed value

Grade for assessment

Grade for assessment tries to estimate the quality of assessments that the participant gave to the peers. This grade (also known as grading grade) is calculated by the artificial intelligence hidden within the Workshop module as it tries to do typical teacher's job.

During the grading evaluation phase, you use a Workshop subplugin to calculate grades for assessment. At the moment, only one subplugin is available called Comparison with the best assessment. The following text describes the method used by this subplugin. Note that more grading evaluation subplugins can be developed as Workshop extensions.

Grades for assessment are displayed in the braces () in the Workshop grades report. The final grade for assessment is calculated as the average of particular grading grades.

There is not a single formula to describe the calculation. However the process is deterministic. Workshop picks one of the assessments as the best one - that is closest to the mean of all assessments - and gives it 100% grade. Then it measures a 'distance' of all other assessments from this best one and gives them the lower grade, the more different they are from the best (given that the best one represents a consensus of the majority of assessors). The parameter of the calculation is how strict we should be, that is how quickly the grades fall down if they differ from the best one.

If there are just two assessments per submission, Workshop can not decide which of them is 'correct'. Imagine you have two reviewers - Alice and Bob. They both assess Cindy's submission. Alice says it is a rubbish and Bob says it is excellent. There is no way how to decide who is right. So Workshop simply says - ok, you both are right and I will give you both 100% grade for this assessment. To prevent it, you have two options:

Either you have to provide an additional assessment so the number of assessors (reviewers) is odd and workshop will be able to pick the best one. Typically, the teacher comes and provide their own assessment of the submission to judge it
Or you may decide that you trust one of the reviewers more. For example you know that Alice is much better in assessing than Bob is. In that case, you can increase the weight of Alice's assessment, let us say to "2" (instead of default "1"). For the purposes of calculation, Alice's assessment will be considered as if there were two reviewers having the exactly same opinion and therefore it is likely to be picked as the best one.

Backward compatibility note: In Workshop 1.x this case of exactly two assessors with the same weight is not handled properly and leads to wrong results as only the one of them is lucky to get 100% and the second get lower grade.

It is very important to know that the grading evaluation subplugin Comparison with the best assessment does not compare the final grades. Regardless the grading strategy used, every filled assessment form can be seen as n-dimensional vector or normalized values. So the subplugin compares responses to all assessment form dimensions (criteria, assertions, ...). Then it calculates the distance of two assessments, using the variance statistics.

To demonstrate it on example, let us say you use grading strategy Number of errors to peer-assess research essays. This strategy uses a simple list of assertions and the reviewer (assessor) just checks if the given assertion is passed or failed. Let us say you define the assessment form using three criteria:

Does the author state the goal of the research clearly? (yes/no)
Is the research methodology described? (yes/no)
Are references properly cited? (yes/no)

Let us say the author gets 100% grade if all criteria are passed (that is answered "yes" by the assessor), 75% if only two criteria are passed, 25% if only one criterion is passed and 0% if the reviewer gives 'no' for all three statements.

Now imagine the work by Daniel is assessed by three colleagues - Alice, Bob and Cindy. They all give individual responses to the criteria in order:

Alice: yes / yes / no
Bob: yes / yes / no
Cindy: no / yes / yes

As you can see, they all gave 75% grade to the submission. But Alice and Bob agree in individual responses, too, while the responses in Cindy's assessment are different. The evaluation method Comparison with the best assessment tries to imagine, how a hypothetical absolutely fair assessment would look like. In the Development:Workshop 2.0 specification, David refers to it as "how would Zeus assess this submission?" and we estimate it would be something like this (we have no other way):

Zeus 66% yes / 100% yes / 33% yes

Then we try to find those assessments that are closest to this theoretically objective assessment. We realize that Alice and Bob are the best ones and give 100% grade for assessment to them. Then we calculate how much far Cindy's assessment is from the best one. As you can see, Cindy's response matches the best one in only one criterion of the three so Cindy's grade for assessment will not be much high.

The same logic applies to all other grading strategies, adequately. The conclusion is that the grade given by the best assessor does not need to be the one closest to the average as the assessment are compared at the level of individual responses, not the final grades.

More explanations

Thread at moodle.org where David explains a particular Workshop results
Presentation by Mark Drechsler

Research papers dealing with Workshop module

Peer assessments using the moodle workshop tool by John F. Dooley
Easy-to-use Workshop Module by Álvaro Figueira and Elisabete Cunha

For developers

Please see Development:Workshop for more information on the module infrastructure and ways how to extend provided functionality by developing own Workshop subplugins.

Documentation