Data Protection Impact Assessment
Current version: 0.1 (June 2019)
According to the UK's Information Commissioner's Office (ICO) a Data Protection Impact Assessment (DPIA) is "a process to help you identify and minimise the data protection risks of a project". Under the terms of the EU's General Data Protection Regulation (GDPR) is mandatory to create a DPIA "for processing that is likely to result in a high risk to individuals". However, it is also "good practice to do a DPIA for any other major project which requires the processing of personal data".
As a result, this DPIA for MoodleNet aims to:
- describe the nature, scope, context and purposes of the processing
- assess necessity, proportionality and compliance measures
- identify and assess risks to individuals
- identify any additional measures to mitigate those risks
Identify the need for a DPIA
MoodleNet is a social network for educators. They can engage in discussions and share resources with one another. What makes MoodleNet different is that it is a federated network, so individuals and organisations can install and set up their own ‘instance’.
We believe that a DPIA is necessary because we are processing personal data at scale, including processing sensitive personal data. Moodle is committed to privacy-enhancing technology, which give its users choices about their personal data, and process that data only insofar as is necessary and proportionate for the services offered.
Further details may be found at: https://moodle.com/moodlenet
Describe the processing
MoodleNet is first and foremost a software application which allows for the creation of a federated network, as shown in the diagram below. An organisation can install a MoodleNet instance to allow their users to share resources and ideas with one another, but also potentially with the users of any other MoodleNet instance. A MoodleNet instance consists of a 'database', a ‘backend’ and ‘frontend’ which are typically deployed via Docker containers on servers hosted either on-site or in a data centre.
The main uses of MoodleNet are:
- Joining communities of users
- Curating educational resources into ‘collections’
- Engaging in discussions
Each instance of MoodleNet can be run independently, and can (optionally) connect to a service run by Moodle HQ ( which we refer to as the ‘Mothership’). This will be separate from any regular MoodleNet instances run by Moodle HQ to which end users may sign up.
The administrator of a MoodleNet instance may request a Mothership API key, subject to agreeing to Terms of Service. This does two things:
- Allows the HQ Mothership to receive public metadata from the instance (relating to communities, collections, resources, and user profiles) and store that in a search index
- Provides users of the instance the ability to search across the federated network, meaning they can discover communities, collections, users, and content from other instances.
The HQ Mothership stores a copy of the metadata from the federated instances which are connected via an API key. Search results provided to connected instances link directly to content on the originating federated instance.
Although a lot of the richness of search will come through tagging, we will also index public fields from user profiles. This is to surface the most relevant information to users.
The HQ Mothership indexes the public data provided by connected instances, making it quickly and easily searchable (using Algolia, a third party search engine service). This data includes information such as username, profile images, location, comments, and communities joined.
Even instances that are not connected to the HQ Mothership are nevertheless connected to one another thanks to the ActivityPub protocol. In fact, whether connected to the HQ Mothership or not, because of the way ActivityPub works, they are potentially connected to any other ActivityPub-enabled server (including servers running software other than MoodleNet).
MoodleNet will carefully differentiate between structured personal data fields and data that the individual contributes as part of their participation within communities. For example, users will be able to delete their account at any time. However, deleting their account will not delete any contributions they may have made to collections of resources.
To explain further, there are two main types of contributions to MoodleNet: adding resources, and adding comments. While comments are text-based and always attributed to a user, adding resources can happen in one of two ways:
- URL - the user points to a resource which is available on the open web. The metadata from the site hosting the resource is automatically pulled in by MoodleNet.
- Upload - the user uploads a resource and adds it to a collection using an open license. This is a similar process to uploading resources to Wikimedia Commons, the process for which will inform our work and which is detailed here: https://commons.wikimedia.org/wiki/Commons:Project_scope/Summary
The highest risk for users could arise if they are unaware that comments they add in discussion areas on one instance of MoodleNet are potentially viewable and storable by any application or server running the ActivityPub protocol (e.g. Mastodon, Pleroma, Osada). We will take steps to make sure this is clear to users, both through the user agreement but also through the visual and textual elements in the User Interface of the application.
Our proposed moderation flow for content or metadata on MoodleNet is outlined in the diagram below:
If a user ‘flags’ a resource, collection, comment, or profile, this is reviewed by the moderators of the relevant community. They may choose to take action, which could include removing content and/or warning a user. Ultimately, moderators may choose to ban a user from a community.
Should a community not be moderated effectively, users have the option of flagging that to the instance administrator. They can review the complaint and choose to warn the moderators, warn or ban the offending users from the instance, directly remove content, or close the community.
If an entire instance is problematic, users of other instances may complain (preferably through their instance admin) to Moodle HQ, who runs the ‘HQ Mothership’. Moodle HQ will review the complaint and, if appropriate, issue a warning to the admin of the problematic instance. Should the problem persist, or if the initial complaint was serious enough, Moodle HQ would revoke the instance’s API key and delete its copies of data from that instance from the HQ Mothership (including any search indexes stored via third-party services such as Algolia).
While this would not fully shut down the instance, it would stop being included in the search results on other MoodleNet instances. Moodle HQ does not process any of the data from federated instances without an API key, including those instances whose API key has been revoked.
For users of Moodle HQ's instance(s) of MoodleNet, we will be collecting personal data, including: name, location, resources uploaded, comments, browser version, and IP address. This does not include criminal offence data, but may include special categories such as political beliefs and accessibility requirements. We could also infer ethnicity through avatars, including photographs, that users choose to represent themselves. This would be as a by-product of using the system, through optional rather than mandatory activity (e.g. tagging, photo-upload, discussion replies).
We will check the age and jurisdiction of a user attempting to create an account, and deny the ability to create an account to those under the age of majority in their jurisdiction. In our user agreement we will specify that users must collect valid consent from individuals whose data is being processed.
Users will be able to register from anywhere in the world, on one or more of many federated MoodleNet instances operated by third parties. These may be hosted in any jurisdiction, including the European Union. As a result, MoodleNet instances are subject to the GDPR. A user’s location may be inferred from their IP address, but they are free to choose whether or not to enter a city-based location in their profile.
One thing we need to be careful of is the possibility of identifying location or ethnicity based on the use of minority languages. For example, the community have already localised the testing version of MoodleNet into the Basque language, which is spoken by relatively few people, and mainly in a particular geographical region. Likewise, we may identify users with special needs through accessibility settings and recognising that they are using assistive technologies.
While users will have the ability to hide information about themselves on their profile so that only they can see it, we are aware that this only hides information from other users, not from the operator of the instance. As a result, we will add a statement in the user agreement making users aware about the way that they might, implicitly and explicitly, reveal sensitive information such as their preferences, any disabilities, and ethnicity or location data. To make this clear we will use straightforward language and diagrams.
We will be continuously collecting data made public by users for as long as they have an account on a federated instance. We will initially be collecting data on hundreds of users, but this is likely to increase to at least tens of thousands of users.
All users of Moodle HQ's instance(s) of MoodleNet will be above the age of digital consent according to their jurisdiction. As part of our user agreement, we are using a Code of Conduct to govern Moodle HQ's relationship with users, interactions amongst users, and users' contributions to MoodleNet, as detailed at https://www.contributor-covenant.org We are not targeting vulnerable groups, we are using data in a way that users would expect, and are not aware of any issues of public concern in this area. Users are registering on a voluntary basis.
Our relationship with users on instances that Moodle HQ host is:
- Data Controller
- Data Processor
On instances not hosted by Moodle HQ, we provide the HQ Mothership API service that indexes data as well as provides search and discovery across federated instances. In this case, we index public content from other instances to allow this to happen. Our relationship with instance administrators is therefore:
- Joint Data Controller (in respect of certain elements of personal data made public by users which Moodle HQ makes searchable across the federated instances connected to the Mothership)
- Data Processor
Instance administrators act as the Data Controller, and can request Moodle HQ to delete personal data on the data subject from the HQ Mothership.
As mentioned above and in reference to the diagram, contributions of content that do not include personal data will remain in MoodleNet after a user deletes their account. This is consistent with the open licenses we will use for uploaded resources and our user agreement.
We are processing personal data for the purposes of identification on a social network. We envisage that this will lead to increased trust and sharing of resources and ideas amongst the educators using federated MoodleNet instances. Users will be able to identify one another, talk about shared interests and goals, and both link to and upload resources that will help their communities.
Moodle HQ gains from this by building something that fits with our values, but also provides a vehicle to complement and point towards our own paid offerings (e.g. MoodleCloud), as well as to our Partners, from whom we receive a share of their revenue.
We are attempting to collect the minimum amount of data to create a federated network of MoodleNet instances. To enable search across the instances connected to the HQ Mothership, we must act as Joint Data Controllers with administrators of those federated instances for the data elements over which we exercise independent control. In addition, we are Data Processors for the administrators of any federated instances, as we are processing data from those federated instances on their behalf, as part of the service offered to their users by them as Data Controllers.
Assess necessity and proportionality
We are gaining the consent of users to process their personal data when they register on one of the instances of MoodleNet which are connected with the HQ Mothership. They will agree with a privacy notice, we will capture the minimum amount of personal data required, and users will be able to access, rectify, export, and delete this data.
As a federated social network, our aim is to process only the data that users have made publicly available, and which is necessary to meet the aims of MoodleNet. Namely, we want users to be able to identify one another and indicate affinity based on factors such as geographical location and interests. Other than a controlled taxonomy for parts of the tagging system, we are allowing users to add free text into these fields, thus allowing them to be pseudo-anonymous.
If we were building a centralised system, we would potentially have access to all user data, including that shared (as far as the user is concerned) ‘privately’. By creating a federated network, users can choose to join an instance they trust, or indeed run their own.
Built into MoodleNet will be the ability to request that data be deleted from other instances after being deleted from the user’s ‘home’ instance. By default, MoodleNet servers will comply with this request, but we cannot control whether third-party servers comply with the request. We will work with the main ActivityPub-compatible social networking apps to encourage them to comply with these requests by default.
We have tested MoodleNet with around 250 volunteers to ascertain both the value proposition and to ensure that they are happy with the product. We collected the minimum amount of personal data in order to facilitate the successful use of the service.
In order to ensure that MoodleNet is as secure and privacy-respecting as possible, we are implementing industry best practices. This will include starting a ‘bug bounty’ program specifically focused on security and privacy. To enable this we will use a crowdsourcing platform that gives us access to security and privacy researchers and ethical hackers. The MoodleNet team works closely with Moodle’s Privacy Officer and DPO to ensure that the procedures we are following start from a ‘privacy by design’ point of view. As we implement MoodleNet, the team will continue to meet regularly with the Privacy Officer and DPO to inform upcoming changes and decisions.
In addition, we plan to share this DPIA with the Moodle community for their feedback, as we intend to be as transparent as possible with this project. Anyone may give feedback via the following methods: