A summer in the life of a DLAM Employ.Ed Intern
My name is Craig, I’m 20 years old from nearby Dunfermline and I’m entering into the fourth year of an Artificial Intelligence and Computer Science degree here at Edinburgh. After my degree I intend to enter the computing industry in either a data analysis or natural language processing role, although I’m interested by most practical applications of informatics.
I’m currently working on three separate projects supporting Massive Online Open Courses (MOOCs). These are online distance learning courses offered by the university with the provision that they are free and open to all. With ubiquitous use of the internet and the huge international standing of the University of Edinburgh, it is important the university has a large presence in the online learning space. To maintain the quality of these courses, it is important the university can analyse the data generated by these courses to be able to put into place improvements year-on-year. My internship has supported this activity and will hopefully make the process of MOOCS data analysis simple, efficient, and quick as possible for anyone supporting the administration and design of these courses. The data generated from these courses will be used by course creators to shape the design of their course in future iterations, shaping them to more deeply understand the age, level, or location of their online users.
The first project is the automation of the existing procedures for the data analysis of the Massive Online Open Courses offered by Edinburgh University on the Coursera platform. Currently, when a session of the course concludes, information about how users interact with the course for example how many times a user watches a lecture or posts on a discussion forum are made available to the university. This is downloaded and the data is moved between various people with each doing their step of changing or using the data over a timeline of days or months until the data is ready to be made into easy-to-interpret graphs and charts on the Edinburgh MOOCs website. Four extracts for research purposes which contain concise summaries of the data in an easy-to-process format are also created.
My hope is to create a new process which will allow data analysis of MOOCs to be possible for University staff without technical programming skills and free up the time of data analysts currently completing these tasks manually. I’ve taken each individual process that was previously manually completed and made them all automated This brings the timescale of the whole process down from months to minutes. Previously whomever was tasked with using the data would have to go through a series of very repetitive and time-consuming steps and now I have built a website which simplifies the entire process down to filling in a single form on a webpage and clicking two buttons. We hope to share this website with the wider data community to allow other universities to simplify their data analysis processes as well.
My second project is the migration of the format for storage of this data from the current Edinburgh-specific structure into the MOOCdb structure. This free and open structure is being developed at MIT and Stanford to define standards for storage and visualisation of data for MOOCs. It’s use would allow Edinburgh to further collaborate with other universities at the forefront of distance learning – sharing processes and optimisations for the collection, storage and publishing of MOOC data. Currently, I’m in the initial stages of translating the previous set of procedures outlined in the first project over to the new format so that we can have the same data extracts so that for any projects that rely on our data they don’t have to change their processes and everything will be familiar to end users.
I’m currently liaising with a member of the MOOCdb team based at Edinburgh with a view to having the foundations for a switchover from our current systems to MOOCdb completed before the conclusion of my internship.
My third project is extensive documentation of both the old processes for the data analysis of MOOCs – the manual editing of data by hand outlined in my first project – and the new processes I’ve developed. This is being realised through the creation of two user guides, one for internal use at Edinburgh which specifically outlines how to use the data generated by our MOOCs using the new website I have developed, and a second containing the old procedure to go from the raw data output from Coursera to producing informative charts and easy to access data for research. This should hopefully form a “Dummy’s guide to MOOC analytics” and should hopefully to able to help open up this field of data analysis to users without technical or programming skills in an easy to follow format.
This guide should hopefully be published within the next few weeks with some reworks and updates to reflect the implementation of new procedures in the last few weeks of my internship.
Finally, I hope to be able to open most of the data analysis processes used for the analysis of MOOCs to allow other institutions to be able to learn from the MOOCs work being done at Edinburgh and be able to take our processes and apply them to their own data. Before all these processes are implemented however, Edinburgh’s data results can be viewed at http://moocs.is.ed.ac.uk/.