Identifying movement in lecture recordings
I’m Tallulah, and this summer I am working as the Lecture Recording Data intern. My job involves working on ways to obtain and analyse data from lecture recordings.
Recording lectures has many benefits, which include improving accessibility by providing lectures in an alternative format, giving students a way to look back on previous lectures as a form of revision, preventing students who are unable to attend due to illness or clashes from missing out on learning, and can be helpful for non native English speakers. Unfortunately, while the university records many lectures, quite a lot of these have some issues, such as microphones not being turned on or the slides not being displayed.
I have been working with the Digital Learning Applications and Media (DLAM) team, as well as Learning Spaces Technology (LST) to develop ways these problems can be identified, hopefully improving the lecture recording service for students!
The university uses Echo360 to record lectures, and I had hoped its reporting API would surface data to identify recording problems. However, after extensive API calls and a thorough review of the documentation, I found it doesn’t provide the metrics, such as audio volume levels or number of displays that we need to spot issues.
Initially to work around this, I worked on a tool that scraped the live data from lecture recording XMLs. Although this did work for obtaining data, constantly scraping from recordings and deploying this over hundreds of lecture theatres would use a large amount of resources. Luckily, Euan from LST developed a Power Automate flow; during a lecture recording, if there has been no audio for 10 minutes, Echo360 sends an alert to Euan, and then this alert is added to the Learning Spaces Datastore (LSD) audio alerts, and every minute for the next 20 minutes, its audio data and snapshots are saved. This helped solve the problem of identifying any audio issues during lectures, but another problem needing to be solved are camera or display issues.
At first, I looked into some AI tools, such as the Google Video Cloud Intelligence API, which can detect movement in videos. However, using external AI tools can have some ethical issues, such as intellectual property infringement or unwanted data collection, and requires paperwork, which can take months to get approved. Since snapshots from lecture recordings can be accessed via the LSD API, I investigated ways I could identify differences between the 20 snapshots provided from lecture recordings, without the use of AI.
A very helpful resource I used was this GeeksforGeeks article, “Algorithms for Image Comparison” [https://www.geeksforgeeks.org/computer-vision/algorithms-for-image-comparison/] which had multiple ways I could identify differences in snapshots, by only using pixel values, and no AI whatsoever! After some trial and error, I decided that calculating the Mean Squared Error (MSE) between consecutive pairs of screenshots would be the best method to use. The MSE is the average of the squared differences between corresponding pixel values of two images, which quantify their overall difference, and fortunately was quite straightforward to implement.
I then implemented the MSE on the snapshot data from the LSD API. After some initial testing, I discovered that the MSE is incredibly sensitive – on what seemed to be 2 identical snapshots of an empty room, it returned a value greater than 0. After doing some further testing, I decided to use a threshold of 100; if in the list of calculated MSE values for the 20 pairs of snapshots there is at least one MSE value greater than 100, the movement flag is set to true. This threshold is quite arbitrary and could change with further testing to prevent false alerts.
The next stage for me is to do further testing to ensure the accuracy of the MSE metric and threshold. So far, I have tested this in a couple of rooms in Argyle House and one lecture theatre, and it seems to be working well, but with hundreds of lecture theatres, broader testing is essential!
This experience has taught me so much and has been immensely valuable. Working with live data for the first time was really exciting, and the large amount of trial-and-error significantly enhanced my problem-solving skills. I also gained more understanding of the technologies involved in lecture recordings, including Echo360, the booker, the scheduler, the tools developed by LST, and appreciated the collaborative efforts between DLAM and LST. I have really enjoyed applying my Informatics studies in a practical, meaningful way that will hopefully benefit other university students. This has also sparked an interest in EdTech, which aligns with my passion for purpose-driven projects, especially those improving accessibility, an area that is personally very important to me.