Anonymising data using R

I often work with student data for research purposes, and one of the first steps I take before doing any analyses is to remove identifying details.

Our students have a “University User Name” (UUN) which is an S followed by a 7-digit number (e.g. “S1234567”), which can be used to link different datasets together. I need to replace these identifiers with new IDs, so that the resulting datasets have no personal identifiers in them.

I’ve written a simple R script that can read in multiple .csv files, and replace the identifiers in a consistent way. It also produces a lookup table so that I can de-anonymise the data if needed. But the main thing is that it produces new versions of all the .csv files with all personal identifiers removed!

Code on GitHub

Scaffolded proofs in a Moodle quiz

In my online course Fundamentals of Algebra and Calculus, there were several places where I wanted to encourage students to engage with a key proof while reading the text.

One approach to this is to ask proof comprehension questions after giving the proof, but I’ve also tried writing some sequences of questions that lead the students through the proof in a scaffolded/structured way.

Here’s a simple example, of a sketch proof of the Fundamental Theorem of Calculus:Screenshot of question showing a sketch and asking students to complete an expression for a shaded area in the sketch

Students can’t see the next part of the proof until they give an answer. Once they have submitted their answer, the next part is revealed:Solution to the task, followed by the rest of the proof

I’ve used this approach in other places in the course, sometimes with more than one step.

The way to do this in Moodle is by having the quiz settings set to “Interactive with multiple tries”:Then using the little padlock symbols that appear at the right-hand side between questions on the “Edit questions” page:

After clicking the padlock, it changes to locked to indicate that students must answer the first question to see the second:

I’ve not done any serious evaluation of this approach, but my intuition is that it’s a good way to direct students’ attention to certain parts of a proof and encourage them to be more active in their reading.

Moodle gradebook setup for mastery grading

In my course Fundamentals of Algebra and Calculus, students complete weekly Unit Tests. Their grade is determined by the number of Unit Tests passed at Mastery (80%+) or Distinction (95%+) levels. For instance, to pass the course, students need to get Mastery in at least 7 of the 10 units. You can find more details about the course in this paper:

  • Kinnear, G., Wood, A. K., Gratwick, R. (2021). Designing and evaluating an online course to support transition to university mathematicsInternational Journal of Mathematical Education in Science and Technologyhttps://doi.org/10.1080/0020739X.2021.1962554

All the Unit Tests are set up as Moodle quizzes, and I needed a way to compute the number of tests completed as Mastery level (and at Distinction level) for each student.

To make matters more complicated, there are 4 different versions of each Unit Test:

  • Unit Test – the first attempt
  • Unit Test Resit – a second attempt, available to students shortly after the first attempt if they did not reach Mastery
  • Unit Test (Extra Resit) – a third attempt, available at the end of semester
  • Unit Test (Resit Diet) – a fourth attempt, available during the resit diet in August

Each subsequent attempt replaces the result of previous ones – e.g. if a student with a Mastery result on the first attempt decides to take the Unit Test (Extra Resit) to try to get a Distinction, then they will lose the Mastery result if they do not reach the 80% threshold.

To set this up in the Moodle gradebook, I have given each of the variants an ID, with the pattern:

  • WnFT
  • WnFTR
  • WnFTR2
  • WnFTRD

(where n is the week number).

Then I have added a calculated grade item called “Number of Mastery results”, with a complicated formula to determine this. It is the sum of 10 terms like this:

ceil([[W1FTRD]]/32)*floor([[W1FTRD]]/25.5)+(1-ceil([[W1FTRD]]/32))*(ceil([[W1FTR2]]/32)*floor([[W1FTR2]]/25.5)+(1-ceil([[W1FTR2]]/32))*floor(max([[W1FT]],[[W1FTR]])/25.5))

where this snippet computes the number of Mastery results in week 1 (i.e. it will return 0 or 1).

Note that the 25.5 appears throughout this expression because that is the threshold for 80% on these tests.

  • ceil([[W1FTRD]]/32)*floor([[W1FTRD]]/25.5) means “if they took the Resit Diet version, then use their score on that to decide if they got a Mastery result”
  • (1-ceil([[W1FTRD]]/32))*(...) means “if they didn’t take the Resit Diet version, then use their other scores to decide”
  • There’s then a similar pattern with W1FTR2
  • And finally, if students didn’t take either W1FTRD or W1FTR2, we use the best of the W1FT and W1FTR results to decide (simple “best of” is OK here, since students can only take W1FTR if they did not get Mastery on W1FT).

This is all quite complicated, I know! It has grown up over time, as the FTR2 and FTRD versions were added after I first set up this approach.

Also, when I first implemented this, our version of Moodle did not support “if” statements – since the Moodle grade calculations can now make use of “if” statements, this calculation could be greatly simplified.

APA referencing in LaTeX using Overleaf

I’m now very used to the referencing style used in education journals (e.g. “according to Author (1999)”), to the point where the numbered style more commonly used in science (e.g. “according to [1]”) really annoys me!

This year I’m supervising three undergraduate projects, and I’ve asked them to use the APA style for referencing in their reports.

It took me a while to find a way of doing this in LaTeX that I was happy with, so to smooth the path for my students I shared this version of the project template, where I’d made all the necessary changes to implement APA style:

https://www.overleaf.com/read/yjkyzmpmkcdm

The key parts are as follows.

In the preamble:

% formatting of hyperlinks
\usepackage{url}
\usepackage{hyperref}
\usepackage{xcolor}
\hypersetup{
    colorlinks,
    linkcolor={red!50!black},
    citecolor={blue!50!black},
    urlcolor={blue!80!black}
}

% Use biblatex for references - change style= as appropriate
\usepackage[natbib=true,backend=biber,sorting=nyt,style=apa]{biblatex}
\renewcommand*{\bibfont}{\fontsize{10}{12}\selectfont}

% add your references to this file
\addbibresource{references.bib}

At the end of the document:

\printbibliography{}

And make sure to add references.bib to your project, with all the bibtex references. I’ve found Mybib.com a really useful tool for this, though I mainly use Mendely as my reference manager (and this can import easily into Overleaf).

Proof comprehension questions

Proofs are an important part of mathematics. In many courses, proofs will be important in two ways:

  1. reading proofs – e.g. to understand new ideas in the course through the proofs of important results, and to see applications of earlier concepts or theorems,
  2. writing proofs – e.g. to show understanding of ideas from the course by being able to apply them to solving “unseen” problems, including proving results that go a bit beyond what was covered in the course.

This post is focused on the first of these. Developing students’ abilities to read proofs is something that is not often done explicitly – there may be an assumption that students will pick it up by osmosis. There is some research into how to help students to develop these abilities (e.g., Hodds et al.. 2014), and a key part of this is having a good way to measure students’ level of comprehension of a given proof.

Proof comprehension framework

Mejia-Ramos et al. (2012) give a framework for assessing proof comprehension, with 7 different types of questions that can be asked:

Local Holistic
    • Meaning of terms and statements
    • Logical status of statements and proof framework
    • Justification of claims
    • Summarizing via high-level ideas
    • Identifying modular structure
    • Transferring the general ideas or methods to another context
    • Illustrating with examples

You can see some more detail about these different categories in a recent talk by Pablo.

The framework is helpful when trying to write questions to assess students’ understanding of a given proof, as it gives ideas for different types of questions you can ask.

Examples

A few years ago, I used this framework to put together some multiple-choice proof comprehension questions for our Year 3 course, Honours Analysis.

My experience of these is that students found them quite hard – the mean score was around 75%, so they are not trivial for students to answer.

References

Hodds, M., Alcock, L., & Inglis, M. (2014). Self-Explanation Training Improves Proof Comprehension. Journal for Research in Mathematics Education, 45(1), 62. https://doi.org/10.5951/jresematheduc.45.1.0062

Mejia-Ramos, J. P., Fuller, E., Weber, K., Rhoads, K., & Samkoff, A. (2012). An assessment model for proof comprehension in undergraduate mathematics. Educational Studies in Mathematics, 79(1), 3–18. https://doi.org/10.1007/s10649-011-9349-7

Taking good screenshots of webpages

I’m working on a paper just now about my online course, Fundamentals of Algebra and Calculus. I’d like to include a high quality screenshot to show what the online course materials look like, and have finally found “one weird trick” that makes it easy!

I learned about this from a bit of googling, which led to this guide to producing a screenshot as a SVG (scalable vector graphic).

Based on that, here’s an easy way to take a screenshot as a PDF:

  1. Using Chrome, on the page you want to screenshot, open the developer tools (e.g. by right clicking the page and choosing “Inspect”)
  2. Click on the “…” menu at the top right of the developer tools window, then choose “More tools” > “Rendering”. This should open a new pane with various options.
  3. Set “Emulate CSS media” to screen
  4. Now when you go to print the page in Chrome, and choose “Save as PDF” for the printer, you will get the webpage as it looks normally, rather than the special printer-friendly style.

For the page I was saving, I found that setting the paper size to A2 gave good results. I also set Margins to “Custom” and made the page slightly narrower. I think you just need to play around with the page size, scaling and margins until you are happy.

I also used the developer tools window to tidy up the page a little, e.g. deleting some irrelevant navigation boxes, and instructor-only tools.

Et voila!

example screenshot

Screenshot_Polynomials-3-2-3_sketching-cubics

STACK: Checking answers in polar form

Last week’s topic in FAC was complex numbers, and I’ve had some difficulties with STACK questions asking students to give their answer in polar form, e.g. when the correct answer was 4*(cos(pi/3)+i*sin(pi/3)) an answer of 4*(cos((1/3)*pi)+i*sin((1/3)*pi)) would be marked incorrect!

The issue was that:

  • with simplification turned on, Maxima will automatically simplify polar form to cartesian form, so I need simplification off.
  • with simplification off, Maxima won’t see those equally valid ways of writing the argument as the same.

I was using the EqualComAss answer test to check whether the student answer (ans1) was equal to the model answer (ta1), and this was failing in the cases above.

The solution I came up with is to add some code to the feedback variables box at the top of the PRT, to replace cos and sin with alternate versions so that Maxima can’t simplify the expressions to cartesian form. I can then use ev(…,simp) to make use of simplification when comparing the expressions:

form_ans1:subst([cos=COSINE, sin=SINE], ans1);
form_ta1:subst([cos=COSINE, sin=SINE], ta1);
proper_form:is(ev(expand(form_ans1-form_ta1),simp)=0);

This will ensure that COSINE(pi/3) and COSINE((1/3)*pi) will cancel out, thanks to the simplification being turned on.

But since Maxima doesn’t know anything about COSINE, it can’t cancel out COSINE(-pi/3) and COSINE(5pi/3) (as it would do with cos) if students give their answer with the wrong value for the principal argument.

It was then just a case of replacing the test for EqualComAss(ans1,ta1) in the PRT with a test that AlgEquiv(proper_form, true), and regrading. Out of ~160 attempts this picked up 8 students who deserved full marks!

Update (08/11/2021): One year on, and STACK now has a new feature which makes it easier to grade these answers correctly! The new EqualComAssRules answer test lets you add a list of different algebraic rules so that two answers should count as equivalent if they differ only by those rules – e.g. x and 1*x.

To fix this question, it’s enough to change the first PRT node to the following, using the “Test options” box to specify the list of algebraic rules:

ATEqualComAssRules(ans1, ta1, [ID_TRANS,NEG_TRANS,DIV_TRANS,INT_ARITH]);

Marking exams using Gradescope

This post is part of a series on marking remote exams.

We’re using Gradescope to mark 5 of our remote exams at the moment. Here, I’ll outline the process that we’ve used.

Preparation

As with all our exams, we go through a process to prepare a folder of anonymised PDFs, one for each student.

Gradescope provides two different types of assignment:

  • Exam / Quiz – where the instructor uploads a batch of scripts, but these all need to have a consistent layout (e.g. when students complete an exam on a pre-printed booklet).
  • Homework / Problem Set – where the student can upload a script of any length, and then identify which questions appear on which pages.

Unfortunately neither of these quite fit our situation – we have a set of variable length scripts, but we need to upload them (since we wanted students to have a consistent submission experience across exams, whether or not we used Gradescope to mark them).

Fortunately my colleague Colin Rundel is a wizard with R and he was able to semi-automate the process of uploading each script individually to the “Homework / Problem Set” assignment type. All our marking is done anonymously, so we’re only using the students’ Exam Number in the Gradescope class list, and for each student Colin’s R script uploads their PDF submission.

Zoning

Once the scripts are uploaded, we still need to identify which questions are on which pages – a process I’ve taken to calling “zoning” since that’s the terminology used in RM Assessor (one of the other tools we’ve been trying).

To do this, we’ve employed several PhD students, who would normally have been helping out with various marking jobs for our 1st/2nd year courses (but those exams were cancelled for this diet).

These PhD students were set up as TAs in the course, and tasked with marking up which questions were on each page, just like the students would normally do in Gradescope. This is surprisingly a difficult workflow in Gradescope, requiring multiple clicks to move between scripts (and there is no summary of which scripts have been “zoned”). To get round this, Colin prepared a spreadsheet with direct links to each script in Gradescope, and the zoners used this to keep track of which ones they had completed (and note any issues). I wrote some very brief instructions on the process (PDF) – this included a short video clip of me demonstrating how to do it, but I’ve redacted that here because it shows student work.

Marking

The process in Gradescope is based on using rubrics (see https://www.gradescope.com/get_started). These can work with either positive or negative marking; we have been using the default of negative marking in the exams so far, which is different to our usual practice but seem to work best in this system. Essentially for each question, you develop a set of common errors and the associated number of marks to take off. That way you can then tag responses with any errors that occur, giving more useful feedback about what went wrong.

Each course has worked a little differently, but the basic idea is for the Course Organiser to develop a rubric and make sure it works on the first 10-15 scripts.

  • Sometimes the CO has developed the rubric first, then went through the first 10-15 scripts to check it made sense and make any adjustments
  • Other times, the CO has just started marking the first 10-15 scripts, and used that process to develop the rubric.

Other markers can then be assigned a question (or group of question parts) each, and they go through applying the rubric. We’ve asked that they flag any issues with the rubric to the CO rather than editing it directly themselves (e.g. to add a new item for an error that doesn’t appear already, or if it seems that the mark deduction for one item is too harsh given the number of students making the error).

A nice feature is that you can adjust the marks associated to rubric items, and this is applied to all previously marked scripts in the same way.

Feedback from markers so far has been very positive. They have found the system intuitive to use, and commented that being able to move quickly between all attempts at a particular question has meant that they can mark much more quickly than on paper. Gradescope have also done some analysis of data from many courses and found that markers tend to get quicker at marking as they work through the submissions:

Moderating

Once marking is completed, the CO can look through the marking to check for any issues. The two main ways of doing this are:

  • checking through a sample (10-20 scripts) of each question, to make sure the rubric is being applied consistently (and following up with further checking if there are any issues).
  • checking whole scripts, particularly any which are failing or near the borderline.

Gradescope provides the facility to download a spreadsheet showing the mark breakdown for each script, and also a PDF copy of the script showing which rubric items were selected for each question part. We’ll be able to make those available for the moderation and Exam Board process.

Conclusions

Gradescope is clearly a powerful tool for marking, and I think we will need something like this if we are to do significant amounts of on-screen marking in future.

However it does come with some issues – we had to work around the fact that it is not designed for the way we needed to use it. For long term use it would make sense to have the students tag up which questions appear on which pages, but that would require integration with our VLE and would add a further layer to the submission process for students (and another tool/system to learn to use). I was also concerned to see news that Gradescope crashed during an exam for a large class in Canada, and there are obvious issues about outsourcing such a sensitive function.

Marking remote exams

Throughout April and May 2020, we’ve had 42 different exams running remotely in the School of Mathematics. These would normally have happened in exam halls, with students writing on paper and the scripts then passed around between staff for marking and moderation.

With the exams now happening remotely, students have instead been scanning their work and uploading it in PDF format.

So, how are we dealing with marking and moderating over 2200 PDF exam scripts?

We’ve taken the opportunity to try out three different approaches:

  1. GradeX PDF – this is a tool that was rapidly developed by Tim Drysdale in Engineering, with the aim of keeping the experience for markers close to that of marking scripts on paper.

    We’ve used this as our default approach, with a mean of around 50 scripts per exam.

  2. Gradescope – this is a popular web-based service, now owned by Turnitin, that we had heard about from colleagues in other mathematics departments some time ago.

    We have used this for 5 of our largest exam papers, in courses with over 150 students, because it has enabled multiple markers to work together more easily.

  3. RM Assessor – this is a service used by many school exam boards, including the SQA, with tools in place for managing large distributed marking teams.

    We have used this for our largest exam, with around 180 students.

Students have the consistent experience of submitting their exam script through a Learn assignment. These have been set up across all 42 courses by a team in our teaching office, using a consistent template. We then use the bulk download feature to export all the scripts, and do some offline processing (described in a previous blog post) to get a set of PDFs ready for whichever tool will be used for marking.

Marking (and moderation) is still underway, so it’s too early to say what our preferred solution is. But so far, it seems that on-screen marking is working well – some markers have reported saving time by not having to shuffle through lots of paper, and by not having to do mundane jobs like adding up marks.

Sharing mathematical writing – using video

On Friday, I took part in a workshop on “sharing mathematical writing” with some colleagues from Mathematics, Informatics and Physics.

We’re all thinking about how best to arrange for students to be able to work together online, and being able to work together on written mathematics is a key requirement for our disciplines.

In the workshop, we tried four different approaches:

  1. Video – using smartphones as document cameras, to show writing on paper,
  2. Digital whiteboard – using a free online whiteboard tool, where participants can collaboratively write (we tried NoteBookCast.com),
  3. Collaborative notebook – using a shared OneNote notebook,
  4. Typing – in the collaborative LaTeX editor, Overleaf.

I was in the Video group, and found it worked quite well.

Setup

The approach is to use a smartphone (a common piece of kit!) as a document camera.

Ahead of the meeting, I installed software on my phone and on my laptop, that enables the phone’s camera to appear as a webcam on the laptop. I followed the instructions in this article and installed DroidCam on both phone and laptop.

We were using Microsoft Teams for the meeting. It’s easy to switch between different webcams, once you know to click the “…” on the toolbar, and select the “Show device settings” option:

Assuming you’ve got DroidCam (or equivalent) running, you should then see it appearing as an option in the dropdown list under Camera.

During the meeting, I was able to switch between using my laptop’s webcam, and my phone’s camera. I have a cheap tripod that holds my smartphone, so I was able to use that to point the phone camera my page of writing. Others were making do with holding their phone above the page, but given how cheaply you can get hold of a reasonable holder (e.g. this one for £12) I think that would be worth the investment!

Alternative method in Teams

It’s also possible to join the Teams meeting from the smartphone, if you have the Teams app installed. Make sure you use the options when joining the meeting on your phone to turn off the mic and the audio output – otherwise you will get horrible audio feedback.

If you have the video turned on when joining the call on your phone, that will take over from your laptop webcam. Other participants will see the video from your phone instead. If you want to switch back to your laptop webcam, simply pause the video on your phone (tap the video icon on the toolbar, like the one at the left of the screenshot above).

This approach was much simpler to set up – just install the Teams app on your phone, and no fussing with apps like DroidCam.

However I found it a bit counterintuitive in Teams at first – the view of the call on my laptop showed no indication that my phone was involved at all. It was still showing the little preview of my laptop’s webcam, and I couldn’t see the view from my phone that other people were seeing (though of course, I could see that on my phone screen).

Advanced setup

I’ve been doing some further experimenting, to see if I could get round the issue of having to choose which camera to show. The solution I found is to use the DroidCam approach, with some further software on top of it:

  • OBS Studio – which lets you control multiple video streams
  • OBS VirtualCam – which makes the output from OBS available as a webcam on your system

With those installed, I fired up DroidCam, then set up a picture-in-picture view in OBS Studio by adding both my laptop webcam and the DroidCam as sources:

Here is what it looks like:

Final thoughts

During the session, it took colleagues about 20 minutes to get settled in to using the technology. I think that’s a one-off setup cost, that we would obviously need to sort out with all students as part of induction.

On balance, my rank order of recommendations would be:

  1. using the Teams app on the phone to join the call,
  2. using DroidCam or similar to let you switch between cameras on your laptop,
  3. the more complicated picture-in-picture setup.

Using the Teams app on your phone is the simplest technically – and once you are used to how it works, it’s quite straightforward.

This whole approach does have its downsides – in particular, it’s hard for participants to work collaboratively since it relies on one person doing the writing. I think there are ways around that, like having well-defined group roles like “scribe” which students would rotate through. But it would need some careful thought and lots of practice on our part – not least to make sure tutors are comfortable facilitating groups working in this way.

Even if we decide not to use this approach in our teaching, I’m confident that it would be useful for students who might want to work together outside of scheduled classes.

Further reading