Shrinkflation and synthetic speech: Rethinking the educational podcast

Pixelated face with 1 and 0's spinning around it — Image credit: Gerd Altmann (Pixabay)

In this post, Dr Brian McGrail describes how he uses Microsoft Co-Pilot to synthetically create podcast episodes to digitise course content that has been fully created by humans for Lifelong Learning education. Brian is a Lecturer (Social Sciences) and Course Organiser in the Centre for Open Learning. This post is part of the Podcasting in Learning and Teaching series.

Maintaining affordability, engagement, and student satisfaction within Lifelong Learning (LLL) education is becoming increasingly challenging within the higher education sector. General factors such as staff resource and increasing costs are accompanied by specific challenges to the LLL model. For example, lack of access to government fee-subsidies (which are tied to accreditation) and the cost-of-living crisis has made the LLL 10-week short course model less affordable, especially for demographics targeted under widening participation.

Amid these challenges, I was tasked with delivering a 10-week online LLL course on Utopianism at intermediary level. I had written and taught a previous Utopianism credit-bearing course that ran successfully for nine years. However, I had to reduce the contact time from two hours per week to just 1.5 hours, whilst maintaining student satisfaction. From previous experience, I knew students preferred contact time to be interactive (student-led, free-form discussion, or semi-structured comprehension activities). So, I decided to take a blended learning approach (Raouna, 2025) and flip the classroom (Advance HE, 2026) by removing the initial live chalk and talk, and replacing it with digitalised content to maintain the contact time for discussions and other activities.

Whilst lecture recordings, delivered via the University’s Media Hopper, are widely utilised with great success, past experience taught that me these are very time consuming to produce, so my response was to ‘stick to audio’. This choice also had the benefit of removing the timely process of ensuring images adhered to the relevant copyright for presentations (see Secker, 2025). The VLE (Canvas) used for this course can carry hyperlinks to Creative Commons (CC) pictures and reproduce these with CC licences and attributions that can accompany the audio. Consequently, the desired format for chalk-and-talk became a podcast – a contemporary version of the radio documentary.

Why use synthetic voices?

Producing a podcast is not without its challenges, both in terms of time and costs. Recording the human voice requires not only specialised and costly equipment but also trained, practiced and confident performers, and, of course, a knowledge of the required editing software to put it all together. Faced with the prospect of multiple takes and a crash-course in editing with the overarching pressure to produce and deliver content, I opted to utilise Co-Pilot – Microsoft’s ‘AI companion’ – to produce my podcasts. I scripted the content and Co-Pilot converted this into audio through synthetic text-to-speech voices, allowing the content to be ‘performed’ without the need for live recording or voice actors.

This circumvented many challenges:

The quality of the recording is catered for and multiple synthetic voices are offered (male, female, various accents and ages).
By using different voices, I can turn typed text (playwright mode) into performed conversations, without involving, training, nor rehearsing other staff or relying on guest speakers.
As everything is recorded ‘inside the box’ (computer), I can produce the podcasts anywhere at any time.
Because I write the voicing text, I generate a 100% accurate accompanying transcript, removing the reliance on automatically produced transcripts within Media Hopper.

There are well placed concerns regarding the use of AI to assist with the generation of podcasts using synthetic voices:

Is the final product a bit wooden and lacking the emotion and uniqueness of a real person?
Would the authentic me – with my east Scotland, Irish-mothered accent – communicate the content better?

Possibly, but the conversational tone is present and the level of clarity is clear and therefore accessible. It’s also worth noting at this point that sight-impaired students frequently learn by listening to synthetic voices (via screen readers). Despite these concerns, the podcasts were produced to a high standard within the allocated time, and the LLL course ran successfully over the ten weeks

Were there challenges?… Absolutely!

A constant problem I encountered (which may become smoother with the technology’s development) was getting the software to pronounce certain words correctly. Surprisingly, the number “four” proved to be my bête noire. To have it pronounced correctly it needs to be written “foe-are”. Otherwise, it becomes “foo-err”. All other numbers are fine. “Plato” is pronounced as it should be, but “Sophocles” took a few screen-reading tests to get right. For me, this quizzical issue was a minor frustration.

A further issue were novel words and terms unknown to Co-Pilot. This is a particular issue given that academics take pride in inventing words, for example, I believe I have invented the term ‘Apocalyptia’ to merge the ideas ‘apocalypse’ with ‘place’ – topos – of revelation. Even Microsoft’s AI-infused vocabulary generation can be ‘old news’ and out-of-date within a slither of time. Oh dear, I feel a new word emerging within!

Here is a link to the first lesson on my course on Utopianism, uploaded to Media Hopper as a sample (or promotional ‘taster’) – a synthetically-voiced introductory lesson entitled ‘Utopian as an Insult’: 01 Utopian as an Insult.

As the saying goes: the proof of the pudding is in the eating. Listeners have reported back positively, and discussions around the podcasts have noted the flexibility of self-pacing course material and the reduced pressure (even anxiety) that rewind and replay offer. It was pointed out that podcasts can be studied whilst driving and are easily downloadable to be listened to offline. Students are aware that AI has only been used for the delivery of the speech, with the content being a 100% human creation. And, finally, the point has been raised, in the not too distant future, if students will be using the same approach in their submissions.

References

Advance HE (2026). Flipped Learning [online]. Available at: Flipped learning | Advance HE Accessed on: 22/05/26.

Raouna, K. (2025). What is blended learning? Why it matters and how to apply it [online]. Available at: What is blended learning? Why it matters and how to apply it Accessed: 21/05/26.

Secker, J. (2025). Copyright anxiety in higher education: findings from our research [online]. Available at: Copyright anxiety in higher education: findings from our research – Copyright Literacy Accessed on: 22/05/26.

Brian McGrail

Dr Brian McGrail is a Lecturer (Social Sciences) and Course Organiser in the Centre for Open Learning, the University of Edinburgh, where he teaches and designs courses on Access, International Foundation, and Short Courses (Open Studies) Programmes. He is also an Associate Lecturer with the Open University and has specialised in adult returner education for 25 years. Brian is currently External Examiner for Lifelong Learning (Access and Study Abroad Experience) at University of Glasgow and for Foundation Pathways at University of Derby. http://linkedin.com/in/brian-mcgrail-5a579034↗️
Brian McGrail’s OpenLearn Profile – OpenLearn – Open University↗️
Brian McGrail (academia.edu)↗️
www.socialisingsense.net ↗️

Jun 17, 2026

Teaching Matters

Promoting, discussing and celebrating teaching at The University of Edinburgh

Shrinkflation and synthetic speech: Rethinking the educational podcast

Brian McGrail

Leave a Reply Cancel reply

Report this page