Short Project Description
This project is a pilot study financed by a Carnegie Trust Research Incentive Grant exploring the feasibility of digitising historical church records using state-of-the-art machine learning algorithms. In many European countries churches kept the official records of births (baptisms), marriages, and deaths before the governments started to collect register data on their populations. Recent efforts by European archives lead to the availability of scanned images of the books in which churches kept these records. In this project we explore to what extent it is possible with current tools to have computers recognise the text written by the priests, for example dates of birth, characteristics of the parents, and causes of death.
In the social sciences researchers have been using the register data collected by governments to study a wealth of questions using statistical tools. Unfortunately, these data typically start to be available for periods after the 1970s and 1980s. To be able to study more long-running questions like generational inequality or the impact of historical events, such as the Spanish Flu, data on earlier time periods is needed.
This is where this project comes in. If it is possible to systematically digitise historical church records using automated processes a wealth of data becomes available to researchers, genealogists, and the wider public. Due to the substantial amount of available scans it would be extraordinarily time-intensive and costly to manually digitise scans. Recent advances in machine learning and character recognition make it seem feasible to us to automate the transcription process.
PI: Andreas Steinhauer (https://www.ed.ac.uk/profile/andreas-steinhauer)
RA: Mirjam Eiswirth
Recent comments