CDCS Digitisation Internship: Creating the Managing Digitised Documents Pathway
In February 2022, PhD students Vesna Curlic and Ash Charlton began their digitisation internships in partnership with the University of Edinburgh’s Library and University Collections and the Centre for Data, Culture, and Society. Together, they reflect on the creation of a CDCS Training Pathway which will direct people towards resources for managing and working with digitised materials, including undertaking their own digitising projects.
Our internship had two main parts – half our time was spent in the Cultural Heritage Digitisation Service’s Digital Imaging Unit (DIU) in the University’s Main Library scanning early volumes of The Student, University of Edinburgh’s student newspaper. The other half of the time, we worked to develop a training pathway for the Centre for Data, Culture and Society (CDCS) which will direct people towards resources for undertaking their own digitising projects. This post is part two of two, reflecting on our experiences creating a training pathway for the CDCS.
The most exciting part of digital humanities is the near-endless possibilities – people are doing work beyond our wildest imaginations. But this makes creating training resources difficult. How do you account for the different types of possible projects and the creation of data to use, whether as images or text? This was our first challenge to tackle with the creation of the pathway. The CDCS is aimed at helping arts, humanities, and social science scholars navigate the world of data and digital technology, catering for a wide range of technical knowledge. So, naturally, the pathway also had to cater for a range of backgrounds, knowledge and project types with digitised documents.
Looking at the pathway, it suggests this process is quite linear, but many of the steps are interlinked and your decisions taken at each stage of the process must be considered in context with all your other decisions. We advise that you follow the pathway through in its entirety to familiarise yourself with each step before starting work with your material and exploring each step in more depth as you need.
The pathway begins with establishing permissions and copyright for your project, which is likely to be informed by the requirements of what you are digitising (step 3) and how you are intending to use your digitised materials (steps 5 and 6). Preparation is key when working with lots of digital material, which is why the pathway addresses this before you create any of your material through digitisation. Knowing how to manage your digital documents will help to keep you organised as you work through your project, from file creation to your final outputs, and should save you time in the long run!
We don’t get to what many people think of as the ‘actual’ digitisation – the process of capturing documents with a camera or scanner – until Step 3, and even then, this is only a small element of the whole process. Digitisation is a largely variable process and can happen on a variety of scales and for different purposes. We wanted the pathway to account for that, but naturally, how you approach it is informed by the specifics of the material you are digitising, your available resources and your overall project. Your desired final outputs (steps 5 and 6) will influence how you approach the creation of digitised material.
Text extraction and preparation in step 4 is not a part of all digitisation projects, but it may be the most important for others – again, there is so much variation depending on your project type. Many researchers rely on text in their research, which can be time-consuming to read or search through manually. The ability to extract text from images and knowing what potential issues to look out for can open new research avenues and methodologies for researchers. Of course, if you are only intending to use digitised images in your research this step will not be necessary.
Steps 5 and 6 intend to generate ideas of how digitised content can be used as research outputs or projects. This is by no means an indication of all the ways digital text and images can be used, but these are some common ways digitised materials may be used, and the exciting thing about research using data and digital skills is that new and exciting things are emerging frequently. Keep your final goals for your digitised materials in mind throughout your process to make the best decisions at each stage of your project.
In creating this pathway, we hope that we have opened new opportunities for researchers across multiple disciplines for ways of approaching digital research methods and outputs. Learning new skills can be difficult when you don’t know where to look, especially if you are approaching from a background such as humanities where these methodologies and skills are not as common. Each step of the pathway draws on expertise and lessons found across the web and we have brought these together to give researchers a starting point to explore these more fully in ways that suit them. Finding and compiling these resources and distilling the information into easy-to-access steps was our biggest challenge while creating the pathway, but it’s been an excellent exercise in researching and reviewing and we hope you find it useful!
View the Managing Digitised Documents Pathway here.
Find out more about the University of Edinburgh’s Library and University Collections internship here.
(Image created using Canva. Original image by C.M. via Unsplash)