Case Study Part One: The Data

The overarching aim of this project is to get contemporary cultural events data into the hands of arts and humanities researchers. As part of this project, we have undertaken a case study using cultural events data supplied by our industry partners Data Thistle to ask some of our own research questions and to explore the complexities of the data.

Data Thistle’s What’s On data includes details of events (date, time, description, ticket prices) and their associated venues (venue name, post code). Further details can be found in their API documentation.

They provided us with a sample of data relating cultural events that took place between 2017 and 2021 within the Edinburgh and South East Scotland City Region Deal area.

The sample was provided in 9 JSON files, amounting to 153MB, and containing approximately 350,000 performances at 2500 venues. Each event in the dataset is allocated to 1 of 15 discrete genre categories (Film, Music, etc) and events can also be tagged (the dataset contains 2175 unique tags used a total of 97,532 times).

The first step was to make sense out the data, to get it into a form where we could ask the questions we wanted to ask of it. For this, project Co-Investigator and data wizard Dr Rosa Filgueira used pandas.DataFrame in a Python 3.9 environment.

She created a Jupyter notebook that requires the user-input of three parameters: 1) city to study (e.g., Edinburgh); 2) list of categories to analyse (e.g., Music, Visual art, Film, Books); 3) month to create a more detail/focus analysis over the years (e.g., August). It then generates different types of analyses (frequencies, histograms, gantt charts, maps) using these parameters.

See Part Two for some of our results.

Photo by Conny Schneider on Unsplash