In this extra post, Serveh Sharifi, Vanda Inacio, Ozan Evkaya, and Amanda Lenzi, academics from the School of Mathematics, share their experience and insights on hosting the American Statistical Association (ASA) DataFest↗️ 2024 at The University of Edinburgh and Heriot-Watt University. The hackathon attracted 82 students who worked in teams to analyse a large dataset from CourseKata↗️.
What is ASA DataFest?
The American Statistical Association (ASA) DataFest↗️ is a celebration of data where teams of undergraduate students work over a weekend to find and share meaning in a large and complex dataset. The dataset and the analysis are typically beyond the scope of what students encounter in their university courses. Founded at UCLA in 2011, ASA DataFest has rapidly grown and now is hosted by many prestigious colleges and universities across the USA, as well as several renowned foreign institutions.
This friendly competition provides students with invaluable real-world experience, an opportunity to showcase their skills, explore a data scientist’s job, and network with professionals and peers. During the competition, professional data scientists supervise students in their work. The event ends with brief presentations of the teams’ work, assessing the students’ ability to communicate results clearly and effectively. Winning teams are awarded in various categories.
ASA DataFest in Edinburgh
We organised the ASA DataFest 2024↗️ in Edinburgh from March 22 to 24, inviting all UG students across The University of Edinburgh and Heriot-Watt University with an interest in data to join. A total of 82 students in 20 teams, from 10 different schools, participated and worked together over the weekend in the Hawthorn Teaching room at the Nucleus Building. Academic staff, PhD students, and data scientists from the industry also joined us as consultants to guide participants during the event.
The event was supported by the School of Mathematics, Actuarial Mathematics and Statistics Department at Heriot-Watt University, Centre for Statistics, Maxwell Institute, Bayes Centre, International Centre for Mathematical Sciences, and the Royal Statistical Society.
The surprise large dataset for this run of DataFest was from CourseKata↗️. CourseKata is a platform that creates and publishes a series of e-books for introductory statistics and data science classes, utilising demonstrated learning strategies to help students learn these subjects. The developers of CourseKata are interested in improving statistics learning by examining students’ interactions with online interactive textbooks. The challenge was to analyse the data and make suggestions to help CourseKata enhance the student experience of learning statistics.
Participants were asked to present their team analyses in five minutes on Sunday evening. Each presentation was reviewed by two of the four judges, followed by discussions to finalise the decisions on winners. Prizes were awarded to six teams in the categories of “Best Insights”, “Best Visualisation”, “Best Use of Outside Data”, and “Judges Pick”. Winners received prizes at the end of the event, including gift cards and free membership to the American Statistical Association and the Royal Statistical Society.
The value of DataFest (and other hackathons) for students
Hackathons like DataFest can be powerful tools for enhancing student learning. Here are some key motivations:
- Multidisciplinary practice: Hackathons are great opportunities for students to work in multidisciplinary teams and on multidisciplinary problems. Data science as a multidisciplinary field is an intersection of statistics, computer science, domain expertise, and critical thinking. Events like DataFest encourage interdisciplinary collaboration by bridging the gap between such traditionally separate fields. Additionally, introducing students to data science early in their academic journey provides them with an understanding of how data can be used to address intricate real-world problems and equips them with invaluable career skills.
- Working with real data: Experience in working with real and challenging data is a key component of the undergraduate curriculum, especially in statistics and data science. Guidelines published by the American Statistical Association (2014) and the Royal Statistical Society (2017 and 2021) emphasise the importance of students becoming familiar with analysing non-textbook data and possessing the ability to communicate complex statistical methods to a non-technical audience. ASA DataFest provides an excellent opportunity for students to gain such experience (Horton, 2015; Dalzell and Evans, 2023).
- Networking: During this event, volunteer consultants guide students in their teamwork analysis of the dataset, answer their questions, and provide insight about the data and analysis. Apart from academic staff and PhD students, we also had consultants from local programming groups, local data-centred companies, and the Scottish Qualifications Authority. This allows students to socialise with these experts and build valuable professional connections.
- Equity, diversity, and inclusion: Such events provide opportunities to encourage students from all backgrounds to explore and experiment in a specific field, in this case data science, a subject that might seem both enticing and perhaps intimidating to some. In a supportive environment, assisted by peers and guided by academic and industry consultants, students can gain confidence and skills regardless of their prior experience.
Reflection and advice
Because we enjoyed the event – spending time with students and professionals over three days, discussing the data, modelling, coding, and enjoying good food – we hope to make this an annual event in Edinburgh in collaboration with the ASA. A student described the event as:
“The event all together was a great joint experience for students and staff I guess, it created a platform to have another style of communication. The challenging data set, students’ commitment and their good final talks are the rewards of a long weekend.”
We strongly suggest that colleagues organise similar events to bring students together to focus on problem-solving skills. However, organising such extended events must be done carefully. The venue, food, and technical and scientific support from staff must be well provided to ensure the event’s success.
References
American Statistical Association Undergraduate Guidelines Workgroup (2014), 2014 Curriculum Guidelines for Undergraduate Programs in Statistical Science, Alexandria, VA: American Statistical Association. https://www.amstat.org/education/curriculum-guidelines-for-undergraduate-programs-in-statistical-science-
Royal Statistical Society (2017), “Master’s (level 7) Standards in Statistics,” available at https://rss.org.uk/RSS/media/File-library/Membership/Prof%20Dev/rss-level7-standards.pdf
Royal Statistical Society (2021), The RSS Accreditation and Quality Mark Schemes, A Guide for Accredited Partners, available at: https://rss.org.uk/RSS/media/File-library/Membership/Prof%20Dev/RSS-Accreditation-and-Quality-Mark-Guidelines-0421.pdf
Horton, N.J. (2015). Challenges and Opportunities for Statistics and Statistical Education: Looking Back, Looking Forward, The American Statistician, 69, 138-145. https://www.tandfonline.com/doi/full/10.1080/00031305.2015.1032435
Dalzell, N.M., and Evans, C. (2023). Increasing Student Access to and Readiness for Statistical Competitions, Journal of Statistics and Data Science Education, 31, 258-263. https://www.tandfonline.com/doi/full/10.1080/26939169.2023.2167750
Serveh Sharifi
Serveh Sharifi is a Lecturer in Mathematical Data Science at the School of Mathematics and Edinburgh Futures Institute, and a Fellow of the Higher Education Academy. She teaches undergraduate and postgraduate courses in statistics, and one of her research interests is data science pedagogy.
Vanda Inacio
Vanda Inacio is a Reader in Statistics in the School of Mathematics. She teaches UG and MSc courses in statistics and conducts research on Bayesian biostatistics. Additionally, she has an interest in statistical pedagogy and how best to teach students.
Ozan Evkaya
Ozan Evkaya is a University Teacher in Statistics at the School of Mathematics and has been teaching mathematics students in higher education across different subjects. Outside of university teaching, Ozan is a co-organiser of TEMSE seminars and Gen-AI discussion group in School of Math, EdinbR group and member of RSS Edinburgh local community.
Amanda Lenzi
Amanda Lenzi is a Lecturer in Statistics at the School of Mathematics at the University of Edinburgh. She does research and teaches topics on computational statistics and deep learning.