Is big data the past? Privacy is the future for education?

This was one of my first tweets with the #mscidel hashtag of the week. Are big data and data-driven concepts that are we have overcome? I mean, I am very conscious about the importance and the relevance of data collection these days. However, I think that big data is something that has been proof it generates so many ethical issues with privacy and biased results, that offers generalizations and trends, but actually is not giving us real “useful” information. Or at least information that doesn’t generate ethical issues.

Big Data represents a number of ethical considerations, particularly around privacy, informed consent, and protection of harm, and raises wider questions of what kinds of data should be combined and analysed, and the purposes to which this should be put.

Eynon, R. (2013).

For years, (and still happening) data has been collected indiscriminately with the idea of having a lot, collecting as much data as possible with the idea that it will help us to solve education system issues. Collecting different data points without and see what it can offer, and what conclusions we can get from there. It seems that over these years teachers, educational institutions, and families have become a data production, responsible to collect and record data, as Williamson, B. (2017), points out in his book: Introduction: Learning machines, digital data and the future of education.

Considering all of these, I wonder why would we like to use big data in education? Are we not moving forward to a more inclusive system, where minority and diversity should be taken into consideration?

I started the week asking this, and after this week readings, discussions on Twitter and cheating a little with the knowledge of my husband. I can say that big data collection will pass to a better life. I am being optimistic, and I think we are moving to a reality where yes, we accept that our data is being collected, and assuming that is used beyond our complete understanding and capacity.

 

But at the same time, though, we are aware of the importance of being conscious and careful about the data we share and to whom. People are not that open to share personal data as it was years ago, and I would say that the trend now is to talk about models of data collection that guarantee a high level of privacy. If we put the focus on education, we can see how students are more careful with their data, as well as other activists in society. People are aware and are taking a more active role with the collection use of their own personal data.

As I said, I cheated a little here, and I used my husband expertise to know more about privacy and what is the current flow. He works as a Google researcher in machine learning and privacy, so I thought it would be really useful to know what big tech companies care about. Using his knowledge, I have learned that the tendency now, and what companies are working with, is based on building trust on users. How? collecting data as much anonymous as possible. Also storage it’s an issue and collecting everything it has a cost. Privacy is a delicate topic, and considering what has happened in the past, tech companies know that they need to guarantee a maxim level of privacy. In that way, different methods and protocols are implemented to guarantee encrypted data collection. You can read here what I shared on Twitter about the topic. He explained what is Differential privacy and some protocols that are used to minimise the identification of people and treat data as much private as possible.

What he pointed out, and I think it is very important when we talk about education is the tendency to move to a focus data collection, instead of big data. It is a way of having only information that is relevant to the purpose, nothing else, that way it is more difficult to identify the person. As I mentioned while reviewing the MOOC, for example, much demographic data is asked. This should be considered a bad practise because demographic data make us unique easily. If we have several data points from people is difficult to have anonymity, many data means that is difficult to hide using aggregation data. You will never have more random that that real data. For that reason, moving forward to models based on focus data collection is a necessity. And education can and should be playing an active role in this battle.

As a conclusion, and answering my own question, I would say that education should not care about big data, but it should care a lot about focus data collection. I think education has an important role in making people aware of this and pushing for regulation and policies. As my dear classmate Paul, suggested in this tweet, education should be working together with data scientists in order to find consensus and work together to achieve good practices in all different contexts.

 


Eynon, R. (2013). The rise of Big Data: what does it mean for education, technology, and media research? Learning, Media and Technology, 38(3), pp. 237-240.

Williamson, B. 2017. Introduction: Learning machines, digital data and the future of education (chapter 1). In Big Data and Education: the digital future of learning, policy, and practice.

Alexandra Wood, Micah Altman, Aaron Bembenek, Mark Bun, Marco Gaboardi, James Honaker, Kobbi Nissim, David R. O’Brien, Thomas Steinke & Salil Vadhan*. Differential Privacy:A Primer for a Non-Technical Audience

Kobbi Nissim, Aaron Bembenek, Alexandra Wood, Mark Bun, Marco Gaboardi, Urs Gasser, David R. O’Brien, Thomas Steinke, & Salil Vadhan* Harvard Journal of Law & Technology Volume 31, Number 2 Spring 2018 BRIDGING THE GAP BETWEEN COMPUTER SCIENCE AND LEGAL APPROACHES TO PRIVACY