Any views expressed within media held on this service are those of the contributors, should not be taken as approved or endorsed by the University, and do not necessarily reflect the views of the University in respect of any particular issue.

Early Big Data and AI Algorithmic Bias

“The Ethical Data Futures” proved not just engaging but deeply insightful in highlighting the ethical minefield surrounding data handling. “Excavating AI” especially captivated me by drawing on the historical layers that laid the foundation for today’s AI landscape. The authors’ observation of diverse political perspectives at the outset of AI development resonated deeply. Their categorisation of Imagenet imagery as political went beyond finite-infinite data states, exposing the biases embedded in mislabeled images, particularly concerning people.

Professor Vallor’s seminar analysis further illuminated this politicisation. As I had previously viewed labeling as purely descriptive, her explanation challenged my perspective. She revealed that the very act of sourcing these images and data from the early internet (2009) was inherently political, reflecting the elitist demographics of major internet users back then. I deeply resonated with her point about the exclusion of those who couldn’t afford internet access at that time, effectively silencing a significant portion of the population.

A particularly troubling aspect I noted, both in the readings and the discussions, was the inappropriate labeling of people in the images. Assigning labels like “slut” or “loser” sends a damaging message. It begs the question, “How can one possibly discern someone’s personality from just a picture?” This practice speaks volumes about the values and qualifications of those involved in labeling services like Amazon Mechanical Turk. The focus on speed and monetary gain, with little regard for accuracy or sensitivity, paints a concerning picture.

While Professor Vallor believes the labelers might have come from low-income countries, I disagree. Accessibility limitations would have restricted participation in many countries. Priority likely went to users in the US, the dominant demographic in labeling. Additionally, internet access, scarce in low-income countries during that era, wouldn’t have been readily available to just anyone. Therefore, I argue that the labelers likely originated from a similar cultural background, potentially sharing similar biases prevalent in their communities. This lack of diversity further exacerbates the ethical issues.

The insufficient representation during AI development leaves room for political manipulation and misrepresentation of certain communities. This constitutes both an ethical and political bias. To understand and counteract these biases, further readings and group discussions were assigned, culminating in a case study reflection.


1) Crawford, K. and Paglen, T. (2019). Excavating AI: The Politics of Images in Machine Learning Training Sets. The AI Now Institute. Available online:


Leave a reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>


Report this page

To report inappropriate content on this page, please use the form below. Upon receiving your report, we will be in touch as per the Take Down Policy of the service.

Please note that personal data collected through this form is used and stored for the purposes of processing this report and communication with you.

If you are unable to report a concern about content via this form please contact the Service Owner.

Please enter an email address you wish to be contacted on. Please describe the unacceptable content in sufficient detail to allow us to locate it, and why you consider it to be unacceptable.
By submitting this report, you accept that it is accurate and that fraudulent or nuisance complaints may result in action by the University.