You recently made a big discovery – that an academic library containing millions of images used to train artificial intelligence systems had privacy and ethics issues, and that it included racist, misogynistic and other offensive content.
Yes, I worked on this with Vinay Prabhu – a chief scientist at UnifyID, a privacy start-up in Silicon Valley – on the 80-million images dataset curated by Massachusetts Institute of Technology. We spent about months looking through this dataset, and we found thousands of images labelled with insults and derogatory terms.
Using this kind of content to build and train artificial intelligence systems, including face recognition systems, would embed harmful stereotypes and prejudices and could have grave consequences for individuals in the real world.
What happened when you published the findings?
The media picked up on it, so it got a lot of publicity. MIT withdrew the database and urged people to delete their copies of the data. That was humbling and a nice result.
How does this finding fit in to your PhD research?
I study embodied cognitive science, which is at the heart of how people interact and go about their daily lives and what it means to be a person. The background assumption is that people are ambiguous, they come to be who they are through interactions with other people.
It is a different perspective to traditional cognitive science, which is all about the brain and rationality. My research looks at how artificial intelligence and machine learning has limits in how it can understand and predict the complex messiness of human behaviour and social outcomes.
Can you give me an example?
If you take the Shazam app, it works very well to recognise a piece of music that you play to it. It searches for the pattern of the music in a database, and this narrow search suits the machine approach. But predicting a social outcome from human characteristics is very different.
As humans we have infinite potentials, we can react to situations in different ways, and a machine that uses numerable parameters cannot predict whether someone is a good hire or at risk of committing a crime in the future. Humans and our interactions represent more than just a few parameters. My research looks at existing machine learning systems and the ethics of this dilemma.
How did you get into this work?
So I studied psychology and philosophy and I did a master’s [master’s course had lots of elements – neuroscience, philosophy, anthropology, and computer science, where we built computational models of various cognitive faculties – and it is where I really found my place.
How has Covid-19 affected your research?
At the start of the pandemic, I thought this might be a chance to write up a lot of my project, but I found it hard to work at home and to unhook my mind from what was going on around the world.
I also missed the social side, going for coffee and talking with my colleagues about work and everything else. So I am glad to be back in the lab now and seeing my lab mates – even at a distance.