Is your AI racist?

Why bias is AI’s biggest challenge.

Over the last couple of years, a rapidly increasing number of organizations have adopted or accepted the idea that Artificial Intelligence (AI) will be an essential driver for their company. However, as the Gartner Hype Cycle taught us, after the so-called 'innovation trigger' most companies land on the 'trough of disillusionment'. At this point, interest wanes because implementations and experiments have failed.

Partly responsible for this 'trough of disillusionment' is biased data and unfairness towards minorities, which is one of the biggest challenges AI is facing today. This blog will explain why this is something to act upon now.

Bias in AI

Although there are many different contributors to failed experiments/implementations in AI, one that stands out (especially in terms of media coverage) is unanticipated behaviour of algorithms once deployed. There are many examples of this, like foul-mouthed racist Chatbot Tay. She had to be put offline after a short life of only 16 hours because of inflammatory and offensive tweets. On the other hand, there is the even more disturbing US Hospital Algorithm which was more likely to refer white people to personalized care programmes than people of color who were equally ill. Put, reality turns out to be completely different from the ‘historic biased representation of reality’ we contain in a dataset. Training a machine learning model on such an imperfect dataset will inevitably result in unreliable results. Especially when it comes to cases like medical diagnosis and credit risk application reviews, this is a dangerous thing to do. AI-models must, therefore, be equally fair, regardless of demographic group or sex.

Racist tweets by chatbot Tay

This topic is commonly referred to as ‘Fairness of AI’ and is slowly evolving to become one of the most crucial aspects of AI. Google Cloud refers to fairness as the process of understanding bias introduced by your data and ensuring your model provides equitable predictions across all demographic groups. The examples above (chatbot Tay and US hospitals) seem like something that could have been easily prevented. However, for some reason, someone decided these models were ready for production. This mistake is caused by how easily unfairness can be overseen, even with the best intentions, especially if you haven’t correctly applied ways to ensure fairness at various stages of model development. If you do, however, use techniques and checks for bias during model development, you can reduce bias. Reduce, not prevent, because productive bias is necessary for an algorithm to be able to model the data and make relevant predictions (Hildebrandt, 2019).

"Reality turns out to be completely different from the 'historic biased representation of reality' we contain in a dataset."

To deal with bias and unfairness, you can use several techniques. Examples are: measuring fairness by demographic parity or enforcing fairness by adversarial learning. But this blog is not about the technical aspects of reducing bias. This blog emphasises the importance of awareness about unfairness in bias because we are all data points in someone else’s model. Models that are not only build to influence our everyday choices (like, what shoes to buy), but that can also profoundly affect our lives. Examples of these are credit risk applications, medical diagnosis, fraud detection, job applications, etc. Also, building AI models is easier than ever before. Any organisation or person can deploy a black-box model claiming it is unbiased and works perfectly for the task at hand. But can they handle the effects of unfair bias?

Though we can not back this claim with evidence and therefore speak from experience, we strongly feel that the vast majority of machine learning applied in organisations isn’t tested for unfair biases because it is simply not part of their ‘Machine Learning Pipeline’ (yet!). It is essential that everyone involved in project teams that develop AI solutions is aware of the dangers biases bring, and knows how to act accordingly. By doing so, we reduce harmful biases and may successfully enter the next phase in the Gartner Hype Cycle: The Slope of Enlightenment.

If you feel like discussing this further, feel free to get in touch with us.

More examples:

- Apple was embarrassed in 2017 by actively suggesting the ‘male businessman’ emoji when typing in ‘CEO’. In all fairness, Apple has been one of the few who already proposed female emoji variations when suggesting professions (e.g. firefighter suggests both a male and female firefighter )

- The COMPAS Recidivism Algorithm is used in courtrooms in a number states in the US. The risk scores not only turned out to be unreliable and non-transparent, but also undeniably racist.

- Word Embeddings Quantify 100 Years of Gender and Ethnic Stereotypes

- YOLO Creator Joseph Redmon Stopped CV Research Due to Ethical Concerns.

It is essential to keep investing in ways to detect and overcome bias. @Gyver we are currently working on this topic extensively and have vacancies for graduate interns to research this topic.

Let's get in touch