Thank you for your interest!
On this page we would like to introduce you to the concepts
of bias and fairness in artificial intelligence (AI) systems.
Thank you for your interest!
On this page we would like to introduce you to the concepts of bias and fairness in artificial intelligence (AI) systems.
The main page is divided into seven sections that build on each other. You can access each section by clicking on the icon in the box below.
The main page is divided into seven sections that build on each other. You can access each section by clicking on the icon in the box below.
Editorial note:
Our exercises build on the basics taught on this page. We therefore recommend that you work on the basics and the exercises in parallel. But don’t worry! At the end of each section, we will link to the relevant exercises and direct you back to them.
First, learn the definitions of the key terms ‘AI systems‘, ‘bias‘ and ‘fairness‘. To do this, click on one of the three boxes.
First, learn the definitions of the key terms ‘AI systems‘, ‘bias‘ and ‘fairness‘. To do this, click on one of the three boxes.
Learn about the importance of artificial intelligence (AI) for business, society and your everyday life. For a deeper understanding, we also distinguish between the terms “artificial intelligence” and “machine learning” in this section.
Learn about the importance of artificial intelligence (AI) for business, society and your everyday life. For a deeper understanding, we also distinguish between the terms “artificial intelligence” and “machine learning” in this section.
Many businesses and governments are already using data-driven, algorithmic decision-making systems. In the foreseeable future, there will be few industries or areas of daily life where artificial intelligence (AI) systems will not be ubiquitous.
You already use AI in your everyday life. Have you tried Spotify or binge-watched a Netflix series today? An AI system helps you find the music and movies you like. An algorithm learns from your choices and then recommends new songs you are most likely to add to your Spotify playlist. Smart home devices like Alexa, or automatic facial recognition when unlocking your smartphone, are two more examples of the importance of AI in our everyday lives.
AI is now behind many things – from chatbots and shopping recommendations to navigation with Google Maps. Google, for example, uses AI to understand search queries and evaluate relevant results. Companies such as Facebook and LinkedIn are using AI systems to identify questionable content. This includes potentially violent, pornographic or politically extreme content. Images, text and videos that (could) fall into these categories are automatically flagged by the AI. AI systems are also being used in many areas of medicine: AI systems learn to make diagnoses based on image data.
Machine learning (ML) as a subfield of AI enables machines to find patterns in data sets without explicit programming of rules and to make decisions and predictions based on this analysis. This is made possible by the increasing availability of big data and high computing power. ML applications typically become more accurate the more data they have available – without the need for additional programming.
We are seeing more and more headlines about AI systems being used inappropriately and making discriminatory decisions. For example, when an AI system is used to filter job applications and only male candidates are selected. As a result, considering bias and fairness in the development of such systems has already become very important.
However, there is still a problem: Many people, not only in the general public but also among those who develop and use AI systems, still make the blanket assumption that algorithmic decisions are objective and neutral. But this is not a given, nor is a decision based solely on objective characteristics necessarily fair and non-discriminatory. Given this situation, there is a great need to gain a deep understanding of the challenges in the use and implementation of AI systems and possible solutions.
In this section you will learn about different types of bias within the ML lifecycle.
In this section you will learn about different types of bias within the ML lifecycle.
Machine learning (ML) is increasingly being used to make decisions that affect people’s lives. Typically, algorithms learn from existing data and apply the learned patterns to unseen data. As a result, problems can arise in data collection, model development and system deployment that can lead to various biases.
Bias can arise at any stage of the ML lifecycle. The ML lifecycle comprises a series of decisions and practices in the development and use of ML systems. Each stage involves decisions that can introduce bias. The process starts with data collection. This involves defining a target population and drawing a sample from it, as well as identifying and measuring characteristics and labels. This data set is divided into training and test data. The training data is used to ‘learn’ an ML model. The test data is used to evaluate the model. The model is then used in a real application to make decisions for its users. This process is cyclical: for example, the model’s decisions influence the state of the world at the time of the next data collection or decision.
Click on the red dots to learn about the eleven bias types in the ML life cycle:
No problem. You can find all the information from the interactive graphic in this box.
No problem. You can find all the information from the interactive graphic in this box.
Historical bias is the pre-existing
bias and socio-technical issues
in the world that can enter the
data generation process even with perfect
sampling and feature selection.
Representation bias results from
the way we collect data. Non-representative
samples lack the diversity of the
population (e.g. missing subgroups).
Measurement bias occurs when
features and labels are selected,
recorded or calculated to be used in a
prediction problem.
Typically, a feature or label is
a proxy (a concrete measure) chosen
to approximate a construct that is not
directly coded or observable.
Omitted variable bias occurs
when one or more important
variables are omitted from the model.
Evaluation bias occurs when the training data
used for a particular task is not
representative of the user population.
Evaluation bias can also be exacerbated by the choice
of performance metrics.
These are distortions that are not
present in the data, but are added by the algorithm.
Aggregation bias occurs when a unit
model is applied to data that has underlying
groups or types of examples that should
be considered differently.
This refers to a bias by the user
interface and by the user himself,
as the system imposes his self-chosen
one-way behaviour and interaction.
Population bias occurs when the statistics,
demographics, representation and user
characteristics of the platform's user population differ
from those of the original target group.
Deployment bias generally refers
to any bias that occurs during use when
a system is used or interpreted in an inappropriate
way that was not intended by the designers or developers.
A feedback loop between data,
algorithms and users can exacerbate
existing sources of bias.
In this box we describe an illustrative example of each of the types of bias listed above.
Note that the different types of bias are not mutually exclusive, i.e. an AI system can suffer from more than one type of bias. For example, AI systems in fitness trackers may suffer from representation bias if darker skin tones are not included in the training dataset, measurement bias if the fitness tracker performs worse for darker skin tones, and evaluation bias if the dataset used to evaluate the AI system does not include darker skin tones.
Learn more: https://www.youtube.com/watch?v=vVRWeGlMkGk
What should you bring?
Bias does not only come from biased data. Bias can also come from how the AI system is modelled, how the system is evaluated, or how users interpret the final results of the AI system.
Use the first course in this unit to better understand the different types of bias. Being aware of the many types of bias will help you to better identify them in AI systems.
Note that the different types of bias are not mutually exclusive, i.e. an AI system may suffer from more than one type of bias. For example, AI systems in fitness trackers may suffer from representation bias if darker skin tones are not included in the training dataset, measurement bias if the fitness tracker performs worse on darker skin tones, and evaluation bias if the dataset used to evaluate the AI system does not include darker skin tones.
Learn more: https://www.youtube.com/watch?v=vVRWeGlMkGk)
What should you know?
Bias does not only come from biased data. Bias can also come from how the AI system is modelled, how the system is evaluated, or how users interpret the final results of the AI system.
Use the first course in this unit to better understand the different types of bias. Being aware of the many types of bias will help you to better identify them in AI systems.
Learn about the statistical definitions of fairness.
Learn about the statistical definitions of fairness.
The concept of fairness is to ensure that the AI system does not result in unfair decisions or discrimination. Respecting the concept of fairness is required of an AI system from both an ethical and a legal perspective. Indeed, it is forbidden to treat equal social circumstances unequally or unequally unless a different approach is objectively justified. In particular, this means that individuals should not be discriminated against because they belong to a marginalised or disadvantaged group.
The potential harm scenario that the concept of fairness primarily addresses is discrimination by an AI system against a particular group of people – be it on the basis of an individual’s ethnic origin, gender, age, religion/belief or other indicators. These indicators are considered sensitive characteristics for which non-discrimination should be established. The concept of fairness involves identifying the potential impact of discrimination from the perspective of those affected. This is particularly relevant for AI systems that make decisions about individuals. Examples include AI-based lending, selection of job applicants or recommendations for medical treatment. The consequences of discrimination by AI systems can be violations of personal rights, financial loss or damage to reputation.
While unfairness can be captured intuitively through various examples, the challenge is to define fairness objectively, metrics-based and as scalable as possible. Below we present concrete ways to quantify fairness.
There are different ways to define when an AI system is fair. We look at statistical definitions of fairness below. We focus here on classification in the ML field. Classification here refers to the identification of a category (e.g. creditworthy vs. not creditworthy) for a data instance (e.g. the data of a bank customer) based on training data whose categories are known.
Most of the metrics used to assess the fairness of a model relate either to the types of errors a model might make, or to the predictive power of the model for different groups. They can often be derived from the values of the so-called confusion matrix. It contains the number of correctly and incorrectly classified test data of the model per class.
We will consider the confusion matrix for the example of binary classification. For example, a model could classify x-rays into the classes “sick” or “healthy”, or a model could classify the data of a bank customer into the classes “creditworthy” or “uncreditworthy”. It is important that the results are verified beforehand. For example, images are used as test data, where there is no doubt about how many images really represent the clinical picture.
Classification models do not have to be binary – they can be trained on more than two classes. Most performance metrics can be derived from those of binary classification, so we restrict ourselves to binary classification in this learning unit.
Note: Turn on the English subtitles in the videos (cogwheel).
So, as explained in the video, when working with a binary classifier, both the predicted and the actual classes can take two values: Class 1 and class 2. We will first look at the different possible relationships between predicted and actual results:
There are many more performance metrics, such as the following four:
Often, however, a model does not directly output one of the two classes (class 1 or class 2) for a data instance, but rather a specific value. In order to assign the data instance to one of the two classes, a threshold value is set above or below which one or the other class is output. For this reason, the Receiver Operator Characteristic (ROC) curve is often used in the analysis of binary outcomes to show the performance of a model. The ROC curve provides information about performance over a range of thresholds and can be summarised by the area under the curve (AUC), a single number.
The ROC curve plots the performance metric TPR against the performance metric FPR at different classification thresholds. The following figure shows a typical ROC curve.
The AUC (area under the curve) measures the two-dimensional area under the entire ROC curve (think calculus). The area under the curve is a measure of a classifier’s ability to discriminate between classes and is used as a summary of the ROC curve. The higher the AUC, the better the performance of the model in distinguishing between class 1 and class 2.
There are many more performance metrics, such as the following four:
Often, however, a model does not directly output one of the two classes (Class 1 or Class 2) for a data instance, but rather a specific value. In order to assign the data instance to one of the two classes, a threshold value is set above or below which one or the other class is output. For this reason, the Receiver Operator Characteristic (ROC) curve is often used in the analysis of binary outcomes to show the performance of a model. The ROC curve provides information about performance over a range of thresholds and can be summarised by the area under the curve (AUC), a single number.
The ROC curve plots the performance metric TPR against the performance metric FPR at different classification thresholds. The following figure shows a typical ROC curve.
The AUC (area under the curve) measures the two-dimensional area under the entire ROC curve (think integral calculus). The area under the curve is the measure of a classifier’s ability to discriminate between classes and is used as a summary of the ROC curve. The higher the AUC, the better the performance of the model in distinguishing between Class 1 and Class 2.
Below we present three statistical definitions of fairness based on the performance metrics mentioned above. We have also prepared a video for you:
Note: Turn on the English subtitles in the videos (cogwheel).
We repeat the three definitions of fairness presented:
In practice, it is not possible to optimise a model for all definitions of fairness. So which definition of fairness should you choose? As with most ethical questions, the answer is usually not easy to find and the choice of definition should be discussed in a conversation involving all members of your team.
By the way: If you are working on real problems, the data will be much, much larger. In this case, the Confusion Matrix is still a useful tool for analysing performance. An important point, however, is that real-world models cannot usually be expected to meet every definition of fairness perfectly. For example, if “demographic parity” is chosen as the fairness definition and a model is expected to select 50% men, the final model may select a percentage close to 50% but not exactly 50% (such as 48% or 53%).
In this section you will learn how to deal with issues of bias & fairness in the real world.
In this section you will learn how to deal with issues of bias & fairness in the real world.
It is not easy to detect bias in your AI system, as it can occur at any point in the ML lifecycle. Furthermore, different people perceive different outcomes as ‘fair’.
You have already learned that bias can occur in a variety of specific ways in an ML lifecycle. You have also seen that there is no single approach to fairness, but rather different interpretations. So how do you deal with bias and fairness in AI systems in the real world? There are many questions that need to be asked to address this issue. The following graphic shows possible questions for selected phases of the ML lifecycle that can be asked to avoid unfair decisions.
One big problem is that bias is rarely obvious. Think of the comments posted under a post on a social networking site. If a comment violates the platform’s policies, such as hate speech, it may be deleted by the platform after it is posted. Some platforms use AI systems to automatically select and delete such hateful comments. But who decides which comments are hateful? Could such an AI system produce unfair results? Could someone be discriminated against?
Use the third course in this unit to explore this scenario in more detail. You can work directly on a real dataset, check for bias in an AI system, and examine performance metrics.
In this section you can learn about different mitigation strategies.
In this section you can read about different mitigation strategies.
Bias in AI systems can take many forms, leading to unfair or discriminatory decisions. But there are also many ways to mitigate bias. In this section, we give you an insight into possible mitigation strategies.
There are some approaches to mitigating or eliminating bias at different stages of the ML lifecycle. However, there is no ‘one size fits all’ approach. Approaches range from formulating an application that is relatively free of bias, to collecting relatively unbiased data, to developing algorithms that minimize bias. Below we present three concrete approaches.
All models are created by humans and reflect human biases. Machine learning models can reflect the biases of organisational teams, the designers on those teams, the data scientists implementing the models, and the data engineers collecting the data. Of course, they also reflect the biases inherent in the data itself. Just as we expect a certain level of trustworthiness from human decision-makers, we expect a certain level of trustworthiness from our models. So to reduce bias, it is important that teams are as diverse as possible.
Explainable Artificial Intelligence (XAI) is a field that is essentially about making AI systems more transparent, so that people can trust and scrutinise an AI system – including in terms of bias and fairness. More specifically, XAI encompasses a variety of technologies and measures that ensure the transparency of an AI system. The goal is always to make the results or internal workings of AI systems understandable to human users. This can also significantly support the detection and correction of biases in the ML lifecycle. XAI can therefore be seen as a way to mitigate bias and improve the fairness of AI.
Techniques for minimizing bias in algorithms fall into three categories:
The field of algorithmic fairness is a new area of research that needs to be further optimised or refined. Nevertheless, there is already a large body of research proposing fair algorithms and bias mitigation techniques, and comparing different bias mitigation algorithms.
Below we give three brief examples of existing work and show how they fall into one of these categories. The main goal of the algorithms is to achieve a model with higher accuracy while ensuring that the models are less discriminatory with respect to sensitive attributes. In simple terms, the output of the classifier should not correlate with sensitive attributes. Building such ML models becomes a multi-objective optimisation problem. The quality of the classifier is measured by its accuracy and the discrimination it makes based on sensitive attributes; the more accurate the better, and the less discriminatory (based on sensitive attributes) the better.
Below we give three brief examples of existing work and show how they fall into one of these categories. The main goal of the algorithms is to achieve a model with higher accuracy while ensuring that the models are less discriminatory with respect to sensitive attributes. In simple terms, the output of the classifier should not correlate with sensitive attributes. Building such ML models becomes a multi-objective optimisation problem. The quality of the classifier is measured by its accuracy and the discrimination it makes based on sensitive attributes; the more accurate the better, and the less discriminatory (based on sensitive attributes) the better.
There is no single right answer to how to ensure fairness in an AI system. The “right answers” are constantly evolving as AI fairness is an active area of research. See also Benjamin van Giffen et al. (2022).
In this section you will learn how to see and reap the benefits of a future with fair algorithms.
In this section you will learn how to see and reap the benefits of a future with fair algorithms.
You cannot shy away from the challenge of making the use of algorithmic decision systems fair. The deeper you delve into the topic of bias and fairness in AI systems, the more you feel the complexity of the problem. On the philosophical side, there is a discourse about the right definition of fairness. This is a discourse that faces the technical impossibility of considering all theoretical understandings of fairness at the same time, as they are partly mutually exclusive. On the technical side, one is confronted with the challenges of the black box of a complex algorithm: What is actually going on in my algorithm?
The discussion about fair algorithms needs to happen now. Ultimately, if algorithms are fair, they can help us overcome our own biases. Existing discriminatory practices should be exposed as soon as possible and reflection on the underlying decision-making criteria should be initiated. This can stimulate the next steps in the use of algorithms and AI. This includes, for example, a trained, enlightened use of algorithmic decision-making systems.
Bias and fairness are only pieces of the puzzle of trustworthy and ethical AI. In addition to bias and fairness, it also includes privacy and accountability.