You can find the video of our Algorithmic Bias session at the end of this article. This was a presentation and panel session hosted on International Women’s Day 2022.
In November 2019 Steve Wozniak, co-founder of Apple, was mad enough about something to take to Twitter and complain.
The reason? He and his wife had both applied for Apple’s new Goldman Sachs credit card, with different results. While Steve had been approved with a high credit limit, his wife had only been given access to a tenth of the amount. The couple shared assets and bank accounts, so why had Apple’s black box denied her the level of credit they had given him?
The answer? Algorithmic Bias.
Algorithms make data-based decisions
Algorithms are behind many of the most impactful decisions made today by banks, lenders, and credit checking companies. Decisions that will affect how you and your family live your lives. But what is an algorithm?
Put simply, an algorithm is a set of instructions that a computer follows to complete a task.
Algorithms can vary dramatically in complexity. Some algorithms use a traditional ‘rules-based approach,’ where a human decision maker predetermines specific rules; for example, an algorithm could make a decision based on a rule that loans should be approved if the income shown is above a certain level.
In a rules-based system, the system operates on logic which is represented by a set of relational rules that collectively represent the knowledge captured by the system. As opposed to machine learning, this knowledge (or list of rules) is often created by a human (e.g., in-house developer) at the beginning of the process with limited flexibility once the system is deployed. One prominent example is a traditional credit scorecard.
Other algorithms are more complicated and can effectively ‘learn’ by teaching themselves their own rules. In credit risk, banks typically use these more complex ‘Machine Learning’ algorithms to predict their customers’ ability to repay, using learning from historical examples to predict an outcome in the future.
Machine learning is a branch of artificial intelligence (AI) and computer science which focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving its accuracy
SOURCE: Interpretable Machine Learning
A Guide for Making Black Box Models Explainable, Christoph Molnar2022-02-28
https://christophm.github.io/interpretable-ml-book/terminology.html
As a regulated industry, financial services have been slow to catch up to the tech giants that pioneered Machine Learning. But with an increasing volume and variety of data now available due to increasing digitalisation, Machine Learning has become widely used to process data usefully at scale.
Algorithms are not just making existing decision making more efficient. They are enabling new ways of using data to increase access and affordability of financial services. Researchers are exploring the use of alternative data including payment behaviours, current account information, location, and even social media posts in order to predict credit risk [1].
Algorithms are also being used to assess mortgage lending applications to better predict default rates, benefitting both lenders and potential borrowers [2].
Decisions made by algorithms can be biased
For centuries, decisions about credit risk and lending were made by humans. As we all know, human decision making is prone to inaccuracy, incompetency, and subconscious bias.
In some ways, data-based decisions using Machine Learning offer the financial services industry a more sophisticated way to assess credit risk. Decisions made by computers will be more consistent than those made by humans. An algorithm is cheaper and more scalable than hiring and training a team of people.
But there are flaws with algorithms too.
Algorithms are created and trained by humans who often feed their own subconscious biases into the code, reproducing “pre-existing patterns of exclusion and inequality [3]”.
Machine Learning models are trained to ‘teach’ themselves how to solve problems and make optimised decisions, but that means a lack of transparency as to how decisions are arrived at. For example, even without explicitly knowing an applicant’s gender, a Machine Learning algorithm may guess gender information based on the applicant’s other data, such as their recent purchase history, and use that information against the applicant.
Since algorithms are used at scale, when the decisions they make are flawed they can impact hundreds of thousands of people unfairly.
One example that illustrates these issues is the University of California, Berkeley study which found that both online and face-to-face lenders charge higher interest rates to African American and Latino borrowers, earning 11-17% higher profits on such loans. They calculated that those homebuyers paid up to half a billion dollars more in interest every year than white borrowers with comparable credit scores. Biases like these are affecting poor and disadvantaged people all over the world.
There are different types of bias
Bias occurs when an individual (or subgroup of individuals) is being weighted in favour of, or against, a decision due to some preconceived opinion that is not based on reason. This can occur when there are heterogeneities in data, particularly in big datasets made up of subgroups characterised by their unique characteristics or behaviours.
A model that has been trained using biased data will result in biased and unfair outcomes. Let’s take the Apple and Goldman Sachs credit card example to illustrate how different types of biases may have affected the algorithm behind-the-scenes.
- Historical Bias – occurs when existing bias and societal inequalities from the real-world leak into the data generation process even after careful sampling (for example, if historically, males made up most of the credit applications, the model may learn from this historical disparity)
- Representation Bias – occurs depending on how samples in a population are defined and selected (for example, lack of gender diversity in selected training data)
- Measurement Bias – occurs based on how features are selected and utilized, in other words, the difference in how minority subgroups are treated and controlled has a feedback loop effect on the type of measurements taken (for example, if females tend to have more credit cards than males, their purchasing behaviour may be scrutinized more heavily).
- Evaluation Bias – occurs if inappropriate or disproportionate benchmarks are used for evaluation (for example, benchmarks used for creditworthiness may be biased towards gender)
- Aggregation Bias – occurs when false assumptions are made about a subgroup based on the learnings from other subgroups which have different behavioural characteristics (for example, different genders may exhibit different types of behavioural patterns therefore a single model would probably not be well-suited for all groups in a population).
Since bias can leak into the modelling process at many points, those designing and implementing algorithms must be careful to make sure that they consider bias so that adverse historical trends are not reproduced in the future.
Biased algorithms perpetuate bias in a way that’s self-fulfilling. Therefore, we should work to detect bias in these models and eliminate it as much as possible.
Algorithmic Bias: Presentation and panel session. March 2022.
We want to tackle algorithmic bias
At Smart Data Foundry we want to work with governments, regulators, and industry to tackle algorithmic bias, as part of our mission to open finance for all.
It’s a topic that many people are looking at. MP Oliver Dowden recently talked about how, “the government recognises the urgent need for the world to do better in using algorithms in the right way: to promote fairness, not undermine it. Algorithms, like all technology, should work for people, and not against them.”
If you want to learn more about algorithmic bias, we at Smart Data Foundry are running a free online event for International Women’s Day: ‘What is Algorithmic Bias and what can we do about it?.’ Get your ticket on Eventbrite. Sign up to attend and receive a link to the recorded session as well as our whitepaper on the topic.
References
- Hurley & Adebayo 2016. Credit scoring in the era of big data.
- Fuster et al. 2017. The Role of Technology in Mortgage Lending.
- Barocas & Selbst 2016. Big Data’s Disparate Impact.