Two recent papers suggest AI lending could be a powerful tool for financial inclusion and help victims of discrimination access credit.

By applying machine learning algorithms to large datasets in order to provide rapid and accurate risk assessments, fintech has the potential to dramatically cut the cost of accessing credit. Yet not everyone is convinced that the rise of algorithmic lending is an unalloyed good.

Two recent books, Caroline Criado Perez’s Invisible Women and Cathy O’Neill’s Weapons of Math Destruction highlight how algorithms can recreate, and even wrongly legitimise, real-world prejudices. These books serve a useful function. They remind us that a machine learning algorithm is only as good as the data it was trained on. They shouldn’t be overhyped or seen as infallible.

Regulators such as the Financial Conduct Authority’s Charles Randell have expressed concern that algorithms could “automate and intensify unacceptable human biases that created the data of the past”.

When we talk about bias there’s a need to be clear about terms. As Chris Stucchio and Lisa Mahapatra point out, ‘bias’ often means different things to different people:

“When an ordinary person uses the term ‘biased’, they think this means that incorrect decisions are made — that a lender systematically refuses loans to blacks who would otherwise repay them. When the media uses the term ‘bias’, they mean something very different — a lender systematically failing to issue loans to black people regardless of whether or not they would pay them back.”

The economics of discrimination

We should also be precise when talking about discrimination. Economists distinguish between taste-based and statistical discrimination.

Say I was hiring for a graduate job and I have limited information about candidates. I might pass over an Oxford-educated candidate for a candidate educated at Manchester out of loyalty to my alma mater, even though I have reason to believe that the Oxford-educated candidate might be a better employee. This would be taste-based discrimination. Discriminating in this manner comes with a clear cost: lower profits. Unconscious bias would be another example of taste-based discrimination.

Suppose instead I decided to interview a candidate who studied at Oxford over a candidate who studied at a less well-known university on the grounds that Oxford graduates will stereotypically perform better. This would be statistical discrimination. Absent any additional information, a business that discriminates would be more profitable than a business that doesn’t (provided they’re using accurate stereotypes and better information isn’t available).

Statistical discrimination might be economically rational (i.e. a way to maximise profits), but it can create self-reinforcing vicious cycles. Take the example of lending. If a group is on average less creditworthy and as a result is likely to be charged higher interest rates, more creditworthy members of the group may simply choose not to apply for credit. Worse still, by losing its most creditworthy members, the group becomes on average less creditworthy. And the cycle begins again.

It might also discourage the accumulation of human capital. I recently watched the Netflix series When They See Us, based on the true story of the wrongly convicted Central Park 5. It provided a good example of the problem. On his release from prison, Raymond struggles to get a job due to his criminal record, and resorts to a life of crime. If the returns to doing the right thing (e.g. finishing school) are lower, then inequality will increase.

In general, we should expect taste-based or inaccurate statistical discrimination to die out in competitive markets as firms miss out or overpay for talent. But statistical discrimination, when it’s based on accurate information about group averages, still poses a problem. In the absence of better information, companies that fail to discriminate in this way will lose out to companies that do.

By thinking about discrimination like an economist, a few things become apparent.

First, different types of discrimination require different solutions. For instance, increased competition might reduce taste-based discrimination but increase statistical discrimination.

Second, the impact of algorithmic bias on discrimination will be different in different sectors. Competition will be a factor. If there are multiple banks using multiple lending algorithms, banks using unbiased algorithms will gain market share at the expense of biased lenders.

We’ll need to keep a closer eye on discrimination in sectors where competitive pressures are weak or non-exist, e.g the NHS or the justice system. One solution would be to make algorithms used in criminal justice or for public sector hiring open source.

But, while keeping that in mind, we shouldn’t make the perfect the enemy of the good. Humans are biased too and anti-discrimination legislation rarely eliminates discrimination and bias entirely. As Caleb Watney writes:

“Places where human bias is most prevalent offer some of the most exciting opportunities for the application of algorithms. Humans appear to be really, really bad at administering justice by ourselves. We judge people based on how traditionally African American their facial features look, we penalize overweight defendants, we let unrelated factors like football games affect our decision making, and more fundamentally, we can’t systematically update our priors in light of new evidence. All this means that we can gain a lot by partnering with AI, which can offset some of our flaws.”

When competitive pressures are strong, our focus should shift to preventing statistical discrimination. The danger isn’t necessarily that machine learning algorithms will discriminate directly based on gender, class, or race. Rather, the concern is that a machine learning algorithm will stumble onto a proxy variable that stands in for a protected characteristic.

For example, Amazon created an algorithm trained on ten years worth of hiring data to assess job applications. The intention was to automate the hiring process, but it ended up automating past biases too. At first, it ranked candidates who used the word ‘women’ on their application (as in ‘I was women’s chess club captain’) down.

Unlike humans, we can edit algorithms to avoid explicit forms of discrimination. But while Amazon edited the algorithm to be neutral to such terms, discrimination wasn’t eliminated. Instead, the algorithm discovered proxy variables, such as ‘being named Jared’ or ‘playing lacrosse’. The danger is that in cases where statistical discrimination is advantageous, the algorithm could outsmart attempts to prevent overt discrimination.

The promise of fintech

It’s worth understanding why banks might discriminate in the first place. When a signal (e.g. a credit rating) is noisy and unreliable, a rational lender will place a greater weight on prior beliefs about group averages.

But as the quality of a signal increases, prior beliefs about group membership hold less weight. For instance, suppose instead of merely knowing the education level of a loan applicant, I also knew where they studied, what they studied, and had access to references from their tutors.

One issue with statistical discrimination is that the standard data tends to favour the majority. For instance, a minority that’s less likely to get a loan might have a shorter repayment history and thus find it harder to get credit in the future. In such cases, lenders will be more likely to discriminate statistically.

The promise of fintech is that it can enable individuals without detailed repayment histories to access credit by using alternative sources of data, such as an individual’s social media footprint, spending records, or rent payments. The greater precision enabled by algorithmic lending should reduce the weight placed by lenders on group averages.

In his paper “On Fintech and Financial Inclusion” economist Thomas Phillippon uses a standard model of discrimination to show that algorithmic lending reduces biases against minority borrowers. Even if “fintech engineers suffer from the same prejudice as loan officers and export their bias into their algorithms”, Phillippon finds that minority borrowers may still be better off using fintech lenders as the impact of the prejudice declines as the algorithm becomes more precise.

The implication of Philippon’s research is that minorities who face discrimination will be better off if fintechs are able to use larger and richer datasets. Instead of forcing lenders to open-source their algorithms, which may discourage investment, the priority should be to make it easier for them to access new sources of data. Projects such as the Government’s Rent Recognition Challenge, a £2m challenge to encourage developers to create applications that “enable rental tenants to record and share their rental payment data with lenders and credit reference agencies,” are steps in the right direction.

Philippon’s arguments are theoretical and shouldn’t end the conversation. He shows that as new sources of data become available minorities are more likely to get credit, but if the new sources of data (e.g. spending patterns) aren’t useful then minorities will be worse off.

So are fintechs widening access to credit? Yes, according to a recent study from UC Berkeley. Looking at mortgage approvals, they find that lenders charge Latinx/African-American borrowers 7.9 basis points more for purchase mortgages. However, fintech lenders discriminate 40% less than face-to-face lenders on pricing (i.e. interest rates). Furthermore, fintech lenders do not discriminate on approvals.

While fintechs discriminate less on price than face-to-face lenders, some discrimination remains. This may be the result of minority borrowers having less incentive or ability to shop around for better prices.

The good news is that discrimination is on the decline. If the remaining discrimination is the result of weaker competition, then this should decline as more fintech lenders enter the market and minority borrowers have greater ability to shop around.

It’s right that people are concerned about the risk of algorithmic discrimination, but it should not be a one-sided discussion. Fundamentally, it should be a debate informed by clear economic analysis and evidence.