Newswise – Although the US Equal Credit Opportunity Act prohibits discrimination in mortgage lending, bias still impacts many borrowers. A 2021 Journal of Financial Economics study found that borrowers from minority groups were charged almost 8% higher interest rates and were rejected for loans 14% more often than those from privileged groups.
When these biases trickle down to the machine-learning models that lenders use to streamline decision-making, they can have far-reaching consequences for housing equity and even contribute to widening the racial wealth gap.
If a model is trained on an unfair dataset, such as one in which a higher proportion of black borrowers were denied loans compared to white borrowers with the same income, credit score, etc. , these biases will affect the predictions of the model when applied to real situations. To stem the spread of mortgage discrimination, MIT researchers created a process that removes bias in the data used to train these machine learning models.
While other methods attempt to tackle this bias, the researchers’ technique is new to mortgage lending because it can remove bias from a dataset with multiple sensitive attributes, such as race and gender. ethnic origin, as well as several “sensitive” options for each. attribute, such as Black or White, and Hispanic or Latino or non-Hispanic or Latino. Sensitive attributes and options are features that distinguish a privileged group from a disadvantaged group.
The researchers used their technique, which they call DualFair, to train a machine learning classifier that makes fair predictions about whether borrowers will receive a mortgage. When they applied it to mortgage data from several US states, their method significantly reduced the discrimination in the forecasts while maintaining high accuracy.
“As Sikh Americans, we frequently face bias and believe it is unacceptable to see this trickle down to algorithms in real-world applications. For things like mortgages and financial systems, it’s very important that bias doesn’t seep into those systems because it can accentuate loopholes that are already in place against certain groups,” says Jashandeep Singh, senior at Floyd Buchanan High School and co-lead. author of the article with his twin brother, Arashdeep. The Singh brothers were recently accepted into MIT.
Joining Arashdeep and Jashandeep Singh on the paper are MIT sophomore Ariba Khan and lead author Amar Gupta, a researcher in MIT’s Computer Science and Artificial Intelligence Laboratory who studies the use of evolving technology to combat against inequality and other societal problems. The research was recently published online and will appear in a special issue of Machine learning and knowledge extraction.
DualFair addresses two types of bias in a mortgage dataset: label bias and selection bias. Label bias occurs when the balance of favorable or unfavorable outcomes for a particular group is unfair. (Black applicants are denied loans more often than they should.) Selection bias is created when data is not representative of the population as a whole. (The dataset only includes individuals from a neighborhood with historically low incomes.)
The DualFair process eliminates label bias by subdividing a data set into the greatest number of subgroups based on combinations of sensitive attributes and options, such as white males who are not Hispanic or Latino, black women who are Hispanic or Latino, etc.
By breaking down the dataset into as many subgroups as possible, DualFair can handle discrimination based on multiple attributes simultaneously.
“So far, researchers have mainly tried to classify biased cases as binary. There are multiple parameters to bias, and these multiple parameters have their own impact in different cases. They are not weighed the same. Our method is able to calibrate it much better,” says Gupta.
Once the subgroups are generated, DualFair equalizes the number of borrowers in each subgroup by duplicating the individuals from the minority groups and removing the individuals from the majority group. DualFair then balances the proportion of loan accepts and denials in each subgroup so that they match the median in the original dataset before recombining the subgroups.
DualFair then eliminates selection bias by iterating over each data point to see if discrimination is present. For example, if a person is a non-Hispanic or Latino black woman who was rejected for a loan, the system will adjust their race, ethnicity and gender one by one to see if the result changes. If that borrower is granted a loan when their race is changed to white, DualFair considers that data point to be biased and removes it from the dataset.
Equity vs Accuracy
To test DualFair, the researchers used the Home Mortgage Disclosure Act’s publicly available dataset, which covers 88% of all mortgages in the United States in 2019 and includes 21 characteristics, including race, gender and birth. ‘Ethnicity. They used DualFair to “debias” the full dataset and smaller datasets for six states, then trained a machine learning model to predict loan acceptances and rejections.
After applying DualFair, the prediction fairness increased while the level of accuracy remained high in all states. They used an existing fairness measure known as the mean odds difference, but it can only measure fairness in one sensitive attribute at a time.
So they created their own measure of equity, called the Alternative World Index, which considers the bias of several sensitive attributes and options as a whole. Using this metric, they found that DualFair increased prediction fairness for four of the six states while maintaining high accuracy.
“It is the common belief that if you want to be accurate you have to give up fairness, or if you want to be fair you have to give up accuracy. We are showing that we can make progress in closing this gap” , Khan said.
The researchers now want to apply their method to debiasing different types of datasets, such as those that capture health care outcomes, auto insurance rates, or job applications. They also plan to address the limitations of DualFair, including its instability when there are small amounts of data with multiple sensitive attributes and options.
Although only a first step, the researchers hope their work may one day have an impact on mitigating bias in lending and beyond.
“Technology, quite frankly, only works for a certain group of people. In the realm of mortgage lending in particular, African American women have always been discriminated against. We are passionate about making sure that systemic racism does not extend to algorithmic models. There’s no point in creating an algorithm that can automate a process if it doesn’t work for everyone the same way,” says Khan.
This research is supported, in part, by the [email protected] initiative.