Profit vs. Privacy: Impacts of Using Provably Private Data for Credit Scoring

Woman selling tomatoes in a local market receiving payment via mobile phone transfer | PC: Adobe Stock
Context
Financial service providers have traditionally used loan officers and standard financial data to make underwriting decisions, excluding many vulnerable people who may live far from bank branches or lack the financial history and documentation needed to obtain credit. The shift to digital loans and the ability to leverage new sources of data and algorithmic decision making has the potential to lower costs and expand access to financial services.
Many of these new data sources, however, are highly sensitive. A rise in data breaches and data misuse have led to new restrictions on data collection, use, and sharing, which could limit the potential of digital financial channels to include new populations.
These data restrictions reduce consumer harms, but at a steep cost. Researchers are now investigating whether algorithmic credit scores can be calculated with the use of Privacy Enhancing Technologies (PETs), safeguarding consumer data and providing social welfare benefits.
Study Design
This project aims to characterize the loss of underwriting accuracy, and thus lender profit, from the use of provably private data when constructing algorithmic credit scores. Accuracy, in this case, refers to the proportion of clients who pay back loans on time.
To provide privacy, researchers are turning to differential privacy, a leading PET that leverages random “noise” to hide the presence of individuals in a dataset. Although differentially private methods are among the most rigorous for guaranteeing privacy, they are best suited for aggregate statistics (e.g. median income for a region) rather than tasks which require learning about individuals, such as credit underwriting. Because of this, the team is using a similar approach as in their work in Togo, applying a novel variant of differential privacy to the underlying data used by a partner fintech in Nigeria.
The research team is able to run simulations first without consumer privacy and then with progressively higher levels of privacy. For each simulation, researchers will build a model to predict which individuals should receive loans and compare those predictions to the actual underwriting decisions made by the lender. Leveraging the differences between their predictions and the lender’s actual decisions, the researchers will assess how payment outcomes and profit would change for the lender at each level of privacy protection.
Results & Policy Lessons
Written by research team: Using the theoretical foundation based on work in Togo, researchers applied the same privacy-enhancing technology to data used in credit scoring by the Nigerian fintech. The researchers measured the privacy-accuracy tradeoff of their approach against two baselines: (1) the “original” approach, where the data is used “as is” without explicit privacy protections are built in, and (2) the differentially private approach. The results are qualitatively similar to the work done in Togo.
Overall the researchers found that, (1) relative the original approach, large increases in privacy can be achieved for relatively modest reductions in targeting accuracy, and resulting in a modest reduction in the firm’s profit; and (2) relative to classic differential privacy, large increases in targeting accuracy could be achieved for relatively modest reductions in profit. Taken together, the results suggest that the approach of targeted differential privacy interpolates between these two baselines, thereby allowing decision-makers to better balance privacy concerns with the profitability that arises from accurate targeting algorithms.
Interested readers can consult the working paper for more details about targeted differential privacy, the algorithm developed, and the experiments conducted.