In 2020, as COVID-19 lockdowns began, the Togolese government launched an emergency social protection program, “Novissi,” to distribute cash to its poorest citizens. In the absence of a traditional social registry to determine eligibility for the program, the government needed an accurate and fast way to target the cash to those who needed it most. A team of CEGA and IPA researchers worked with government officials to develop a novel targeting approach leveraging mobile phone data and machine learning. The government first considered an approach that targeted benefits to specific geographic regions, and compared to this alternative, the new approach developed by researchers reduced the number of people who didn’t get benefits, but should have, by 4–21%.
Big data and machine learning hold extraordinary potential to improve the efficiency of aid delivery. However, as with similar uses of mobile phone data in digital financial services, there are important privacy implications to consider. Individuals can be identified with just a handful of cell phone location data points. This is a useful feature for providing cash transfer relief payments, but in the wrong hands, this data could be used to target minorities, dissenters, or political officials.
This project seeks to analyze the loss of accuracy that would result from using provably private data in constructing targeting algorithms (the “privacy-accuracy tradeoff”). Accuracy is measured in multiple ways, such as the proportion of people who are properly assigned to (not) receive a cash transfer, while privacy refers specifically to differential privacy, the “gold standard” for provable privacy in statistical tasks. Conceptually speaking, differentially private methods strategically add random “noise” to guarantee that “very little” is learned about any individual in the dataset. Some methods insert noise directly into the data, while others add noise to aggregated statistics computed on the data. For example, the US Census Bureau publishes detailed records which users can query, but only after the Census Bureau adds a calculated amount of noise to the statistics so individuals or housing units cannot be identified.
Using differential privacy for targeting algorithms poses a unique challenge because targeting for humanitarian relief requires learning about individuals: classifying whether or not an individual is below a wealth threshold and should receive cash transfer payments requires knowing their place in the distribution of population wealth. In other words, aggregated statistics (e.g. median income for a region) are insufficient.
To proceed, researchers will first need to construct a novel variant of differential privacy which provides mathematical guarantees of privacy at the individual level, rather than across statistics. Having demonstrated that this new definition of privacy is needed and that their algorithms satisfy the definition, they will analyze the privacy-accuracy tradeoff of applying this differentially private method: what percentage of people who should receive a cash transfer will be misclassified as should not receive the transfer, and vice versa?
Results from this project are forthcoming.
Copyright 2024. All Rights Reserved
Design & Dev by Wonderland Collective