New privacy technologies for better data sharing
Challenge
UNHCR and the Joint Data Center on Forced Displacement launched the UNHCR Microdata Library (MDL) in 2020, to safely share microdata with external researchers. But, due to the dataâs sensitivity, statistical disclosure control often allows only a small subsample to be shared, limiting its utility.
Solution
Test the use of artificially generated data, differential privacy, and other privacy enhancing technology techniques to generate dummy data from the MDL that could be safely shared with external researchers while maintaining a good level of utility.
Impact
Researchers have access to useful datasets that accurately convey the content of the MDL without breaching privacy. An improved evidence base drives improved initiatives to protect, empower, and include refugees.
Project impact
Other impact
UNHCR became one of the first humanitarian organizations to operationalize a hybrid approach combining synthetic data and differential privacy: a methodology independently validated at UNECE SDC2025, where National Statistical Offices were found using the same synthesizer family. The project produced a freely downloadable practical guide, integrated OpenDP libraries into the UNHCR Microdata Library, and positioned UNHCR as a recognized leader in privacy-enhancing technologies at global forums. The work was funded by the Data Innovation Fund and is directly replicable for other sensitive humanitarian datasets.