STORIES & INSIGHTS

Leveraging AI for community-based protection

As millions fled Ukraine, their online conversations ballooned. Using new technologies, those exchanges could inform a more efficient, data-driven response.

Poland. Polish border town welcomes largest number of refugees fleeing Ukraine. Photo: UNHCR/Maciej Moskwa.
Poland. Polish border town welcomes largest number of refugees fleeing Ukraine. Photo: UNHCR/Maciej Moskwa.

Since the full-scale Russian invasion of Ukraine, in 2022, more than 5.7 million Ukrainians have fled the country, around 600,000 of them finding themselves in Slovenia, Czechia, Slovakia, and Hungary. In their scramble to set up new lives in safety, the displaced turned to trusted sources of information: each other. “Informal communication within our community can be challenging,” a Ukrainian refugee in Prague told UNHCR, the UN Refugee Agency, “but it’s often the most helpful way to get information.”

Much of this information exchange took place in digital spaces, with online activity — particularly on Telegram and Facebook — ballooning during initial waves of displacement. “The people fleeing Ukraine were probably on average much more digitally literate compared to other responses that UNHCR has been working on — meaning that they’ve been using digital tools and platforms at a much higher rate, and accessing services and information in a slightly different way to what we’re used to,” explains Márton Elődi, former Digital Expert with UNHCR Hungary. “This creates both new challenges and also possibilities for UNHCR.”

These online interactions generated a flood of unstructured data about the communities UNHCR works with and for — a potentially valuable resource to inform more targeted and timely interventions, if it could be responsibly and efficiently analyzed. Rapid advancements in generative artificial intelligence (GenAI), specifically large language models (LLMs), made such analysis possible. So, Márton and Protection Officer Antonia Haegner set about leveraging this technology to better understand community concerns — prototyping a custom-built solution tailored to the needs of refugees and aligned with UNHCR’s data protection and operational standards.

A lifeline and a risk

When the war began, displaced people quickly turned to digital communities for support. These platforms offered fast, localized information but also introduced risks like misinformation and oversharing of sensitive data. While helpful, the surge in online engagement created new protection challenges.

“We had to acknowledge that we are kind of unable to directly respond to these risks from a technical perspective,” Márton recalls. UNHCR couldn’t comprehensively monitor these communities and remove problematic posts directly, so the Hungary Operation instead embarked on a project — supported through the Digital Innovation Fund — to raise awareness about online safety and digital protection. Called Wise Browsing, Safe Posting, that project developed and disseminated educational materials via YouTube, Instagram, Facebook, and Telegram. The campaign eventually reached more than 5 million people across Slovenia, Czechia, Slovakia and Hungary.

This work, focusing more on traditional community-based protection techniques, also sparked a new idea. “We realized that the type of questions community members are asking and the information they’re sharing could also help to guide our response strategies,” Márton says. By monitoring this activity, “we can better understand their perspectives and coordinate responses accordingly.”

A flood of information

UNHCR was already keeping an eye on online interactions, but there was no out-of-the-box solution to enable comprehensive monitoring. As a result, efforts were manual, ad hoc, and resource intensive. The sheer volume of online engagement posed a critical challenge. In the early days of the response, UNHCR Hungary was seeing more than 500 relevant posts/messages on Facebook daily, and thousands of Telegram messages weekly. As Antonia notes, “Community based protection is the bread and butter of UNHCR, but the data points are soon overwhelming.”

Manual monitoring made it impossible to achieve a full picture, and it meant certain protection issues could be missed. For instance, UNHCR made an effort to map all young Ukrainian sports teams arriving in Hungary, because often the players, who were minors, were traveling without their usual caregivers. One team only came to their attention in 2022, even though the coach had posted a relevant message to a public Telegram channel in 2022. “Potentially we could have identified their presence two years earlier,” Márton notes, “meaning more assistance and more support for these 20 to 30 children.”

A thorough review indicated that existing commercial and open-source monitoring tools focused on keyword tracking and brand monitoring, rather than on generating the type of nuanced, community-based insights needed for protection work. GenAI made such insights possible, but it also raised data privacy and ethics concerns. Before doing anything else, Márton and Antonia, with support from colleagues, worked hard to understand these concerns and the perspectives of the communities they hoped to support.

Gathering community insights

UNHCR Protection teams under the Representation for Central Europe consulted with more than 100 refugees across four countries through twelve focus group discussions, which aimed to understand community perspectives on media monitoring and information provision. In addition, they gathered insights from expert organizations, partners, and national media authorities. “Obviously, the most important part was the community engagement and to get the support and approval of the refugees themselves by first giving them a full picture of how this would work, how we would use their data, what kind of channels we would be monitoring, and how they could provide feedback to us,” Márton says.

 
The project consulted with more than 100 refugees across four countries. Photo: UNHCR/Marton Elodi.
The project consulted with more than 100 refugees across four countries. Photo: UNHCR/Marton Elodi.

 

Interestingly, most of the feedback suggested that Ukrainians were aware the information they were sharing was public, and they were already expecting humanitarian organizations to be collecting and using it. Ideally, however, they indicated that they would like to see evidence of how the information and feedback gathered from them was used to inform programming. “If monitoring our online communities helps provide timely assistance, especially in urgent situations involving minors or people with disabilities, I think it’s beneficial,” said one participant in a discussion in Czechia’s Hradec Kralove.

At the same time, the team conducted separate research to understand and define responsible AI in this context, in order to be able to design a solution grounded in human rights, personal data protection and privacy, and humanitarian ethics. To ensure long-term impact and cross-team learning, they initiated the development of standard operating procedures for ethical online monitoring, aligned with UNHCR’s framework on Accountability to Affected People (AAP).

Mapping community concerns

It took Márton, Antonia, and a network of technical colleagues more than a year to complete the project’s preparatory phases, including community consultations, compliance processes, human rights due diligence, and privacy safeguards (including a comprehensive data protection impact assessment). Then, they set about engineering prompts and testing the utility of available OpenAI models — privately deployed and running on UNHCR’s Microsoft Azure cloud to ensure complete data protection — to analyze the information being generated by online conversations. “The project kind of turned into this data-driven intelligence platform,” Márton says, “exploring how we can use data from different sources in different formats so that they all inform the same protection work.”

This process — designed to render information from these divergent data sources in a uniform format, to make it easily searchable and comparable — was guided by three core principles:

  • Stay in control.
  • Everything must be verifiable.
  • Evidence-based process must be traceable from start to finish.

To address the second point, the team built a user interface that would enable feedback on the results to be given line-by-line, with specific information either accepted or rejected, to ensure accuracy. The result is a system that maintains responsible AI principles with mandatory human oversight and full output traceability.

The testing was startlingly successful from the beginning. With careful prompt engineering, very little fine tuning was required. “The fact that our initial prompts were producing reliable and useful results was something that was pretty impressive for me,” Márton recalls, “but also at first surprising, because I wasn’t expecting to make it work so easily.” The team was able to review and instantly classify information, using a labelling system that enabled them to, among other things, disaggregate information according to age, gender, and diversity (AGD) characteristics.

Among its key features, the tool maps community concerns by identifying locations mentioned across online platforms and displaying them on a heat map. Users can filter by date to track trends over time, helping teams quickly assess protection challenges in specific areas before missions or interventions.

Accounting for imperfect tools

A key risk of LLMs is that they are nondeterministic, meaning, as Márton puts it, “that for the same input they might be generating a different output”. AI-based algorithms and analysis are much easier to break than traditional statistical methods. Moreover, we don’t fully understand how these models work; their internal logic is somewhat opaque. Importantly, too, LLM’s are better at extracting data from certain document formats and certain languages than others — which could pose more of a challenge when seeking to develop a standardized, widely used tool. Nevertheless, the project’s rigorous research and three core principles helped to mitigate these issues — with verification, triangulation, and follow-up action — and ensure accurate, verifiable results.

While the project did not set out to train a bespoke GenAI tool on specific datasets, Márton acknowledges that this would have manifold benefits. “UNHCR could play a very crucial role in making AI more inclusive and more specialized and more safe, of course,” he says.

Identifying potentially life-saving opportunities

As resources shrink, being able to efficiently understand refugee concerns and respond to them becomes ever more essential. While many are concerned about the rapid advancements in AI, it undoubtedly creates new ways for humanitarians to enhance effectiveness and close the feedback loop with communities — if deployed creatively and responsibly. This project carefully addressed potential pitfalls and uncovered key opportunities this technology presents for UNHCR’s protection work, developing a scalable, technical solution that could be deployed across different contexts.

The team’s highly effective, fully documented prototype software — grounded in community preferences and aligned to humanitarian principles — is now available to be tested by different UNHCR Operations for different use cases. Its rapid, comprehensive analysis of community-produced data from different sources has high potential — particularly in new displacement situations when a high volume of data is generated.

As Márton notes, “In emergencies, this tool helps preserve and interpret fast-moving information.”

Find out more about UNHCR’s Digital Innovation work.