On Uppsala’s Old Town Square, along the banks of Fyris river, stands the University of Uppsala’s Institute for Peace and Conflict Resolution. The institute oversees the Uppsala Conflict Data Programme (UCDP), the leading producer of conflict data in the world, and the data is the most widely used concerning armed conflict and its consequences. Together with RISE, UCDP investigates the possibility to automate its business using artificial intelligence.
– “Our analysts review around 50,000 pieces of news per year and record between ten and twelve thousand events in the database,” says Kristine Eck, Senior Lecturer at UCDP.
The quality of the data produced at UCDP is unparalleled. For example, all major reports from various UN agencies that have studied armed conflict in recent years are based on data from UCDP, the data are used in the World Bank’s World Development Indicators and serve as the basis for World Health Organization (WHO) conflict epidemiology. From across the globe, students, journalists, researchers and international organisations rely on UCDP to obtain systematic and up-to-date information on conflicts occurring around the world.
Collaboration with leading operators
In the type of work carried out by UCDP, which involves analysing news and then categorising and storing the information in a structured database, the question of automation is bound to arise. And UCDP is no exception.
– “We’ve been asked about automation,” says Eck. “But due to the kind of data and operations we have, the technology has not existed before. We’re not completely sure that it does now, but, in order to find out, we felt we had to collaborate with experts in the field.”
Stipulations on quality are non-negotiable
Challenge for RISE
UCDP approached RISE and asked: ‘Can what we do be automated?’ and ‘Will someone else be able to do something similar?’. With reports coming from diverse sources (with varying credibility and relevance) and the fact that multiple incidents are sometimes reported in a single article, it was – suffice it to say – a challenge for RISE.
– “There are primarily two real challenges when it comes to machine learning: the scope of information taken into account by UCDP analysts when annotating texts, and the quality of the data,” explains Fredrik Olsson, Senior Researcher at RISE. “Each event processed by UCDP analysts is described according to twenty or so attributes, including where the incident took place, who was involved, and how many people died. Each attribute can comprise anything from three to more than 4,000 values. What this means is that there is essentially a lack of examples from which a machine can learn. The other challenge data quality – has to do with the work process and raw data.”
Focus on data quality
The quality of data is something that UCDP considers absolutely critical with no room for compromise.
– “Stipulations on quality are non-negotiable,” says Eck. “Our data are used by the UN and the World Bank and in the unlikely event that an incident in the database is miscoded, it could have major political consequences.”
For this reason, the collaboration with RISE has mainly focused on the broader issues, and not on ways to simplify matters for UCDP and its employees.
– “Above all, we want to ensure that technology does not overtake us,” concludes Kristine Eck.