AI as a tool for International Development professionals

Pilot learnings

17 Jun

A blog by Seb Mhatre, an FCDO Pioneer

Pilot: Using LLMs as a tool for International Development professionals

International development organisations both publish and fund the publishing of tens of thousands of documents every year. There are strategy documents, policy documents, programme documents, evaluation reports, guidance notes, research, grey literature, contracts and tenders. Publishing documents about their activities helps ensure transparency. Stakeholders, including donors, partner countries, beneficiaries and the general public, can see where and how funds are being used. This openness, showing how resources are being allocated towards intended development outcomes, builds trust and supports accountability. Publishing information about their work fosters collaboration between development organisations, governments, the private sector and civil society, creating opportunities for new partnerships. Publishing policy documents can shape development agendas, influence other stakeholders and mobilise resources. Publishing programme documents, research and grey literature can share lessons on what works and what doesn’t and help other organisations understand the complexity of specific areas of international development.

However, there is a problem. Most people working in international development only have the time to read a tiny fraction of the documents that are potentially relevant to their work. This means that it is normally the case that there is a considerable amount of information that would have been useful for a decision that an international development professional is making or a task they are performing that wasn’t used. Lack of time is the overarching theme but this can be broken down into more specific reasons why a development professional might not end up using relevant and useful information. Sometimes, people don't actively search for information because they either don't know it exists or don't think they need it. Even if they do look for it, they may struggle to find what they need due to time constraints. Even when they find the information, it may be too complex to understand, too hard to distil into something usable or too hard to verify. Similarly, within the information they do use they may end up using it incorrectly because they assumed it was trustworthy or relevant when it fact it was unreliable or only reliable within an unrelated context.

Advances in digital technologies continue to address the problems in knowledge management. Databases store vast amounts of information in an organised and structured manner. The internet connects individuals and organisations worldwide, facilitating the sharing of ideas, research findings, and expertise. Search engines have made information retrieval far simpler and more effective, enabling us to sift through the sea of online content for useful information. However, one of the problems that remain is that the wealth of information contained within documents, like those published about international development, is unstructured data. This means that the content does not follow a predefined, structured data model and without that structure it is hard to analyse using traditional data science methods. A second problem is that up until recently, analysing semantic content of unstructured data such as data within documents relied on Natural Language Processing (NLP). NLP can handle basic tasks like keyword extraction or tagging parts of speech but it can’t handle more complex semantic relationships. This meant that even using state of the art NLP it would not have been possible to ask a question about what information was contained in a document and receive an accurate response to that question in plain English.

In November 2022, OpenAI released ChatGPT, a Large Language Model (LLM). An LLM is trained on a huge amount of data to predict the best next word in a sequence and generate realistic language content. ChatGPT was much better than most people expected. Since then there has been an explosion of activity in the development of LLMs with several billion dollar companies in competition with OpenAI and a vibrant open source community. LLMs are already being used to power commercial chatbots, to scale content creation or for new approaches to search. However, much of the interest in LLMs is due to the speed of innovation both in the underlying LLM technology and in its application to new tasks. So the question is whether LLMs can help international development professionals to do their work by making it easier to use knowledge stored in the huge number of published documents about international development.

My guess is that this is likely to be true. There are already examples in other domains of LLMs being used by organisations to support their teams with tasks that depend on extracting knowledge from documents. For example, law companies are using LLMs to find similar cases or summarise past legal advice or case precedents. Secondly, LLM’s are improving all the time e.g. larger context windows, multi-modal capabilities, better retrieval augmented generation (RAG) architectures, use of knowledge graphs, use of agent-based approaches …

One of the maxims of data science is ‘rubbish in, rubbish out’. This means that the potential for LLM tools within international development is limited by the usefulness of the information in the documents they use. So a good starting point for thinking about the potential value is to look at what useful information published documents within international development contain and what the limitations of that information are. For example, programme documents, like business cases or annual reviews, contain information such as when, where and how they did what, what was achieved, what are the relationships with other stakeholders or what were the risks or lessons learned. This has the potential to be valuable in common use cases within development organisations such as programme design (e.g. writing a business case, concept note or theory of change), programme implementation (applying lessons learned from other programmes), strategic thinking (creating a thematic or geographical portfolio summary) or accountability/transparency (enabling better tracking of programme activities).

This is the idea we are exploring through our Frontier Tech project i.e. how to use LLM’s to leverage the information within publicly available documents to support common evidence based tasks in international development. The team consists of Robbie Phillips and myself from FCDO and Olivier Mills, Founder of Baobab Tech. In addition to hopefully creating a useful tool that anyone can use, given the constantly evolving capabilities of LLMs, the focus of the pilot is on learning what best practice looks like when using LLMs and to create a collaborative approach and community with other international development organisations. At the moment we are still in the early stages of the pilot. We have done some focus groups with FCDO colleagues that suggests we are on the right track and that there are some tasks where an LLM tool would be useful. We have also developed a basic RAG prototype that can retrieve information from the FCDO documents within the International Aid Transparency Initiative (IATI) dataset, such as annual reviews, programme completion reviews, evaluations and business cases. Over the next couple of months, the next steps are to keep testing and iterating on user needs, prompts and RAG recipes and to develop approaches for quality assurance and benchmarking performance. Over the next couple of months, there are two main measures of success. The first is to develop an LLM tool that is able to improve the delivery of at least one task commonly performed by international development professionals and to measure the value of and quality assure the performance by the LLM tool on that task. The second is to capture what we learnt in a way that helps us and others think about what comes next.

In the scenario where we do find a reliable value add from the tool, there are a few potential ways to build on that success over the next year. We can explore more tasks. Each task needs a customised approach, its own user needs analysis, performance metrics and its own quality assurance. Linked to that we can also explore new datasets. A bigger knowledge base with more documents (that contain useful information) leads to more opportunities to extract value and the potential for new tasks but also to a greater need for careful design of user experience and optimising the RAG approach. However, the most exciting way we can make further progress is by building/contributing to an international development community that is working together to explore LLM use cases in the international development space. A bigger community means more feedback, more experimentation, faster iteration and more awareness of the potential of LLMs. That greater awareness of both use cases and best practice then leads to more international development professionals using LLMs and more feedback into the design of future tools.

Next steps

As we make progress on learning about how to use LLMs for supporting work in international development, we will continue to write new blog posts. We are interested in collaboration and building a community of international development professionals interested in the use of LLMs, so if you are interested in or working on LLMs for International Development, or want to take part in one of our user needs focus groups or to test the prototype, please do get in touch by contacting jenny.prosser@dt-global.com.

If you’d like to dig in further…

🚀 Explore this pilot’s profile page

AILLMs for international development

Frontier Tech Hub

The Frontier Technologies Hub works with UK Foreign, Commonwealth and Development Office (FCDO) staff and global partners to understand the potential for innovative tech in the development context, and then test and scale their ideas.

AI as a tool for International Development professionals

Next steps

Launchpad - Thinking ahead: sustainable business models for your pilot

Installing Ruby’s hardware: from one mother to countless others