Natural language processing (NLP) is a sub-field of AI focused on the analysis, interpretation and generation of language by a computer. It combines a rule-based deconstruction of the complexities of human language with AI modelling to create applications which can process the meaning of texts and recreate them (IBM, n.d.).

NLP can be broadly broken down into two sub-categories:  

  • Natural language understanding (NLU) which is focused on the interpretation of human language 

  • Natural language generation (NLG) which is focused on the creation of human language (DeepLearning, 2023).  

When combined, these two functions act as a foundation for a powerful toolkit of applications, which can be leveraged to tackle many of the challenges international organisations currently face. Whether that be synthesising a distributed evidence-base into summaries which can be used to inform programme design, or providing farmers with up-to-date knowledge of best practices, tailored to their farms, there is growing evidence around the potential impact of NLP in international development. Before we explore the scope of this potential impact, it will be useful to understand some of the key NLP tasks and untangle it from two related concepts: Generative AI (GenAI) and Large Language Models (LLMs).

What can NLP be used for?  

The items listed below, are unique tasks that are part of NLP. Any given use-case might involve a variety of these tasks coming in at different steps in the development process. For example, a customer service chatbot would involve a classification layer to categorise queries into specific classes, adding a summarisation layer to synthesise different policies and information, and then using text generation to produce a response.  

You can find some of the key tasks below:  

  • ext Classification is the task of assigning a label or class to a given text. Some use cases are sentiment analysis, natural language inference, and assessing grammatical correctness (Hugging Face, n.d.). 

  • Sentiment analysis is the process of identifying the underlying emotion behind a piece of text, a common use-case is to identify positive/negative reviews, but it also has more nuanced applications (Jagota, 2020)

  • Summarisation involves breaking down longer documents into a concise summary which captures the key concepts of the longer document (Accern, 2023). 

  • Chatbots are conversational agents which can process and respond to questions in a natural manner – they are able to “remember” information from previous questions (Pollock, 2023).

  • Answering a question posed about the content of a document/set of documents. 

  • This involves answering a question posed about the data in a table (Hugging Face, n.d.).

  • This involves extracting different pieces of information within a text and classifying them into predefined categories (Turing, n.d.). 

  • Models which can produce texts in a specific style or format (DeepLearning, 2023).

  • Detecting the language used in a text and translating it into another language.