Machine learning systems can identify patterns in data and use those patterns to make predictions.
There are three main types of machine learning: supervised learning, unsupervised learning, and reinforcement learning.
Supervised learning
In 2019, one of our FCDO colleagues saw that, despite South African miners being entitled to recompense if they had developed TB or silicosis, very few people were receiving payments. The issue was a massive backlog of X-ray scans, and too few doctors available to read them. In collaboration with the University of British Colombia, we developed an AI tool which was capable of distinguishing healthy X-rays from those that needed to be checked for respiratory diseases.
In supervised machine learning, you start with a training set of labelled data. In this case, a set of X-rays which were correctly labelled by a professional doctor as either: silicosis present, TB present, or neither present. The system is fed this data and trained to identify patterns underlying the matching of the data with the label. As it learns these patterns, it adjusts its underlying algorithms such that the inputs are accurately matched with the correct labels. Once the model is trained, you feed in data which was not part of the training set, and the model can correctly identify the output.
Unsupervised learning
Unsupervised learning on the other hand, involves a machine learning application identifying patterns within a dataset, and using those patterns to uncover the structures within the data. It does this without being explicitly trained by humans to identify what those structures will be. The key difference is that the training data in unsupervised learning is not labelled, and as such the system does not have a preset range of expected outputs to organise the data.
One use-case of unsupervised learning is clustering. Clustering involves the classification of data based on similarities and differences present in data. One example of where his technique is commonly used, is customer segmentation. Suppose you have a database of information about different customers spending habits, e.g. the frequency of their purchases, the kinds of products they buy, and the amount they spend. The machine learning application will look for clusters of similarity between the datapoints and use this to create organic groupings. The system might spot a similarity between certain customers who make frequent, large purchases, and others who make infrequent large purchases. These clusters can then be used by the company to classify customers into specific groups, without explicitly training the system to categories data into specific categories (IBM, n.d.)
Reinforcement learning:
One of the most famous examples of reinforcement learning comes from DeepMind’s AlphaGo. In 2016, AlphaGo shocked computer scientists by beating a world champion in the ancient game of Go – a territory-based board game (Google, n.d.). In a strategy game like Go, players make sequential decisions in a dynamic environment (the choice of optimal moves changes based on where the other player goes), under uncertainty (you don’t know where the other player is going). In these circumstances, a machine learning technique called reinforcement learning is particularly effective.
In reinforcement learning, an AI system is placed in an environment – in our example, the starting position for a game of Go. It then makes a decision in that environment which is either rewarded or punished. The rewards relate to actions that the developers identify to be the goal of the system, such as making an optimal move. The system is trained to try and receive as many rewards as possible. Over time, as it receives punishments and rewards, it gradually learns to pick sequences of decisions that optimise its rewards and minimise its punishments (Bhatt, 2018). AlphaGo simulated numerous games of Go, following various decisions until it ultimately identified strategies by which it could get the highest score on its reward function.
Natural Language Processing and Transformers: birth of Generative AI
The field of Natural Language Processing (NLP) underwent a revolutionary change with the introduction of the Transformer architecture in the 2017 paper "Attention is All You Need" by Vaswani et al. This new approach to machine learning, which relied heavily on a mechanism called "attention," allowed models to process and understand language in a way that was much closer to human comprehension. Transformers could look at entire sequences of text at once, rather than processing words one by one, enabling them to capture context and relationships between words much more effectively than previous methods.
The true potential of Transformer-based models became apparent to the wider public with the release of GPT-3 (Generative Pre-trained Transformer 3) by OpenAI in 2021. This massive language model, with 175 billion parameters, demonstrated an unprecedented ability to generate human-like text, understand context, and perform a wide range of language tasks with minimal specific instruction. The release of GPT-3 marked a turning point in public perception of AI, showcasing capabilities that seemed almost magical to many observers. It sparked widespread discussions about the potential applications and implications of such powerful language models, bringing AI into the mainstream consciousness like never before.