Healthcare is a complicated sector, with a highly specialised jargon. Therefore, it was crucial for the team to create a model which could cut through the jargon and create natural human-like conversation which users could understand.

For the kind of application Inavya set about building, it was also incredibly important that the model could process both the context and detail of user requests. One of the ways in which you improve the performance of an LLM to perform language tasks in a specific sector is through Retrieval Augmented Generation (RAG).  

RAG is a hybrid framework which integrates the baseline generative power of LLMs to create human-like text with a semantic similarity search function. We’ll explore this in more detail later in the case study but essentially this semantic search draws information from specific sources which were not part of the LLMs training data, and uses that information to augment its baseline generative abilities, i.e. its “knowledge”. 

Consider this analogy:

Before a doctor trains, they already know how to understand language and hold a conversation - this is equivalent to the baseline functionality of the generative model.  They also master a body of medical knowledge.   When they start talking with the patient, they use their pre-existing language understanding and conversation ability but only draw on the body of medical knowledge they have been trained on, and not any new information (until the patient starts sharing).  When talking with patients, the doctor might want to retrieve new information from a knowledge base, or pull patient files. This information has not been part of its pre-training. That “in the moment” querying of data is RAG. It combines the pre-existing conversational ability and new data understanding of the LLM with the specific knowledge in the knowledge base that you connected it to. 

As such, the technique of retrieval-augmented generation has been developed to be more suited to enterprise use-cases, such as healthcare. As a hybrid framework which augments the generative power of LLMs with a knowledge base comprised of specific information, which may include personal and private data, the technology can be used to create chatbots which can process the unique requirements of specific user requests and maintain maximum relevance within specific domains. This context-specificity, and opportunity for personalisation, provided a strong foundation for a tool that could meet the needs surfaced in the assessment phase.