Unlike the early warning system, there wasn’t a binary classification layer in this use case. There were various faults that they wanted the system to be able to identify, so that they could direct the right resources to fix them.
A binary classification system which categorised images as working or faulty wouldn’t have provided them this level of detail.
As such the system was mainly focused on object detection for identifying various objects and the faults in the images.
Data Collection
As explained earlier, one of the key enablers for developing the solution was the huge number of images that the NWASH portal already contained. While these images were labelled incorrectly, the existence of the images themselves was a strong body of data that could be used to train the system to identify the faults and label the images correctly.
Some real-world challenges were observed during the project:
The first challenge we encountered was scoping the data collection required to develop a model capable of achieving specific benchmarks, such as a predetermined level of accuracy. Real-world scenarios are inherently complex, necessitating careful consideration of various factors relevant to the prediction task at hand.
The second challenge stemmed from the quality and consistency of the images. Issues included object orientations, image noise (e.g. unwanted objects), and object size (often too small for effective visualisation). These factors presented significant obstacles as conventional approaches struggled to adapt to the complexities of real-world situations. For instance, classification algorithms faltered in the presence of image noise, while dimensional analysis necessitated well-oriented photographs devoid of obstructions. Moreover, the lack of uniformity among WASH objects across various projects posed additional difficulties. For example, if there were 10 different types of taps across all projects, each sample had to be incorporated into the training process. Consequently, any new sample would inherently be deemed an outlier. Until the Minimum Viable Product (MVP) was extensively utilised, identifying outliers within the dataset remained challenging.
Training
The team took a supervised learning approach, where they randomly selected a sample from the data. They then worked with a WASH expert who manually looked through the images and identified specific faults in the infrastructure, adding correct labels to the images. Leveraging the knowledge of the expert allows you to create a training set which identifies the aspects of images which you are most interested in identifying. In this case, the expert could put more weight on those faults which were most important from a resource allocation perspective, and less weight on those that weren’t.
Accordingly, supervised learning was the right approach in this context. If they had set out to create an unsupervised model, which was able to identify every distinct object in the image, they would have numerous useless labels which didn’t identify what they were interested in. By taking a supervised learning approach they could create a simplified model, which was only identifying relevant objects within the image.
Creating the model, and fine-tuning to the data
Rara Labs used a similar process to LUMs to develop the model. They used Yolo for object detection, through a fine-tuning process to adjust its parameters to their specific use cases. Through this fine-tuning process, they fed in the labelled training data, and the algorithm was refined to be able to make distinctions between the different faults present in the images.
Yolo is a CNN algorithm, which has a high speed and moderately good accuracy on good quality photographs. The challenge when tailoring it to a specific real-world use case, is to determine whether it works well with the data you’re leveraging.
When employing YOLO for object and fault detection, two critical factors were considered:
The dataset should predominantly comprise large or medium-sized objects.
The objects within the images exhibit minimal clutter and overlap.
To ensure adherence to these criteria, the dataset was thoroughly collected and processed for the training. Furthermore, a comprehensive set of guidelines was created for capturing photographs to elevate their quality for future utilisation.
Testing
During the testing process, a random sample was selected from the data which hadn’t been used in the training process. This unlabelled data was fed through the system, which picked out faults present in the images. The next stage was to share the images which had been processed and labelled by the system with an expert to review whether all the relevant faults had been identified by the system.
Following this assessment, the team went through various iterations to adjust the model based on the feedback from the WASH expert. Ultimately the model was roughly able to achieve a 96-98% level of accuracy for the object detection.
While the process of refining the model is ongoing process, the initial findings from the first phase of work are promising. To understand the real impact, we need to understand the challenges of data collection, where someone needs to collect data for WASH assets one by one by answering several questions. There is a high possibility that the collected data will have some level of error on them.
The team is currently exploring new use cases, such as assessing the size of reservoirs and the height of the taps in the images contained in the portal. Furthermore, the nature of use cases did not have that one-size-fits-all solution. Therefore, other different experiments on Classification and Segmentation are being done as well.
Classification can be used to separate objects without faults with those with faults and types of faults. While Segmentation can be used to identify boundaries between various objects within the image. However, among the three approaches, the concentration of the pilot is mostly on object detection for the final MVP. With the findings from Classification and Segmentation, a hybrid approach can be developed for better and more accurate coverage of different use cases.