Last updated: May 9, 2020
The goal of this project is to create a ROS-integrated computer vision system for mosquito detection and keypoint identification to guide an automated mosquito dissection robotic system for live malaria vaccine production. The target keypoints on the mosquito to be identified include the proboscis, the head, and the neck. This project is in conjunction with Sanaria Inc. (Rockville, MD).
Malaria is a mosquito-borne disease that affects humans and is caused by a single-celled organism of the Plasmodium group. In 2017, there were over 200 million clinical cases of malaria, causing over 435,000 deaths, and over $12 billion USD loss in Africa alone. Despite the impact that malaria has and the clear need, there currently exists no effective malaria vaccine in the market. However, Sanaria, a biotechnology company in Rockville MD, has recently been successful in developing a live malaria vaccine that has shown to be up to 100% effect in clinical cases. These vaccines are made from attenuated Plasmodium falciparum sporozoites (PfSPZ), the very bacteria that causes malaria. Because of the live nature of these vaccines, they must be cultivated within live mosquitoes, and hence must also be extracted from mosquito salivary glands before being able to be used as a vaccine.
Currently, the workflow to create this vaccine is slow, requiring manual extraction of the attenuated PfSPZ from salivary glands using syringes. Hence, an autonomous robotic system is currently being developed in order to automate the process. The full workflow for automated PfSPZ extraction is outlined in Figure 2. Our project will focus on leveraging computer vision techniques to allow the robot to autonomously detect keypoints on the mosquito. This will allow the robot to optimally position the mosquito for decapitation and salivary gland extraction while minimizing human input.
Project goals:
All of these components are required to provide robust vision guidance to support a fully autonomous system, ultimately streamlining the production of an impactful vaccine.
Model-Based Approach
The main goals of the model-based approaches are to (1) detect and locate mosquitoes, (2) locate the center of the neck, and (3) locate the center of the proboscis. This is shown below in figure 3, as algorithms part of a proposed Sanaria CV library that communicates with the robot via the ROS client.
The goals of each of these steps is to find where the mosquitoes are on the rotating wheel, determine how far the robot must drag the mosquito to align it with the blade cutter, and determine where to grab the mosquito respectively. Each of these steps will be performed using traditional computer vision methods as follows:
Ultimately, these algorithms will be encapsulated in a single library with a function each to find a mosquito, locate its neck, and locate the endpoints of its proboscis. Furthermore, since the entire robot system and its individual components will be integrated in ROS, we will create a ROS service which can be called by the robot driver client, ultimately using these algorithms and seamlessly transmitting the outputs to later stages in the workflow.
Deep Learning Approach
The main goals of the deep learning approaches for this project are to (1) classify the orientation of the mosquito, (2) detect and locate the mosquitoes, and (3) locate keypoints of the mosquito, which include the proboscis tip, the proboscis end, the head, the neck, the thorax, and the abdomen. The process of creating deep learning neural network models and using them in practice is outlined below in figure 4.
Deep learning can occur via either training from scratch, or transfer learning. Training from scratch (shown in figure 4 as the right light-blue shaded box) occurs when a network model is initialized from random weights, and training images are provided to the network, where a series of forward propagation and backpropagation are performed so that the network is able to learn features of the image that it uses for classification. Though this specializes the model for our purposes, training is slow, and because of our limited dataset, accuracy is low. Validation images are used to evaluate the training progress of the models.
Therefore, transfer learning was used for this project, which involves taking a pretrained network (shown in figure 4 as the left light-blue shaded box) and fine-tuning it. By pretraining the network on other images, the neural network already has a representation of features of images that it can use for classification, speeding up the training process. The fine tuning process outlined in the image by using our training and validation dataset allows us to modify the model weights such that it can be used for our purposes.
Once the model has been trained, inference can occur, as shown in the light-red shaded box in the bottom of figure 4. The model is loaded, and the relevant image is imported and processed. Inference occurs when the image is fed into the trained neural network, where it will output the classification or other predictions.
Orientation Classification The orientation classification model aims to classify the mosquito as either missing, lying on its back or stomach, lying on its left side, or lying on its right side, as seen in figure 5. Deep learning is the preferred modality over image processing, since orientation is conveyed by the textures of the mosquito rather than distinct features that one can easily identify. Hence the use of deep learning to automatically learn these implicit features would have more success. Transfer learning of neural network models was performed via PyTorch. Pretrained models such as ResNet18, ResNet152, VGG16, and DenseNet121 were loaded from PyTorch, and trained on our orientation image dataset. Conditions such as data augmentation, learning rate, epoch number, and optimizers were varied to determine the best combination that would yield the highest accuracy.
Besides using deep learning for orientation determination, an exploratory side of this project is to use deep learning for automatic detection of keypoints of the mosquito, such as the proboscis tip and base, neck, midpoint between thorax and abdomen, and the end of the abdomen. It is the hope that if deep learning is successful, then it will be able to support (or even replace) the image processing algorithms, as sufficient training will make the detection robust to lighting conditions and other factors that may stump the image processing algorithm, providing more confident estimations of the location of the neck and other keypoints of the mosquito.
Mosquito Detection Transfer learning was also performed for our mosquito detection task, which involves locating the mosquito in the image. The Detectron2 library framework, made by Facebook AI and based on Mask R-CNN, was used for training, as it hosts a large number of pretrained models, and supports object detection, as required for this project. To use this model, Detectron2 models were loaded from the library, and training and validation images from the old system setup were used to train the model via the framework. Once training was completed, testing images were then provided to the model, where evaluation was performed.
Mosquito Pose Estimation For mosquito pose estimation/keypoint detection, the Detectron2 library was also used, as it supported human pose estimation, which was re-purposed to be applicable for mosquito keypoints. Again, Detectron2 models were loaded from the library, and training and validation images from the old system setup were used to train the model via the framework. Once training was completed, testing images were then provided to the model, where evaluation was performed.
The proposed workflow for training both mosquito detection and mosquito pose estimation is as follows:
Items in red are no longer possible due to COVID-19 updates.
Here give list of other project files (e.g., source code) associated with the project. If these are online give a link to an appropriate external repository or to uploaded media files under this name space (2020-05).
Source code and Documentation for deep learning can be found here
Source code and Documentation for image processing algorithms can be found here