Contact Us
CiiS Lab
Johns Hopkins University
112 Hackerman Hall
3400 N. Charles Street
Baltimore, MD 21218
Directions
Lab Director
Russell Taylor
127 Hackerman Hall
rht@jhu.edu
Last updated: 2023/5/9
The project goal is to develop a Deep Learning based kinematics model for the Nitinol wire-based continuum manipulator manipulator for future surgery implementation.Traditional kinematic predictions are challenging for soft continuum robots; thus, deep learning is employed to establish the relationship between input angles and manipulator output configuration. The resulting kinematic equations can be adapted to various Nitinol continuum wire manipulator designs through transfer learning techniques.
Retinal diseases impact around 200 million people globally, with degenerative diseases being a major cause of vision loss. To safely and precisely perform surgery within the limited retinal structure, a soft robot called the Continuum Wire Manipulator has been developed and utilized in medical procedures. Soft robots, like concentric tube robots, offer valuable advantages in navigating tight spaces due to their flexibility. However, controlling the position and movement of soft robots is more complex compared to rigid body robots. Consequently, extensive research has been conducted on the kinematics of concentric tube robots to establish the relationship between motor actuation and the configuration of the soft robot.
Retinal Disease Symptom Areas and Soft Robot for Vein Cannulation
Background: Retinal diseases affect approximately 200 million people worldwide, and degenerative diseases in particular are a leading cause of vision loss. Current treatment options include subretinal injections of pharmaceuticals and retinal vein cannulation, both of which require precise and delicate manipulation of tools within the confined space of the eye. This is where soft robots, such as concentric tube robots, can be particularly useful due to their flexibility and ability to navigate tight spaces. However, the control of soft robot position and movement is more challenging than rigid body robots. This challenge led to a great deal of researches on concentric tube robots kinematics to find out the mapping between the input of motor actuation and the output of soft robot configuration.
Significance: In previous studies, two approaches have been applied to finding the kinematics of concentric tube robots. One of the approaches was study through the mechanics of the concentric tube robots systems and provides a general equations. The other one built a simulated dataset based on the previously found mechanic equation for deep learning and then fine-tuned with transfer learning get the equations for specific concentric tube setups. For the Nitinol continuum wires manipulator, due to its novel geometric configuration, there was no study before to find out its general equations with mechanic based illustration, so the Nitinol continuum wires manipulator study could not be built upon an existing theoretical model. Instead, a deep learning method was proposed to study the general kinematics equations for Nitinol continuum wires manipulator. Around 10,000 data would be collected from real world experiments through different readings from motor encoder and openCV detection. Once the general kinematics equation was tested and validated from deep learning, it could be used for the transfer learning to quickly fine-tune towards some specific designed Nitinol continuum wires manipulators. These specific designed ones would have similar but different geometric parameters and their data would be separately collected for transfer learning.
Overview
Motor Control Design
Angle Controller with PID and the RoboClaw Controller
The purpose of implementing a Proportional-Integral-Derivative (PID) controller in conjunction with the RoboClaw Package was to control the motor's rotation, enabling it to reach and maintain a specific angle. The PID controller functions by calculating the error between the desired angle and the motor's current angle, subsequently adjusting the motor's output as needed. The three components of the PID controller—proportional, integral, and derivative—cooperate to minimize error and stabilize the motor's position。
Integration with the Computer Vision System and the Motor Control
During the experimentation phase, the PID control sometimes required a longer execution time. However, given the necessity to collect a large volume of data, often in the thousands, we developed a conditional program to optimize the process. This program assessed whether the encoder readings had reached the target value, and upon achieving the desired value, the system would exit the PID control loop and proceed directly to the computer vision algorithm. This approach significantly improved the efficiency of the data collection process.
Dual-Camera Computer Vision Based Data Generation
Dual-Camera Calibration
Dual-camera calibration is the process of determining the relative positions and orientations of two cameras with respect to each other. These parameters are used to transform the 2D image coordinates into 3D world coordinates through point cloud registration, which is necessary for the back-projection step of the Shape from Silhouette algorithm.
Data Generation for training
The data is generated using the Shape from Silhouette (SfS) algorithm, processing binary images from top and side cameras to obtain strip-like clusters representing the wire's 3D configuration. Gaussian Mixture Model clustering is applied to isolate the data and find the mean point of each cluster, while Silhouette Analysis, Akaike Information Criterion (AIC), and Bayesian Information Criterion (BIC) help determine the optimal number of clusters. Despite some contradictions, a visual evaluation is used to select 50 clusters for accurate representation and ease of generative Deep Learning.
Deep learning Neural Networks
Data Collection and Preprocessing
To develop the forward kinematics mapping for the Nitinol soft robot manipulator, 9,556 labeled images were collected using motor encoders and computer vision. These images provided the motor-actuated input angles and the output configuration for the CWM.
The input and output data were loaded and preprocessed by dividing them into training and validation sets (80% for training and 20% for validation, accounting for 3/4 of the total data), while also reserving 1/4 of the rest of data as a test dataset to evaluate prediction performance. The input data were standardized using the StandardScaler from the scikit-learn library. The output data were reshaped and standardized in a similar manner.
Overview of Deep Learning Model
The MLP model was trained using the mean squared error (MSE) loss function and the Adam optimization algorithm, employing a custom learning rate of 1e-5. The model underwent training for 500 epochs with a batch size of 30. The training process consisted of forward and backward passes, updating the model's parameters and minimizing the loss.
Throughout the training, the loss metrics, such as mean squared error, mean absolute error, and R-squared score, were computed and recorded for both the training and validation sets. These metrics were utilized to evaluate the model's performance and monitor the progress of training.
Tansfer Learning
Overview of Tansfer Learning
Transfer learning is a machine learning technique where a pre-trained model is used as the starting point for training a new model on a different task, as shown in Figure. The concept behind transfer learning is that the knowledge gained by a pre-trained model on one task can be useful in solving a related task. Instead of starting from scratch, the pre-trained model provides a foundation of knowledge that can be built upon and adapted to the new task.
Test Station Set
Experimental testing was performed as follows. First we connected the motor to the Arduino board using a suitable motor driver circuit and digital pins. To measure the motor's current position, we then connected an encoder or potentiometer to the Arduino board and use the analog pins for readings. Roboclaw board was then programmed for control signal computing and implement the PID control algorithm to manage the motor's position. For image capturing, we set up the Andonstar AD407 digital camera and install the “usb-cam” ROS package to send real-time images from the camera to the PC. We then utilized OpenCV for image processing, applying Bilateral Filtering and Chroma Key Masking to extract useful information from the images. Implement the Shape from Silhouette (SfS) algorithm to reconstruct the 3D shape from the 2D silhouette and calibrate the dual-camera setup to determine the relative positions and orientations of the cameras.
Different Wire Setting for Transfer Learning and Deep Learning
he experiment involved manufacturing five distinct Nitinol wires and collecting data for each of them. Specifically, we gathered 200 data points for each of the first three wires, totaling 600 data points, while only 5 data points were collected for the fourth wire.
To prepare the wire for experimental data collection, additional processing steps are necessary. First, the radius of the end crossing of the CWM is controlled through a plastic sealing technique. Subsequently, a fixture is employed to secure the metal ring at the end of the wire, ensuring its stable attachment to the motor. We measured and recorded the parameters of the line as well and Figure shows four wire's parameter for transfer learning.
Comparison between Learned and Real Data
We present a visualization of the real-world wire shape alongside the predicted wire shape. The red lines represent the predictions, while the blue lines indicate the ground truth. As can be observed, our model closely approximates the ground truth by effectively learning the underlying shape of the real-world data. This demonstrates the success of our approach in accurately predicting the forward kinematics of Nitinol manipulators.
Comparison of MAE for Transfer Learning in Different Fine-tuning Datasets with Measurement Precision in mm
The results of our transfer learning experiments are presented in Table, which compares the errors obtained when training from scratch and when using pre-training followed by fine-tuning for different data samples used as fine-tuning datasets. The table highlights the substantial improvements achieved through transfer learning. From the table, it is evident that the pre-training and fine-tuning process significantly reduces the error compared to training from scratch. The table is divided into two main categories: models trained from scratch and models that underwent a pre-training and fine-tuning process. The columns represent the size of the fine-tuning datasets, including one sample, five samples, and seven samples.
For the models trained from scratch, the MAE values are significantly higher, ranging from 122.248 mm with one sample to 61.793 mm with seven samples. This indicates that without pre-training, the models struggle to capture the underlying relationships in the data, especially when the size of the fine-tuning dataset is limited.
On the other hand, the pre-training and fine-tuning approach demonstrates substantially lower MAE values across all fine-tuning dataset sizes. For one sample, the MAE is 27.255 mm, while for five samples, it is 21.143 mm, and for seven samples, it is 20.789 mm. These results highlight the effectiveness of transfer learning in improving model performance, even when the fine-tuning dataset is small.
Grassmann, R.M., & Burgner-Kahrs, J. (2019). On the Merits of Joint Space and Orientation Representations in Learning the Forward Kinematics in SE(3). Robotics: Science and Systems XV.
Kuntz, A., Sethi, A., Webster, R. J., 3rd, & Alterovitz, R. (2020). Learning the Complete Shape of Concentric Tube Robots. IEEE transactions on medical robotics and bionics, 2(2), 140–147. https://doi.org/10.1109/tmrb.2020.2974523
Kong-man, C., Baker, S. and Kanade, T. (2005). Shape-from-silhouette across time part I: Theory and algorithms. International Journal of Computer Vision, vol. 62, no. 3, pp. 221–247.
Grassmann, R., Modes, V., & Burgner-Kahrs, J. (2018). Learning the Forward and Inverse Kinematics of a 6-DOF Concentric Tube Continuum Robot in SE(3). In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 5125-5132).
Bergeles, C., F. Y. Lin, and G. Z. Yang. “Concentric tube robot kinematics using neural networks.” Hamlyn symposium on medical robotics. Vol. 6. 2015.
Github link for the project. https://github.com/dusevitch/continuum_wire_robot