Visual Tracking of Surgical Tools in Retinal Surgery using Particle Filtering

Last updated: 4/26/2012


We aim to develop a direct visual tracking method for retinal surgical tools using mutual information and particle filtering.

  • Students: David Li, William Yang
  • Mentor(s): Dr. Rogerio Richa

Background, Specific Aims, and Significance


Vitreoretinal surgery is undertaken to treat eye problems involving the retina, macula, and vitreous fluid, and is considered one of the most difficult types of surgeries to perform. Some eye problems treated by vitreoretinal surgery include macular degeneration, retinal detachment, and diabetic retinopathy. During vitreoretinal surgery, small implements are inserted into the ocular vitreous cavity through small incisions and are manipulated within through a microscope to perform the procedure.

Potential complicating factors include difficult visualization of surgical targets, poor ergonomics, lack of tactile feedback, and the requirement for high precision and accuracy. Since the surgery is performed using indirect visualization, the surgeon faces limited field and clarity of view, depth perception, and illumination, which hinders identification and localization of surgical targets and leads to long operating times and risks of surgical error.

Specific Aims

While there are many tools available and in development to help surgeons with hand tremor, such as the microsurgical robot, intraoperative force transduction sensors, and intraoperative optical coherence tomography (OCT) retinal scans, there has not been a method of detecting and tracking the surgical tool implements from the viewpoint of the optical microscope that is robust and precise enough for practical use in a clinical scenario. Our project attempts to resolve this issue using the following components.

  1. Mutual Information (MI) as a similarity measure
  2. Particle Filter (PF) as a stochastic optimization method
  3. GPU / Parallel Processing to greatly improve the processing speed of our PF


  1. In determining the location of the tool during surgery, we will use template-based registration with Mutual Information (MI) as the similarity measure. We do this instead of using SSD or NCC as the similarity measure as MI is more robust in presence of changes in illumination, rotation, scale, and limited texture information.
  2. Also known as a condensation algorithm, we will use a particle filter to track the motion of the tool. Gradient descent methods, which are traditionally used, suffer from problems with local minima which occur when there are large displacements in the tool position. In addition, a particle filter supports alternative hypothesis tracking, so that it is unlikely an incorrect hypothesis will become “sticky.”
  3. Finally, we hope to implement parallel processing using a GPU to process separate particles in parallel at each time step, greatly improving processing speed and allowing for the evaluation of more particles per frame. This will allow our filter to be much more robust in real time.


  • Minimum: (Expected by 3/14)
    1. OpenCV demo of tool tracking using mutual information and particle filtering (offline video)
    2. Well-documented and optimized code
  • Expected: (Expected by 4/11)
    1. All deliverables from Minimum, as well as the following
    2. CISST code running on surgical platform (online)
    3. Poster and paper (Expected by 5/9)
  • Maximum: (Expected by 5/2)
    1. All deliverables from Expected, as well as the following
    2. Refinements to the tracking algorithm
    3. GPU/parallel implementation of particle filter

Technical Approach

Mutual Information

A 4 degree of freedom model will be used: translation (2D), rotation, and scale. The calculation of the MI score involves calculating the joint entropy between the template (I*) and image (I), then subtracting the images’ individual entropies:

  h(I) = -sumr[pI(r) log(pI(r))]
  h(I, I∗) = -sumr,t[pII∗(r, t) * log(pII∗(r, t))]
  MI(I, I∗) = h(I)+h(I∗)−h(I, I∗)

MI is effectively a measure of the quantity of shared information between the two images being compared.

Particle Filter

The particle filter is essentially a two-step iterative process. With an initial constrained random sampling of particles around the last known location of the tool, the MI score is calculated for each particle. The MI score is thus the weight of the particle. The field of particles is then resampled so that each particle has equal weight, but the density of particles is proportional to the calculated MI score. The process is then iterated, and the location of the tool determined based on the most probable hypothesis currently known.


Ideally, the MI score calculation stage of the particle filter would be run in parallel. We propose to do this on a GPU. This would enable the particle filter to be very robust without sacrificing the performance and frame rate needed to perform a live surgery; without massively parallel computational capacity, the number and density of particles would need to be reduced in order to allow the tracker to run at an acceptable frame rate, which in turn reduces the robustness and accuracy of the tracker.

Error Analysis

  • To measure error analysis, we will use the first fully annotated and freely available image data set for tool detection in in vivo retinal microsurgery.4 Error analysis will consist of two parts: tool detection using mutual information and tool tracking using particle filtering.
  • To analyze the efficacy of mutual information as a tool detection algorithm for vitreoretinal surgery, the parameters will be determined by maximizing accuracy on a validation subset of the complete image data set and the mutual information detector will be evaluated on the entire image data set. Correct predictions will be defined as those that are within 10 pixels of true location for both endpoints of the shaft of the tool. An ROC curve will be plotted as false positive rate vs. true positive rate, and this will be compared against those generated by that of other image detectors based on SSD or NCC. Ideally, our detection algorithm should be on the order of 90% accuracy with a false positive rate on the order of 10^-8, which is what state-of-the-art face detection algorithms have.4
  • To analyze the efficacy of particle filtering as a tool tracking algorithm for vitreoretinal surgery, the complete particle filtering with mutual information algorithm will be evaluated on video sequences. At any frame, the tracking is said to have failed whenever the true position of the terminating end of the shaft of the tool is greater than some threshold σ. Whenever that happens, a note is made and the tracking algorithm is reinitialized using ground truth to continue analysis. After the sequence is complete, the number of frames successfully tracked continuously is plotted against the number of events as a histogram and is fitted to a geometric distribution. The lower the fitted probability, the better the robustness; current gradient descent tracking algorithms using SSD or MI have a p-value of around 0.1.4


  • Development environment for Milestones 1 and 2
    • Resolved (Visual Studio/OpenCV)
  • Development environment for Milestones 3 and 4
    • Will work with Rogerio (CISST libraries)
  • Access to CUDA-enabled GPU for Milestone 5
    • Resolved for offline development; will work with Rogerio for online
  • J-Card access to robotorium
    • Resolved
  • Use of microretinal surgery workstation
    • Will need to schedule when ready
    • If not accessible, will work on pre-recorded data

Milestones and Status

  1. Milestone 1: Basic Particle Filter
    • Planned Date: 3/7
    • Actual Date: 3/6
    • Status: Complete
  2. Milestone 2: Implement Mutual Information
    • Planned Date: 3/14
    • Actual Date: 4/4
    • Status: Complete
  3. Milestone 3: Port to CISST
    • Planned Date: 4/4
    • Expected Date: 4/18
    • Status: Complete
  4. Milestone 4: Refinements to Algorithm
    • Planned Date: 4/18
    • Expected Date: 5/2
    • Status: Complete
  5. Milestone 5: Parallel Implementation
    • Planned Date: 4/18
    • Planned Date (updated): 5/2
    • Expected Date: 5/2
    • Status: Complete

Reports and presentations

Project Bibliography

  • Balicki, M., Han, J., Iordachita, I., Gehlbach, P., Handa, J., Taylor, R., and Kang, J. (2009). Single Fiber Optical Coherence Tomography Microsurgical Instruments for Computer and Robot-Assisted Retinal Surgery. MICCAI 2009, 108-115
  • Dame, A. and Marchand, E. (2010). Accurate real-time tracking using mutual information. IEEE Int. Symp. on Mixed and Augmented Reality, ISMAR'10, 47-56.
  • Isard, M. and Blake, A. (1998). Condensation – conditional density propagation for visual tracking. Int. Journal of Computer Vision, 29, 5-28.
  • Richa, R. et al. (2012). An Evaluation Framework for in vivo Microretinal Tool Detection and Tracking. MICCAI
  • Richa, R. et al. (2012). Hybrid SLAM for Intra-operative Information Augmentation in Retinal Surgery. MICCAI

Other Resources and Project Files

Here give list of other project files (e.g., source code) associated with the project. If these are online give a link to an appropriate external repository or to uploaded media files under this name space.

courses/446/2012/446-2012-14/project14.txt · Last modified: 2012/12/10 17:36 (external edit)