Table of Contents

Robust Feature Matching for Endoscopic Reconstruction

CONFIDENTIAL materials. Contact the instructor for related claims.

Last updated: 05/10/2013 10:59 EST

Summary

Feature matching based 3D reconstruction is a standard technique in 3D Computer Vision. An natural extension is to reconstruct dynamic surfaces from videos, such as reconstructing sinus surfaces from endoscopic videos. However, since the camera is moving and the sinus surfaces are normally deformable and non-planar, the feature matching is usually unsatisfactory. We will employ a state-of-the-art feature matching strategy in the domain of minimally invasive image analysis. Instead of restricting inliers using a global affie transformation, multiple affine components are hierarchically clustered.

The main goal of this project is to prototypes the image matching and motion estimation and analyze their uncertainties. Specific tasks includes:

  1. to test Hierachical Multi-Affine (HMA) feature matching's performance.
  2. to test the accuracy of motion estimation with HMA feature matching.
  3. to perform empirical uncertainty analysis of the estimated motion in a leaving-one-out cross validation setting.
  4. to perform comparison between HMA matching and SIFT matching for all tasks above.

Figure 1. A full 3D reconstruction of a pediatric airway from video imagery acquired with a tracked endoscope. [Image from a NIH-funded project proposal with permission.]

Background, Specific Aims, and Significance

[Descriptions in the following are cited from a NIH-funded project proposal with permission.]

  1. Background of endoscopic reconstruction. It is estimated that there are more than 200,000 functional endoscopic sinus surgeries (FESS) procedures performed annually in the United States at a cost of several billion dollars annually. As the name implies, all of these procedures are performed under endoscopic guidance, and a large fraction employ surgical navigation systems to visualize critical structures that must not be disturbed during the surgery. Although navigation is widely employed for FESS, its capabilities are far from optimal. In particular, the sinuses contain structures that are smaller than a millimeter in size, and yet delineate critical anatomy such as the optic nerve or the carotid artery. However, the accuracy of navigation is 2 mm under near ideal conditions. As a result, navigation can provide a qualitative sense of location, but final confirmation of anatomic structures ultimately relies on the surgeon's ability to interpret and relate the CT image to the endoscopic view. This process, which is further complicated when the anatomy is distorted or otherwise altered by surgery, requires time, skill and experience and can lead to errors in judgement that adversely affect outcome.
  2. Aims of this course project: To develop methods for surface reconstruction and optionally further shape estimation from endoscopic videos. In detail, we propose to develop algorithms that are able to compute a surface reconstruction from video to an accuracy of 0:5 mm so that anatomic changes and surgical progress can be measured at any point of a procedure.
  3. Significance of endoscopic reconstruction. The significance of our work is the introduction of a paradigm shift in surgical navigation by using a device present in every endoscopic surgery, namely the endoscope, to improve registration and visualization of anatomy. This will have numerous positive impacts. Most importantly, our work will provide an inexpensive, non-invasive, radiation-free method to enhance registration accuracy at any point of the procedure. Enhancements in registration will reduce ambiguity for the surgeon during surgery, enhancing confidence, and improve workflow by reducing the need to re-register or re-image the patient. The endoscope will also be used as a measurement device to update anatomic models during a procedure. This not only will improve the ability of the surgeon to visualize the progress of the surgery, but it will accrue additional benefits to the patient and hospital, as it may reduce the level of radiation exposure and cost by eliminating the need for intraoperative CT imaging.

Deliverables

Technical Approach

The Hierarchical Multi-Afne (HMA) algorithm for fast and accurate feature matching is illustrated in the figure below. Its basic idea is to represent a plane or surface using multiple affine-transformation components.

Figure from G. A. Puerto-Souza and G. L. Mariottini. Hierarchical Multi-Affine (HMA) Algorithm for Fast and Accurate Feature Matching in Minimally-Invasive Surgical Images. 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems October 7-12, 2012. Vilamoura, Algarve, Portugal.

The comparison between SIFT raw matching and HMA matching is shown below.

(Patient data, distributed with permission.)

RANSAC detected outlier number vs. Total ‘matched’ feature number. Shown for both HMA and SIFT for a comparison.

The 3D reconstruction pipeline is shown in the figure followed.

Figure from [Mirota etal 2012]: D. Mirota, H. Wang, R. H. Taylor, M. Ishii, G. L. Gallia, and G. D. Hager. A System for Video-Based Navigation for Endoscopic Endonasal Skull Base Surgery. IEEE Trans. Med. Imaging, 31(4), 963-976 (2012).

The basic idea of empirical uncertainty analysis is to compute statistics such as variance and covariance from results in a number of experiments, either by cross validation or Monte Carlo simulation.

Estimated standard deviation of alpha, beta and gamma vs. feature number. Shown for both HMA and SIFT for a comparison.

Comparison of the estimated covariance matrix by HMA matching vs. SIFT matching. Images shown for the pair (frame 6, frame 7).

Projection error of the held-out query keypoint with HMA matching.

Dependencies

External libraries.

Camera calibration is performed by using Caltech Matlab calibration toolki.

SIFT features are extracted using VLfeat Matlab library.

HMA matching are performed using HMA Matlab toolbox.

RANSAC based essential matrix estimation is performed using OpenCVs fi ndFundamentalMat.

Camera motion recovery from the essential matrix is done using Structure and Motion Matlab toolkit.

Patient Data

Patient data are collected at Johns Hopkins Hospital on December 19, 2012. The endoscopic video is hours long. A data collection system has been developed to simultaneously capture both the endoscopic video and external motion tracking data. However, data collection is out of the scope of this course project, which focuses on the algorithm design and testing.

The CT data is still unavailable till the end of this course project.

Milestones and Status

  1. Milestone 1: Program for robust feature matching by HMA algorithm.
    • Planned Date: 28th February
    • Expected Date: 7nd March
    • Status: Done.
  2. Milestone 2: Program for motion estimation by RANSAC and 5 point algorithm.
    • Planned Date: 14th March
    • Expected Date: 14th March
    • Status: Done.
  3. Milestone 3: Empirical quantitaive comparison of HAM and SIFT algorithms
    • Planned Date: 2nd May
    • Expected Date: 9th May
    • Status: Done.
  4. Milestone 4: Program for video-CT registration by Trimmed ICP algorithm
    • Planned Date: 9th May
    • Expected Date: 12nd May
    • Status: Done.

}

Reports and presentations

Project Bibliography

  1. D. Mirota, H. Wang, R. H. Taylor, M. Ishii, G. L. Gallia and G. D. Hager. A System for Video-Based Navigation for Endoscopic Endonasal Skull Base Surgery. IEEE Trans. Med. Imaging, 31(4), 963-976, 2012.
  2. G. Puerto, M. Adibi, J. Cadeddu1 and G. L. Mariottini, Adaptive Multi-Affine (AMA) Feature-Matching Algorithm and its Application to Minimally-Invasive Surgery Images. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2371 - 2376, Sept. 25-30, San Francisco, California, 2011.
  3. G. A. Puerto-Souza, and G. L. Mariottini. Hierarchical Multi-Affine (HMA) algorithm for fast and accurate feature matching in minimally-invasive surgical images. 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems October 7-12, 2012. Vilamoura, Algarve, Portugal.
  4. G. Puerto and G. L. Mariottini. A comparative study of correspondence-search algorithms in MIS images. Medical Image Computing and Computer Assisted Interventions (MICCAI12), Nice, France, 2012.
  5. G. A. Puerto and G.L. Mariottini. A Fast and Accurate Feature-Matching Algorithm for Minimally Invasive Endoscopic Images“. IEEE Transactions on Medical Imaging, 2013.
  6. R. I. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, ISBN: 0521540518, second edition, 2004.
  7. R. Szeliski. Computer Vision: Algorithms and Applications. Springer, 2010.

Other Resources and Project Files

Here give list of other project files (e.g., source code) associated with the project. If these are online give a link to an appropriate external repository or to uploaded media files under this name space.