Table of Contents

Improved Generalization of Pelvis X-ray Landmark Detection

Last updated: 02/13/2020

Note: This project was changed in mid-semester, in response to the COVID-19 outbreak. The new project is “A County-level Dataset for Informing the United States' Response to COVID-19”, and its web page can be found here.

Summary

Develop a method that improves pelvis landmark detection for intraoperative 2D to 3D registration on real, scarce X-rays that takes advantage of simulated, readily available X-rays.

Background

Minimally invasive hip surgery is a desirable method for many patients. Although its benefits remain controversial with regard to pain management and recovery time, many patients strongly prefer a smaller incision to more traditional hip surgery. Unfortunately, these cosmetic advantages translate to additional complexity for the surgeon. Minimally invasive hip surgery requires navigation in and manipulation of anatomical structures which are underneath unbroken skin and thus not reliably visible to the surgeon. At the same time, correctly aligning the cup and stem is crucial to the operation's success. In the past, this has been achieved by using the minimal incision as a “mobile window” for identifying anatomical landmarks, but this can result in unreliable outcomes.

Alternatively, fluoroscopic imaging provides intraoperative 2D visualization of the hip anatomy, but it presents its own set of challenges. First and foremost, the mental interpretation of 2D X-ray images places an undesirable burden on the surgeon, at a time when her chief concern should be correctly aligning the hip. Computer-assisted tracking systems overcome the requirement for mental 2D/3D registration of the image with the anatomy. Based on the fluoroscopic image of the hip, they can automatically track desired objects and display their poses in the context of a preoperative plan.

Significance

The systems we investigate here involve the registration of intraoperative 2D fluoroscopic images with a 3D preoperative model. Improper initialization of traditional registration algorithms, such as Iterative Closest Point (ICP) and its many variants, can lead to large registration errors. This is because of the numerous local minima which may exist in the cost optimization's function. A better “first guess” makes it much more likely that ICP converges on the actual minimum, a reliable registration. This first guess typically takes the form of human input, identifying anatomical landmarks in the hip anatomy. Yet human input is undesirable for two reasons. First and foremost, the time required for human landmark annotation is not insignificant. Even a 4-5 second delay interrupts the surgical procedure, resulting a disjointed alignment process. Sub-second registration, on the other hand, would allow more continuous adjustment of the cup and stem.

Specific Aims

Prior work has shown that deep learning (DL) based techniques can identify anatomical landmarks in a fast and reliable manner, in order to initialize an ICP algorithm. Unlike ICP algorithms, deep neural networks (DNNs) learn generalizable features from labeled training data, and use them to interpret previously unseen images. For example, use simulated images to generate arbitrarily large training data with perfectly known ground truth anatomical landmarks. They show that a multi-stage DNN trained on these simulated images can generalize well to real-world images but are susceptible to scenarios not seen during training. A surgical tool which occludes the image can severely compromise the DNN's ability to detect anatomical landmarks. Since the goal of automatic landmark detection is continuous, intraoperative feedback for the surgeon, it would be impractical for the surgeon to withdraw her tools during every registration. Thus, we aim to improve the generalization of DNNs from simulation to real-world scenarios which are not encountered during training.

There are many possible approaches for improving sim-to-real generalization. We propose using a novel patch-normalized convolution (PNC) layer, which constrains feature descriptors to a local region at every scale, described in the Technical Approach of the project plan. Based on preliminary results, PNC shows an improved ability to generalize to unseen types of noise, especially additive noise patterns and contrast adjustments. We anticipate that DNNs which employ PNC will be particularly effective for occlusions by surgical tools due to the high contrast between these tools and typical intensities for an unoccluded X-ray.

Deliverables

Deliverables:
---------- ---------------- --------------------------------------------------------------------------------------------------
           Algorithm        DNN for landmark detection.
           Implementation   PyTorch Implementation, Made Public on [GitHub](https://github.com)
 Minimum   Validation       Anatomical landmark detection results on real data, matching prior work.
           Documentation    Inline code documentation.
           Presentation     Final written report, in-class presentation.
---------- ---------------- --------------------------------------------------------------------------------------------------
           Algorithm        DNN for landmark detection **using PNC**.
           Implementation   PyTorch implementation, made public on [GitHub](https://github.com), **ready for academic use**.
 Expected  Validation       Anatomical landmark detection results on real data, **exceeding prior work**.
           Documentation    **Organized and complete code** documentation.
           Presentation     Final written report, in-class presentation.
---------- ---------------- --------------------------------------------------------------------------------------------------
           Algorithm        DNN for landmark detection **using PNC**.
           Implementation   PyTorch implementation, made public on [GitHub](https://github.com), ready for academic use.
 Maximum   Validation       Anatomical landmark detection results on real data **with demonstrable generalization**.
           Documentation    Organized, complete code documentation, final report, **academic publication**.
           Presentation     Final written report, in-class presentation, **academic publication**.
---------- ---------------- --------------------------------------------------------------------------------------------------

Technical Approach

Much of the recent success in computer vision is due to the advent of the deep convolutional neural network, which has at its core the convolutional layer. We propose to apply DNN architectures based on prior work to anatomical landmark detection, incorporating a novel type of convolutional layer, the Patch-normalized Convolution (PNC). We hypothesis that the spatially local nature of the PNC layer as well as its robustness to noise will enable greater generalization to real X-ray data. We refer to and for a discussion of the U-Net and stage-based DNN architectures, respectively, which we will employ for landmark detection.

State-of-the-art DNNs, including the afforementioned U-Net and stage-based network, usually pair a convolutional layer with a normalization layer. In our project proposal, we review these concepts briefly in order to lay the groundwork for PNC, which combines a convolution with a kernel-dependent normalization. Please see project_proposal.pdf for a detailed mathematical formulation.

Dependencies

Our primary dependencies are simulated and real fluoroscopic images of the hip with anatomical landmarks. Simulated X-ray data has been used in an ongoing manner by Cong Gao. Fortunately, these are already resolved. The real X-ray data requires some formatting, for which Robb Grupp is an ongoing contact. Additionally, we are heavily reliant on advanced computational resources for experimentation and ablation studies of any proposed method. The MARCC compute cluster is a reliable high-compute system with multiple redundancies for high-capacity data storage. Alternatively, we have guaranteed access to two personal workstations with high-speed SSD primary drives and high-capacity HDD backup data drives. Any code, documentation, or statistical results are version-controlled and backed up using GitHub.

Recently, based on, we realized it might be of academic interest to evaluate our method's generalization ability to images which are occluded by surgical tools in a previously unseen manner. Although this is not core to our aim of improving sim-to-real generalization, it is nevertheless of interest. Therefore the effort to obtain real images with surgical tool occlusions is ongoing.

Dependency                                         Solution                                                     Alternative                                   Status
-------------------------------------------------- ------------------------------------------------------------ --------------------------------------------- -----------------------------------------------------
Anatomical Landmark Detection Software             `Generalizing_Pelvis_Landmark_Detection` Repository Access                                                 Solved
DeepDRR Dataset of Simulated Fluoroscopic Images   Transfer from Cong Gao                                                                                  NA On Personal Workstation
Computational Resources (GPU)                      MARCC Cluster Access                                                 Personal Workstations (3x total GPUs) Allocation Granted
Real X-ray Images for Testing                      [Robb Grupp](mailto:grupp@jhu.edu)                                                                      NA On BIGSS Shared Drive
Real X-ray Images with Occlusions (new)            Authors of(mailto:unberath@jhu.edu) IN PROGRESS
Efficient PNC PyTorch Implementation               [Xingtong Liu](mailto:XingtongLiu@jhu.edu)                                                              NA Solved
-------------------------------------------------- ------------------------------------------------------------ --------------------------------------------- -----------------------------------------------------

Milestones and Status

Milestone                                    Date    Status
------------------------------------------- ------- --------
Obtain simulated X-ray data from Cong Gao    02/15    Done
Obtain Real X-ray data from Robb Grupp       02/11    Done
Finalize simulation training pipeline        03/01  
Finalize Real X-ray validation pipeline      03/07  
Finalize DNN architecture/algorithm          03/21  
Finish ablation study                        04/14  
Finish statistical analysis                  04/21  
Presentation                                 05/05  
Final report                                 05/15  
Academic publication                          TBD   
------------------------------------------- ------- --------

}

Reports and presentations

Project Bibliography

[1] H. Roth et al., “A new 2.5 D representation for lymph node detection in CT,” The Cancer Imaging Archive, 2015.

[2] A. Khan, A. Sohail, U. Zahoora, and A. S. Qureshi, “A Survey of the Recent Architectures of Deep Convolutional Neural Networks,” arXiv:1901.06032 [cs], Feb. 2020.

[3] R. Grupp et al., “Automatic Annotation of Hip Anatomy in Fluoroscopy for Robust and Efficient 2D/3D Registration,” arXiv:1911.07042 [cs, eess], Nov. 2019.

[4] S. Ioffe and C. Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,” arXiv:1502.03167 [cs], Mar. 2015.

[5] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” presented at the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.

[6] M. Unberath et al., “Enabling machine learning in X-ray-based procedures via realistic simulation of image formation,” Int J CARS, vol. 14, no. 9, pp. 1517–1528, Sep. 2019, doi: 10.1007/s11548-019-02011-2.

[7] Y. Wu and K. He, “Group Normalization,” presented at the Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 3–19.

[8] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” in Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, Eds. Curran Associates, Inc., 2012, pp. 1097–1105.

[9] D. Ulyanov, A. Vedaldi, and V. Lempitsky, “Instance Normalization: The Missing Ingredient for Fast Stylization,” arXiv:1607.08022 [cs], Nov. 2017.

[10] J. L. Ba, J. R. Kiros, and G. E. Hinton, “Layer Normalization,” arXiv:1607.06450 [cs, stat], Jul. 2016.

[11] B. Bier et al., “Learning to detect anatomical landmarks of the pelvis in X-rays from arbitrary views,” Int J CARS, vol. 14, no. 9, pp. 1463–1473, Sep. 2019, doi: 10.1007/s11548-019-01975-5.

[12] A. Malik and L. D. Dorr, “The Science of Minimally Invasive Total Hip Arthroplasty,” Clinical Orthopaedics and Related Research®, vol. 463, pp. 74–84, Oct. 2007, doi: 10.1097/BLO.0b013e3181468766.

[13] O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional Networks for Biomedical Image Segmentation,” in Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, Cham, 2015, pp. 234–241, doi: 10.1007/978-3-319-24574-4_28.

[14] M. Woerner et al., “Visual intraoperative estimation of cup and stem position is not reliable in minimally invasive hip arthroplasty,” Acta Orthopaedica, vol. 87, no. 3, pp. 225–230, May 2016, doi: 10.3109/17453674.2015.1137182.

[15] B. Bier et al., “X-ray-transform Invariant Anatomical Landmark Detection for Pelvic Trauma Surgery,” arXiv:1803.08608 [cs], Mar. 2018.

Other Resources and Project Files

Here give list of other project files (e.g., source code) associated with the project. If these are online give a link to an appropriate external repository or to uploaded media files under this name space (2020-03).