======Surgical Phase Recognition using Deep Learning====== ======Summary====== Surgical phase recognition plays a crucial role in the era of digitized surgery. Deep learning solutions have seen great success in endoscopic surgeries. Currently, no prior work has investigated its application in skull-base surgery (Cortical Mastoidectomy). In this project, we will benchmark existing DL solutions and create an innovative DL segmentation algorithm in skull-based surgery. * **Students:** Xucheng Ma, Xiaorui Zhang, Wenkai Luo * **Mentor(s):** Max Li, Danielle Trakimas, Dr.Francis Creighton, Prof.Mathias Unberath, Prof.Russell Taylor {{:courses:456:2022:projects:456-2022-06:intro.png?400|}}{{:courses:456:2022:projects:456-2022-06:intro2.png?400|}} ======Background, Specific Aims, and Significance====== Purely vision-based recognition has been proven to be successful in endoscopies surgeries. Spatial and temporal features are proven to be crucial and efficient in tackling the surgical phase segmentation task. Many DL networks were proposed to extract those features and achieve automatic phase segmentation effectively. An automatic surgical phase recognition has numerous potential medical applications, such as automatic indexing of surgical video databases and real-time operating room scheduling optimization. It’s also a Foundation of an intelligent context-aware system, which facilitates surgery monitoring, surgical protocol extraction, and decision support. ======Deliverables====== * **Minimum:Benchmarking Existing DL Solutions​** (Expected by April 5th) - New dataset from cortical mastoidectomy videos - Benchmark three existing methods that are designed for a similar task (Cholecystectomy).​ - A well-documented benchmarking code base that can be easily reused to test out any model for our dataset.​ * **Expected: Design a DL Solution for Skull-base Surgery​** (Expected by End of Spring Semester 22) - Experiment results analysis of the weakness of benchmarked models.​ - Redesign a new model that achieved 85% accuracy, 80% recall, and precision.​ * **Maximum:** (Expected by End Of Summer 22) - Conference paper​ ======Technical Approach====== Since the surgical phase segmentation is a sequential problem rather than a per-frame classification, the proposed deep learning neural network needs to extract spatiotemporal features. * **Spatial feature extraction:** * One of the essential parts of the architecture is the spatial feature extractor, which extracts the feature from the frames and converts them into an abstract representation format. Based on the idea of transfer learning, the pre-trained convolutional neural network(CNN) models have been proved to be stat-of-the-art on many computer vision tasks. Furthermore, the pre-trained model can extract more specific features with the fine-tuning process using the mastoidectomy dataset. * **Spatiotemporal model:** * The most crucial part of the architecture is the spatiotemporal model, which is to capture the temporal patterns from the extracted per-frame spatial features from video. The temporal patterns are essential to make a correct prediction since the patterns can provide clues of surgery environment change and the instrument motion. Recurrent neural networks such as LSTM, temporal convolution neural network (TCN), and transformer are promising architectures for capturing the temporal pattern. However, some research argues that LSTM can only capture short-term information while long-term context might be beneficial for accurate segmentation. All of them will be implemented on the mastoidectomy dataset in our project. * **High-level Classifier:** *The fully connected neural network(FCNN) will then be trained with the ground truths to make the phase decision based on the extracted spatiotemporal feature since it can approximate any arbitrary function. One crucial concern is the over-fitting issue, which can be compensated by different training skills such as drop-out and batch-normalization. ======Dependencies====== ^ Main Dependencies ^ Sub Dependencies ^ Contact ^ Expected Date ^ Status ^ Alternative solution ^ | Dataset | Data Generation​ |Dr. Danielle Trakimas ​ | 04/01 |Complete |N/A | | ::: | Annotation Protocol​ |Dr. Danielle Trakimas ​ | 02/18 |Complete |N/A | | ::: | Data Annotation​ |Dr. Danielle Trakimas | 03/17 |Ongoing |N/A | | ::: | IRB Training  |Dr. Danielle Trakimas ​ | 02/11 |Complete |N/A | | ::: | IRB Amendment  |Dr. Danielle Trakimas ​ | 02/25 |Complete |Use the safe desktop to do the preprocessing of the video, and onedrive streaming will be the alternative solution to address the failure of the IRB amendment | | Computational Resources| GPU | Max Li​ | 02/18 |Complete |Use the online GPU resource such as Amazon cloud or Colab(Need to get the budget from mentors) | | ::: | Server Remote Access |Anton Deguet​ | 02/18 |Complete |Set up the computer in a physically available environment, and we need to use that computer to finish the project | | Existing Framework & Public Dataset​​ | Framework​ |Max Li | 02/11 |Complete |Implement and reproduce the frameworks based on the paper by ourselves using PyTorch | | ::: | Laparoscopic Public Dataset (Cholec80)​ |Max Li | 02/11 |Complete |Find Another available public dataset | | Clinical Advice​ | Clinical Advice​|Dr. Danielle Trakimas ​ | / |Ongoing |Need to find another expert to provide clinical advice | ======Milestones and Status ====== - Milestone name: Proposal and Plan * Planned Date: 02/10 * Expected Date: 02/10 * Status: 100% - Milestone name: Sample Dataset * Planned Date: 02/20 * Expected Date: 02/20 * Status: 100% - Milestone name: Fully Annotated Dataset * Planned Date: 03/17 * Expected Date: 04/17 * Status: 12/15 80% - Milestone name: Minimum Deliverables * Planned Date: 03/27 * Expected Date: 03/27 * Status: 100% - Milestone name: Initial Network Design * Planned Date: 04/22 * Expected Date: 04/22 * Status: 30% - Milestone name: Expected Deliverables * Planned Date: 04/29 * Expected Date: 04/29 * Status: 60% - Milestone name: Final Presentation * Planned Date: 05/02 * Expected Date: 05/02 * Status: 0% ======Reports and presentations====== * Project Plan * {{:courses:456:2022:projects:456-2022-05:proposal_plan_presentation.pdf| Project plan presentation}} * {{:courses:456:2022:projects:456-2022-05:proposal_plan.pdf|Project plan proposal}} * Project Background Reading * See Bibliography below for links. * Project Checkpoint * {{:courses:456:2022:projects:456-2022-05:Checkpoint_Presentation_April_12th.pdf| Project checkpoint presentation}} * Paper Seminar Presentations * {{ :courses:456:2022:projects:456-2022-05:seminar_presentation_march_8th.pdf | Seminar presentation}} * {{:courses:456:2022:projects:456-2022-05:critical_review.pdf|Critical Review}} * Papers: * {{ :courses:456:2022:projects:456-2022-05:sv-rcnet.pdf |SV-RCNet: Workflow Recognition From Surgical Videos Using Recurrent Convolutional Network}} * {{ :courses:456:2022:projects:456-2022-05:tecno.pdf | TeCNO: Surgical Phase Recognition with Multi-Stage Temporal Convolutional Networks}} * {{ :courses:456:2022:projects:456-2022-05:trans-svnet.pdf | Trans-SVNet: Accurate Phase Recognition from Surgical Videos via Hybrid Embedding Aggregation Transformer}} * Project Final Presentation * {{:courses:456:2022:projects:456-2022-01:final_report.pdf|Final poster}} * Project Final Report [[https://livejohnshopkins.sharepoint.com/:b:/s/SurgicalPhaseSegmentation/EZqd8l8uuslJp4PzzWDkfU0BNffgqX5omtBBLvJZKLuQcg?e=Be04yZ|Final Report]] * links to any appendices or other material ======Project Bibliography======= - Andru Putra Twinanda, Sherif Shehata, Didier Mutter, Jacques Marescaux, Michel de Mathelin, and Nicolas Padoy. Endonet: A deep architecture for recognition tasks on laparoscopic videos. CoRR, abs/1602.03012, 2016.​ - Xiaojie Gao, Yueming Jin, Yong-Hao Long, Qi Dou, and Pheng-Ann Heng. Trans-svnet: Accurate phase recognition from surgical videos via hybrid embedding aggregation transformer.CoRR, abs/2103.09712, 2021.​ - Carly Garrow, Karl-Friedrich Kowalewski, Linhong Li, Martin Wagner, Mona Schmidt, Sandy Engelhardt, Daniel Hashimoto, Hannes Kenngott, Sebastian Bodenstedt, Stefanie Speidel, Beat Müller, and Felix Nickel. Machine learning for surgical phase recognition a systematic review. Annals of Surgery, Publish Ahead of Print, 11 2020.​ - Henry Lin, Izhak Shafran, David Yuh, and Gregory Hager. Towards automatic skill evaluation: Detection and segmentation of robot-assisted surgical motions. Computer aided surgery : official journal of the International Society for Computer Aided Surgery, 11:220–30, 10 2006.​ - Tobias Blum, Hubertus Feussner, and Nassir Navab. Modeling and segmentation of surgical workflow from laparoscopic video. volume 13, pages 400–7, 09 2010.​ - Joonmyeong Choi, Sungman Cho, Jong Chung, and Namkug Kim. Video recognition of simple mastoidectomy using convolutional neural nets: Detection and segmentation of surgical tools and anatomic regions. Computer Methods and Programs in Biomedicine, 208:106251, 06 2021.​ - Colin Lea, Joon Hyuck Choi, Austin Reiter, and G Hager. Surgical phase recognition: from instrumented ORs to hospitals around the world.​ - Manish Sahu, Angelika Szengel, Anirban Mukhopadhyay, and Stefan Zachow. Surgical phase recognition by learning phase transitions. Current Directions in Biomedical Engineering, 6(1):20200037, 2020.4​ - Lea, Colin, Austin Reiter, René Vidal, and Gregory D. Hager. "Segmental spatiotemporal cnns for fine-grained action segmentation." In European Conference on Computer Vision, pp. 36-52. Springer, Cham, 2016.​ - Jin, Yueming, Yonghao Long, Cheng Chen, Zixu Zhao, Qi Dou, and Pheng-Ann Heng. "Temporal memory relation network for workflow recognition from surgical video." IEEE Transactions on Medical Imaging 40, no. 7 (2021): 1911-1923. ======Other Resources and Project Files======= * [[https://livejohnshopkins.sharepoint.com/:x:/s/SurgicalPhaseSegmentation/EciPWczCjnBElMe6pxA0o9sBbHTNHlemB4Fh1KVxRrj_cQ?e=IIuLc2|Gantt Chart]] * [[https://github.com/mxchenggggg/phase_detection|GitHub Repository]]