Publication
SurgeoNet: Realtime 3D Pose Estimation of Articulated Surgical Instruments from Stereo Images using a Synthetically-trained Network
Ahmed Tawfik Aboukhadra; Nadia Robertini; Jameel Malik; Ahmed Elhayek; Gerd Reis; Didier Stricker
In: Pattern Recognition - 46th DAGM German Conference, DAGM GCPR 2024, Proceedings. German Conference on Pattern Recognition (GCPR-2024), September 10-13, Munich, Germany, Lecture Notes in Computer Science, Vol. 15297, Springer Nature, 2024.
Abstract
Surgery monitoring in Mixed Reality (MR) environments has recently received substantial focus due to its importance in image-based decisions, skill assessment, and robot-assisted surgery. Tracking hands and articulated surgical instruments is crucial for the success of these applications. Due to the lack of annotated datasets and the complexity of the task, only a few works have addressed this problem. In this work, we present SurgeoNet, a real-time neural network pipeline to accurately detect and track surgical instruments from a stereo VR view. Our multi-stage approach is inspired by state-of-the-art neural-network architectural design, like YOLO and Transformers. We demonstrate the generalization capabilities of SurgeoNet in challenging real-world scenarios, achieved solely through training on synthetic data. The approach can be easily extended to any new set of articulated surgical instruments. SurgeoNet's code and data are publicly available.
