Skip to main content Skip to main navigation


On the Out of Distribution Robustness of Foundation Models in Medical Image Segmentation

Ho Minh Duy Nguyen; Tan Pham; Nghiem Tuong Diep; Nghi Phan; Quang Pham; Vinh Tong; Binh T. Nguyen; Ngan Hoang Le; Nhat Ho; Pengtao Xie; Daniel Sonntag; Mathias Niepert
In: The Thirty-Seventh Annual Conference on Neural Information Processing Systems (NeurIPS 2023). Neural Information Processing Systems (NeurIPS), Workshop on Robustness of Few-shot and Zero-shot Learning in Foundation Models, December 10-16, Advances in Neural Information Processing Systems, 12/2023.


Constructing a robust model that can effectively generalize to test samples under distribution shifts remains a significant challenge in the field of medical imaging. The foundational models for vision and language, pre-trained on extensive sets of natural image and text data, have emerged as a promising approach. It showcases impressive learning abilities across different tasks with the need for only a limited amount of annotated samples. While numerous techniques have focused on developing better fine-tuning strategies to adapt these models for specific domains, we instead examine their robustness to domain shifts in the medical image segmentation task. To this end, we compare the generalization performance to unseen domains of various pre-trained models after being fine-tuned on the same in-distribution dataset and show that foundation-based models enjoy better robustness than other architectures. From here, we further developed a new Bayesian uncertainty estimation for frozen models and used them as an indicator to characterize the model’s performance on out-of-distribution (OOD) data, proving particularly beneficial for real-world applications. Our experiments not only reveal the limitations of current indicators like accuracy on the line or agreement on the line commonly used in natural image applications but also emphasize the promise of the introduced Bayesian uncertainty. Specifically, lower uncertainty predictions usually tend to higher out-of-distribution (OOD) performance.