Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C Berg, Wan-Yen Lo, Piotr Dollár, Ross Girshick (2023)
The Segment Anything project introduces a foundation model for image segmentation through the development of three components: a promptable segmentation task, an innovative segmentation model termed SAM, and a data engine that collects a large dataset named SA-1B. The project aims to improve segmentation by employing prompt engineering, allowing for efficient data annotation and zero-shot generalization across tasks. The SA-1B dataset consists of over 1.1 billion masks gathered from 11 million licensed images, significantly surpassing existing segmentation datasets in both quantity and diversity. Through human evaluations and experiments, the efficacy of SAM in generating high-quality segmentation masks is demonstrated, proving its capabilities in various segmentation tasks, including edge detection, object proposals, and text-to-mask segmentation. Despite its advantages, the model does exhibit limitations, including challenges with ambiguity and the processing of fine structures, necessitating careful consideration of its application in real-world settings.
This paper employs the following methods:
The following datasets were used in this research:
The authors identified the following limitations: