ScreenSpot is an evaluation benchmark for GUI grounding, comprising over 1,200 instructions from various environments, including iOS, Android, macOS, Windows, and Web. Each data point includes annotated element types (Text or Icon/Widget). For more details and examples, please refer to our paper.
Each test sample includes:
"icon"
or "text"
.Variants: ScreenSpot
This dataset is used in 1 benchmark:
Recent papers with results on this dataset: