Andrew G Howard [email protected] Hartwig Adam Google Inc, Menglong Zhu [email protected] Hartwig Adam Google Inc, Bo Chen [email protected] Hartwig Adam Google Inc, Dmitry Kalenichenko [email protected] Hartwig Adam Google Inc, Weijun Wang [email protected] Hartwig Adam Google Inc, Tobias Weyand [email protected] Hartwig Adam Google Inc, Marco Andreetto Hartwig Adam Google Inc (2017)
This paper presents MobileNets, a class of efficient models designed for mobile and embedded vision applications, utilizing a streamlined architecture that implements depthwise separable convolutions. The authors introduce two global hyperparameters: the width multiplier and the resolution multiplier, which allow model builders to efficiently trade off between latency and accuracy to suit their specific application needs. Extensive experiments conducted demonstrate that MobileNets outperform other popular models in terms of resource efficiency while maintaining strong accuracy, particularly on the ImageNet classification task. The authors provide evidence of MobileNets' versatility across various applications including object detection, fine-grained classification, and geo-localization. Overall, MobileNets are shown to be capable of delivering comparable performance to larger models with significantly reduced computational and size requirements.
This paper employs the following methods:
The following datasets were used in this research:
The authors identified the following limitations: