Christian Szegedy [email protected] Google Inc Zbigniew Wojna University College London, Vincent Vanhoucke [email protected] Google Inc Zbigniew Wojna University College London, Sergey Ioffe [email protected] Google Inc Zbigniew Wojna University College London, Jonathon Shlens [email protected] Google Inc Zbigniew Wojna University College London (2015)
The paper "Rethinking the Inception Architecture for Computer Vision" by Christian Szegedy et al. discusses advancements in convolutional neural networks, specifically focusing on the Inception architecture. It emphasizes the importance of computational efficiency while improving model performance, particularly in constrained environments such as mobile vision. The authors propose several design principles for scaling convolutional networks and outline methods for factorizing convolutions to enhance computational savings. The study benchmarks their approach using the ILSVRC 2012 classification challenge validation set, reporting significant improvements over prior state-of-the-art results, including a top-1 error rate of 21.2% and top-5 error rate of 5.6% with a network that has a low parameter count. They detail enhancements made in the Inception-v3 architecture, which builds upon Inception-v2 by incorporating factorized convolutions, grid size reduction techniques, and auxiliary classifiers, thereby achieving high performance while minimizing computational costs. The paper also introduces a label-smoothing technique to regularize model predictions, further improving accuracy. Overall, the authors contribute valuable insights into the design and optimization of deep learning architectures for computer vision tasks, showing that low computational cost can coexist with high accuracy.
This paper employs the following methods:
The following datasets were used in this research:
The authors identified the following limitations: