Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich, Google Inc University of North Carolina Chapel Hill, Google Inc Google Inc University of Michigan Google Inc Google Inc Google Inc Google Inc, Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich, Google Inc University of North Carolina Chapel Hill, Google Inc Google Inc University of Michigan Google Inc Google Inc Google Inc Google Inc, Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich, Google Inc University of North Carolina Chapel Hill, Google Inc Google Inc University of Michigan Google Inc Google Inc Google Inc Google Inc (2014)
The paper presents a deep convolutional neural network architecture called Inception, specifically the GoogLeNet model, which achieved state-of-the-art results in the ImageNet Large-Scale Visual Recognition Challenge 2014. The architecture emphasizes improved utilization of computing resources through intricate design, increasing both depth and width without exceeding computational budgets. Key insights include using 1x1 convolutions for dimensionality reduction and the incorporation of multi-scale processing. The model achieves significant results in both classification and detection tasks, outperforming previous architectures while using fewer parameters. The paper discusses the importance of efficient architectural choices in the context of mobile and embedded environments, highlighting the balance between accuracy and computational efficiency. The results demonstrate that approximating optimal sparse structures with dense components can yield competitive performance in object detection and image classification tasks, reinforcing the efficacy of the Inception architecture.
This paper employs the following methods:
The following datasets were used in this research:
The authors identified the following limitations: