← ML Research Wiki / 1409.1556

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan Visual Geometry Group University of Oxford, Andrew Zisserman Visual Geometry Group University of Oxford (2014)

Paper Information

arXiv ID

1409.1556

Venue

International Conference on Learning Representations

Domain

Computer vision

SOTA Claim

Yes

Reproducibility

8/10

Contents

Abstract
Methods
Datasets
Results
Limitations
Related Work
External Resources

Abstract

In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve the stateof-the-art results. Importantly, we have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.1

Summary

In this paper, the authors explore the impact of convolutional network (ConvNet) depth on accuracy in large-scale image recognition, reporting on a significant performance improvement achieved by using up to 19 weight layers. This work contributed to their successful submission to the ImageNet Challenge 2014, where they secured top positions in both classification and localization tasks. The authors present two architectures, "Net-D" and "Net-E", demonstrating state-of-the-art results in image classification on the ILSVRC-2012 dataset, and show that their models generalize well to other datasets, enhancing research in deep visual representations. The paper provides a detailed methodology for training, evaluation, and the architectures employed, as well as discussing the implementation and enhancements over previous ConvNet designs.

Methods

This paper employs the following methods:

Convolutional Neural Networks

Models Used

Net-D
Net-E

Datasets

The following datasets were used in this research:

ImageNet

Evaluation Metrics

top-1 error
top-5 error

Results

Improved accuracy with deeper ConvNets on ILSVRC-2012
Achieved state-of-the-art results in ILSVRC classification and localisation tasks

Limitations

The authors identified the following limitations:

Not specified

Technical Requirements

Number of GPUs: 4
GPU Type: NVIDIA Titan Black

Keywords

deep learning convolutional neural networks ImageNet large-scale recognition

Papers Using Similar Methods

External Resources

Funding: ERC grant VisRec no. 228180
References: 43
Influential Citations: 13764

Very Deep Convolutional Networks for Large-Scale Image Recognition

Abstract edit

Summary

Methods add

Models Used add

Datasets add

Evaluation Metrics add

Results add

Limitations add

Technical Requirements edit

Keywords add

Related Papers