← ML Research Wiki / 1612.03144

Feature Pyramid Networks for Object Detection

Tsung-Yi Lin Facebook AI Research (FAIR) Cornell University and Cornell Tech, Piotr Dollár Facebook AI Research (FAIR), Ross Girshick Facebook AI Research (FAIR), Kaiming He Facebook AI Research (FAIR), Bharath Hariharan Facebook AI Research (FAIR), Serge Belongie Cornell University and Cornell Tech (2016)

Paper Information

arXiv ID

1612.03144

Venue

Computer Vision and Pattern Recognition

Domain

Artificial Intelligence

SOTA Claim

Yes

Reproducibility

8/10

Contents

Abstract
Methods
Datasets
Results
Limitations
Related Work
External Resources

Abstract

Feature pyramids are a basic component in recognition systems for detecting objects at different scales.But recent deep learning object detectors have avoided pyramid representations, in part because they are compute and memory intensive.In this paper, we exploit the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost.A topdown architecture with lateral connections is developed for building high-level semantic feature maps at all scales.This architecture, called a Feature Pyramid Network (FPN), shows significant improvement as a generic feature extractor in several applications.Using FPN in a basic Faster R-CNN system, our method achieves state-of-the-art singlemodel results on the COCO detection benchmark without bells and whistles, surpassing all existing single-model entries including those from the COCO 2016 challenge winners.In addition, our method can run at 5 FPS on a GPU and thus is a practical and accurate solution to multi-scale object detection.Code will be made publicly available.

Summary

This paper discusses the development of Feature Pyramid Networks (FPN) for object detection, addressing the challenges of recognizing objects at different scales. The authors present an architecture that utilizes a top-down pathway with lateral connections to build a pyramidal representation from deep convolutional networks with minimal computational overhead. Their proposed FPN significantly improves detection performance in systems like Faster R-CNN, achieving state-of-the-art results on the COCO detection benchmark, while also running efficiently on GPUs. The paper details the method's architecture, its effectiveness in various applications, and comparisons with existing approaches, showing improvements in both accuracy and speed without increasing inference time.

Methods

This paper employs the following methods:

Feature Pyramid Networks
Top-down architecture
Lateral connections

Models Used

Faster R-CNN
RPN
DeepMask
SharpMask

Datasets

The following datasets were used in this research:

COCO

Evaluation Metrics

Average Recall (AR)
Average Precision (AP)

Results

Achieved state-of-the-art single-model results on COCO detection benchmark
Improved Average Recall (AR) by 8.0 points for bounding box proposals
Increased COCO-style Average Precision (AP) by 2.3 points and PASCAL-style AP by 3.8 points over strong baseline

Limitations

The authors identified the following limitations:

Not specified

Technical Requirements

Number of GPUs: 8
GPU Type: NVIDIA M40

Keywords

Feature Pyramid Networks Object detection Deep learning Convolutional networks Multi-scale recognition

Papers Using Similar Methods

External Resources

References: 42
Influential Citations: 2910

Feature Pyramid Networks for Object Detection

Abstract edit

Summary

Methods add

Models Used add

Datasets add

Evaluation Metrics add

Results add

Limitations add

Technical Requirements edit

Keywords add

Related Papers