← ML Research Wiki / 2303.03667

Run, Don't Walk: Chasing Higher FLOPS for Faster Neural Networks

Jierun Chen HKUST, Shiu-Hong Kao HKUST, Hao He, Weipeng Zhuo HKUST, Song Wen HKUST Rutgers University, Chul-Ho Lee Texas State University, S.-H Gary Chan HKUST (2023)

Paper Information

arXiv ID

2303.03667

Venue

Computer Vision and Pattern Recognition

Domain

Computer vision

Code

Available

Reproducibility

7/10

Contents

Abstract
Methods
Datasets
Results
Limitations
Related Work
External Resources

Abstract

To design fast neural networks, many works have been focusing on reducing the number of floating-point operations (FLOPs). We observe that such reduction in FLOPs, however, does not necessarily lead to a similar level of reduction in latency. This mainly stems from inefficiently low floating-point operations per second (FLOPS). To achieve faster networks, we revisit popular operators and demonstrate that such low FLOPS is mainly due to frequent memory access of the operators, especially the depthwise convolution. We hence propose a novel partial convolution (PConv) that extracts spatial features more efficiently, by cutting down redundant computation and memory access simultaneously. Building upon our PConv, we further propose FasterNet, a new family of neural networks, which attains substantially higher running speed than others on a wide range of devices, without compromising on accuracy for various vision tasks. For example, on ImageNet-1k, our tiny FasterNet-T0 is 2.8×, 3.3×, and 2.4× faster than MobileViT-XXS on GPU, CPU, and ARM processors, respectively, while being 2.9% more accurate. Our large FasterNet-L achieves impressive 83.5% top-1 accuracy, on par with the emerging Swin-B, while having 36% higher inference throughput on GPU, as well as saving 37% compute time on CPU. Code is available at https://github. com/JierunChen/FasterNet.

Summary

This paper introduces FasterNet, a family of neural networks designed to maximize performance while minimizing latency by optimizing floating-point operations per second (FLOPS). The authors highlight that merely reducing the number of floating-point operations (FLOPs) does not correspondingly reduce latency, primarily due to inefficient use of FLOPS, especially in depthwise convolutions. To address this, they propose a new operator, Partial Convolution (PConv), which reduces redundancy in computations and memory access simultaneously. The proposed FasterNet architecture utilizes PConv and achieves significant speed improvements compared to existing models, like MobileViT-XXS, on varied hardware including GPUs, CPUs, and ARM processors, while also maintaining accuracy levels. The paper details extensive experimentation on classification, detection, and segmentation tasks, demonstrating FasterNet's superior performance and efficiency on the ImageNet-1k dataset, achieving an accuracy of 83.5%. The authors also discuss limitations and future directions for enhancing neural network architectures.

Methods

This paper employs the following methods:

PConv
FasterNet

Models Used

FasterNet
MobileViT-XXS
ResNet50
Swin-B

Datasets

The following datasets were used in this research:

ImageNet-1k
COCO

Evaluation Metrics

Accuracy
Latency

Results

FasterNet-T0 is 2.8×, 3.3×, and 2.4× faster than MobileViT-XXS on GPU, CPU, and ARM processors respectively, with 2.9% higher accuracy
FasterNet-L achieves 83.5% top-1 accuracy, 36% higher throughput on GPU, and saves 37% compute time on CPU compared to Swin-B.

Limitations

The authors identified the following limitations:

Not specified

Technical Requirements

Number of GPUs: None specified
GPU Type: None specified

Keywords

FLOPS latency speed-up neural network operators partial convolution FasterNet

Papers Using Similar Methods

External Resources

Funding: Not specified
References: 89
Influential Citations: 39

Run, Don't Walk: Chasing Higher FLOPS for Faster Neural Networks

Abstract edit

Summary

Methods add

Models Used add

Datasets add

Evaluation Metrics add

Results add

Limitations add

Technical Requirements edit

Keywords add

Related Papers