ResNet-50
|
Benchmarking Neural Network Robustness to Common …
|
76.70
|
2019-03-28
|
|
FAN-L-Hybrid (IN-22k)
|
Understanding The Robustness in Vision Transforme…
|
73.60
|
2022-04-26
|
|
FAN-B-Hybrid (IN-22k)
|
Understanding The Robustness in Vision Transforme…
|
70.50
|
2022-04-26
|
|
Group-wise Inhibition (ResNet-50)
|
Group-wise Inhibition based Feature Regularizatio…
|
69.60
|
2021-03-03
|
|
ResNet-50 (PushPull-Conv) + PRIME
|
PushPull-Net: Inhibition-driven ResNet robust to …
|
69.40
|
2024-08-07
|
|
Stylized ImageNet (ResNet-50)
|
ImageNet-trained CNNs are biased towards texture;…
|
69.30
|
2018-11-29
|
|
FAN-L-Hybrid+STL
|
Fully Attentional Networks with Self-emerging Tok…
|
69.20
|
2024-01-08
|
|
FAN-L-Hybrid
|
Understanding The Robustness in Vision Transforme…
|
67.70
|
2022-04-26
|
|
AugMix (ResNet-50)
|
AugMix: A Simple Data Processing Method to Improv…
|
65.30
|
2019-12-05
|
|
APR-SP (ResNet-50)
|
Amplitude-Phase Recombination: Rethinking Robustn…
|
65.00
|
2021-08-19
|
|
DeepAugment (ResNet-50)
|
The Many Faces of Robustness: A Critical Analysis…
|
60.40
|
2020-06-29
|
|
PRIME + DeepAugment (ResNet-50)
|
PRIME: A few primitives can boost robustness to c…
|
59.90
|
2021-12-27
|
|
APR-SP + DeepAugment (ResNet-50)
|
Amplitude-Phase Recombination: Rethinking Robustn…
|
57.50
|
2021-08-19
|
|
RVT-Ti*
|
Towards Robust Vision Transformer
|
57.00
|
2021-05-17
|
|
ViT-B/16-SAM
|
When Vision Transformers Outperform ResNets witho…
|
56.50
|
2021-06-03
|
|
PRIME with JSD (ResNet-50)
|
PRIME: A few primitives can boost robustness to c…
|
56.40
|
2021-12-27
|
|
PRIME (ResNet-50)
|
PRIME: A few primitives can boost robustness to c…
|
55.00
|
2021-12-27
|
|
ResNet-152x2-SAM
|
When Vision Transformers Outperform ResNets witho…
|
55.00
|
2021-06-03
|
|
DINOv2 (ViT-S/14, frozen model, linear eval)
|
DINOv2: Learning Robust Visual Features without S…
|
54.40
|
2023-04-14
|
|
GFNet-S
|
Global Filter Networks for Image Classification
|
53.80
|
2021-07-01
|
|
RVT-S*
|
Towards Robust Vision Transformer
|
49.40
|
2021-05-17
|
|
Sequencer2D-L
|
Sequencer: Deep LSTM for Image Classification
|
48.90
|
2022-05-04
|
|
Mixer-B/8-SAM
|
When Vision Transformers Outperform ResNets witho…
|
48.90
|
2021-06-03
|
|
RVT-B*
|
Towards Robust Vision Transformer
|
46.80
|
2021-05-17
|
|
ConvFormer-B36
|
MetaFormer Baselines for Vision
|
46.30
|
2022-10-24
|
|
DrViT
|
Discrete Representations Strengthen Vision Transf…
|
46.22
|
2021-11-20
|
|
DiscreteViT
|
Discrete Representations Strengthen Vision Transf…
|
46.22
|
2021-11-20
|
|
DINOv2 (ViT-B/14, frozen model, linear eval)
|
DINOv2: Learning Robust Visual Features without S…
|
42.70
|
2023-04-14
|
|
CAFormer-B36
|
MetaFormer Baselines for Vision
|
42.60
|
2022-10-24
|
|
Pyramid Adversarial Training Improves ViT
|
Pyramid Adversarial Training Improves ViT Perform…
|
41.42
|
2021-11-30
|
|
GPaCo (ViT-L)
|
Generalized Parametric Contrastive Learning
|
39.00
|
2022-09-26
|
|
ConvNeXt-XL (Im21k) (augmentation overlap with ImageNet-C)
|
A ConvNet for the 2020s
|
38.80
|
2022-01-10
|
|
DiscreteViT (Im21k)
|
Discrete Representations Strengthen Vision Transf…
|
38.74
|
2021-11-20
|
|
VOLO-D5+HAT
|
Improving Vision Transformers by Revisiting High-…
|
38.40
|
2022-04-03
|
|
Pyramid Adversarial Training Improves ViT (Im21k)
|
Pyramid Adversarial Training Improves ViT Perform…
|
36.80
|
2021-11-30
|
|
ConvFormer-B36 (IN21K)
|
MetaFormer Baselines for Vision
|
35.00
|
2022-10-24
|
|
MAE (ViT-H)
|
Masked Autoencoders Are Scalable Vision Learners
|
33.80
|
2021-11-11
|
|
CAFormer-B36 (IN21K)
|
MetaFormer Baselines for Vision
|
31.80
|
2022-10-24
|
|
DINOv2 (ViT-L/14, frozen model, linear eval)
|
DINOv2: Learning Robust Visual Features without S…
|
31.50
|
2023-04-14
|
|
MAE+DAT (ViT-H)
|
Enhance the Visual Representation via Discrete Ad…
|
31.40
|
2022-09-16
|
|
CAFormer-B36 (IN21K, 384)
|
MetaFormer Baselines for Vision
|
30.80
|
2022-10-24
|
|
DINOv2 (ViT-g/14, frozen model, linear eval)
|
DINOv2: Learning Robust Visual Features without S…
|
28.20
|
2023-04-14
|
|