Model soups (BASIC-L)
|
Model soups: averaging weights of multiple fine-t…
|
94.17
|
2022-03-10
|
|
Model soups (ViT-G/14)
|
Model soups: averaging weights of multiple fine-t…
|
92.67
|
2022-03-10
|
|
µ2Net+ (ViT-L/16)
|
A Continual Development Methodology for Large-sca…
|
84.53
|
2022-09-15
|
|
CAR-FT (CLIP, ViT-L/14@336px)
|
Context-Aware Robust Fine-Tuning
|
81.50
|
2022-11-29
|
|
CAFormer-B36 (IN-21K, 384)
|
MetaFormer Baselines for Vision
|
79.50
|
2022-10-24
|
|
MAE (ViT-H, 448)
|
Masked Autoencoders Are Scalable Vision Learners
|
76.70
|
2021-11-11
|
|
FAN-Hybrid-L(IN-21K, 384)
|
Understanding The Robustness in Vision Transforme…
|
74.50
|
2022-04-26
|
|
ConvFormer-B36 (IN-21K, 384)
|
MetaFormer Baselines for Vision
|
73.50
|
2022-10-24
|
|
CAFormer-B36 (IN-21K)
|
MetaFormer Baselines for Vision
|
69.40
|
2022-10-24
|
|
ConvNeXt-XL (Im21k, 384)
|
A ConvNet for the 2020s
|
69.30
|
2022-01-10
|
|
MAE+DAT (ViT-H)
|
Enhance the Visual Representation via Discrete Ad…
|
68.92
|
2022-09-16
|
|
ConvFormer-B36 (IN-21K)
|
MetaFormer Baselines for Vision
|
63.30
|
2022-10-24
|
|
Pyramid Adversarial Training Improves ViT (Im21k)
|
Pyramid Adversarial Training Improves ViT Perform…
|
62.44
|
2021-11-30
|
|
CAFormer-B36 (384)
|
MetaFormer Baselines for Vision
|
61.90
|
2022-10-24
|
|
TransNeXt-Base (IN-1K supervised, 384)
|
TransNeXt: Robust Foveal Visual Perception for Vi…
|
61.60
|
2023-11-28
|
|
TransNeXt-Small (IN-1K supervised, 384)
|
TransNeXt: Robust Foveal Visual Perception for Vi…
|
58.30
|
2023-11-28
|
|
ConvFormer-B36 (384)
|
MetaFormer Baselines for Vision
|
55.30
|
2022-10-24
|
|
SEER (RegNet10B)
|
Vision Models Are More Robust And Fair When Pretr…
|
52.70
|
2022-02-16
|
|
TransNeXt-Base (IN-1K supervised, 224)
|
TransNeXt: Robust Foveal Visual Perception for Vi…
|
50.60
|
2023-11-28
|
|
CAFormer-B36
|
MetaFormer Baselines for Vision
|
48.50
|
2022-10-24
|
|
TransNeXt-Small (IN-1K supervised, 224)
|
TransNeXt: Robust Foveal Visual Perception for Vi…
|
47.10
|
2023-11-28
|
|
FAN-L-Hybrid+STL
|
Fully Attentional Networks with Self-emerging Tok…
|
46.10
|
2024-01-08
|
|
ConvFormer-B36
|
MetaFormer Baselines for Vision
|
40.10
|
2022-10-24
|
|
Pyramid Adversarial Training Improves ViT (384x384)
|
Pyramid Adversarial Training Improves ViT Perform…
|
36.41
|
2021-11-30
|
|
Sequencer2D-L
|
Sequencer: Deep LSTM for Image Classification
|
35.50
|
2022-05-04
|
|
Discrete Adversarial Distillation (ViT-B/224)
|
Distilling Out-of-Distribution Robustness from Vi…
|
31.80
|
2023-11-02
|
|
Diffusion Classifier
|
Your Diffusion Model is Secretly a Zero-Shot Clas…
|
30.20
|
2023-03-28
|
|
RVT-B*
|
Towards Robust Vision Transformer
|
28.50
|
2021-05-17
|
|
RVT-S*
|
Towards Robust Vision Transformer
|
25.70
|
2021-05-17
|
|
RVT-Ti*
|
Towards Robust Vision Transformer
|
14.40
|
2021-05-17
|
|
GFNet-S
|
Global Filter Networks for Image Classification
|
14.30
|
2021-07-01
|
|
CutMix+MoEx (ResNet-50)
|
On Feature Normalization and Data Augmentation
|
8.40
|
2020-02-25
|
|
Discrete Adversarial Distillation (ResNet-50)
|
Distilling Out-of-Distribution Robustness from Vi…
|
7.70
|
2023-11-02
|
|
CutMix (ResNet-50)
|
CutMix: Regularization Strategy to Train Strong C…
|
7.30
|
2019-05-13
|
|
Mixup (ResNet-50)
|
mixup: Beyond Empirical Risk Minimization
|
6.60
|
2017-10-25
|
|
Cutout (ResNet-50)
|
Improved Regularization of Convolutional Neural N…
|
4.40
|
2017-08-15
|
|
ResNet-50 (300 Epochs)
|
Deep Residual Learning for Image Recognition
|
4.20
|
2015-12-10
|
|
Stylized ImageNet (ResNet-50)
|
ImageNet-trained CNNs are biased towards texture;…
|
2.30
|
2018-11-29
|
|
ResNet-50
|
Natural Adversarial Examples
|
0.00
|
2019-07-16
|
|