Model soups (BASIC-L)
|
Model soups: averaging weights of multiple fine-t…
|
77.18
|
2022-03-10
|
|
Model soups (ViT-G/14)
|
Model soups: averaging weights of multiple fine-t…
|
74.24
|
2022-03-10
|
|
CAR-FT (CLIP, ViT-L/14@336px)
|
Context-Aware Robust Fine-Tuning
|
65.50
|
2022-11-29
|
|
ConvNeXt-XL (Im21k, 384)
|
A ConvNet for the 2020s
|
55.00
|
2022-01-10
|
|
CAFormer-B36 (IN21K, 384)
|
MetaFormer Baselines for Vision
|
54.50
|
2022-10-24
|
|
LLE (ViT-H/14, MAE, Edge Aug)
|
A Whac-A-Mole Dilemma: Shortcuts Come in Multiple…
|
53.39
|
2022-12-09
|
|
ConvFormer-B36 (IN21K, 384)
|
MetaFormer Baselines for Vision
|
52.90
|
2022-10-24
|
|
CAFormer-B36 (IN21K)
|
MetaFormer Baselines for Vision
|
52.80
|
2022-10-24
|
|
ConvFormer-B36 (IN21K)
|
MetaFormer Baselines for Vision
|
52.70
|
2022-10-24
|
|
MAE (ViT-H, 448)
|
Masked Autoencoders Are Scalable Vision Learners
|
50.90
|
2021-11-11
|
|
MAE+DAT (ViT-H)
|
Enhance the Visual Representation via Discrete Ad…
|
50.03
|
2022-09-16
|
|
GPaCo (ViT-L)
|
Generalized Parametric Contrastive Learning
|
48.30
|
2022-09-26
|
|
Discrete Adversarial Distillation (ViT-B, 224)
|
Distilling Out-of-Distribution Robustness from Vi…
|
46.10
|
2023-11-02
|
|
Pyramid Adversarial Training Improves ViT (Im21k)
|
Pyramid Adversarial Training Improves ViT Perform…
|
46.03
|
2021-11-30
|
|
SEER (RegNet10B)
|
Vision Models Are More Robust And Fair When Pretr…
|
45.60
|
2022-02-16
|
|
DrViT
|
Discrete Representations Strengthen Vision Transf…
|
44.72
|
2021-11-20
|
|
CAFormer-B36
|
MetaFormer Baselines for Vision
|
42.50
|
2022-10-24
|
|
Pyramid Adversarial Training Improves ViT
|
Pyramid Adversarial Training Improves ViT Perform…
|
41.04
|
2021-11-30
|
|
ConvFormer-B36
|
MetaFormer Baselines for Vision
|
39.50
|
2022-10-24
|
|
Sequencer2D-L
|
Sequencer: Deep LSTM for Image Classification
|
35.80
|
2022-05-04
|
|