ML Research Wiki / Benchmarks / Image Classification / CIFAR-100

CIFAR-100

Image Classification Benchmark

Performance Over Time

📊 Showing 197 results | 📏 Metric: Percentage correct

Top Performing Models

Rank	Model	Paper	Percentage correct	Date	Code
1	efficient adaptive ensembling 📚	Efficient Adaptive Ensembling for Image Classification	96.81	2022-06-15	-
2	EffNet-L2 (SAM) 📚	Sharpness-Aware Minimization for Efficiently Improving Generalization	96.08	2020-10-03	📦 davda54/sam 📦 google-research/sam 📦 moskomule/sam.pytorch
3	Swin-L + ML-Decoder 📚	ML-Decoder: Scalable and Versatile Classification Head	95.10	2021-11-25	📦 alibaba-miil/ml_decoder
4	µ2Net (ViT-L/16) 📚	An Evolutionary Approach to Dynamic Introduction of Tasks in Large-scale Multitask Learning Systems	94.95	2022-05-25	📦 google-research/google-research
5	ViT-B-16 (ImageNet-21K-P pretrain) 📚	ImageNet-21K Pretraining for the Masses	94.20	2021-04-22	📦 Alibaba-MIIL/ImageNet21K 📦 YutingLi0606/SURE 📦 encounter1997/fp-detr 📦 gregorbachmann/scaling_mlps 📦 MS-Mind/MS-Code-01
6	CvT-W24 📚	CvT: Introducing Convolutions to Vision Transformers	94.09	2021-03-29	📦 huggingface/transformers 📦 BR-IDL/PaddleViT 📦 microsoft/CvT
7	ViT-B/16 (PUGD) 📚	Perturbated Gradients Updating within Unit Space for Deep Learning	93.95	2021-10-01	📦 hanktseng131415go/pugd
8	Heinsen Routing + BEiT-large 16 224 📚	An Algorithm for Routing Vectors in Sequences	93.80	2022-11-20	📦 glassroom/heinsen_routing
9	BiT-L (ResNet) 📚	Big Transfer (BiT): General Visual Representation Learning	93.51	2019-12-24	📦 google-research/big_transfer 📦 sayakpaul/FunMatch-Distillation 📦 bethgelab/InDomainGeneralizationBenchmark
10	VIT-L/16 (Spinal FC, Background)	Reduction of Class Activation Uncertainty with Background Information	93.31	2023-05-05	📦 dipuk0506/SpinalNet 📦 dipuk0506/uq

All Papers (197)

Efficient Adaptive Ensembling for Image Classification

2022

efficient adaptive ensembling

Sharpness-Aware Minimization for Efficiently Improving Generalization

2020

EffNet-L2 (SAM)

davda54/sam google-research/sam

ML-Decoder: Scalable and Versatile Classification Head

2021

Swin-L + ML-Decoder

alibaba-miil/ml_decoder

An Evolutionary Approach to Dynamic Introduction of Tasks in Large-scale Multitask Learning Systems

2022

µ2Net (ViT-L/16)

google-research/google-research

ImageNet-21K Pretraining for the Masses

2021

ViT-B-16 (ImageNet-21K-P pretrain)

Alibaba-MIIL/ImageNet21K YutingLi0606/SURE

CvT: Introducing Convolutions to Vision Transformers

2021

CvT-W24

huggingface/transformers BR-IDL/PaddleViT

Perturbated Gradients Updating within Unit Space for Deep Learning

2021

ViT-B/16 (PUGD)

hanktseng131415go/pugd

An Algorithm for Routing Vectors in Sequences

2022

Heinsen Routing + BEiT-large 16 224

glassroom/heinsen_routing

Big Transfer (BiT): General Visual Representation Learning

2019

BiT-L (ResNet)

google-research/big_transfer sayakpaul/FunMatch-Distillation

Reduction of Class Activation Uncertainty with Background Information

2023

VIT-L/16 (Spinal FC, Background)

dipuk0506/SpinalNet dipuk0506/uq

Going deeper with Image Transformers

2021

CaiT-M-36 U 224

rwightman/pytorch-image-models lucidrains/vit-pytorch

Three things everyone should know about Vision Transformers

2022

ViT-L (attn fine-tune)

rwightman/pytorch-image-models lucidrains/vit-pytorch

TResNet: High Performance GPU-Dedicated Architecture

2020

TResNet-L-V2

rwightman/pytorch-image-models mrT23/TResNet Alibaba-MIIL/TResNet

EfficientNetV2: Smaller Models and Faster Training

2021

EfficientNetV2-L

rwightman/pytorch-image-models pytorch/vision

EfficientNetV2: Smaller Models and Faster Training

2021

EfficientNetV2-M

rwightman/pytorch-image-models pytorch/vision

Big Transfer (BiT): General Visual Representation Learning

2019

BiT-M (ResNet)

google-research/big_transfer sayakpaul/FunMatch-Distillation

Incorporating Convolution Designs into Visual Transformers

2021

CeiT-S

rishikksh20/CeiT-pytorch coeusguo/ceit mindspore-courses/External-Attention-MindSpore

Incorporating Convolution Designs into Visual Transformers

2021

CeiT-S (384 finetune resolution)

rishikksh20/CeiT-pytorch coeusguo/ceit mindspore-courses/External-Attention-MindSpore

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

2019

EfficientNet-B7

ultralytics/yolov5 rwightman/pytorch-image-models

EfficientNetV2: Smaller Models and Faster Training

2021

EfficientNetV2-S

rwightman/pytorch-image-models pytorch/vision

GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism

2018

GPIPE

tensorflow/lingvo qubvel/efficientnet

Performance of Gaussian Mixture Model Classifiers on Embedded Feature Spaces

2024

DGMMC-S

cvmlmu/dgmmc

Transformer in Transformer

2021

TNT-B

rwightman/pytorch-image-models PaddlePaddle/PaddleClas

Training data-efficient image transformers & distillation through attention

2020

DeiT-B

huggingface/transformers rwightman/pytorch-image-models

Global Filter Networks for Image Classification

2021

GFNet-H-B

raoyongming/GFNet liuruiyang98/Jittor-MLP

Rethinking Recurrent Neural Networks and Other Improvements for Image Classification

2020

E2E-3M

leonlha/e2e-3m

Bamboo: Building Mega-Scale Vision Dataset Continually with Human-Machine Synergy

2022

Bamboo (ViT-B/16)

zhangyuanhan-ai/bamboo davidzhangyuanhan/bamboo

ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks

2021

PyramidNet-272 (ASAM)

davda54/sam borealisai/perturbed-forgetting

Sharpness-Aware Minimization for Efficiently Improving Generalization

2020

PyramidNet (SAM)

davda54/sam google-research/sam

Not All Images are Worth 16x16 Words: Dynamic Transformers for Efficient Image Recognition

2021

DVT (T2T-ViT-24)

blackfeather-wang/Dynamic-Vision-Transformer blackfeather-wang/dynamic-vision-transformer-mindspore

ResMLP: Feedforward networks for image classification with data-efficient training

2021

ResMLP-24

rwightman/pytorch-image-models xmu-xiaoma666/External-Attention-pytorch

Towards Better Accuracy-efficiency Trade-offs: Divide and Co-training

2020

PyramidNet-272, S=4

freeformrobotics/divide-and-co-training mzhaoshuai/Divide-and-Co-training

Incorporating Convolution Designs into Visual Transformers

2021

CeiT-T

rishikksh20/CeiT-pytorch coeusguo/ceit mindspore-courses/External-Attention-MindSpore

AutoAugment: Learning Augmentation Policies from Data

2018

PyramidNet+ShakeDrop

tensorflow/models tensorflow/models

When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations

2021

ViT-B/16- SAM

google-research/vision_transformer ttt496/VisionTransformer

ConvMLP: Hierarchical Convolutional MLPs for Vision

2021

ConvMLP-M

BR-IDL/PaddleViT shinya7y/UniverseNet

ConvMLP: Hierarchical Convolutional MLPs for Vision

2021

ConvMLP-L

BR-IDL/PaddleViT shinya7y/UniverseNet

Effect of Pre-Training Scale on Intra- and Inter-Domain Full and Few-Shot Transfer Learning for Natural and Medical X-Ray Chest Images

2021

ResNet-152x4-AGC (ImageNet-21K)

SLAMPAI/large-scale-pretraining-transfer

ColorNet: Investigating the importance of color spaces for image classification

2019

ColorNet

kini5gowda/ColorNet

Fast AutoAugment

2019

PyramidNet+ShakeDrop (Fast AA)

kakaobrain/fast-autoaugment ildoonet/pytorch-randaugment

Neural Architecture Transfer

2020

NAT-M4

human-analysis/neural-architecture-transfer awesomelemon/encas

Incorporating Convolution Designs into Visual Transformers

2021

CeiT-T (384 finetune resolution)

rishikksh20/CeiT-pytorch coeusguo/ceit mindspore-courses/External-Attention-MindSpore

Neural Architecture Transfer

2020

NAT-M3

human-analysis/neural-architecture-transfer awesomelemon/encas

When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations

2021

ViT-S/16- SAM

google-research/vision_transformer ttt496/VisionTransformer

Neural Architecture Transfer

2020

NAT-M2

human-analysis/neural-architecture-transfer awesomelemon/encas

PSO-Convolutional Neural Networks with Heterogeneous Learning Rate

2022

Dynamics 1

leonlha/pso-convnet-dynamics

Towards Better Accuracy-efficiency Trade-offs: Divide and Co-training

2020

DenseNet-BC-190, S=4

freeformrobotics/divide-and-co-training mzhaoshuai/Divide-and-Co-training

ConvMLP: Hierarchical Convolutional MLPs for Vision

2021

ConvMLP-S

BR-IDL/PaddleViT shinya7y/UniverseNet

ResMLP: Feedforward networks for image classification with data-efficient training

2021

ResMLP-12

rwightman/pytorch-image-models xmu-xiaoma666/External-Attention-pytorch

Towards Better Accuracy-efficiency Trade-offs: Divide and Co-training

2020

WRN-40-10, S=4

freeformrobotics/divide-and-co-training mzhaoshuai/Divide-and-Co-training

ResNet strikes back: An improved training procedure in timm

2021

ResNet50 (A1)

rwightman/pytorch-image-models open-mmlab/mmdetection

MixMo: Mixing Multiple Inputs for Multiple Outputs via Deep Subnetworks

2021

WRN-28-10 * 3

alexrame/mixmo-pytorch

Regularizing Neural Networks via Adversarial Model Perturbation

2020

PyramidNet + AA (AMP)

hiyouga/AMP-Regularizer

Self-Knowledge Distillation with Progressive Refinement of Targets

2020

PyramidNet-200 + Shakedrop + Cutmix + PS-KD

lgcnsai/ps-kd-pytorch

When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations

2021

Mixer-B/16- SAM

google-research/vision_transformer ttt496/VisionTransformer

Deep Feature Response Discriminative Calibration

2024

ResCNet-50

tcmyxc/rescnet

CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features

2019

PyramidNet-200 + Shakedrop + Cutmix

rwightman/pytorch-image-models pytorch/vision

MUXConv: Information Multiplexing in Convolutional Neural Networks

2020

MUXNet-m

human-analysis/MUXConv

Neural Architecture Transfer

2020

NAT-M1

human-analysis/neural-architecture-transfer awesomelemon/encas

MixMo: Mixing Multiple Inputs for Multiple Outputs via Deep Subnetworks

2021

WRN-28-10

alexrame/mixmo-pytorch

Towards Better Accuracy-efficiency Trade-offs: Divide and Co-training

2020

WRN-28-10, S=4

freeformrobotics/divide-and-co-training mzhaoshuai/Divide-and-Co-training

Boosting Discriminative Visual Representation Learning with Scenario-Agnostic Mixup

2021

WRN-28-8 +SAMix

Westlake-AI/openmixup

Improving Neural Architecture Search Image Classifiers via Ensemble Learning

2019

ASANas

tensorflow/adanet

SparseSwin: Swin Transformer with Sparse Transformer Block

2023

SparseSwin

krisnapinasthika/sparseswin

When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations

2021

ResNet-50-SAM

google-research/vision_transformer ttt496/VisionTransformer

AutoMix: Unveiling the Power of Mixup for Stronger Classifiers

2021

WRN-28-8 +AutoMix

Westlake-AI/openmixup zeyuanyin/tiny-imagenet Westlake-AI/AutoMix

WaveMix: A Resource-efficient Neural Network for Image Analysis

2022

WaveMixLite-256/7

pranavphoenix/WaveMix

Linear Attention with Global Context: A Multipole Attention Mechanism for Vision and Physics

2025

MANO-tiny

AlexColagrande/MANO

Neural networks with late-phase weights

2020

WRN 28-14

google/uncertainty-baselines seijin-kobayashi/late-phase-weights

Expeditious Saliency-guided Mix-up through Random Gradient Thresholding

2022

R-Mix (WideResNet 28-10)

minhlong94/random-mixup

EEEA-Net: An Early Exit Evolutionary Neural Architecture Search

2021

EEEA-Net-C (b=5)+ CO

chakkritte/eeea-net

Expeditious Saliency-guided Mix-up through Random Gradient Thresholding

2022

RL-Mix (WideResNet 28-10)

minhlong94/random-mixup

Automatic Data Augmentation via Invariance-Constrained Learning

2022

Wide-ResNet-28-10

ihounie/daug

Squeeze-and-Excitation Networks

2017

SENet + ShakeEven + Cutout

PaddlePaddle/PaddleOCR xmu-xiaoma666/External-Attention-pytorch

Boosting Discriminative Visual Representation Learning with Scenario-Agnostic Mixup

2021

ResNeXt-50(32x4d) + SAMix

Westlake-AI/openmixup

Non-convex Learning via Replica Exchange Stochastic Gradient MCMC

2020

WRN-28-10 with reSGHMC

gaoliyao/Replica_Exchange_Stochastic_Gradient_MCMC WayneDW/Variance_Reduced_Replica_Exchange_SGMCMC

Averaging Weights Leads to Wider Optima and Better Generalization

2018

PyramidNet-272 + SWA

timgaripov/swa wjmaddox/swa_gaussian

Puzzle Mix: Exploiting Saliency and Local Statistics for Optimal Mixup

2020

WRN28-10

snu-mllab/PuzzleMix

Gated Convolutional Networks with Hybrid Connectivity for Image Classification

2019

HCGNet-A3

winycg/HCGNet

Expeditious Saliency-guided Mix-up through Random Gradient Thresholding

2022

WideResNet 28-10 + CutMix (OneCycleLR scheduler)

minhlong94/random-mixup

FMix: Enhancing Mixed Sample Data Augmentation

2020

DenseNet-BC-190 + FMix

PaddlePaddle/PaddleClas Westlake-AI/openmixup

Oriented Response Networks

2017

ORN

ZhouYanzhao/ORN

Grafit: Learning fine-grained image representations with coarse labels

2020

Grafit (ResNet-50)

AutoMix: Unveiling the Power of Mixup for Stronger Classifiers

2021

ResNeXt-50(32x4d) + AutoMix

Westlake-AI/openmixup zeyuanyin/tiny-imagenet Westlake-AI/AutoMix

TokenMixup: Efficient Attention-guided Token-level Data Augmentation for Transformers

2022

CCT-7/3x1+HTM+VTM

mlvlab/tokenmixup

Gated Convolutional Networks with Hybrid Connectivity for Image Classification

2019

HCGNet-A2

winycg/HCGNet

Res2Net: A New Multi-scale Backbone Architecture

2019

Res2NeXt-29

rwightman/pytorch-image-models open-mmlab/mmdetection

mixup: Beyond Empirical Risk Minimization

2017

DenseNet-BC-190 + Mixup

rwightman/pytorch-image-models pytorch/vision

Contextual Classification Using Self-Supervised Auxiliary Models for Deep Neural Networks

2021

SSAL-DenseNet 190-40

Engler93/Self-Supervised-Autogenous-Learning

EnAET: A Self-Trained framework for Semi-Supervised and Supervised Learning with Ensemble Transformations

2019

EnAET

maple-research-lab/EnAET wang3702/EnAET

Neural networks with late-phase weights

2020

WRN 28-10

google/uncertainty-baselines seijin-kobayashi/late-phase-weights

Expeditious Saliency-guided Mix-up through Random Gradient Thresholding

2022

R-Mix (ResNeXt 29-4-24)

minhlong94/random-mixup

Single-bit-per-weight deep convolutional neural networks without batch-normalization layers for embedded systems

2019

Wide ResNet+Cutout+no BN scale/offset learning

McDonnell-Lab/1-bit-per-weight

Non-convex Learning via Replica Exchange Stochastic Gradient MCMC

2020

WRN-16-8 with reSGHMC

gaoliyao/Replica_Exchange_Stochastic_Gradient_MCMC WayneDW/Variance_Reduced_Replica_Exchange_SGMCMC

Densely Connected Convolutional Networks

2016

DenseNet-BC

pytorch/vision pytorch/vision

ANDHRA Bandersnatch: Training Neural Networks to Predict Parallel Realities

2024

ABNet-2G-R3-Combined

dvssajay/New_World

Escaping the Big Data Paradigm with Compact Transformers

2021

CCT-7/3x1*

keras-team/keras-io SHI-Labs/Compact-Transformers

EXACT: How to Train Your Accuracy

2022

EXACT (WRN-28-10)

tinkoff-ai/exact ivan-chai/exact

Selective Kernel Networks

2019

SKNet-29 (ResNeXt-29, 16×32d)

rwightman/pytorch-image-models xmu-xiaoma666/External-Attention-pytorch

Densely Connected Convolutional Networks

2016

DenseNet

pytorch/vision pytorch/vision

Learning Implicitly Recurrent CNNs Through Parameter Sharing

2019

Shared WRN

lolemacs/soft-sharing

Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding

2021

Transformer local-attention (NesT-B)

rwightman/pytorch-image-models google-research/nested-transformer

Expeditious Saliency-guided Mix-up through Random Gradient Thresholding

2022

RL-Mix (ResNeXt 29-4-24)

minhlong94/random-mixup

When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations

2021

Mixer-S/16- SAM

google-research/vision_transformer ttt496/VisionTransformer

Expeditious Saliency-guided Mix-up through Random Gradient Thresholding

2022

R-Mix (WideResNet 16-8)

minhlong94/random-mixup

Expeditious Saliency-guided Mix-up through Random Gradient Thresholding

2022

ResNeXt 29-4-24 + CutMix (OneCycleLR scheduler)

minhlong94/random-mixup

Attend and Rectify: a Gated Attention Mechanism for Fine-Grained Recovery

2018

WARN

prlz77/attend-and-rectify

Expeditious Saliency-guided Mix-up through Random Gradient Thresholding

2022

RL-Mix (WideResNet 16-8)

minhlong94/random-mixup

Averaging Weights Leads to Wider Optima and Better Generalization

2018

WRN+SWA

timgaripov/swa wjmaddox/swa_gaussian

Manifold Mixup: Better Representations by Interpolating Hidden States

2018

Manifold Mixup

Westlake-AI/openmixup vikasverma1077/manifold_mixup

Gated Convolutional Networks with Hybrid Connectivity for Image Classification

2019

HCGNet-A1

winycg/HCGNet

Expeditious Saliency-guided Mix-up through Random Gradient Thresholding

2022

WideResNet 16-8 + CutMix (OneCycleLR scheduler)

minhlong94/random-mixup

Learning Identity Mappings with Residual Gates

2016

Residual Gates + WRN

Revisiting a kNN-based Image Classification System with High-capacity Storage

2022

kNN-CLIP

Attention Augmented Convolutional Networks

2019

AA-Wide-ResNet

leaderj1001/Attention-Augmented-Conv2d leaderj1001/Stand-Alone-Self-Attention

PDO-eConvs: Partial Differential Operator Based Equivariant Convolutions

2020

PDO-eConv (p8, 4.6M)

shenzy08/PDO-eConvs Roderickzzc/Pdo-econv-pytorch ejnnr/steerable_pdo_experiments

Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision

2022

SEER (RegNet10B)

facebookresearch/vissl

Expeditious Saliency-guided Mix-up through Random Gradient Thresholding

2022

R-Mix (PreActResNet-18)

minhlong94/random-mixup

On the Performance Analysis of Momentum Method: A Frequency Domain Perspective

2024

ResNet50 (FSGDM)

yinleung/FSGDM

Automatic Data Augmentation via Invariance-Constrained Learning

2022

Wide-ResNet-40-2

ihounie/daug

Wide Residual Networks

2016

Wide ResNet

tensorflow/models osmr/imgclsmob

Deep Competitive Pathway Networks

2017

CoPaNet-R-164

JiaRenChang/CoPaNet

ANDHRA Bandersnatch: Training Neural Networks to Predict Parallel Realities

2024

ABNet-2G-R3

dvssajay/New_World

Expeditious Saliency-guided Mix-up through Random Gradient Thresholding

2022

RL-Mix (PreActResNet-18)

minhlong94/random-mixup

Expeditious Saliency-guided Mix-up through Random Gradient Thresholding

2022

PreActResNet-18 + CutMix (OneCycleLR scheduler)

minhlong94/random-mixup

Gated Attention Coding for Training High-performance and Efficient Spiking Neural Networks

2023

GAC-SNN

bollossom/GAC

ANDHRA Bandersnatch: Training Neural Networks to Predict Parallel Realities

2024

ABNet-2G-R2

dvssajay/New_World

Towards Principled Design of Deep Convolutional Networks: Introducing SimpNet

2018

SimpleNetv2

Coderx7/SimpNet

UPANets: Learning from the Universal Pixel Attention Networks

2021

UPANets

hanktseng131415go/UPANets

SageMix: Saliency-Guided Mixup for Point Clouds

2022

PreActResNet-18 + SageMix

mlvlab/SageMix

Non-convex Learning via Replica Exchange Stochastic Gradient MCMC

2020

ResNet56 with reSGHMC

gaoliyao/Replica_Exchange_Stochastic_Gradient_MCMC WayneDW/Variance_Reduced_Replica_Exchange_SGMCMC

PDO-eConvs: Partial Differential Operator Based Equivariant Convolutions

2020

PDO-eConv (p8, 2.62M)

shenzy08/PDO-eConvs Roderickzzc/Pdo-econv-pytorch ejnnr/steerable_pdo_experiments

Training Neural Networks with Local Error Signals

2019

VGG11B(3x) + LocalLearning

anokland/local-loss ai-tech-research-lab/nitro-d

With a Little Help from My Friends: Nearest-Neighbor Contrastive Learning of Visual Representations

2021

NNCLR

lightly-ai/lightly keras-team/keras-io

ANDHRA Bandersnatch: Training Neural Networks to Predict Parallel Realities

2024

ABNet-2G-R1

dvssajay/New_World

Regularizing Neural Networks via Adversarial Model Perturbation

2020

PreActResNet18 (AMP)

hiyouga/AMP-Regularizer

Lets keep it simple, Using simple architectures to outperform deeper and more complex architectures

2016

SimpleNetv1

Coderx7/SimpleNet Coderx7/SimpleNet_Pytorch

Pre-training of Lightweight Vision Transformers on Small Datasets with Minimally Scaled Images

2024

ViT (lightweight, MAE pre-trained)

Augmenting Deep Classifiers with Polynomial Neural Networks

2021

PDC

grigorisg9gr/polynomials-for-augmenting-nns jesperhauch/polynomial_deep_learning

Rethinking Depthwise Separable Convolutions: How Intra-Kernel Correlations Lead to Improved MobileNets

2020

MobileNetV3-large x1.0 (BSConv-U)

zeiss-microscopy/BSConv

Escaping the Big Data Paradigm with Compact Transformers

2021

CCT-6/3x1

keras-team/keras-io SHI-Labs/Compact-Transformers

Identity Mappings in Deep Residual Networks

2016

ResNet-1001

tensorflow/models tensorflow/models

Large-Scale Evolution of Image Classifiers

2017

Evolution

marijnvk/LargeScaleEvolution StevenGerrad/large-scale-Evolution-Net

DIANet: Dense-and-Implicit Attention Network

2019

DIANet

osmr/imgclsmob gbup-group/DIANet gbup-group/EAN-efficient-attention-network

Encoding the latent posterior of Bayesian Neural Networks for uncertainty quantification

2020

LP-BNN (ours) + cutout

ENSTA-U2IS-AI/torch-uncertainty giannifranchi/LP_BNN

Learning Class Unique Features in Fine-Grained Visual Classification

2020

ResNet-18+MM+FRL

Non-convex Learning via Replica Exchange Stochastic Gradient MCMC

2020

ResNet32 with reSGHMC

gaoliyao/Replica_Exchange_Stochastic_Gradient_MCMC WayneDW/Variance_Reduced_Replica_Exchange_SGMCMC

Momentum Residual Neural Networks

2021

MomentumNet

michaelsdr/momentumnet

Spatially-sparse convolutional neural networks

2014

SSCNN

facebookresearch/SparseConvNet justinessert/hierarchical-deep-cnn simpleintel/jupyter

Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)

2015

Exponential Linear Units

MaximeVandegar/Papers-in-100-Lines-of-Code hughperkins/DeepCL

CNN Filter DB: An Empirical Investigation of Trained Convolutional Filters

2022

ResNet-9

paulgavrikov/cnn-filter-db

Deep Networks with Stochastic Depth

2016

Stochastic Depth

rwightman/pytorch-image-models pytorch/vision

Mish: A Self Regularized Non-Monotonic Activation Function

2019

ResNet v2-110 (Mish activation)

tensorflow/addons digantamisra98/Mish

Non-convex Learning via Replica Exchange Stochastic Gradient MCMC

2020

ResNet20 with reSGHMC

gaoliyao/Replica_Exchange_Stochastic_Gradient_MCMC WayneDW/Variance_Reduced_Replica_Exchange_SGMCMC

MixMatch: A Holistic Approach to Semi-Supervised Learning

2019

MixMatch

google-research/mixmatch YU1ut/MixMatch-pytorch

Beta-Rank: A Robust Convolutional Filter Pruning Method For Imbalanced Medical Image Analysis

2023

Beta-Rank

mohofar/beta-rank

How to Use Dropout Correctly on Residual Networks with Batch Normalization

2023

PreResNet-110

kmbmjn/DropoutCorrectly

ANDHRA Bandersnatch: Training Neural Networks to Predict Parallel Realities

2024

ABNet-2G-R0

dvssajay/New_World

Fractional Max-Pooling

2014

Fractional MP

facebookresearch/SparseConvNet laplacetw/vgg-like-cifar10

Deep Residual Networks with Exponential Linear Unit

2016

ResNet+ELU

Amihaeseisergiu/Cifar-10-ResNet-ELU-Cutout

PDO-eConvs: Partial Differential Operator Based Equivariant Convolutions

2020

PDO-eConv (p6m,0.37M)

shenzy08/PDO-eConvs Roderickzzc/Pdo-econv-pytorch ejnnr/steerable_pdo_experiments

Stochastic Optimization of Plain Convolutional Neural Networks with Simple methods

2020

SOPCNN

junaidaliop/MNIST-SOPCNN

PDO-eConvs: Partial Differential Operator Based Equivariant Convolutions

2020

PDO-eConv (p6,0.36M)

shenzy08/PDO-eConvs Roderickzzc/Pdo-econv-pytorch ejnnr/steerable_pdo_experiments

Scalable Bayesian Optimization Using Deep Neural Networks

2015

Tuned CNN

automl/pybnn pipilurj/BONAS

Stochastic Subsampling With Average Pooling

2024

ResNet-110 (SAP)

Competitive Multi-scale Convolution

2015

CMsC

All you need is a good init

2015

Fitnet4-LSUV

ducha-aiki/LSUVinit ducha-aiki/LSUV-keras

Batch-normalized Maxout Network in Network

2015

BNM NiN

JohnBensen1000/machine_learning

Online Training Through Time for Spiking Neural Networks

2022

OTTT

pkuxmq/ottt-snn

On the Importance of Normalisation Layers in Deep Learning with Piecewise Linear Activation Units

2015

MIM

WaveMix: A Resource-efficient Neural Network for Image Analysis

2022

WaveMix-Lite-256/7

pranavphoenix/WaveMix

Learning Activation Functions to Improve Deep Neural Networks

2014

NiN+APL

ForestAgostinelli/Learned-Activation-Functions-Source DivyanshRoy/Learning-Activation-Function-Using-Tensorflow pavaichandru93/Neural-network-for-data-science

Stacked What-Where Auto-encoders

2015

SWWAE

isaacgerg/keras_odds_and_ends zhangqinghao0811/unpool

Deep Convolutional Decision Jungle for Image Classification

2017

NiN+Superclass+CDJ

Spectral Representations for Convolutional Neural Networks

2015

Spectral Representations for Convolutional Neural Networks

"BNN - BN = ?": Training Binary Neural Networks without Batch Normalization

2021

ReActNet-18

VITA-Group/BNN_NoBN

Training Very Deep Networks

2015

VDN

LiyuanLucasLiu/LM-LSTM-CRF yoonkim/lstm-char-cnn flukeskywalker/highway-networks

Deep Convolutional Neural Networks as Generic Feature Extractors

2017

DCNN+GFE

Generalizing Pooling Functions in Convolutional Neural Networks: Mixed, Gated, and Tree

2015

Tree+Max-Avg pooling

cypw/DPNs BeanGreen247/Python-AI-Arts

HD-CNN: Hierarchical Deep Convolutional Neural Network for Large Scale Visual Recognition

2014

HD-CNN

justinessert/hierarchical-deep-cnn rarriaza/ATPRO_HCNN

Universum Prescription: Regularization using Unlabeled Data

2015

Universum Prescription

Striving for Simplicity: The All Convolutional Net

2014

ACN

pytorch/captum MisaOgura/flashtorch

DLME: Deep Local-flatness Manifold Embedding

2022

DLME (ResNet-18, linear)

Westlake-AI/openmixup zangzelin/code_ECCV2022_DLME

FatNet: High Resolution Kernels for Classification Using Fully Convolutional Optical Neural Networks

2022

ResNet-18 (modified)

riadibadulla/simulator

Deeply-Supervised Nets

2014

DSN

ellisdg/3DUnetCNN

Network In Network

2013

NiN

MaximeVandegar/Papers-in-100-Lines-of-Code nagadomi/kaggle-cifar10-torch7

Improving Deep Neural Networks with Probabilistic Maxout Units

2013

DNN+Probabilistic Maxout

Maxout Networks

2013

Maxout Network (k=2)

MaximeVandegar/Papers-in-100-Lines-of-Code philipperemy/tensorflow-maxout

Convolutional Xformers for Vision

2022

Convolutional Linear Transformer for Vision (CLTV)

pranavphoenix/cxv

FatNet: High Resolution Kernels for Classification Using Fully Convolutional Optical Neural Networks

2022

FatNet of ResNet-18

riadibadulla/simulator

FatNet: High Resolution Kernels for Classification Using Fully Convolutional Optical Neural Networks

2022

Optical Simulation of FatNet

riadibadulla/simulator

Empirical Evaluation of Rectified Activations in Convolutional Network

2015

RReLU

OsvaldN/APS360_Project spinterRu/fashion_mnist

Stochastic Pooling for Regularization of Deep Convolutional Neural Networks

2013

Stochastic Pooling

szagoruyko/imagine-nn

How Important is Weight Symmetry in Backpropagation?

2015

Sign-symmetry

jsalbert/biotorch willwx/sign-symmetry

Sharpness-Aware Minimization for Efficiently Improving Generalization

2020

CNN39

davda54/sam google-research/sam

Sharpness-Aware Minimization for Efficiently Improving Generalization

2020

CNN36

davda54/sam google-research/sam

Sharpness-aware Quantization for Deep Neural Networks

2021

CNN37

ziplab/saq zip-group/saq

Model	Paper	Percentage correct	Date
efficient adaptive ensembling	Efficient Adaptive Ensembling for Image Classific…	96.81	2022-06-15
EffNet-L2 (SAM)	Sharpness-Aware Minimization for Efficiently Impr…	96.08	2020-10-03
Swin-L + ML-Decoder	ML-Decoder: Scalable and Versatile Classification…	95.10	2021-11-25
µ2Net (ViT-L/16)	An Evolutionary Approach to Dynamic Introduction …	94.95	2022-05-25
ViT-B-16 (ImageNet-21K-P pretrain)	ImageNet-21K Pretraining for the Masses	94.20	2021-04-22
CvT-W24	CvT: Introducing Convolutions to Vision Transform…	94.09	2021-03-29
ViT-B/16 (PUGD)	Perturbated Gradients Updating within Unit Space …	93.95	2021-10-01
Heinsen Routing + BEiT-large 16 224	An Algorithm for Routing Vectors in Sequences	93.80	2022-11-20
BiT-L (ResNet)	Big Transfer (BiT): General Visual Representation…	93.51	2019-12-24
VIT-L/16 (Spinal FC, Background)	Reduction of Class Activation Uncertainty with Ba…	93.31	2023-05-05
CaiT-M-36 U 224	Going deeper with Image Transformers	93.10	2021-03-31
ViT-L (attn fine-tune)	Three things everyone should know about Vision Tr…	93.00	2022-03-18
TResNet-L-V2	TResNet: High Performance GPU-Dedicated Architect…	92.60	2020-03-30
EfficientNetV2-L	EfficientNetV2: Smaller Models and Faster Training	92.30	2021-04-01
EfficientNetV2-M	EfficientNetV2: Smaller Models and Faster Training	92.20	2021-04-01
BiT-M (ResNet)	Big Transfer (BiT): General Visual Representation…	92.17	2019-12-24
CeiT-S	Incorporating Convolution Designs into Visual Tra…	91.80	2021-03-22
CeiT-S (384 finetune resolution)	Incorporating Convolution Designs into Visual Tra…	91.80	2021-03-22
EfficientNet-B7	EfficientNet: Rethinking Model Scaling for Convol…	91.70	2019-05-28
EfficientNetV2-S	EfficientNetV2: Smaller Models and Faster Training	91.50	2021-04-01
GPIPE	GPipe: Efficient Training of Giant Neural Network…	91.30	2018-11-16
DGMMC-S	Performance of Gaussian Mixture Model Classifiers…	91.20	2024-10-17
TNT-B	Transformer in Transformer	91.10	2021-02-27
DeiT-B	Training data-efficient image transformers & dist…	90.80	2020-12-23
GFNet-H-B	Global Filter Networks for Image Classification	90.30	2021-07-01
E2E-3M	Rethinking Recurrent Neural Networks and Other Im…	90.27	2020-07-30
Bamboo (ViT-B/16)	Bamboo: Building Mega-Scale Vision Dataset Contin…	90.20	2022-03-15
PyramidNet-272 (ASAM)	ASAM: Adaptive Sharpness-Aware Minimization for S…	89.90	2021-02-23
PyramidNet (SAM)	Sharpness-Aware Minimization for Efficiently Impr…	89.70	2020-10-03
DVT (T2T-ViT-24)	Not All Images are Worth 16x16 Words: Dynamic Tra…	89.63	2021-05-31
ResMLP-24	ResMLP: Feedforward networks for image classifica…	89.50	2021-05-07
PyramidNet-272, S=4	Towards Better Accuracy-efficiency Trade-offs: Di…	89.46	2020-11-30
CeiT-T	Incorporating Convolution Designs into Visual Tra…	89.40	2021-03-22
PyramidNet+ShakeDrop	AutoAugment: Learning Augmentation Policies from …	89.30	2018-05-24
ViT-B/16- SAM	When Vision Transformers Outperform ResNets witho…	89.10	2021-06-03
ConvMLP-M	ConvMLP: Hierarchical Convolutional MLPs for Visi…	89.10	2021-09-09
ConvMLP-L	ConvMLP: Hierarchical Convolutional MLPs for Visi…	88.60	2021-09-09
ResNet-152x4-AGC (ImageNet-21K)	Effect of Pre-Training Scale on Intra- and Inter-…	88.54	2021-05-31
ColorNet	ColorNet: Investigating the importance of color s…	88.40	2019-02-01
PyramidNet+ShakeDrop (Fast AA)	Fast AutoAugment	88.30	2019-05-01
NAT-M4	Neural Architecture Transfer	88.30	2020-05-12
CeiT-T (384 finetune resolution)	Incorporating Convolution Designs into Visual Tra…	88.00	2021-03-22
NAT-M3	Neural Architecture Transfer	87.70	2020-05-12
ViT-S/16- SAM	When Vision Transformers Outperform ResNets witho…	87.60	2021-06-03
NAT-M2	Neural Architecture Transfer	87.50	2020-05-12
Dynamics 1	PSO-Convolutional Neural Networks with Heterogene…	87.48	2022-05-20
DenseNet-BC-190, S=4	Towards Better Accuracy-efficiency Trade-offs: Di…	87.44	2020-11-30
ConvMLP-S	ConvMLP: Hierarchical Convolutional MLPs for Visi…	87.40	2021-09-09
ResMLP-12	ResMLP: Feedforward networks for image classifica…	87.00	2021-05-07
WRN-40-10, S=4	Towards Better Accuracy-efficiency Trade-offs: Di…	86.90	2020-11-30
ResNet50 (A1)	ResNet strikes back: An improved training procedu…	86.90	2021-10-01
WRN-28-10 * 3	MixMo: Mixing Multiple Inputs for Multiple Output…	86.81	2021-03-10
PyramidNet + AA (AMP)	Regularizing Neural Networks via Adversarial Mode…	86.64	2020-10-10
PyramidNet-200 + Shakedrop + Cutmix + PS-KD	Self-Knowledge Distillation with Progressive Refi…	86.41	2020-06-22
Mixer-B/16- SAM	When Vision Transformers Outperform ResNets witho…	86.40	2021-06-03
ResCNet-50	Deep Feature Response Discriminative Calibration	86.31	2024-11-16
PyramidNet-200 + Shakedrop + Cutmix	CutMix: Regularization Strategy to Train Strong C…	86.19	2019-05-13
MUXNet-m	MUXConv: Information Multiplexing in Convolutiona…	86.10	2020-03-31
NAT-M1	Neural Architecture Transfer	86.00	2020-05-12
WRN-28-10	MixMo: Mixing Multiple Inputs for Multiple Output…	85.77	2021-03-10
WRN-28-10, S=4	Towards Better Accuracy-efficiency Trade-offs: Di…	85.74	2020-11-30
WRN-28-8 +SAMix	Boosting Discriminative Visual Representation Lea…	85.50	2021-11-30
ASANas	Improving Neural Architecture Search Image Classi…	85.42	2019-03-14
SparseSwin	SparseSwin: Swin Transformer with Sparse Transfor…	85.35	2023-09-11
ResNet-50-SAM	When Vision Transformers Outperform ResNets witho…	85.20	2021-06-03
WRN-28-8 +AutoMix	AutoMix: Unveiling the Power of Mixup for Stronge…	85.16	2021-03-24
WaveMixLite-256/7	WaveMix: A Resource-efficient Neural Network for …	85.09	2022-05-28
MANO-tiny	Linear Attention with Global Context: A Multipole…	85.08	2025-07-03
WRN 28-14	Neural networks with late-phase weights	85.00	2020-07-25
R-Mix (WideResNet 28-10)	Expeditious Saliency-guided Mix-up through Random…	85.00	2022-12-09
EEEA-Net-C (b=5)+ CO	EEEA-Net: An Early Exit Evolutionary Neural Archi…	84.98	2021-08-13
RL-Mix (WideResNet 28-10)	Expeditious Saliency-guided Mix-up through Random…	84.90	2022-12-09
Wide-ResNet-28-10	Automatic Data Augmentation via Invariance-Constr…	84.89	2022-09-29
SENet + ShakeEven + Cutout	Squeeze-and-Excitation Networks	84.59	2017-09-05
ResNeXt-50(32x4d) + SAMix	Boosting Discriminative Visual Representation Lea…	84.42	2021-11-30
WRN-28-10 with reSGHMC	Non-convex Learning via Replica Exchange Stochast…	84.38	2020-08-12
PyramidNet-272 + SWA	Averaging Weights Leads to Wider Optima and Bette…	84.16	2018-03-14
WRN28-10	Puzzle Mix: Exploiting Saliency and Local Statist…	84.05	2020-09-15
HCGNet-A3	Gated Convolutional Networks with Hybrid Connecti…	84.04	2019-08-26
WideResNet 28-10 + CutMix (OneCycleLR scheduler)	Expeditious Saliency-guided Mix-up through Random…	83.97	2022-12-09
DenseNet-BC-190 + FMix	FMix: Enhancing Mixed Sample Data Augmentation	83.95	2020-02-27
ORN	Oriented Response Networks	83.85	2017-01-07
Grafit (ResNet-50)	Grafit: Learning fine-grained image representatio…	83.70	2020-11-25
ResNeXt-50(32x4d) + AutoMix	AutoMix: Unveiling the Power of Mixup for Stronge…	83.64	2021-03-24
CCT-7/3x1+HTM+VTM	TokenMixup: Efficient Attention-guided Token-leve…	83.57	2022-10-14
HCGNet-A2	Gated Convolutional Networks with Hybrid Connecti…	83.46	2019-08-26
Res2NeXt-29	Res2Net: A New Multi-scale Backbone Architecture	83.44	2019-04-02
DenseNet-BC-190 + Mixup	mixup: Beyond Empirical Risk Minimization	83.20	2017-10-25
SSAL-DenseNet 190-40	Contextual Classification Using Self-Supervised A…	83.20	2021-01-07
EnAET	EnAET: A Self-Trained framework for Semi-Supervis…	83.13	2019-11-21
WRN 28-10	Neural networks with late-phase weights	83.06	2020-07-25
R-Mix (ResNeXt 29-4-24)	Expeditious Saliency-guided Mix-up through Random…	83.02	2022-12-09
Wide ResNet+Cutout+no BN scale/offset learning	Single-bit-per-weight deep convolutional neural n…	82.95	2019-07-16
WRN-16-8 with reSGHMC	Non-convex Learning via Replica Exchange Stochast…	82.95	2020-08-12
DenseNet-BC	Densely Connected Convolutional Networks	82.82	2016-08-25
ABNet-2G-R3-Combined	ANDHRA Bandersnatch: Training Neural Networks to …	82.78	2024-11-28
CCT-7/3x1*	Escaping the Big Data Paradigm with Compact Trans…	82.72	2021-04-12
EXACT (WRN-28-10)	EXACT: How to Train Your Accuracy	82.68	2022-05-19
SKNet-29 (ResNeXt-29, 16×32d)	Selective Kernel Networks	82.67	2019-03-15
DenseNet	Densely Connected Convolutional Networks	82.62	2016-08-25
Shared WRN	Learning Implicitly Recurrent CNNs Through Parame…	82.57	2019-02-26
Transformer local-attention (NesT-B)	Nested Hierarchical Transformer: Towards Accurate…	82.56	2021-05-26
RL-Mix (ResNeXt 29-4-24)	Expeditious Saliency-guided Mix-up through Random…	82.43	2022-12-09
Mixer-S/16- SAM	When Vision Transformers Outperform ResNets witho…	82.40	2021-06-03
R-Mix (WideResNet 16-8)	Expeditious Saliency-guided Mix-up through Random…	82.32	2022-12-09
ResNeXt 29-4-24 + CutMix (OneCycleLR scheduler)	Expeditious Saliency-guided Mix-up through Random…	82.30	2022-12-09
WARN	Attend and Rectify: a Gated Attention Mechanism f…	82.18	2018-07-19
RL-Mix (WideResNet 16-8)	Expeditious Saliency-guided Mix-up through Random…	82.16	2022-12-09
WRN+SWA	Averaging Weights Leads to Wider Optima and Bette…	82.15	2018-03-14
Manifold Mixup	Manifold Mixup: Better Representations by Interpo…	81.96	2018-06-13
HCGNet-A1	Gated Convolutional Networks with Hybrid Connecti…	81.87	2019-08-26
WideResNet 16-8 + CutMix (OneCycleLR scheduler)	Expeditious Saliency-guided Mix-up through Random…	81.79	2022-12-09
Residual Gates + WRN	Learning Identity Mappings with Residual Gates	81.73	2016-11-04
kNN-CLIP	Revisiting a kNN-based Image Classification Syste…	81.70	2022-04-03
AA-Wide-ResNet	Attention Augmented Convolutional Networks	81.60	2019-04-22
PDO-eConv (p8, 4.6M)	PDO-eConvs: Partial Differential Operator Based E…	81.60	2020-07-20
SEER (RegNet10B)	Vision Models Are More Robust And Fair When Pretr…	81.53	2022-02-16
R-Mix (PreActResNet-18)	Expeditious Saliency-guided Mix-up through Random…	81.49	2022-12-09
ResNet50 (FSGDM)	On the Performance Analysis of Momentum Method: A…	81.44	2024-11-29
Wide-ResNet-40-2	Automatic Data Augmentation via Invariance-Constr…	81.19	2022-09-29
Wide ResNet	Wide Residual Networks	81.15	2016-05-23
CoPaNet-R-164	Deep Competitive Pathway Networks	81.10	2017-09-29
ABNet-2G-R3	ANDHRA Bandersnatch: Training Neural Networks to …	80.83	2024-11-28
RL-Mix (PreActResNet-18)	Expeditious Saliency-guided Mix-up through Random…	80.75	2022-12-09
PreActResNet-18 + CutMix (OneCycleLR scheduler)	Expeditious Saliency-guided Mix-up through Random…	80.60	2022-12-09
GAC-SNN	Gated Attention Coding for Training High-performa…	80.45	2023-08-12
ABNet-2G-R2	ANDHRA Bandersnatch: Training Neural Networks to …	80.35	2024-11-28
SimpleNetv2	Towards Principled Design of Deep Convolutional N…	80.29	2018-02-17
UPANets	UPANets: Learning from the Universal Pixel Attent…	80.29	2021-03-15
PreActResNet-18 + SageMix	SageMix: Saliency-Guided Mixup for Point Clouds	80.16	2022-10-13
ResNet56 with reSGHMC	Non-convex Learning via Replica Exchange Stochast…	80.14	2020-08-12
PDO-eConv (p8, 2.62M)	PDO-eConvs: Partial Differential Operator Based E…	79.99	2020-07-20
VGG11B(3x) + LocalLearning	Training Neural Networks with Local Error Signals	79.90	2019-01-20
NNCLR	With a Little Help from My Friends: Nearest-Neigh…	79.00	2021-04-29
ABNet-2G-R1	ANDHRA Bandersnatch: Training Neural Networks to …	78.79	2024-11-28
PreActResNet18 (AMP)	Regularizing Neural Networks via Adversarial Mode…	78.49	2020-10-10
SimpleNetv1	Lets keep it simple, Using simple architectures t…	78.37	2016-08-22
ViT (lightweight, MAE pre-trained)	Pre-training of Lightweight Vision Transformers o…	78.27	2024-02-06
PDC	Augmenting Deep Classifiers with Polynomial Neura…	77.90	2021-04-16
MobileNetV3-large x1.0 (BSConv-U)	Rethinking Depthwise Separable Convolutions: How …	77.70	2020-03-30
CCT-6/3x1	Escaping the Big Data Paradigm with Compact Trans…	77.31	2021-04-12
ResNet-1001	Identity Mappings in Deep Residual Networks	77.30	2016-03-16
Evolution	Large-Scale Evolution of Image Classifiers	77.00	2017-03-03
DIANet	DIANet: Dense-and-Implicit Attention Network	76.98	2019-05-25
LP-BNN (ours) + cutout	Encoding the latent posterior of Bayesian Neural …	76.85	2020-12-04
ResNet-18+MM+FRL	Learning Class Unique Features in Fine-Grained Vi…	76.64	2020-11-22
ResNet32 with reSGHMC	Non-convex Learning via Replica Exchange Stochast…	76.55	2020-08-12
MomentumNet	Momentum Residual Neural Networks	76.38	2021-02-15
SSCNN	Spatially-sparse convolutional neural networks	75.70	2014-09-22
Exponential Linear Units	Fast and Accurate Deep Network Learning by Expone…	75.70	2015-11-23
ResNet-9	CNN Filter DB: An Empirical Investigation of Trai…	75.59	2022-03-29
Stochastic Depth	Deep Networks with Stochastic Depth	75.42	2016-03-30
ResNet v2-110 (Mish activation)	Mish: A Self Regularized Non-Monotonic Activation…	74.41	2019-08-23
ResNet20 with reSGHMC	Non-convex Learning via Replica Exchange Stochast…	74.14	2020-08-12
MixMatch	MixMatch: A Holistic Approach to Semi-Supervised …	74.10	2019-05-06
Beta-Rank	Beta-Rank: A Robust Convolutional Filter Pruning …	74.01	2023-04-15
PreResNet-110	How to Use Dropout Correctly on Residual Networks…	73.98	2023-02-13
ABNet-2G-R0	ANDHRA Bandersnatch: Training Neural Networks to …	73.93	2024-11-28
Fractional MP	Fractional Max-Pooling	73.60	2014-12-18
ResNet+ELU	Deep Residual Networks with Exponential Linear Un…	73.50	2016-04-14
PDO-eConv (p6m,0.37M)	PDO-eConvs: Partial Differential Operator Based E…	73.00	2020-07-20
SOPCNN	Stochastic Optimization of Plain Convolutional Ne…	72.96	2020-01-24
PDO-eConv (p6,0.36M)	PDO-eConvs: Partial Differential Operator Based E…	72.87	2020-07-20
Tuned CNN	Scalable Bayesian Optimization Using Deep Neural …	72.60	2015-02-19
ResNet-110 (SAP)	Stochastic Subsampling With Average Pooling	72.54	2024-09-25
CMsC	Competitive Multi-scale Convolution	72.40	2015-11-18
Fitnet4-LSUV	All you need is a good init	72.30	2015-11-19
BNM NiN	Batch-normalized Maxout Network in Network	71.10	2015-11-09
OTTT	Online Training Through Time for Spiking Neural N…	71.05	2022-10-09
MIM	On the Importance of Normalisation Layers in Deep…	70.80	2015-08-03
WaveMix-Lite-256/7	WaveMix: A Resource-efficient Neural Network for …	70.20	2022-05-28
NiN+APL	Learning Activation Functions to Improve Deep Neu…	69.20	2014-12-21
SWWAE	Stacked What-Where Auto-encoders	69.10	2015-06-08
NiN+Superclass+CDJ	Deep Convolutional Decision Jungle for Image Clas…	69.00	2017-06-06
Spectral Representations for Convolutional Neural Networks	Spectral Representations for Convolutional Neural…	68.40	2015-06-11
ReActNet-18	"BNN - BN = ?": Training Binary Neural Networks w…	68.34	2021-04-16
VDN	Training Very Deep Networks	67.80	2015-07-22
DCNN+GFE	Deep Convolutional Neural Networks as Generic Fea…	67.70	2017-10-06
Tree+Max-Avg pooling	Generalizing Pooling Functions in Convolutional N…	67.60	2015-09-30
HD-CNN	HD-CNN: Hierarchical Deep Convolutional Neural Ne…	67.40	2014-10-03
Universum Prescription	Universum Prescription: Regularization using Unla…	67.20	2015-11-11
ACN	Striving for Simplicity: The All Convolutional Net	66.30	2014-12-21
DLME (ResNet-18, linear)	DLME: Deep Local-flatness Manifold Embedding	66.10	2022-07-07
ResNet-18 (modified)	FatNet: High Resolution Kernels for Classificatio…	66.00	2022-10-30
DSN	Deeply-Supervised Nets	65.40	2014-09-18
NiN	Network In Network	64.30	2013-12-16
DNN+Probabilistic Maxout	Improving Deep Neural Networks with Probabilistic…	61.90	2013-12-20
Maxout Network (k=2)	Maxout Networks	61.43	2013-02-18
Convolutional Linear Transformer for Vision (CLTV)	Convolutional Xformers for Vision	60.11	2022-01-25
FatNet of ResNet-18	FatNet: High Resolution Kernels for Classificatio…	60.00	2022-10-30
Optical Simulation of FatNet	FatNet: High Resolution Kernels for Classificatio…	60.00	2022-10-30
RReLU	Empirical Evaluation of Rectified Activations in …	59.80	2015-05-05
Stochastic Pooling	Stochastic Pooling for Regularization of Deep Con…	57.50	2013-01-16
Sign-symmetry	How Important is Weight Symmetry in Backpropagati…	48.75	2015-10-17
CNN39	Sharpness-Aware Minimization for Efficiently Impr…	42.64	2020-10-03
CNN36	Sharpness-Aware Minimization for Efficiently Impr…	36.07	2020-10-03
CNN37	Sharpness-aware Quantization for Deep Neural Netw…	35.05	2021-11-24

CIFAR-100

Performance Over Time

Edit Benchmark Results

Edit Result

Top Performing Models

All Papers (197)