ML Research Wiki / Benchmarks / Image Reconstruction / ImageNet

ImageNet

Image Reconstruction Benchmark

Performance Over Time

📊 Showing 15 results | 📏 Metric: FID

Top Performing Models

Rank	Model	Paper	FID	Date	Code
1	MGVQ (16x16x8)	MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization	0.49	2025-07-14	📦 MKJia/MGVQ
2	MGVQ (16x16x4)	MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization	0.64	2025-07-14	📦 MKJia/MGVQ
3	GigaTok-XL-XXL	GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation	0.79	2025-04-11	📦 SilentView/GigaTok
4	OptVQ (16x16x8)	Preventing Local Pitfalls in Vector Quantization via Optimal Transport	0.91	2024-12-19	📦 zbr17/OptVQ
5	OptVQ (16x16x4)	Preventing Local Pitfalls in Vector Quantization via Optimal Transport	1.00	2024-12-19	📦 zbr17/OptVQ
6	IBQ (16x16)	Taming Scalable Visual Tokenizer for Autoregressive Image Generation	1.00	2024-12-03	📦 tencentarc/seed-voken 📦 tencentarc/open-magvit2
7	Mo-VQGAN (16x16x4)	MoVQ: Modulating Quantized Vectors for High-Fidelity Image Generation	1.12	2022-09-19	📦 ai-forever/Kandinsky-2 📦 ai-forever/movqgan
8	Open-Magvit2 (16x16)	Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation	1.17	2024-09-06	📦 tencentarc/open-magvit2 📦 tencentarc/seed-voken
9	ViT-VQGAN (16x16)	Vector-quantized Image Modeling with Improved VQGAN	1.28	2021-10-09	📦 lucidrains/DALLE2-pytorch 📦 thuanz123/enhancing-transformers 📦 thuangb/enhancing-transformers 📦 ai-forever/movqgan 📦 CuddleSabe/VQGAN
10	MaskBit (16x16)	MaskBit: Embedding-free Image Generation via Bit Tokens	1.66	2024-09-24	📦 markweberdev/maskbit

All Papers (15)

MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization

2025

MGVQ (16x16x8)

MKJia/MGVQ

MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization

2025

MGVQ (16x16x4)

MKJia/MGVQ

GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation

2025

GigaTok-XL-XXL

SilentView/GigaTok

Preventing Local Pitfalls in Vector Quantization via Optimal Transport

2024

OptVQ (16x16x8)

zbr17/OptVQ

Preventing Local Pitfalls in Vector Quantization via Optimal Transport

2024

OptVQ (16x16x4)

zbr17/OptVQ

Taming Scalable Visual Tokenizer for Autoregressive Image Generation

2024

IBQ (16x16)

tencentarc/seed-voken tencentarc/open-magvit2

MoVQ: Modulating Quantized Vectors for High-Fidelity Image Generation

2022

Mo-VQGAN (16x16x4)

ai-forever/Kandinsky-2 ai-forever/movqgan

Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation

2024

Open-Magvit2 (16x16)

tencentarc/open-magvit2 tencentarc/seed-voken

Vector-quantized Image Modeling with Improved VQGAN

2021

ViT-VQGAN (16x16)

lucidrains/DALLE2-pytorch thuanz123/enhancing-transformers

MaskBit: Embedding-free Image Generation via Bit Tokens

2024

MaskBit (16x16)

markweberdev/maskbit

An Image is Worth 32 Tokens for Reconstruction and Generation

2024

TiTok-S-128

bytedance/1d-tokenizer lukaslaobeyer/token-opt

Autoregressive Image Generation using Residual Quantization

2022

RQ-VAE (8x8x16)

kakaobrain/rq-vae-transformer lucidrains/magvit2-pytorch

MaskGIT: Masked Generative Image Transformer

2022

MaskGIT-VQGAN (16x16)

lucidrains/soundstorm-pytorch HKUNLP/Dream

Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99%

2024

VQGAN-LC (16x16)

zh460045050/vqgan-lc

Taming Transformers for High-Resolution Image Synthesis

2020

Taming-VQGAN (16x16)

CompVis/taming-transformers alibaba/EasyNLP

ImageNet

Performance Over Time

Edit Benchmark Results

Edit Result

Top Performing Models

All Papers (15)

MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization

MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization

GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation

Preventing Local Pitfalls in Vector Quantization via Optimal Transport

Preventing Local Pitfalls in Vector Quantization via Optimal Transport

Taming Scalable Visual Tokenizer for Autoregressive Image Generation

MoVQ: Modulating Quantized Vectors for High-Fidelity Image Generation

Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation

Vector-quantized Image Modeling with Improved VQGAN

MaskBit: Embedding-free Image Generation via Bit Tokens

An Image is Worth 32 Tokens for Reconstruction and Generation

Autoregressive Image Generation using Residual Quantization

MaskGIT: Masked Generative Image Transformer

Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99%

Taming Transformers for High-Resolution Image Synthesis

Model	Paper	FID	Date
MGVQ (16x16x8)	MGVQ: Could VQ-VAE Beat VAE? A Generalizable Toke…	0.49	2025-07-14
MGVQ (16x16x4)	MGVQ: Could VQ-VAE Beat VAE? A Generalizable Toke…	0.64	2025-07-14
GigaTok-XL-XXL	GigaTok: Scaling Visual Tokenizers to 3 Billion P…	0.79	2025-04-11
OptVQ (16x16x8)	Preventing Local Pitfalls in Vector Quantization …	0.91	2024-12-19
OptVQ (16x16x4)	Preventing Local Pitfalls in Vector Quantization …	1.00	2024-12-19
IBQ (16x16)	Taming Scalable Visual Tokenizer for Autoregressi…	1.00	2024-12-03
Mo-VQGAN (16x16x4)	MoVQ: Modulating Quantized Vectors for High-Fidel…	1.12	2022-09-19
Open-Magvit2 (16x16)	Open-MAGVIT2: An Open-Source Project Toward Democ…	1.17	2024-09-06
ViT-VQGAN (16x16)	Vector-quantized Image Modeling with Improved VQG…	1.28	2021-10-09
MaskBit (16x16)	MaskBit: Embedding-free Image Generation via Bit …	1.66	2024-09-24
TiTok-S-128	An Image is Worth 32 Tokens for Reconstruction an…	1.71	2024-06-11
RQ-VAE (8x8x16)	Autoregressive Image Generation using Residual Qu…	1.83	2022-03-03
MaskGIT-VQGAN (16x16)	MaskGIT: Masked Generative Image Transformer	2.28	2022-02-08
VQGAN-LC (16x16)	Scaling the Codebook Size of VQGAN to 100,000 wit…	2.62	2024-06-17
Taming-VQGAN (16x16)	Taming Transformers for High-Resolution Image Syn…	3.64	2020-12-17