SD3.5-Medium+Flow-GRPO
|
Flow-GRPO: Training Flow Matching Models via Onli…
|
0.95
|
2025-05-08
|
|
UniWorld-V1 (Rewrite)
|
UniWorld-V1: High-Resolution Semantic Encoders fo…
|
0.84
|
2025-06-03
|
|
MindOmni
|
MindOmni: Unleashing Reasoning Generation in Visi…
|
0.83
|
2025-05-19
|
|
UniWorld-V1
|
UniWorld-V1: High-Resolution Semantic Encoders fo…
|
0.80
|
2025-06-03
|
|
SANA-1.5 4.8B (+ Inference Scaling)
|
SANA 1.5: Efficient Scaling of Training-Time and …
|
0.80
|
2025-01-30
|
|
Janus-Pro-7B
|
Janus-Pro: Unified Multimodal Understanding and G…
|
0.80
|
2025-01-29
|
|
MetaQuery-XL (Rewrite)
|
Transfer between Modalities with MetaQueries
|
0.80
|
2025-04-08
|
|
Show-o [xie2024show] PARM It. DPO PARM
|
Can We Generate Images with CoT? Let's Verify and…
|
0.77
|
2025-01-23
|
|
Show-o [xie2024show] Ft. ORM It. DPO Ft. ORM
|
Can We Generate Images with CoT? Let's Verify and…
|
0.75
|
2025-01-23
|
|
Janus-Pro-1B
|
Janus-Pro: Unified Multimodal Understanding and G…
|
0.73
|
2025-01-29
|
|
Lumina-Image 2.0
|
Lumina-Image 2.0: A Unified and Efficient Image G…
|
0.73
|
2025-03-27
|
|
SANA-1.5 4.8B
|
SANA 1.5: Efficient Scaling of Training-Time and …
|
0.72
|
2025-01-30
|
|
Fluid (10.5B)
|
Fluid: Scaling Autoregressive Text-to-image Gener…
|
0.69
|
2024-10-17
|
|
Und. and Gen. Show-o (Ours)
|
Show-o: One Single Transformer to Unify Multimoda…
|
0.68
|
2024-08-22
|
|
Emu3
|
Emu3: Next-Token Prediction is All You Need
|
0.66
|
2024-09-27
|
|
SnapGen
|
SnapGen: Taming High-Resolution Text-to-Image Mod…
|
0.66
|
2024-12-12
|
|
JanusFlow
|
JanusFlow: Harmonizing Autoregression and Rectifi…
|
0.63
|
2024-11-12
|
|
PixArt-Σ
|
PixArt-Σ: Weak-to-Strong Training of Diffusion Tr…
|
0.53
|
2024-03-07
|
|
DiffMoE-E16-T2I-Flow (w SFT)
|
DiffMoE: Dynamic Token Selection for Scalable Dif…
|
0.51
|
2025-03-18
|
|
PIXART-δ
|
PIXART-δ: Fast and Controllable Image Generation …
|
0.00
|
2024-01-10
|
|