Visual Question Answering
|
Janus-Pro-7B |
Janus-Pro: Unified Multimodal Understanding and …
|
2025-01-29 |
Visual Question Answering
|
Janus-Pro-1B |
Janus-Pro: Unified Multimodal Understanding and …
|
2025-01-29 |
Visual Question Answering
|
PIIP-LLaVA (Vicuna-7B, ConvNeXt-L, CLIP-L ) |
Parameter-Inverted Image Pyramid Networks for …
|
2025-01-14 |
Visual Question Answering
|
Lyra-Mini |
Lyra: An Efficient and Speech-Centric …
|
2024-12-12 |
Visual Question Answering (VQA)
|
Lyra-Pro |
Lyra: An Efficient and Speech-Centric …
|
2024-12-12 |
Visual Question Answering
|
Lyra-Base |
Lyra: An Efficient and Speech-Centric …
|
2024-12-12 |
Visual Question Answering
|
Lyra-Pro |
Lyra: An Efficient and Speech-Centric …
|
2024-12-12 |
Visual Question Answering
|
ILLUME |
ILLUME: Illuminating Your LLMs to …
|
2024-12-09 |
Visual Question Answering
|
LLaVA-1.5-7B (VG-S) |
ProVision: Programmatically Scaling Vision-centric Instruction …
|
2024-12-09 |
Visual Question Answering
|
LLaVA-1.5-7B (DC-S) |
ProVision: Programmatically Scaling Vision-centric Instruction …
|
2024-12-09 |
Visual Question Answering
|
TACO (LLaMA3-8B / CLIP) |
TACO: Learning Multi-modal Action Models …
|
2024-12-07 |
Visual Question Answering
|
TACO (Qwen2-7B / SigLIP) |
TACO: Learning Multi-modal Action Models …
|
2024-12-07 |
Visual Question Answering
|
TACO (LLaMA3-8B / SigLIP) |
TACO: Learning Multi-modal Action Models …
|
2024-12-07 |
Visual Question Answering
|
InternVL2.5-38B |
Expanding Performance Boundaries of Open-Source …
|
2024-12-06 |
Visual Question Answering
|
InternVL2.5-2B |
Expanding Performance Boundaries of Open-Source …
|
2024-12-06 |
Visual Question Answering
|
InternVL2.5-4B |
Expanding Performance Boundaries of Open-Source …
|
2024-12-06 |
Visual Question Answering
|
InternVL2.5-1B |
Expanding Performance Boundaries of Open-Source …
|
2024-12-06 |
Visual Question Answering
|
InternVL2.5-8B |
Expanding Performance Boundaries of Open-Source …
|
2024-12-06 |
Visual Question Answering
|
MAmmoTH-VL-8B |
MAmmoTH-VL: Eliciting Multimodal Reasoning with …
|
2024-12-06 |
Visual Question Answering
|
InternVL2.5-26B |
Expanding Performance Boundaries of Open-Source …
|
2024-12-06 |