PPLLaVA-7B
|
PPLLaVA: Varied Video Sequence Understanding With…
|
3.85
|
2024-11-04
|
|
PLLaVA-34B
|
PLLaVA : Parameter-free LLaVA Extension from Imag…
|
3.60
|
2024-04-25
|
|
TS-LLaVA-34B
|
TS-LLaVA: Constructing Visual Tokens through Thum…
|
3.55
|
2024-11-17
|
|
SlowFast-LLaVA-34B
|
SlowFast-LLaVA: A Strong Training-Free Baseline f…
|
3.48
|
2024-07-22
|
|
VideoChat2_HD_mistral
|
MVBench: A Comprehensive Multi-modal Video Unders…
|
3.40
|
2023-11-28
|
|
VideoGPT+
|
VideoGPT+: Integrating Image and Video Encoders f…
|
3.27
|
2024-06-13
|
|
ST-LLM
|
ST-LLM: Large Language Models Are Effective Tempo…
|
3.23
|
2024-03-30
|
|
MiniGPT4-video-7B
|
MiniGPT4-Video: Advancing Multimodal LLMs for Vid…
|
3.08
|
2024-04-04
|
|
VideoChat2
|
MVBench: A Comprehensive Multi-modal Video Unders…
|
3.02
|
2023-11-28
|
|
Chat-UniVi
|
Chat-UniVi: Unified Visual Representation Empower…
|
2.89
|
2023-11-14
|
|
VTimeLLM
|
VTimeLLM: Empower LLM to Grasp Video Moments
|
2.78
|
2023-11-30
|
|
MovieChat
|
MovieChat: From Dense Token to Sparse Memory for …
|
2.76
|
2023-07-31
|
|
BT-Adapter
|
BT-Adapter: Video Conversation is Feasible Withou…
|
2.68
|
2023-09-27
|
|
Video-ChatGPT
|
Video-ChatGPT: Towards Detailed Video Understandi…
|
2.40
|
2023-06-08
|
|
Video Chat
|
VideoChat: Chat-Centric Video Understanding
|
2.32
|
2023-05-10
|
|
BT-Adapter (zero-shot)
|
BT-Adapter: Video Conversation is Feasible Withou…
|
2.16
|
2023-09-27
|
|
LLaMA Adapter
|
LLaMA-Adapter V2: Parameter-Efficient Visual Inst…
|
2.03
|
2023-04-28
|
|
Video LLaMA
|
Video-LLaMA: An Instruction-tuned Audio-Visual La…
|
1.96
|
2023-06-05
|
|