📊 Showing 3 results | 📏 Metric: Accuracy
Rank | Model | Paper | Accuracy | Date | Code |
---|---|---|---|---|---|
1 | Mirasol3B | Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities | 78.20 | 2023-11-09 | - |
2 | CA2ST(B/16) | CA^2ST: Cross-Attention in Audio, Space, and Time for Holistic Video Recognition | 61.00 | 2025-03-30 | - |
3 | CAVA(B/16) | CA^2ST: Cross-Attention in Audio, Space, and Time for Holistic Video Recognition | 60.30 | 2025-03-30 | - |