ScoutML
Sign In
Request Access
Wiki
/
Papers
/
2024 Papers
2024 Papers
Machine learning and AI research papers from 2024
Recent
Popular
2024
2023
2022
Intelligent Clinical Documentation: Harnessing Generative AI for Patient-Centric Clinical Note Generation
2024
•
938 citations
YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
2024
•
876 citations
Mixtral of Experts
2024
•
867 citations
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Microsoft
2024
•
861 citations
QWEN2 TECHNICAL REPORT
2024
•
666 citations
YOLOv10: Real-Time End-to-End Object Detection
2024
•
632 citations
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
2024
•
608 citations
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
2024
•
579 citations
DeepSeek-Coder: When the Large Language Model Meets Programming -The Rise of Code Intelligence
2024
•
551 citations
SAM 2: Segment Anything in Images and Videos
2024
•
529 citations
VMamba: Visual State Space Model
2024
•
517 citations
Gemma 2: Improving Open Language Models at a Practical Size
2024
•
503 citations
LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code
2024
•
448 citations
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites
2024
•
447 citations
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
2024
•
433 citations
LLaVA-OneVision: Easy Visual Task Transfer
2024
•
420 citations
KTO: Model Alignment as Prospect Theoretic Optimization
2024
•
393 citations
ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools
2024
•
386 citations
Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
2024
•
383 citations
KAN: Kolmogorov-Arnold Networks
2024
•
380 citations
Gemma: Open Models Based on Gemini Research and Technology
2024
•
366 citations
GPT-4o System Card
2024
•
354 citations
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
2024
•
349 citations
2D Gaussian Splatting for Geometrically Accurate Radiance Fields
2024
•
341 citations
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
2024
•
321 citations
TinyLlama: An Open-Source Small Language Model
2024
•
320 citations
OLMo : Accelerating the Science of Language Models
2024
•
317 citations
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
2024
•
306 citations
LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation
2024
•
303 citations
Grounded SAM: Assembling Open-World Models for Diverse Visual Tasks
2024
•
300 citations
Improving Online Algorithms via ML Predictions *
2024
•
299 citations
WORLDSIMBENCH: TOWARDS VIDEO GENERATION MODELS AS WORLD SIMULATORS
2024
•
299 citations
Large Language Models: A Survey
2024
•
296 citations
SimPO: Simple Preference Optimization with a Reference-Free Reward
2024
•
287 citations
DoRA: Weight-Decomposed Low-Rank Adaptation
2024
•
285 citations
LLAMAFACTORY: Unified Efficient Fine-Tuning of 100+ Language Models
2024
•
284 citations
Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators
2024
•
282 citations
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
2024
•
275 citations
Self-Rewarding Language Models
2024
•
268 citations
From Local to Global: A Graph RAG Approach to Query-Focused Summarization
2024
•
256 citations
DeepSeek LLM Scaling Open-Source Language Models with Longtermism
2024
•
253 citations
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies
2024
•
251 citations
UAV Networks Surveillance Implementing an Effective Load-Aware Multipath Routing Protocol (ELAMRP)
2024
•
249 citations
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation
2024
•
248 citations
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
2024
•
243 citations
DeepSeek-VL: Towards Real-World Vision-Language Understanding
2024
•
240 citations
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
2024
•
239 citations
Depth Anything V2
2024
•
238 citations
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
2024
•
236 citations
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
2024
•
235 citations
How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs This paper contains jailbreak contents that can be offensive in nature
2024
•
229 citations
BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains
2024
•
228 citations
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark
2024
•
227 citations
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs
2024
•
227 citations
Chameleon: Mixed-Modal Early-Fusion Foundation Models
2024
•
222 citations
InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Models
2024
•
219 citations
MEDUSA: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
2024
•
219 citations
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
2024
•
217 citations
A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications
2024
•
216 citations
VideoMamba: State Space Model for Efficient Video Understanding
2024
•
214 citations
VM-UNet: Vision Mamba UNet for Medical Image Segmentation
2024
•
210 citations
Zero-shot Identity-Preserving Generation in Seconds
2024
•
208 citations
YOLO-World: Real-Time Open-Vocabulary Object Detection
2024
•
206 citations
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction
2024
•
204 citations
Latte: Latent Diffusion Transformer for Video Generation
2024
•
201 citations
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models
2024
•
199 citations
Large Language Model based Multi-Agents: A Survey of Progress and Challenges
2024
•
197 citations
RULER: What's the Real Context Size of Your Long-Context Language Models?
2024
•
195 citations
VideoLLaMA 2 Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
2024
•
190 citations
Are We on the Right Way for Evaluating Large Vision-Language Models?
2024
•
189 citations
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
2024
•
189 citations
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
2024
•
182 citations
Jamba: A Hybrid Transformer-Mamba Language Model
2024
•
180 citations
GPT-4V(ision) is a Generalist Web Agent, if Grounded
2024
•
178 citations
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling
2024
•
177 citations
Hallucination is Inevitable: An Innate Limitation of Large Language Models
2024
•
176 citations
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
2024
•
175 citations
ORPO: Monolithic Preference Optimization without Reference Model
2024
•
173 citations
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
2024
•
172 citations
Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation
2024
•
170 citations
Retrieval-Augmented Generation for AI-Generated Content: A Survey
2024
•
166 citations
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
2024
•
165 citations
MambaIR: A Simple Baseline for Image Restoration with State-Space Model
2024
•
162 citations
SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation
2024
•
161 citations
STAR: A Benchmark for Situated Reasoning in Real-World Videos
2024
•
161 citations
LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models
2024
•
160 citations
LocalMamba: Visual State Space Model with Windowed Selective Scan
2024
•
158 citations
A Survey on 3D Gaussian Splatting
2024
•
157 citations
SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities
2024
•
157 citations
Quantum computing with Qiskit
2024
•
156 citations
LLaVA-CoT: Let Vision Language Models Reason Step-by-Step
2024
•
153 citations
MM-LLMs: Recent Advances in MultiModal Large Language Models
2024
•
153 citations
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
2024
•
153 citations
A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models
2024
•
151 citations
Universal Manipulation Interface: In-The-Wild Robot Teaching Without In-The-Wild Robots
2024
•
150 citations
RAFT: Adapting Language Model to Domain Specific RAG
2024
•
150 citations
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
2024
•
149 citations
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
2024
•
148 citations
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
2024
•
147 citations
InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models
2024
•
147 citations
Page 1 of 1162
Next
Last
Browse Papers By:
By Year
2024 Papers
2023 Papers
2022 Papers
2021 Papers
By Popularity
Most Cited
Recently Added