2024 Papers

LESS: Selecting Influential Data for Targeted Instruction Tuning

2024 • 146 citations

Quantum error correction below the surface code threshold Google Quantum AI and Collaborators

2024 • 145 citations

SWE-AGENT: AGENT-COMPUTER INTERFACES ENABLE AUTOMATED SOFTWARE ENGINEERING

2024 • 143 citations

LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders

2024 • 142 citations

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

2024 • 142 citations

Chronos: Learning the Language of Time Series

2024 • 140 citations

PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning

2024 • 140 citations

MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images

2024 • 140 citations

JAILBREAKING LEADING SAFETY-ALIGNED LLMS WITH SIMPLE ADAPTIVE ATTACKS

2024 • 139 citations

Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in Closed-Source LLMs

2024 • 139 citations

A Survey of Imitation Learning Methods, Environments and Metrics

2024 • 138 citations

MoE-LLaVA: Mixture of Experts for Large Vision-Language Models

2024 • 138 citations

What matters when building vision-language models?

2024 • 138 citations

Autoregressive Image Generation without Vector Quantization

2024 • 138 citations

MATHVERSE: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?

2024 • 138 citations

PROMETHEUS 2: An Open Source Language Model Specialized in Evaluating Other Language Models

2024 • 136 citations

KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache

2024 • 136 citations

A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models

2024 • 134 citations

SHOW-O: ONE SINGLE TRANSFORMER TO UNIFY MULTIMODAL UNDERSTANDING AND GENERATION

2024 • 132 citations

TO COT OR NOT TO COT? CHAIN-OF-THOUGHT HELPS MAINLY ON MATH AND SYMBOLIC REASONING

2024 • 132 citations

Unified Training of Universal Time Series Forecasting Transformers

2024 • 132 citations

SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers

2024 • 131 citations

LongICLBench: Long-context LLMs Struggle with Long In-context Learning

2024 • 131 citations

QWEN2.5-MATH TECHNICAL REPORT: TOWARD MATHEMATICAL EXPERT MODEL VIA SELF-IMPROVEMENT

2024 • 131 citations

SLICEGPT: COMPRESS LARGE LANGUAGE MODELS BY DELETING ROWS AND COLUMNS

2024 • 131 citations

SnapKV: LLM Knows What You are Looking for Before Generation

2024 • 129 citations

Large Language Models (LLMs) as Agents for Augmented Democracy

2024 • 128 citations

GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation

2024 • 127 citations

Swin-UMamba: Mamba-based UNet with ImageNet-based pretraining

2024 • 124 citations

Benchmarking Retrieval-Augmented Generation for Medicine

2024 • 124 citations

The Power of Noise: Redefining Retrieval for RAG Systems

2024 • 123 citations

∞BENCH: Extending Long Context Evaluation Beyond 100K Tokens

2024 • 122 citations

CAT3D: Create Anything in 3D with Multi-View Diffusion Models

2024 • 122 citations

LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

2024 • 121 citations

LoRA+: Efficient Low Rank Adaptation of Large Models

2024 • 121 citations

LLM Evaluators Recognize and Favor Their Own Generations

2024 • 121 citations

THE FAISS LIBRARY

2024 • 120 citations

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

2024 • 120 citations

A Survey on Large Language Models for Code Generation

2024 • 120 citations

NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

2024 • 120 citations

Emu3: Next-Token Prediction is All You Need

2024 • 119 citations

CodeS: Towards Building Open-source Language Models for Text-to-SQL

2024 • 119 citations

From r to Q * : Your Language Model is Secretly a Q-Function

2024 • 119 citations

DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving

2024 • 119 citations

Genie: Generative Interactive Environments

2024 • 119 citations

AI and Memory Wall

2024 • 118 citations

Break the Sequential Dependency of LLM Inference Using LOOKAHEAD DECODING

2024 • 118 citations

TOFU: A Task of Fictitious Unlearning for LLMs

2024 • 118 citations

The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning

2024 • 117 citations

ALLaVA: Harnessing GPT4V-Synthesized Data for Lite Vision-Language Models

2024 • 116 citations

Refusal in Language Models Is Mediated by a Single Direction

2024 • 115 citations

DataComp-LM: In search of the next generation of training sets for language models NeurIPS

2024 • 115 citations

Long Context Transfer from Language to Vision

2024 • 115 citations

Improve Mathematical Reasoning in Language Models by Automated Process Supervision

2024 • 114 citations

Hallucination of Multimodal Large Language Models: A Survey

2024 • 113 citations

GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting

2024 • 113 citations

DRIVEVLM: The Convergence of Autonomous Driving and Large Vision-Language Models

2024 • 113 citations

Direct Language Model Alignment from Online AI Feedback

2024 • 113 citations

TOWER: An Open Multilingual Large Language Model for Translation-Related Tasks

2024 • 112 citations

Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive

2024 • 112 citations

Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study

2024 • 112 citations

Moving horizon partition-based state estimation of large-scale systems -Revised version

2024 • 110 citations

NV-EMBED: IMPROVED TECHNIQUES FOR TRAINING LLMS AS GENERALIST EMBEDDING MODELS

2024 • 110 citations

Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity

2024 • 109 citations

Data Engineering for Scaling Language Models to 128K Context

2024 • 109 citations

ReST-MCTS * : LLM Self-Training via Process Reward Guided Tree Search

2024 • 109 citations

SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents

2024 • 109 citations

TripoSR: Fast 3D Object Reconstruction from a Single Image

2024 • 109 citations

Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve

2024 • 108 citations

CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model

2024 • 108 citations

UniDepth: Universal Monocular Metric Depth Estimation

2024 • 108 citations

we introduce negative-quality prompts to further improve perceptual quality. We also develop a restoration-guided sampling method to suppress the fidelity issue encountered in generative-based restoration. Experiments demonstrate SUPIR's exceptional restoration effects and its novel capacity to manipulate restoration through textual prompts. Degradation

2024 • 107 citations

BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions

2024 • 107 citations

TravelPlanner: A Benchmark for Real-World Planning with Language Agents

2024 • 107 citations

YOLOV11: AN OVERVIEW OF THE KEY ARCHITECTURAL ENHANCEMENTS

2024 • 105 citations

Generative Verifiers: Reward Modeling as Next-Token Prediction

2024 • 104 citations

Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences

2024 • 104 citations

SDXL-Lightning: Progressive Adversarial Diffusion Distillation

2024 • 104 citations

VIDEO INSTRUCTION TUNING WITH SYNTHETIC DATA

2024 • 104 citations

Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning

2024 • 104 citations

Gymnasium: A Standard Interface for Reinforcement Learning Environments

2024 • 104 citations

RAPTOR: RECURSIVE ABSTRACTIVE PROCESSING FOR TREE-ORGANIZED RETRIEVAL

2024 • 102 citations

AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling

2024 • 102 citations

Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning

2024 • 102 citations

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

2024 • 102 citations

Evaluating Text-to-Visual Generation with Image-to-Text Generation

2024 • 101 citations

Mamba-UNet: UNet-Like Pure Visual Mamba for Medical Image Segmentation

2024 • 101 citations

Understanding the planning of LLM agents: A survey

2024 • 101 citations

QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs

2024 • 101 citations

An Image is Worth 1/2 Tokens After Layer 2: Plug-and-PLay Acceleration for VLLM Inference

2024 • 101 citations

EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty

2024 • 101 citations

Executable Code Actions Elicit Better LLM Agents

2024 • 100 citations

Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs

2024 • 100 citations

LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks

2024 • 100 citations

HunyuanVideo: A Systematic Framework For Large Video Generative Models "Bridging the gap between closed-source and open-source video foundation models to accelerate community exploration." -Hunyuan Foundation Model Team

2024 • 99 citations

Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models

2024 • 99 citations

WHEN SCALING MEETS LLM FINETUNING: THE EFFECT OF DATA, MODEL AND FINETUNING METHOD

2024 • 99 citations

InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD

2024 • 99 citations

CodeGemma: Open Code Models Based on Gemma CodeGemma Team, Google LLC 1

2024 • 98 citations

FROM CROWDSOURCED DATA TO HIGH-QUALITY BENCHMARKS: ARENA-HARD AND BENCHBUILDER PIPELINE

2024 • 98 citations

Browse Papers By:

By Year

By Popularity