ScoutML
Sign In
Request Access
Wiki
/
Papers
/
2021 Papers
2021 Papers
Machine learning and AI research papers from 2021
Recent
Popular
2024
2023
2022
Generative Adversarial Networks
2021
•
30040 citations
Learning Transferable Visual Models From Natural Language Supervision
2021
•
25508 citations
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
2021
•
19276 citations
High-Resolution Image Synthesis with Latent Diffusion Models
2021
•
13656 citations
LORA: LOW-RANK ADAPTATION OF LARGE LAN- GUAGE MODELS
2021
•
8539 citations
Masked Autoencoders Are Scalable Vision Learners
2021
•
6999 citations
Diffusion Models Beat GANs on Image Synthesis
2021
•
6824 citations
Emerging Properties in Self-Supervised Vision Transformers
2021
•
5429 citations
Evaluating Large Language Models Trained on Code
2021
•
4608 citations
Zero-Shot Text-to-Image Generation
2021
•
4544 citations
Prefix-Tuning: Optimizing Continuous Prompts for Generation
2021
•
3849 citations
YOLOX: Exceeding YOLO Series in 2021 COCO AP (%) V100 batch 1 Latency (ms) YOLOX-L YOLOv5-L YOLOX-DarkNet53 YOLOv5-Darknet53 EfficientDet COCO AP (%) Number of parameters (M) YOLOX-Nano NanoDet YOLOv4-Tiny YOLOX-Tiny EfficientDet-Lite0 EfficientDet-Lite3 YOLOX-S PPYOLO-Tiny EfficientDet-Lite2 EfficientDet-Lite1 Figure 1: Speed-accuracy trade-off of accurate models (top) and Size-accuracy curve of lite models on mobile devices (bottom) for YOLOX and other state-of-the-art object detectors
2021
•
3666 citations
The Power of Scale for Parameter-Efficient Prompt Tuning
2021
•
3655 citations
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing
2021
•
3633 citations
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
2021
•
3509 citations
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
2021
•
3448 citations
Training Verifiers to Solve Math Word Problems
2021
•
3408 citations
Published as a conference paper at ICLR 2022 FINETUNED LANGUAGE MODELS ARE ZERO-SHOT LEARNERS
2021
•
3403 citations
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
2021
•
3262 citations
Improved Denoising Diffusion Probabilistic Models
2021
•
3220 citations
SimCSE: Simple Contrastive Learning of Sentence Embeddings
2021
•
3111 citations
BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes
2021
•
3105 citations
TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation
2021
•
3069 citations
Coordinate Attention for Efficient Mobile Network Design
2021
•
2633 citations
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
2021
•
2624 citations
BEIT: BERT Pre-Training of Image Transformers
2021
•
2621 citations
SwinIR: Image Restoration Using Swin Transformer
2021
•
2590 citations
Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation
2021
•
2524 citations
MLP-Mixer: An all-MLP Architecture for Vision
2021
•
2490 citations
EfficientNetV2: Smaller Models and Faster Training
2021
•
2417 citations
Transformers in Vision: A Survey
2021
•
2308 citations
Barlow Twins: Self-Supervised Learning via Redundancy Reduction
2021
•
2202 citations
Learning to Prompt for Vision-Language Models
2021
•
2109 citations
Masked-attention Mask Transformer for Universal Image Segmentation
2021
•
2081 citations
ViViT: A Video Vision Transformer
2021
•
1986 citations
Restormer: Efficient Transformer for High-Resolution Image Restoration
2021
•
1946 citations
ROFORMER: ENHANCED TRANSFORMER WITH ROTARY POSITION EMBEDDING
2021
•
1927 citations
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
2021
•
1919 citations
Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting
2021
•
1886 citations
Is Space-Time Attention All You Need for Video Understanding?
2021
•
1868 citations
Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
2021
•
1839 citations
Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields
2021
•
1804 citations
Align before Fuse: Vision and Language Representation Learning with Momentum Distillation
2021
•
1798 citations
CvT: Introducing Convolutions to Vision Transformers
2021
•
1787 citations
An Empirical Study of Training Self-Supervised Vision Transformers
2021
•
1721 citations
Integrated Sensing and Communications: Towards Dual-functional Wireless Networks for 6G and Beyond
2021
•
1709 citations
Swin Transformer V2: Scaling Up Capacity and Resolution
2021
•
1636 citations
NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction
2021
•
1596 citations
Image Super-Resolution via Iterative Refinement
2021
•
1594 citations
TruthfulQA: Measuring How Models Mimic Human Falsehoods
2021
•
1587 citations
Measuring Mathematical Problem Solving With the MATH Dataset
2021
•
1586 citations
Program Synthesis with Large Language Models
2021
•
1585 citations
Vision Transformers for Dense Prediction
2021
•
1567 citations
Attention Mechanisms in Computer Vision: A Survey
2021
•
1543 citations
Plenoxels: Radiance Fields without Neural Networks
2021
•
1525 citations
Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields
2021
•
1522 citations
Palette: Image-to-Image Diffusion Models
2021
•
1504 citations
Alias-Free Generative Adversarial Networks
2021
•
1502 citations
Decision Transformer: Reinforcement Learning via Sequence Modeling
2021
•
1481 citations
Efficiently Modeling Long Sequences with Structured State Spaces
2021
•
1469 citations
Transformer in Transformer
2021
•
1453 citations
RepVGG: Making VGG-style ConvNets Great Again
2021
•
1436 citations
Calibrate Before Use: Improving Few-Shot Performance of Language Models
2021
•
1432 citations
Per-Pixel Classification is Not All You Need for Semantic Segmentation
2021
•
1427 citations
UNETR: Transformers for 3D Medical Image Segmentation
2021
•
1404 citations
GLM: General Language Model Pretraining with Autoregressive Blank Infilling
2021
•
1396 citations
Video Swin Transformer
2021
•
1387 citations
Multi-Stage Progressive Image Restoration
2021
•
1386 citations
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation
2021
•
1386 citations
Segmenter: Transformer for Semantic Segmentation
2021
•
1349 citations
CLIPScore: A Reference-free Evaluation Metric for Image Captioning
2021
•
1330 citations
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
2021
•
1328 citations
SDEDIT: GUIDED IMAGE SYNTHESIS AND EDITING WITH STOCHASTIC DIFFERENTIAL EQUATIONS
2021
•
1314 citations
LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs
2021
•
1303 citations
Uformer: A General U-Shaped Transformer for Image Restoration
2021
•
1260 citations
What Makes Good In-Context Examples for GPT-3?
2021
•
1252 citations
SimMIM: a Simple Framework for Masked Image Modeling
2021
•
1245 citations
ByteTrack: Multi-Object Tracking by Associating Every Detection Box
2021
•
1235 citations
Ensemble deep learning: A review
2021
•
1208 citations
Multiscale Vision Transformers
2021
•
1175 citations
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
2021
•
1155 citations
ARTICLE E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials
2021
•
1142 citations
Efficient Geometry-aware 3D Generative Adversarial Networks
2021
•
1135 citations
WebGPT: Browser-assisted question-answering with human feedback
2021
•
1128 citations
TABULAR DATA: DEEP LEARNING IS NOT ALL YOU NEED
2021
•
1118 citations
ACCEPTED BY IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 1 Generalizing to Unseen Domains: A Survey on Domain Generalization
2021
•
1115 citations
Cascaded Diffusion Models for High Fidelity Image Generation Figure 1: A cascaded diffusion model comprising a base model and two super-resolution models. * . Equal contribution Figure 2: Selected synthetic 256×256 ImageNet samples
2021
•
1114 citations
CoAtNet: Marrying Convolution and Attention for All Data Sizes
2021
•
1113 citations
MOBILEVIT: LIGHT-WEIGHT, GENERAL-PURPOSE, AND MOBILE-FRIENDLY VISION TRANSFORMER
2021
•
1100 citations
Focal and Efficient IOU Loss for Accurate Bounding Box Regression
2021
•
1099 citations
GPT Understands, Too
2021
•
1099 citations
TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-captured Scenarios
2021
•
1093 citations
Machine learning and deep learning
2021
•
1091 citations
The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games
2021
•
1088 citations
BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models
2021
•
1088 citations
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval Space-Time Transformer Encoder Patch + Spatial Pos. + Temporal Pos. "Man drinking a bottle of water in the park" 0 0 CLS 1 0 1 1 1 N 2 0 2 1 2 N M 0 * M 1 M N Linear Projection
2021
•
1074 citations
LoFTR: Detector-Free Local Feature Matching with Transformers
2021
•
1067 citations
Physics-informed neural networks (PINNs) for fluid mechanics: A review
2021
•
1051 citations
A Survey of Uncertainty in Deep Neural Networks
2021
•
1040 citations
DEBERTAV3: IMPROVING DEBERTA USING ELECTRA-STYLE PRE-TRAINING WITH GRADIENT-DISENTANGLED EMBEDDING SHARING
2021
•
1038 citations
Page 1 of 819
Next
Last
Browse Papers By:
By Year
2024 Papers
2023 Papers
2022 Papers
2021 Papers
By Popularity
Most Cited
Recently Added