Papers
-
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-AwarenessMassachusetts Institute of Technology, Stanford University
-
mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections
-
ItemSage: Learning Product Embeddings for Shopping Recommendations at Pinterest
-
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
-
MultiBiSage: A Web-Scale Recommendation System Using Multiple Bipartite Graphs at Pinterest
-
M6-Rec: Generative Pretrained Language Models are Open-Ended Recommender Systems
-
Least-to-Most Prompting Enables Complex Reasoning in Large Language Models
-
PinnerFormer: Sequence Modeling for User Representation at Pinterest
-
PinnerFormer: Sequence Modeling for User Representation at Pinterest
-
Hierarchical Text-Conditional Image Generation with CLIP Latents
-
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
-
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
-
Training Compute-Optimal Large Language Models
-
CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis
-
Learning When to Translate for Streaming Speech
-
Training a Tokenizer for Free with Private Federated Learning
-
Training Language Models to Follow Instructions with Human Feedback
-
FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators
-
Competition-Level Code Generation with AlphaCode
-
Low-Overhead Fault-Tolerant Quantum Error Correction with the Surface-GKP Code
-
Instant Neural Graphics Primitives with a Multiresolution Hash Encoding
-
ML-Decoder: Scalable and Versatile Classification Head
-
A Mathematical Framework for Transformer Circuits
-
Training Verifiers to Solve Math Word Problems
-
Improving language models by retrieving from trillions of tokens
-
On-device Panoptic Segmentation for Camera Using Transformers
-
Merlion: A Machine Learning Library for Time Series
-
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
-
AGENT: A Benchmark for Core Psychological Reasoning
-
Highly accurate protein structure predictionwith AlphaFold
-
LoRA: Low-Rank Adaptation of Large Language Models
-
AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE
-
M6: A Chinese Multimodal Pretrainer
-
Holographic dynamics simulations with a trapped ion quantum computer
-
An autonomous debating system (Project Debater)
-
Learning Transferable Visual Models From Natural Language Supervision
-
Learning Transferable Visual Models From Natural Language Supervision
-
Federated Evaluation and Tuning for On-Device Personalization: System Design & Applications
-
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
-
PinnerSage: Multi-Modal User Embedding Framework for Recommendations at Pinterest
-
Denoising Diffusion Probabilistic Models
-
ZeRO: Memory Optimizations Toward Training Trillion Parameter Models
-
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
-
Scaling Laws for Neural Language Models
-
Dota 2 with Large Scale Deep Reinforcement Learning
-
PyTorch: An Imperative Style, High-Performance Deep Learning Library
-
Overton: A Data System for Monitoring and Improving Machine-Learned Products
-
StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding
-
RoBERTa: A Robustly Optimized BERT Pretraining Approach
-
D2-Net: A Trainable CNN for Joint Description and Detection of Local Features
