Papers
-
Cosmos World Foundation Model Platform for Physical AI
-
Titans: Learning to Memorize at Test Time
-
Generative Video Propagation
-
In Case You Missed It: ARC 'Challenge' Is Not That Challenging
-
Qwen2.5 Technical Report
-
Prompt Compression with Context-Aware Sentence Encoding for Fast and Improved LLM Inference
-
Alignment faking in large language models
-
How Often are Fingerprints Repeated in the Population? Expanding on Evidence from AI With the Birthday Paradox
-
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
-
VDB-GPDF: Online Gaussian Process Distance Field with VDB Structure
-
pfl-research: simulation framework for accelerating research in Private Federated Learning
-
Frontier AI systems have surpassed the self-replicating red line
-
InstantRestore: Single-Step Personalized Face Restoration with Shared-Image Attention
-
Best-of-N Jailbreaking
-
Creating realistic 3D shapes using generative AIMassachusetts Institute of Technology
-
Commit0: Library Generation from Scratch
-
ARTIST: Improving the Generation of Text-rich Images with Disentangled Diffusion Models and Large Language Models
-
Controlling Language and Diffusion Models by Transporting Activations
-
The Rise and Potential of Large Language Model Based Agents: A SurveyMIT
-
Evaluating Cultural and Social Awareness of LLM Web Agents
-
SF-V: Single Forward Video Generation Model
-
The Llama 3 Herd of Model
-
Improving Pinterest Search Relevance Using Large Language Models
-
NVLM: Open Frontier-Class Multimodal LLMs
-
HyQE: Ranking Contexts with Hypothetical Query Embeddings
-
RedPajama: an Open Dataset for Training Large Language Models
-
Understanding Chain-of-Thought in LLMs through Information Theory
-
Survival of the Safest: Towards Secure Prompt Optimization through Interleaved Multi-Objective Evolution
-
Nemotron-4-340B-Instruct
-
Pixtral 12B
-
Data-Driven Discovery of Conservation Laws from Trajectories via Neural Deflation
-
Chronos: Learning the Language of Time Series
-
Qwen2-VL: Enhancing Vision-Language Model’s Perception of the World at Any Resolution
-
Duo-LLM: A Framework for Studying Adaptive Computation in Large Language Models
-
HM3: Heterogeneous Multi-Class Model Merging
-
arsier: Recipes for Training and Evaluating Large Video Description Models
-
OpenVLA: An Open-Source Vision-Language-Action Model
-
ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis
-
Arctic-TILT. Business Document Understanding at Sub-Billion Scale
-
General-Purpose User Modeling with Behavioral Logs
-
LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference
-
Qwen2-Audio Technical Report
-
Qwen2 Technical Report
-
Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Models
-
Harvesting Efficient On-Demand Order Pooling from Skilled Couriers: Enhancing Graph Representation Learning for Refining Real-time Many-to-One Assignments
-
Claude 3.5 Sonnet Model Card Addendum
-
Abliteration
-
Multi-Agent Software Development through Cross-Team Collaboration
-
Efficient Large Language Model Inference with Limited Memory
-
AgentBoard: An Evaluation Platform for LLM-Based Autonomous Agents
