Papers
-
Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving
-
Kimi K2: Open Agentic Intelligence
-
Scaling Data-Constrained Language Models
-
Seed LiveInterpret 2.0: End-to-end Simultaneous Speech-to-speech Translation with Your Voice
-
Scalpel vs. Hammer: GRPO Amplifies Existing Capabilities, SFT Replaces Them
-
STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning
-
DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal Decomposition
-
Voxtral
-
Apple Intelligence Foundation Language Models: Tech Report 2025
-
Non-preemptive Throughput Maximizationunder Time-varying Capacity
-
MAPoRL: Multi-Agent Post-Co-Training for Collaborative Large Language Models with Reinforcement Learning
-
A Survey of Automatic Prompt Optimization with Instruction-focused Heuristic-based Search Algorithm
-
SEE: Strategic Exploration and Exploitation for Cohesive In-Context Prompt Optimization
-
Skywork-R1V3 Technical Report
-
SHADE-Arena: Evaluating Sabotage and Monitoring in LLM Age
-
Unconditional Diffusion for Generative Sequential Recommendation
-
Evaluating the Critical Risks of Amazon’s Nova Premier under the Frontier Model Safety Framework
-
Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy
-
`For Argument's Sake, Show Me How to Harm Myself!': Jailbreaking LLMs in Suicide and Self-Harm Contexts
-
UMA: A Family of Universal Models for Atoms
-
Hierarchical Reasoning Model
-
Steering Your Diffusion Policy with Latent Space Reinforcement Learning
-
Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs
-
KIMI-VL TECHNICAL REPORT
-
TransAct V2: Lifelong User Action Sequence Modeling on Pinterest Recommendation
-
Next-User Retrieval: Enhancing Cold-Start Recommendations via Generative Next-User Modeling
-
Agent Laboratory: Using LLM Agents as Research Assistants
-
CSVQA: A Chinese Multimodal Benchmark for Evaluating STEM Reasoning Capabilities of VLMs
-
AlphaEvolve: A coding agent for scientific and algorithmic discovery
-
Arctic Long Sequence Training: Scalable And Efficient Training For Multi-Million Token Sequences
-
Transformers without Normalization
-
T1: Advancing Language Model Reasoning through Reinforcement Learning and Inference Scalin
-
Ministral 3
-
The Behavior Gap: Evaluating Zero-shot LLM Agents in Complex Task-Oriented Dialogs
-
A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce
-
Magistral
-
Adobe Researchers present a powerful, unified approach to generative video editing at CVPR 2025
-
Multi-Token Attention
-
Transaction Categorization with Relational Deep Learning in QuickBooks
-
Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought
-
Skywork-VL Reward: An Effective Reward Model for Multimodal Understanding and Reasoning
-
Gradient-guided Attention Map Editing: Towards Efficient Contextual Hallucination Mitigation
-
Modality-Specialized Synergizers for Interleaved Vision-Language Generalists
-
Skywork R1V2: Multimodal Hybrid Reinforcement Learning for Reasoning
-
Splat and Replace: 3D Reconstruction with Repetitive Elements
-
FlexTok: Resampling Images into 1D Token Sequences of Flexible Length
-
HybGRAG: Hybrid Retrieval-Augmented Generation on Textual and Relational Knowledge Bases
-
SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model Transformation
-
Scaling Diffusion Language Models via Adaptation from Autoregressive Models
-
M+: Extending MemoryLLM with Scalable Long-Term Memory
