Papers

Filter by company

MegaScale-MoE: Large-Scale Communication-Efficient Training of Mixture-of-Experts Models in Production

AMD / Peking University

1 author
The $qs$ Inequality: Quantifying the Double Penalty of Mixture-of-Experts at Inference

AMD

Published on: 2026-03-09 2 authors
Dynamic Chunking Diffusion Transformer

AMD

Published on: 2026-03-06 6 authors
AI+HW 2035: Shaping the Next Decade

NVIDIA, Google, AMD, IBM, Together AI, OpenAI, SEMRON, EnCharge AI, SambaNova, SK Hynix, Oracle / Agentrys, Brown University, California Institute of Technology, Carnegie Mellon University, Hewlett Packard Labs, New York University, Princeton University, Stanford University, University at Buffalo, University of California, University of Illinois Urbana-Champaign, University of Pennsylvania, University of Texas

Published on: 2026-03-05 30 authors
Parallelization Strategies for Dense LLM Deployment: Navigating Through Application-Specific Tradeoffs and Bottlenecks

AMD / The Pennsylvania State University

Published on: 2026-03-05 5 authors
GUI-GENESIS: Automated Synthesis of Efficient Environments with Verifiable Rewards for GUI Agent Post-Training

Tencent, AMD / Peking University

Published on: 2026-02-15 1 author
GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning

AMD

Published on: 2026-02-14 1 author
AdaptEvolve: Improving Efficiency of Evolutionary AI Agents through Adaptive Model Selection

AMD / Indian Institute of Technology Kharagpur

Published on: 2026-02-12 1 author
Canzona: A Unified, Asynchronous, and Load-Balanced Framework for Distributed Matrix-based Optimizers

AMD

Published on: 2026-02-04 1 author
M2XFP: A Metadata-Augmented Microscaling Data Format for Efficient Low-bit Quantization

AMD / Shanghai Jiao Tong University

Published on: 2026-01-28 1 author
Zebra-Llama: Towards Extremely Efficient Hybrid Models

AMD

Published on: 2026-01-20 1 author
Power Aware Dynamic Reallocation For Inference

AMD

Published on: 2026-01-18 1 author
AIE4ML: An End-to-End Framework for Compiling Neural Networks for the Next Generation of AMD AI Engines

AMD / Institute of Physics Belgrade

Published on: 2026-01-09 1 author
CD4LM: Consistency Distillation and aDaptive Decoding for Diffusion Language Models

AMD / Princeton University

Published on: 2026-01-05 1 author
CXL-SpecKV: A Disaggregated FPGA Speculative KV-Cache for Datacenter LLM Serving

AMD / Columbia University, Yale University

Published on: 2025-12-11 2 authors
SparK: Query-Aware Unstructured Sparsity with Recoverable KV Cache Channel Pruning

AMD / University of Illinois Urbana-Champaign

Published on: 2025-10-12 1 author
Efficient and Adaptable Overlapping for Computation and Communication via Signaling and Reordering

AMD / Tsinghua University

Published on: 2025-10-09 1 author
APRIL: Active Partial Rollouts in Reinforcement Learning to Tame Long-tail Generation

AMD / Carnegie Mellon University

Published on: 2025-09-26 1 author
Geak: Introducing Triton Kernel AI Agent & Evaluation Benchmarks

AMD

Published on: 2025-07-31 1 author
Agent Laboratory: Using LLM Agents as Research Assistants

AMD / ETH Zurich

Published on: 2025-06-17 1 author
Agent Laboratory: Using LLM Agents as Research Assistants

AMD / Johns Hopkins University, Swiss Federal Institute of Technology in Zurich

Published on: 2025-01-08 10 authors

Go to section

Search

Papers

Help

People also viewed

Create AI Tools

Mini Tool

Vibe code an AI Tool

Choose listing type: