TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

Orchestrator-8B

New Text Gen 7
Released: November 26, 2025

Overview

ToolOrchestra is NVIDIA’s method for training small “orchestrator” models that decide when and how to call tools and other LLMs. The main model, Orchestrator-8B, beats GPT-5 on hard agent benchmarks like Humanity’s Last Exam while using about 30% of the cost and running roughly 2.5× faster.

Description

ToolOrchestra trains an 8B controller that alternates between reasoning and tool calls over multiple turns, choosing among web search, code interpreters, specialized math or coding models, and generalist LLMs such as GPT-5, Llama-Nemotron-Ultra, and Claude Opus 4.1. It uses reinforcement learning with outcome, efficiency, and preference rewards so the orchestrator learns not only to solve tasks, but to do so cheaply and in line with user tool preferences. On benchmarks like Humanity’s Last Exam, τ²-Bench, and FRAMES, Orchestrator-8B reaches higher accuracy than GPT-5 while using around a third of the cost and showing robust generalization to unseen tools, suggesting that a lightweight tool router can be more cost-effective than a single giant model.

About NVIDIA Corporation

No company description available.

Industry: Computer Hardware Manufacturing
Company Size: 36000
Location: Santa Clara, California, US
Website: nvidia.com
View Company Profile

Related Models

Last updated: December 2, 2025