Orchestrator-8B | AI Model

Overview

ToolOrchestra is NVIDIA’s method for training small “orchestrator” models that decide when and how to call tools and other LLMs. The main model, Orchestrator-8B, beats GPT-5 on hard agent benchmarks like Humanity’s Last Exam while using about 30% of the cost and running roughly 2.5× faster.

Description

ToolOrchestra trains an 8B controller that alternates between reasoning and tool calls over multiple turns, choosing among web search, code interpreters, specialized math or coding models, and generalist LLMs such as GPT-5, Llama-Nemotron-Ultra, and Claude Opus 4.1. It uses reinforcement learning with outcome, efficiency, and preference rewards so the orchestrator learns not only to solve tasks, but to do so cheaply and in line with user tool preferences. On benchmarks like Humanity’s Last Exam, τ²-Bench, and FRAMES, Orchestrator-8B reaches higher accuracy than GPT-5 while using around a third of the cost and showing robust generalization to unseen tools, suggesting that a lightweight tool router can be more cost-effective than a single giant model.

About NVIDIA Corporation

No company description available.

Industry: Computer Hardware Manufacturing

Company Size: 36000

Location: Santa Clara, California, US

Website: nvidia.com

View Company Profile

Related Models

Last updated: December 2, 2025

Overview

Description

About NVIDIA Corporation

Related Models

GPT 4.1 mini

Gemma Scope 2

GPT 5 Pro

Help

People also viewed

Create AI Tools

Mini Tool

Vibe code an AI Tool