TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

Ornith 1.0 397B

Model family: Qwen
Ornith-1.0-397B is the flagship model of the Ornith-1.0 family, built on Qwen 3.5 MoE and trained with a self-improving reinforcement learning framework. Unlike standard RL pipelines that rely on fixed human-designed harnesses, Ornith-1.0 learns to generate both solution rollouts and the task-specific scaffolds that guide them simultaneously, co-evolving scaffold and solution across training. The model achieves 82.4 on SWE-Bench Verified, 62.2 on SWE-Bench Pro, 78.9 on SWE-Bench Multilingual, and 77.5 on Terminal-Bench 2.1, matching or exceeding Claude Opus 4.7 on agentic coding benchmarks. It supports context windows up to 400K tokens, native tool calling via the tool_calls interface, and chain-of-thought reasoning through embedded think blocks. Deployable via vLLM, SGLang, or Hugging Face Transformers with OpenAI-compatible API endpoints.
New Multimodal Gen 3
Released: June 27, 2026

Overview

Ornith-1.0-397B is a 397B MoE open-source reasoning model for agentic coding, post-trained on Qwen 3.5 MoE via a self-improving RL framework that jointly learns task solutions and the scaffolds guiding them. Achieves 82.4 on SWE-Bench Verified and 77.5 on Terminal-Bench 2.1. MIT licensed with tool-calling and 256K context support.

About DeepReinforce

DeepReinforce is an AI research startup founded by Dr. Jiwei Li, focused on using reinforcement learning to build agentic AI systems for coding and system optimization. They developed GrandCode (ranked #1 in Codeforces live competitions, beating all human grandmasters), Ornith-1.0 (open-source LLMs for agentic coding, 9Bโ€“397B parameters), and IterX (agentic code optimizer surpassing NVIDIA's cuBLAS).

Industry: Artificial Intelligence
Location: US
View Company Profile
Last updated: June 30, 2026
0 AIs selected
Clear selection
#
Name
Task