TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

Ornith 1.0 35B

Model family: Qwen
Ornith-1.0-35B is the mid-tier model in the Ornith-1.0 family, a 35B MoE architecture post-trained on Qwen 3.5. It is a reasoning model that produces explicit think traces before final answers and emits well-formed function calls via the tool_calls field. The core innovation is a self-improving RL training framework: rather than relying on fixed human-designed harnesses, the model co-learns solution rollouts and the task-specific scaffolds that guide them. Reward is propagated to both stages, enabling per-task orchestration strategies to emerge automatically. The context window is 262,144 tokens. On agentic coding benchmarks, it scores 75.6 on SWE-Bench Verified, 50.4 on SWE-Bench Pro, 69.3 on SWE-Bench Multilingual, and 64.2 on Terminal-Bench 2.1, outperforming Qwen 3.5-397B on Terminal-Bench 2.1 despite being 10x smaller. Compatible with vLLM, SGLang, and HuggingFace Transformers. MIT licensed.
New Coding Gen 2
Released: June 27, 2026

Overview

Ornith-1.0-35B is a 35B Mixture-of-Experts reasoning model for agentic coding, post-trained on Qwen 3.5 using a self-improving RL framework that jointly learns solution rollouts and the task-specific scaffolds guiding them. Supports native function calling and 262K context. Scores 75.6 on SWE-Bench Verified and 64.2 on Terminal-Bench 2.1. MIT licensed.

About DeepReinforce

DeepReinforce is an AI research startup founded by Dr. Jiwei Li, focused on using reinforcement learning to build agentic AI systems for coding and system optimization. They developed GrandCode (ranked #1 in Codeforces live competitions, beating all human grandmasters), Ornith-1.0 (open-source LLMs for agentic coding, 9Bโ€“397B parameters), and IterX (agentic code optimizer surpassing NVIDIA's cuBLAS).

Industry: Artificial Intelligence
Location: US
View Company Profile
Last updated: June 30, 2026
0 AIs selected
Clear selection
#
Name
Task