TAAFT
Free mode
100% free
Freemium
Free Trial
Create tool

Qwen3-30B-A3B

By Alibaba
New Text Gen
Released: July 30, 2025

Overview

Qwen3-30B-A3B is Alibaba’s open-weight MoE LLM (Apache-2.0) with 30.5B params and ~3.3B activated. It supports toggleable “thinking” vs. non-thinking modes, strong agent/tool calling, 100+ languages, and a 32K native context (≈131K with YaRN). New 2507 variants raise context to 256K.

Description

Qwen3-30B-A3B is a mixture-of-experts model in Alibaba’s Qwen3 family, open-weighted under Apache-2.0. It has 30.5B total parameters with ~3.3B active per token (128 experts, 8 active), GQA attention, and a native 32,768-token window extendable to ~131,072 via YaRN. A hallmark feature is a hard/soft switch between “thinking” (emits <think>…</think> traces for deeper reasoning) and non-thinking modes, letting you trade latency for reasoning depth. The model is optimized for multilingual use (100+ languages) and agentic tool-calling; Qwen provides the Qwen-Agent toolkit for function/tool use. For long-context tasks, the 2507 “Thinking” and “Instruct” releases bump native context to 262,144 tokens, with guides to reach ~1M using dual-chunk attention and sparse attention in supported runtimes (vLLM/SGLang). Deploy via Hugging Face Transformers; local options include Ollama, LM Studio, MLX-LM, and llama.cpp.

About Alibaba

Chinese e-commerce and cloud leader behind Taobao, Tmall, and Alipay.

Website: alibaba.com
View Company Profile

Related Models

Last updated: September 17, 2025