| Model | Input | Cached input | Output | Unit |
|---|---|---|---|---|
|
Deepseek V4 Pro
DeepSeek
|
$1.74 | $0.2 | $3.48 | per 1M tokens |
|
Gemma 4 31B IT NVFP4
This model
NVIDIA
|
$0.28 | - | $0.86 | per 1M tokens |
|
GLM 5
Z.ai
|
$1 | - | $3.2 | per 1M tokens |
|
GLM 5.1
Z.ai
|
$1.4 | $0.26 | $4.4 | per 1M tokens |
|
Kimi K2.6
Moonshot AI
|
$1.2 | $0.2 | $4.5 | per 1M tokens |
|
Kimi K2.7 Code
Moonshot AI
|
$0.95 | $0.19 | $4 | per 1M tokens |
|
LFM2 24B A2B
Liquid AI
|
$0.03 | - | $0.12 | per 1M tokens |
|
Llama 3.3
Meta Platforms
|
$1.04 | - | $1.04 | per 1M tokens |
|
MiniMax M2.5
MiniMax
|
$0.3 | $0.06 | $1.2 | per 1M tokens |
|
MiniMax M2.7
MiniMax
|
$0.3 | $0.06 | $1.2 | per 1M tokens |
|
MiniMax M3
MiniMax
|
$0.3 | $0.06 | $1.2 | per 1M tokens |
| $0.6 | $0.2 | $3.6 | per 1M tokens | |
|
Qwen 3.5 9B
Alibaba
|
$0.17 | - | $0.25 | per 1M tokens |
|
Qwen 3.6 Plus
Alibaba
|
$0.5 | - | $3 | per 1M tokens |
|
Qwen 3.7 Max
Alibaba
|
$1.25 | $0.13 | $3.75 | per 1M tokens |
|
Qwen3 235B A22B
Alibaba
|
$0.2 | - | $0.6 | per 1M tokens |
|
Qwen3.5 397B A17B
Alibaba
|
$0.6 | $0.35 | $3.6 | per 1M tokens |
|
Qwen3.7-Plus
Alibaba
|
$0.32 | - | $1.28 | per 1M tokens |
Gemma 4 31B IT NVFP4
Overview
Gemma-4-31B-IT-NVFP4 is NVIDIA’s inference-optimized NVFP4 quantized version of Gemma 4 31B IT. It is a commercial-ready multimodal model for text, image, and video understanding with text output, built for reasoning, coding, chat, and agentic workflows while preserving the original model’s long 256K context window.
Pricing
Compare Gemma 4 31B IT NVFP4 with other models listed in the same vendor pricing tiers and context lengths.
Standard
Batch
| Model | Input | Cached input | Output | Unit |
|---|---|---|---|---|
|
Deepseek V4 Pro
DeepSeek
|
$0.87 | $0.2 | $1.74 | per 1M tokens |
|
Gemma 4 31B IT NVFP4
This model
NVIDIA
|
$0.28 | - | $0.86 | per 1M tokens |
|
GLM 5
Z.ai
|
$1 | - | $3.2 | per 1M tokens |
|
GLM 5.1
Z.ai
|
$0.7 | $0.26 | $2.2 | per 1M tokens |
|
Kimi K2.6
Moonshot AI
|
$1.2 | $0.2 | $4.5 | per 1M tokens |
|
Kimi K2.7 Code
Moonshot AI
|
$0.95 | $0.19 | $4 | per 1M tokens |
|
LFM2 24B A2B
Liquid AI
|
$0.01 | - | $0.06 | per 1M tokens |
|
Llama 3.3
Meta Platforms
|
$0.52 | - | $0.52 | per 1M tokens |
|
MiniMax M2.5
MiniMax
|
$0.3 | $0.06 | $1.2 | per 1M tokens |
|
MiniMax M2.7
MiniMax
|
$0.15 | $0.06 | $0.6 | per 1M tokens |
|
MiniMax M3
MiniMax
|
$0.3 | $0.06 | $1.2 | per 1M tokens |
| $0.6 | $0.2 | $3.6 | per 1M tokens | |
|
Qwen 3.5 35B A3B
Alibaba
|
$0.6 | $0.35 | $3.6 | per 1M tokens |
|
Qwen 3.5 9B
Alibaba
|
$0.17 | - | $0.25 | per 1M tokens |
|
Qwen 3.6 Plus
Alibaba
|
$0.5 | - | $3 | per 1M tokens |
|
Qwen3 235B A22B
Alibaba
|
$0.1 | - | $0.3 | per 1M tokens |
|
Qwen3.7-Plus
Alibaba
|
$1.25 | $0.13 | $3.75 | per 1M tokens |
About NVIDIA
Tools using Gemma 4 31B IT NVFP4
No tools found for this model yet.
MongoDB - Build AI That Scales
