TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

Gemma 4 31B IT NVFP4

By NVIDIA
Gemma-4-31B-IT-NVFP4 is an NVIDIA-optimized quantized release of Google DeepMind’s Gemma 4 31B IT, prepared for inference with NVIDIA Model Optimizer and vLLM. The model card says it handles text and image inputs, can process video as frame sequences, and generates text output for chat, summarization, reasoning, coding, multimodal understanding, and function calling. It uses Gemma 4’s hybrid attention design for long-context performance, keeps a 256K token context window, supports more than 140 languages, and reports only small benchmark drops versus the baseline after NVFP4 quantization.
New Multimodal Gen 3
Released: April 6, 2026

Overview

Gemma-4-31B-IT-NVFP4 is NVIDIA’s inference-optimized NVFP4 quantized version of Gemma 4 31B IT. It is a commercial-ready multimodal model for text, image, and video understanding with text output, built for reasoning, coding, chat, and agentic workflows while preserving the original model’s long 256K context window.

Pricing

Compare Gemma 4 31B IT NVFP4 with other models listed in the same vendor pricing tiers and context lengths.

Tier

Standard

Model Input Cached input Output Unit
Deepseek V4 Pro DeepSeek
$1.74 $0.2 $3.48 per 1M tokens
Gemma 4 31B IT NVFP4 This model NVIDIA
$0.28 - $0.86 per 1M tokens
GLM 5 Z.ai
$1 - $3.2 per 1M tokens
GLM 5.1 Z.ai
$1.4 $0.26 $4.4 per 1M tokens
Kimi K2.6 Moonshot AI
$1.2 $0.2 $4.5 per 1M tokens
Kimi K2.7 Code Moonshot AI
$0.95 $0.19 $4 per 1M tokens
LFM2 24B A2B Liquid AI
$0.03 - $0.12 per 1M tokens
Llama 3.3 Meta Platforms
$1.04 - $1.04 per 1M tokens
MiniMax M2.5 MiniMax
$0.3 $0.06 $1.2 per 1M tokens
MiniMax M2.7 MiniMax
$0.3 $0.06 $1.2 per 1M tokens
MiniMax M3 MiniMax
$0.3 $0.06 $1.2 per 1M tokens
$0.6 $0.2 $3.6 per 1M tokens
Qwen 3.5 9B Alibaba
$0.17 - $0.25 per 1M tokens
Qwen 3.6 Plus Alibaba
$0.5 - $3 per 1M tokens
Qwen 3.7 Max Alibaba
$1.25 $0.13 $3.75 per 1M tokens
Qwen3 235B A22B Alibaba
$0.2 - $0.6 per 1M tokens
$0.6 $0.35 $3.6 per 1M tokens
Qwen3.7-Plus Alibaba
$0.32 - $1.28 per 1M tokens

About NVIDIA

Industry: Computer Hardware Manufacturing
Company Size: 42000
Location: Santa Clara, California, US
Website: nvidia.com
View Company Profile

Tools using Gemma 4 31B IT NVFP4

No tools found for this model yet.

Last updated: April 6, 2026
0 AIs selected
Clear selection
#
Name
Task