TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

Apertus v1.1 4B Instruct

Apertus-v1.1-4B-Instruct is a 4B parameter instruction-tuned language model produced via pre-training distillation (PD) from the Apertus-8B-2509 teacher, using a 90%/10% mix of KL-Divergence and label cross-entropy. The architecture is a dense transformer decoder with 24 layers, model dimension 3072, MLP dimension 16384, 24/8 Q/KV heads, and xIELU activations. It was trained on 1.7T tokens from Phase 5 of the Apertus data pipeline covering filtered documents, code, and instruction samples. Post-training included supervised fine-tuning and alignment. The model natively supports 1811 languages and is optimized for memory-constrained environments. Quantized checkpoints are available in FP8, NVFP4A16, and INT3-6 formats for vLLM and Apple MLX inference. Weights, training data, and training recipes are fully open under Apache-2.0 license. Training respects data owner opt-out consent and avoids memorization of training data.
New Text Gen 7
Released: June 29, 2026

Overview

A 4B parameter instruction-tuned language model created via pre-training distillation from the Apertus-8B teacher model. Uses a dense transformer decoder with grouped-query attention and xIELU activations, trained on 1.7T tokens. Natively supports 1811 languages with quantized checkpoints available for mobile and edge deployment. Apache-2.0 licensed.

About Swiss AI Initiative

The Swiss AI Initiative is the world's largest open science/open source effort for AI foundation models, started in December 2023. Seeded with over 10M GPU hours on the Alps supercomputer and a 20M CHF grant from the ETH Domain, it is the first initiative of the Swiss National AI Institute—a partnership between the ETH AI Center and the EPFL AI Center—leveraging 800+ researchers (70 AI-focused professors) from 10+ Swiss academic institutions.

Industry: Research
Location: Zürich, CH
View Company Profile
Last updated: July 1, 2026
0 AIs selected
Clear selection
#
Name
Task