TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

Apertus v1.1 4B

Apertus-v1.1-4B is a 4B parameter open-source base language model produced through pre-training distillation (PD) from the Apertus-8B-2509 teacher using a 90%/10% mix of KL-Divergence and label cross-entropy, significantly reducing compute requirements. Architecture: dense transformer decoder with 24 layers, 3072 model dimension, 16384 MLP dimension, 24/8 Q/KV heads, xIELU activations, and no tied embeddings. Trained on 1.7T tokens (Phase 5 of the Apertus data pipeline: filtered documents, code, and instruction samples) using 2.0E22 FLOPs on 64 GH200 GPUs. Natively supports 1,811 languages. Fully open: open weights, open data, full training recipe. Respects data owner opt-out consent and avoids training data memorization. Base model with no SFT or alignment, intended for downstream fine-tuning. Quantized variants and Instruct version available. License: Apache 2.0.
New Text Gen 7
Released: June 29, 2026

Overview

Apertus-v1.1-4B is a 4B parameter open-source base language model created via pre-training distillation from the Apertus-8B teacher model. It uses a dense transformer with grouped-query attention and xIELU activations, natively supports 1,811 languages, and is trained on 1.7T high-quality tokens. Designed for constrained hardware and further fine-tuning.

About Swiss AI Initiative

The Swiss AI Initiative is the world's largest open science/open source effort for AI foundation models, started in December 2023. Seeded with over 10M GPU hours on the Alps supercomputer and a 20M CHF grant from the ETH Domain, it is the first initiative of the Swiss National AI Institute—a partnership between the ETH AI Center and the EPFL AI Center—leveraging 800+ researchers (70 AI-focused professors) from 10+ Swiss academic institutions.

Industry: Research
Location: Zürich, CH
View Company Profile
Last updated: July 1, 2026
0 AIs selected
Clear selection
#
Name
Task