Apertus v1.1 4B

Apertus v1.1 4B

Apertus-v1.1-4B is a 4B parameter open-source base language model produced through pre-training distillation (PD) from the Apertus-8B-2509 teacher using a 90%/10% mix of KL-Divergence and label cross-entropy, significantly reducing compute requirements. Architecture: dense transformer decoder with 24 layers, 3072 model dimension, 16384 MLP dimension, 24/8 Q/KV heads, xIELU activations, and no tied embeddings. Trained on 1.7T tokens (Phase 5 of the Apertus data pipeline: filtered documents, code, and instruction samples) using 2.0E22 FLOPs on 64 GH200 GPUs. Natively supports 1,811 languages. Fully open: open weights, open data, full training recipe. Respects data owner opt-out consent and avoids training data memorization. Base model with no SFT or alignment, intended for downstream fine-tuning. Quantized variants and Instruct version available. License: Apache 2.0.

Overview

Apertus-v1.1-4B is a 4B parameter open-source base language model created via pre-training distillation from the Apertus-8B teacher model. It uses a dense transformer with grouped-query attention and xIELU activations, natively supports 1,811 languages, and is trained on 1.7T high-quality tokens. Designed for constrained hardware and further fine-tuning.

📚Large Language Models 🔍Research 🌐Multilingual communication 🧠Model training

About Swiss AI Initiative

The Swiss AI Initiative is the world's largest open science/open source effort for AI foundation models, started in December 2023. Seeded with over 10M GPU hours on the Alps supercomputer and a 20M CHF grant from the ETH Domain, it is the first initiative of the Swiss National AI Institute—a partnership between the ETH AI Center and the EPFL AI Center—leveraging 800+ researchers (70 AI-focused professors) from 10+ Swiss academic institutions.

Industry: Research

Location: Zürich, CH

Website: swiss-ai.org

View Company Profile

Last updated: July 1, 2026

Go to section

Search

Overview

About Swiss AI Initiative

Related Models

Help

People also viewed

Create AI Tools

Mini Tool

Vibe code an AI Tool

Choose listing type: