TAAFT
Free mode
100% free
Freemium
Free Trial
Create tool

Llama 3.2 Lightweight

By Meta
New Text Gen
Released: September 25, 2024

Overview

Llama 3.2 Lightweight is Meta’s small-footprint Llama 3.2 line tuned for low-latency, low-cost deployment on a single GPU or edge device. It follows instructions well, writes code, supports tool/function calling and reliable JSON output, and ships as open weights under the Llama license.

Description

Llama 3.2 Lightweight brings the 3.2 generation’s reasoning and coding ability to compact models that run fast with modest memory. It’s designed for on-device and single-GPU serving, so you can embed assistants in apps, browsers, and edge boxes without a heavy backend. The models are instruction-tuned for clean, controllable behavior, produce structured responses when you need JSON, and integrate smoothly with function calling for agent workflows and RAG. Quantization keeps footprints small while preserving quality, and lightweight adapters such as LoRA make domain fine-tuning straightforward. Compared with the larger 3.2 models, Lightweight trades some raw capability for responsiveness and cost efficiency; paired with retrieval, it delivers crisp chat, summarization, and coding help with the immediacy required for real-time products.

About Meta

We're connecting people to what they care about, powering new, meaningful experiences, and advancing the state-of-the-art through open research and accessible tooling.

Location: California, US
Website: ai.meta.com
View Company Profile

Related Models

Last updated: September 22, 2025