TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

Llama 3.2 Lightweight

Model family: LLaMA
Llama 3.2 Lightweight brings the 3.2 generation’s reasoning and coding ability to compact models that run fast with modest memory. It’s designed for on-device and single-GPU serving, so you can embed assistants in apps, browsers, and edge boxes without a heavy backend. The models are instruction-tuned for clean, controllable behavior, produce structured responses when you need JSON, and integrate smoothly with function calling for agent workflows and RAG. Quantization keeps footprints small while preserving quality, and lightweight adapters such as LoRA make domain fine-tuning straightforward. Compared with the larger 3.2 models, Lightweight trades some raw capability for responsiveness and cost efficiency; paired with retrieval, it delivers crisp chat, summarization, and coding help with the immediacy required for real-time products.
New Text Gen 7
Released: September 25, 2024

Overview

Llama 3.2 Lightweight is Meta’s small-footprint Llama 3.2 line tuned for low-latency, low-cost deployment on a single GPU or edge device. It follows instructions well, writes code, supports tool/function calling and reliable JSON output, and ships as open weights under the Llama license.

About Meta Platforms

We're connecting people to what they care about, powering new, meaningful experiences, and advancing the state-of-the-art through open research and accessible tooling.

Industry: Technology, Information and Media
Company Size: 78.000-79.000
Location: Menlo Park, California, US
Website: ai.meta.com
View Company Profile

Tools using Llama 3.2 Lightweight

No tools found for this model yet.

Last updated: February 25, 2026
0 AIs selected
Clear selection
#
Name
Task