TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

pplx embed context v1 0.6b

pplx-embed-context-v1-0.6b is a contextual text embedding model built on a diffusion continued pre-trained Qwen3 backbone. Designed for RAG systems where surrounding document context should influence each chunk's representation, it accepts lists of document chunks and uses a late chunking strategy that allows bidirectional context to flow across chunk boundaries. The model produces 1024-dimensional unnormalized int8-quantized embeddings with a 32K token context window, and supports both INT8 and binary quantization formats. Matryoshka Representation Learning (MRL) enables flexible embedding dimensionality. No instruction prefixes are required. It achieves state-of-the-art results on the ConTEB contextual embedding benchmark and is optimized for large-scale web retrieval tasks. Compatible with Transformers and ONNX runtimes. MIT licensed.
Text Gen 7
Released: February 11, 2026

Overview

A 0.6B-parameter contextual text embedding model for RAG pipelines. Unlike standard embedding models, it takes full document chunks together so each chunk's embedding reflects surrounding context. Produces 1024-dimensional int8-quantized vectors with a 32K token context window. Supports binary quantization and MRL. Multilingual and instruction-free. Available via Perplexity API and as open weights.

Pricing

Compare pplx embed context v1 0.6b with other models listed in the same vendor pricing tiers and context lengths.

Embeddings

About Perplexity

Perplexity is a technology company that specializes in artificial intelligence and machine learning solutions.

Industry: Artificial Intelligence
Company Size: 247
Location: San Francisco, California, US
View Company Profile
Last updated: June 25, 2026
0 AIs selected
Clear selection
#
Name
Task