TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

pplx embed v1 4b

pplx-embed-v1-4b is a 4-billion-parameter dense text embedding model built on a Qwen3 backbone continued-pretrained with diffusion. It produces 2560-dimensional embeddings natively quantized to INT8 or binary precision, enabling efficient large-scale retrieval. Embeddings are unnormalized and should be compared with cosine similarity. The model supports a 32K token context window and Matryoshka Representation Learning (MRL) for flexible output dimensionality reduction. Mean pooling is used. It is instruction-aware, allowing direct embedding without instruction-prefix tuning for most use cases. Multilingual support is included. Deployable via the Perplexity API, SentenceTransformers, ONNX Runtime, and Text Embeddings Inference (v1.9.2+). Released under the MIT license.
Text Gen 7
Released: February 11, 2026

Overview

A 4B-parameter text embedding model built on diffusion-pretrained Qwen3. Produces 2560-dimensional dense embeddings with INT8 and binary quantization, 32K context window, and Matryoshka Representation Learning (MRL) support. Designed for semantic search and web-scale retrieval. Multilingual. Available via API, SentenceTransformers, ONNX, and Text Embeddings Inference.

Pricing

Compare pplx embed v1 4b with other models listed in the same vendor pricing tiers and context lengths.

Embeddings

About Perplexity

Perplexity is a technology company that specializes in artificial intelligence and machine learning solutions.

Industry: Artificial Intelligence
Company Size: 247
Location: San Francisco, California, US
View Company Profile
Last updated: June 25, 2026
0 AIs selected
Clear selection
#
Name
Task