TAAFT
Free mode
100% free
Freemium
Free Trial
Create tool

Phi Silica

New Text Gen
Released: May 20, 2024

Overview

Phi Silica is Microsoft’s on-device small language model (~3.3B params) built to run locally on Copilot+ PC NPUs. It’s pre-tuned, 4-bit–quantized, streams fast (≈230 ms first token, up to ~20 tok/s), and currently offers a 2K context (with 4K coming). Available to developers via the Windows App SDK’s Phi Silica APIs.

Description

Microsoft’s Phi Silica is a sister line to the Phi family, designed specifically for on-device use on Copilot+ PCs (Snapdragon X-series NPUs). It ships as a pre-tuned SLM that apps call locally through new Windows App SDK APIs, enabling chat, math, code help, and reasoning without cloud calls. Access uses a Limited Access Feature flow for developers.
Under the hood, Phi Silica targets NPU efficiency: 4-bit weight quantization, low idle memory, and NPU-based context processing. Microsoft reports ~230 ms time-to-first-token for short prompts, throughput up to ~20 tokens/sec, a 2K context window (with 4K “coming shortly”), and significantly reduced power draw on the NPU versus CPU.
Model scale is in the “small but capable” range—about 3.3B parameters—optimized for Windows distribution and real-time interactivity on device. Media coverage and Microsoft materials position it as the SLM foundation for Copilot+ features on PCs.
As of April 2025, Microsoft has demonstrated vision-based multimodal extensions for Phi Silica (image+text), broadening local use cases like document and UI understanding entirely on device.

About Microsoft

No company description available.

Location: Redmond, WA, US
View Company Profile

Related Models

Last updated: September 22, 2025