Overview
Phi Silica is Microsoft’s on-device small language model (~3.3B params) built to run locally on Copilot+ PC NPUs. It’s pre-tuned, 4-bit–quantized, streams fast (≈230 ms first token, up to ~20 tok/s), and currently offers a 2K context (with 4K coming). Available to developers via the Windows App SDK’s Phi Silica APIs.
Description
Microsoft’s Phi Silica is a sister line to the Phi family, designed specifically for on-device use on Copilot+ PCs (Snapdragon X-series NPUs). It ships as a pre-tuned SLM that apps call locally through new Windows App SDK APIs, enabling chat, math, code help, and reasoning without cloud calls. Access uses a Limited Access Feature flow for developers.
Under the hood, Phi Silica targets NPU efficiency: 4-bit weight quantization, low idle memory, and NPU-based context processing. Microsoft reports ~230 ms time-to-first-token for short prompts, throughput up to ~20 tokens/sec, a 2K context window (with 4K “coming shortly”), and significantly reduced power draw on the NPU versus CPU.
Model scale is in the “small but capable” range—about 3.3B parameters—optimized for Windows distribution and real-time interactivity on device. Media coverage and Microsoft materials position it as the SLM foundation for Copilot+ features on PCs.
As of April 2025, Microsoft has demonstrated vision-based multimodal extensions for Phi Silica (image+text), broadening local use cases like document and UI understanding entirely on device.
Under the hood, Phi Silica targets NPU efficiency: 4-bit weight quantization, low idle memory, and NPU-based context processing. Microsoft reports ~230 ms time-to-first-token for short prompts, throughput up to ~20 tokens/sec, a 2K context window (with 4K “coming shortly”), and significantly reduced power draw on the NPU versus CPU.
Model scale is in the “small but capable” range—about 3.3B parameters—optimized for Windows distribution and real-time interactivity on device. Media coverage and Microsoft materials position it as the SLM foundation for Copilot+ features on PCs.
As of April 2025, Microsoft has demonstrated vision-based multimodal extensions for Phi Silica (image+text), broadening local use cases like document and UI understanding entirely on device.
About Microsoft
No company description available.
Location:
Redmond, WA, US
Website:
news.microsoft.com
Related Models
Last updated: September 22, 2025