TAAFT
Free mode
100% free
Freemium
Free Trial
Deals
Create tool

CogView

New Image Gen 1
Released: July 1, 2021

Overview

CogView is THUDM’s text to image system that uses a transformer over discrete image tokens. It is strong with Chinese and English prompts, can render readable Chinese text, and supports image generation, captioning, and simple edits in later versions.

Description

CogView models images as sequences of codebook tokens produced by a vector-quantized autoencoder, then trains a large autoregressive transformer to map prompts to those tokens. This lets the system compose complex scenes, handle bilingual prompts, and render on brand typography in Chinese more reliably than many early peers. The pipeline supports text to image from scratch, image-to-image refinements through partial token replacement, and captioning by reversing the mapping. Later releases improve speed and resolution through staged decoding and better tokenizers, so drafts arrive quickly and upscale cleanly for delivery. In practice, teams use CogView for concept art, posters with Chinese copy, product visuals, and multilingual content where prompt fidelity and typography matter.

About Microsoft

Microsoft is a technology company that offers a wide range of software, cloud computing services, hardware, and artificial intelligence solutions.

Industry: Software Development
Company Size: 10001+
Location: Redmond, Washington, US
View Company Profile

Related Models

Last updated: October 15, 2025