CompVis
Models
-
Latent Diffusion by CompVis is a text to image method that runs the diffusion process in a compressed latent space for speed and quality. A VAE encodes and decodes images, a U Net denoiser operates on latents, and a text encoder guides generation. It enables fast synthesis, image to image, and inpainting on modest hardware.ImageReleased 3y ago
-
Taming Transformers is a method for high-resolution image synthesis that compresses images into discrete tokens with a VQ-style autoencoder, then trains a Transformer to model those tokens. This makes large images practical to generate with good fidelity and control.ImageReleased 5y ago
