AI Glossary

Text-to-Image Generation

AI models that create images from natural language text descriptions (prompts).

Overview

Text-to-image generation uses AI models to create visual content from natural language descriptions. Modern systems like Stable Diffusion, DALL-E, and Midjourney produce photorealistic or artistic images based on text prompts, leveraging diffusion models or transformer architectures.

Key Details

These systems are trained on billions of image-text pairs and use CLIP-like models to align visual and textual representations. Techniques like classifier-free guidance and negative prompts give users fine-grained control. Applications include graphic design, concept art, advertising, and creative exploration, though they raise concerns about copyright and deepfakes.

Related Concepts

stable diffusiondall ediffusion model

← Back to AI Glossary

Last updated: March 5, 2026