Text-to-Image Generation
AI models that create images from natural language text descriptions (prompts).
Overview
Text-to-image generation uses AI models to create visual content from natural language descriptions. Modern systems like Stable Diffusion, DALL-E, and Midjourney produce photorealistic or artistic images based on text prompts, leveraging diffusion models or transformer architectures.
Key Details
These systems are trained on billions of image-text pairs and use CLIP-like models to align visual and textual representations. Techniques like classifier-free guidance and negative prompts give users fine-grained control. Applications include graphic design, concept art, advertising, and creative exploration, though they raise concerns about copyright and deepfakes.