AI Glossary

AI Guardrails

Programmatic safety constraints that filter, validate, or modify LLM inputs and outputs.

Overview

AI guardrails are programmatic safety mechanisms that monitor, filter, or constrain LLM inputs and outputs to prevent harmful, incorrect, or off-topic behavior. They act as a safety layer between the model and the user, catching issues that the model's training alone cannot prevent.

Key Details

Guardrails can be implemented as input filters (blocking prompt injection, detecting PII), output validators (checking for harmful content, factual consistency, format compliance), and behavioral constraints (keeping responses on-topic, enforcing length limits). Tools like Guardrails AI, NeMo Guardrails (NVIDIA), and custom classifiers provide frameworks for implementing these controls. Guardrails are essential for production LLM deployments, especially in regulated industries like healthcare and finance.

Related Concepts

ai safety • prompt injection • safety filter

← Back to AI Glossary