AI Glossary

RoBERTa

An optimized version of BERT that achieves better performance through improved training methodology.

Overview

RoBERTa (Robustly Optimized BERT Approach), developed by Facebook AI in 2019, improves upon BERT by modifying key training decisions: training longer with bigger batches, removing the next sentence prediction objective, using dynamic masking, and training on more data (160GB vs BERT's 16GB).

Key Details

Without any architectural changes, these training improvements led to significant performance gains across all benchmarks. RoBERTa demonstrated that BERT was substantially undertrained and that careful hyperparameter tuning matters as much as architectural innovation. It remains a strong baseline for NLP research.

Related Concepts

berttransformerpre training

← Back to AI Glossary

Last updated: March 5, 2026