Encoder-Decoder Model
A transformer architecture with separate encoder and decoder components, where the encoder processes input and the decoder generates output conditioned on the encoder's representations.
Architecture
The encoder processes input bidirectionally. The decoder generates output autoregressively, using cross-attention to attend to the encoder's output. This enables tasks that transform one sequence into another.
Examples
T5: Text-to-Text Transfer Transformer by Google. BART: Bidirectional and Auto-Regressive Transformers. mBART: Multilingual BART. These excel at summarization, translation, and question answering.