Encoder-Only Model
A transformer architecture that processes the entire input bidirectionally to produce rich contextual representations, used for understanding tasks like classification and NER.
How It Works
The encoder processes the full input sequence with bidirectional self-attention (each token attends to all others). It outputs contextualized representations for each token. BERT is the canonical encoder-only model.
Use Cases
Text classification, named entity recognition, sentence similarity, semantic search (embedding generation), and any task that requires understanding existing text rather than generating new text.