Encoder-only transformers

Architectures utilizing only the encoder component of the Transformer architecture, characterized by bidirectional self-attention. Unlike decoder-only models, these models process the entire input sequence simultaneously, allowing each token to attend to both preceding and following tokens in the context.

Core Functionality & Use Cases

Primary applications are focused on discriminative, extractive, and sequence-labeling tasks within natural-language-processing (NLP):

Comparative Context


Backlinks: