Conditional Random Fields
Conditional Random Fields (CRFs) are a class of probabilistic graphical models designed for structured prediction tasks where the output consists of sequences or other interdependent elements. Unlike generative models that learn the joint distribution of inputs and outputs, CRFs are discriminative models that directly estimate the conditional probability of output sequences given input sequences. This discriminative approach avoids the computational burden of modeling the input distribution, making CRFs particularly efficient for many practical applications.
CRFs are widely used in natural language processing tasks such as named entity recognition, part-of-speech tagging, and sequence labeling. The model captures dependencies between labels in an output sequence through potential functions defined over pairs of adjacent labels and observations. This ability to model label interactions while conditioning on observed data distinguishes CRFs from simpler classifiers like logistic regression, which typically make independent predictions for each element.
Model Structure
A CRF defines the conditional probability of a label sequence given observations using an exponential family distribution. The model consists of feature functions that combine input observations with label information, weighted by learned parameters. The normalization is performed over all possible output sequences, which requires inference algorithms such as the Viterbi algorithm for finding the most likely sequence and the forward-backward algorithm for computing probabilities.
Training and Inference
CRFs are typically trained using maximum likelihood estimation, with parameters optimized to maximize the conditional likelihood of training sequences. Inference involves computing the most probable label sequence for new observations. Linear-chain CRFs, where labels depend only on adjacent labels, are the most common variant and scale efficiently to long sequences.