Object tracking

The computer vision task of identifying and following objects across consecutive video frames while maintaining their identity and spatial relationships.

Core techniques:

  • Feature-based tracking: Using SIFT, ORB, or deep features for frame-to-frame matching
  • Deep learning trackers: Siamese networks (e.g., SiamRPN), correlation filter-based methods
  • Multi-object tracking (MOT): Handling occlusions, identity switches, and scale changes (e.g., SORT, DeepSORT)

Recent advancements focus on integrating spatial-temporal understanding with language models:

Related concepts: Video understanding, Multi-object tracking, Spatial-temporal modeling, large-language-models