alibaba’s VideoRefer Suite is an open-source video model (Apache 2 license) designed to enhance large-language-models with fine-grained spatial-temporal object understanding. It enables LLMs to track and reason about specific objects within videos across time and space.

Backlink: 2026 04 14 Fahd Mirza Videorefer model running locally