Track-On-R: Real-World Point Tracking with Verifier-Guided Pseudo-Labeling

Track-On-R is an online point tracking model that improves real-world performance through verifier-guided pseudo-label fine-tuning. It processes videos frame-by-frame using a compact transformer memory.

This model was introduced in the paper Real-World Point Tracking with Verifier-Guided Pseudo-Labeling.

Project Page: kuis-ai.github.io/track_on_r
Repository: github.com/gorkaydemir/track_on

Model Description

Models for long-term point tracking are typically trained on synthetic datasets, and their performance often degrades in real-world videos. Track-On-R addresses this by introducing a verifier, a meta-model that learns to assess the reliability of tracker predictions and guide pseudo-label generation. By selecting the most trustworthy predictions from an ensemble, it enables data-efficient adaptation to unlabeled real-world videos.

Sample Usage

You can track points on a video using the Predictor class from the official repository. Ensure you have the repository cloned and dependencies installed.

Minimal Example

import torch
from model.trackon_predictor import Predictor

device = "cuda" if torch.cuda.is_available() else "cpu"

# Initialize
model = Predictor(checkpoint_path="path/to/checkpoint.pth").to(device).eval()

# Inputs
# video:   (1, T, 3, H, W) in range 0-255
# queries: (1, N, 3) with rows = (t, x, y) in pixel coordinates
#          or use None to enable the model's uniform grid querying
video = ...          # e.g., torchvision.io.read_video -> (T, H, W, 3) -> (T, 3, H, W) -> add batch dim
queries = ...        # e.g., torch.tensor([[0, 190, 190], [0, 200, 190], ...]).unsqueeze(0).to(device)

# Inference
traj, vis = model(video, queries)

# Outputs
# traj: (1, T, N, 2)  -> per-point (x, y) in pixels
# vis:  (1, T, N)     -> per-point visibility in {0, 1}

Note: Track-On checkpoints do not include the DINOv3 backbone weights due to licensing restrictions. You must request access to the official pretrained weights for dinov3-vits16plus on Hugging Face. Once access is granted and you are logged in (huggingface-cli login), the weights will be automatically downloaded and cached locally on the first run.

Citation

@inproceedings{aydemir2026trackonr,
  title     = {Real-World Point Tracking with Verifier-Guided Pseudo-Labeling},
  author    = {Aydemir, G\"orkay and G\"uney, Fatma and Xie, Weidi},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year      = {2026}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Other

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for gorkaydemir/track_on_r

Real-World Point Tracking with Verifier-Guided Pseudo-Labeling

Paper • 2603.12217 • Published Mar 12