SpaTrackerV2/README.md
2025-07-17 10:13:49 +08:00

4.4 KiB

SpatialTrackerV2: 3D Point Tracking Made Easy

CAD&CG, Zhejiang University; University of Oxford; Ant Research; Pixelwise AI; Bytedance Seed

Yuxi Xiao, Jianyuan Wang, Nan Xue, Nikita Karaev, Iurii Makarov, Bingyi Kang, Xin Zhu, Hujun Bao, Yujun Shen, Xiaowei Zhou

Project Page | BibTeX | Google Drive

arXiv Spaces Visitors

📰 Latest Updates & News

  • [July 17, 2025]: Our paper is on arXiv
  • [June 27, 2025]: SpatialTrackerV2 accepted by ICCV 2025
  • [June 23, 2025]: Huggingface Space Demo launched! Try it out: 🤗 Huggingface Space

TODO List

  • Release quick start of SpaTrack2-offline
  • Final version of Paper at PAPER.md
  • Release SpaTrack2-online
  • Training & Evaluation Codes.
  • More supports for other Depth Model, e.g., DepthAnything, StereoFoundation, UniDepth, Metric3D.
  • Ceres Python Bindings designed for SpatialTracker and Dynamic Reconstruction.

Set up the environment

To set up the environment for running the SpaTrack model, follow these steps:

  1. Clone the Repository:

    # clone the code
    git clone https://github.com/henry123-boy/SpaTrackerV2.git
    cd SpaTrackerV2
    
    # optionally download the example data to run the examples
    # Note: This will slowdown the clonining process, as it includes large files.
    git submodule update --init --recursive
    
  2. Create a Virtual Environment: It's recommended to use a virtual environment to manage dependencies.

    conda create -n SpaTrack2 python=3.11
    conda activate SpaTrack2
    
  3. Install Dependencies:

    Install the torch dependencies pip (tested with torch2.4).

    python -m pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu124
    

    Install the required Python packages using pip.

    python -m pip install -r requirements.txt
    

By following these steps, you should have a working environment ready to run the SpaTrack model.

Quick Start

Here are two examples of how to use SpaTrack2.

Type1: Monocular video as input (Example0)

python inference.py --data_type="RGB" --data_dir="examples" --video_name="protein" --fps=3

Type2: RGBD video + Camera poses as input (Example1)

We provide an example with Depth and Camera poses from MegaSAM. First, download the examples via:

sh scripts/download.sh

Then run inference with the command with below:

python inference.py --data_type="RGBD" --data_dir="assets/example1" --video_name="snowboard" --fps=1

Visualize your results

Guidance will be displayed in the terminal after running inference.py.

Please follow the instructions in the app_3rd README to configure the dependencies. Then, install the required packages:

python -m pip install gradio==5.31.0 pako

Our Gradio demo enables users to easily track points on the target object. Just try:

python app.py