MV²

Multi-View Multi-Vehicle Driving Dataset for Novel View Synthesis

A real-world benchmark for evaluating novel view synthesis under large viewpoint changes in dynamic driving scenes. MV² records the same outdoor scene from a car, a two-wheeler, and a drone, enabling train-on-one-platform and test-on-another evaluation.

Sanjay Bhargav Dharavath · Hanvitha Saraswathi Mukkamala · Faizan Farooq Khan · Ioannis Kakogeorgiou · Aditya Arun · Zakaria Laskar · C.V. Jawahar

Centre for Visual Information Technology, IIIT Hyderabad

Overview Dataset GitHub

Pose-verified scenes

retained from 200 recorded sequences

12K

Calibrated images

1080×1920 · 60 FPS · sampled to 2 FPS

Platforms

car · two-wheeler · drone

Evaluation setups

Eval-Car-Train and Eval-Drone-Train

Dataset samples

Synchronized frame triplets from the same timestamp across the car, two-wheeler, and drone capture platforms.

seq10 car forward frame 000010 — V_C · car frontframe 000010 · training stream

seq10 scooty forward frame 000010 — V_S · scooty frontframe 000010 · cross-vehicle view

seq10 drone frame 000010 — V_D · droneframe 000010 · aerial stream

Why MV²?

Most driving NVS benchmarks test interpolation along the same vehicle trajectory. MV² instead measures cross-platform extrapolation: train on one moving platform and render from another.

Real multi-platform capture

A car, a two-wheeler, and a drone observe the same dynamic outdoor scene from independent synchronized trajectories.

real-worlddynamic

Cross-view evaluation

Models are trained on one platform and evaluated on another, exposing failures hidden by same-trajectory test splits.

T_C→ST_D→C

Pose-verified benchmark

COLMAP poses are filtered with manual region annotations, dense RoMA correspondences, and epipolar consistency checks.

SfMe_m ≤ 30 px

Dataset construction

A short version of the capture and filtering protocol used in the paper.

Capture

GoPro 10 cameras mounted on a car, two-wheeler, and drone capture synchronized 1080×1920 videos at 60 FPS.

car · two-wheeler · drone

Segment

Videos are sampled at 2 FPS and split into 100-frame segments. Congested, tunnel, red-light, and poorly aligned segments are removed.

200 candidates

Register

Training sequences are reconstructed with COLMAP. Test images are localized using the nearest training images.

single world frame

Verify

Relative poses are accepted only when the maximum epipolar error over annotated correspondences is at most 30 pixels.

50 final scenes

Evaluation protocol

A split name T_X→Y means the model is trained on platform X and rendered/evaluated from platform Y.

Eval-Car-Train train: V_C

T_C→Csame car trajectory; closest to conventional NVS interpolation

T_C→Lleft car-mounted camera; small lateral baseline

T_C→Stwo-wheeler trajectory; cross-vehicle generalization

Eval-Drone-Train train: V_D

T_D→Dsame drone trajectory; aerial interpolation

T_D→Cdrone-to-car rendering; aerial-to-ground gap

T_D→Sdrone-to-two-wheeler rendering; cross-platform gap

Key findings

The page keeps only the main conclusions needed for a project website.

Viewpoint gap hurts NVS

Performance consistently drops as the test viewpoint moves away from the training trajectory, especially from T_C→C to T_C→S.

One extra car camera is not enough

T_C→L introduces only a small baseline. T_C→S is the more realistic cross-vehicle test.

Aerial-to-ground is difficult

Training on drone images and rendering ground views causes a large degradation for both static and dynamic NVS methods.

Pose estimation remains open

Feed-forward pose estimators trail COLMAP under wide-baseline cross-platform localization.

Benchmark

Interactive table using the paper's main NVS results. Use tabs to switch training platforms; PSNR/SSIM higher is better and LPIPS lower is better.

Table values are copied from the main paper. Feed-forward methods use 12 context views; 2-view and 6-view ablations can be added later if needed.

Demo placeholder

Keep this block as a placeholder until the hosted viewer or video demo is ready.

MV² Viewer

Interactive reconstruction preview

The interactive demo will land here once the hosted viewer is ready. The scene toggle already tracks the showcase sequences on this page.

current scene: seq10 · camera: free orbit

Hosted interactive viewer link: https://example.com/mv2-viewer

Resources

Dataset, code, and paper entry points for the MV² release.

Dataset

Train/test lists, calibrated images, camera poses, and correspondence annotations for the benchmark release.

Dataset link

Code

Evaluation scripts, split readers, metric computation, and baseline configuration files.

GitHub repo

Paper

Paper PDF and supplementary material for the ECCV release.

Paper link