t-SNE projection of learned scenario embeddings (contours indicate density). Dataset proximity (e.g., ETH/inD, INTERACTION/WOMD) indicates potential for cross-dataset pretraining or knowledge transfer.
The growing availability of trajectory datasets has fueled major advances in data-driven motion prediction. Yet, models trained on one dataset often fail to generalize beyond their training domain as a result of differences in scene layouts, agent behaviors, and sensing conditions. A framework that learns latent representations of datasets and quantifies their similarity using distributional metrics is presented. This large-scale study covers 24 major datasets, including the most widely used motion-prediction benchmarks, and shows that the resulting transferability scores strongly correlate with cross-dataset model performance. The results provide practical guidance for dataset selection, pretraining, and large-scale foundation models for motion prediction, paving the way toward more generalizable and robust predictive systems.
The code will be released here soon. Stay tuned!
@inproceedings{westny2026unveiling,
title = {Unveiling Transferability in Trajectory Prediction via Latent Scene Embeddings},
author = {Westny, Theodor and Axelsson, David and Olofsson, Bj{\"o}rn and Frisk, Erik},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2026}
}