Fast Large-Scale Trajectory Clustering

Abstract

In this paper, we study the problem of large-scale trajectory data clustering, k-paths, which aims to efficiently identify k representative paths in a road network. Unlike traditional clustering approaches that require multiple data-dependent hyperparameters, k-paths can be used for visual exploration in applications such as traffic monitoring, public transit planning, and site selection. By combining map matching with an efficient intermediate representation of trajectories and a novel edge-based distance (EBD) measure, we present a scalable clustering method to solve k-paths. Experiments verify that we can cluster millions of taxi trajectories in less than one minute, achieving improvements of up to two orders of magnitude over state-of-the-art solutions that solve similar trajectory clustering problems.

Publication
In 46th International Conference on Very Large Data Bases (VLDB)