WSDM2021

Predicting Crowd Flows via Pyramid Dilated Deeper Spatial-temporal Network

Congcong Miao 1 Jiajun Fu 1 Jilong Wang 1 Heng Yu 1 Botao Yao 1 Anqi Zhong 1 Jie Chen 2 Zekun He 2
1Tsinghua University, China
2Tencent, China

Predicting crowd flows is crucial for urban planning, traffic management and public safety. However, predicting crowd flows is not trivial because of three challenges: 1) highly heterogeneous mobility data collected by various services; 2) complex spatio-temporal correlations of crowd flows, including multi-scale spatial correlations along with non-linear temporal correlations. 3) diversity in long-term temporal patterns. To tackle these challenges, we proposed an end-to-end architecture, called pyramid dilated spatial-temporal network (PDSTN), to effectively learn spatial-temporal representations of crowd flows with a novel attention mechanism. Specifically, PDSTN employs the ConvLSTM structure to identify complex features that capture spatial-temporal correlations simultaneously and then stacks multiple ConvLSTM units for deeper feature extraction. For further improving the spatial learning ability, a pyramid dilated residual network is introduced by adopting several dilated residual ConvLSTM networks to extract multi-scale spatial information. In addition, a novel attention mechanism, which considers both long-term periodicity and the shift in periodicity, is designed to study diverse temporal patterns. Extensive experiments were conducted on three highly heterogeneous real-world mobility datasets to illustrate the effectiveness of PDSTN beyond the state-of-the-art methods. Moreover, PDSTN provides intuitive interpretation into the prediction.