Contact: Institute of Computing Technology, Chinese Academy of Sciences, No.6 Kexueyuan South Road Zhongguancun, Haidian District Beijing, 100190, China
I am an assistant professor with Institute of Computing Technology, Chinese Academy of Sciences. I received my PhD degree from University of Chinese Academy of Sciences in 2019 under guidance of Prof. Zhaoqi Wang. I was a visiting professor in Prof. Zhigang Deng lab in University of Houston from Oct. 2015 to Dec. 2018.
My research primarily centers on computer vision and computer graphics, especially visual traffic simulation, trajectory prediction with deep learning and autonomous driving.
This work presents a novel First-person View based Trajectory predicting model (FvTraj) to estimate the future trajectories of pedestrians in a scene given their observed trajectories and the corresponding first-person view images. First, we render first-person view images using our in-house built First-person View Simulator (FvSim), given the ground-level 2D trajectories. Then, based on multi-head attention mechanisms, we design a social-aware attention module to model social interactions between pedestrians, and a view-aware attention module to capture the relations between historical motion states and visual features from the first-person view images. Our results show the dynamic scene contexts with ego-motions captured by first-person view images via FvSim are valuable and effective for trajectory prediction. Using this simulated first-person view images, our well structured FvTraj model achieves state-of-the-art performance.
Most of existing traffic simulation methods have been focused on simulating vehicles on freeways or city-scale urban networks. However, relatively little research has been done to simulate intersectional traffic to date despite its obvious importance in real-world traffic phenomena. In this paper we propose a novel deep learning-based framework to simulate and edit intersectional traffic. Specifically, based on an in-house collected intersectional traffic dataset, we employ the combination of convolution network (CNN) and recurrent network (RNN) to learn the patterns of vehicle trajectories in intersectional traffic. Besides simulating novel intersectional traffic, our method can be used to edit existing intersectional traffic. Through many experiments as well as comparison user studies, we demonstrate that the results by our method are visually indistinguishable from ground truth and perform better than other methods.
Accurate vehicle trajectory prediction can benefit many Intelligent Transportation System (ITS) applications such as traffic simulation and advanced driver assistance system. This ability is pronounced with the emergence of autonomous vehicles, as they require the prediction of nearby agents' trajectories to navigate safely and efficiently. Recent studies based on deep learning have greatly improved prediction accuracy. However, one prominent issue is that these models often lack explainability. We alleviate this issue by proposing STA-LSTM, an LSTM model with spatial-temporal attention mechanisms. STA-LSTM not only outperforms other state-of-the-art models in prediction accuracy but also identifies the influence of historical trajectories and neighboring vehicles on the target vehicle via spatial-temporal attention weights. We provide analyses of the learned attention weights in various traffic scenarios based on target vehicle class, target vehicle location, and traffic density. An analysis showing that STA-LSTM can capture fine-grained lane-changing behaviors is also provided.
Virtualized traffic via various simulation models and real-world traffic data are promising approaches to reconstruct detailed traffic flows. A variety of applications can benefit from the virtual traffic, including, but not limited to, video games, virtual reality, traffic engineering, and autonomous driving. In this survey, we provide a comprehensive review on the state-of-the-art techniques for traffic simulation and animation. We start with a discussion on three classes of traffic simulation models applied at different levels of detail. Then, we introduce various data-driven animation techniques, including existing data collection methods, and the validation and evaluation of simulated traffic flows. Next, We discuss how traffic simulations can benefit the training and testing of autonomous vehicles. Finally, we discuss the current states of traffic simulation and animation and suggest future research directions.
Trajectory prediction for objects is challenging and critical for various applications (e.g., autonomous driving, and anomaly detection). Most of the existing methods focus on homogeneous pedestrian trajectories prediction, where pedestrians are treated as particles without size. However, they fall short of handling crowded vehicle-pedestrian-mixed scenes directly since vehicles, limited with kinematics in reality, should be treated as rigid, non-particle objects ideally. In this paper, we tackle this problem using separate LSTMs for heterogeneous vehicles and pedestrians. Specifically, we use an oriented bounding box to represent each vehicle, calculated based on its position and orientation, to denote its kinematic trajectories. We then propose a framework called VP-LSTM to predict the kinematic trajectories of both vehicles and pedestrians simultaneously. In order to evaluate our model, a large dataset containing the trajectories of both vehicles and pedestrians in vehicle-pedestrian-mixed scenes is specially built. Through comparisons between our method with state-of-the-art approaches, we show the effectiveness and advantages of our method on kinematic trajectories prediction in vehicle-pedestrian-mixed scenes.
Human trajectory prediction is challenging and critical in various applications (e.g., autonomous vehicles and social robots). Because of the continuity and foresight of the pedestrian movements, the moving pedestrians in crowded spaces will consider both spatial and temporal interactions to avoid future collisions. However, most of the existing methods ignore the temporal correlations of interactions with other pedestrians involved in a scene. In this work, we propose a Spatial-Temporal Graph Attention network (STGAT), based on a sequence-to-sequence architecture to predict future trajectories of pedestrians. Besides the spatial interactions captured by the graph attention mechanism at each time-step, we adopt an extra LSTM to encode the temporal correlations of interactions. Through comparisons with state-of-the-art methods, our model achieves superior performance on two publicly available crowd datasets (ETH and UCY) and produces more "socially" plausible trajectories for pedestrians.
In this paper, we propose a new data-driven model to simulate the process of lane-changing in traffic simulation. Specifically, we first extract the features from surrounding vehicles that are relevant to the lane-changing of the subject vehicle. Then, we learn the lane-changing characteristics from the ground-truth vehicle trajectory data using randomized forest and back-propagation neural network algorithms. Our method can make the subject vehicle to take account of more gap options on the target lane to cut in as well as achieve more realistic lane-changing trajectories for the subject vehicle and the follower vehicle. Through many experiments and comparisons with selected state-of-the-art methods, we demonstrate that our approach can soundly outperform them in terms of the accuracy and quality of lane-changing simulation. Our model can be flexibly used together with a variety of existing car-following models to produce natural traffic animations in various virtual environments.