Optimizing traffic with CNN-Based visual object analysis

5 juni 2024

Matthijs Zwemer's PhD research focuses on the visual analysis of road vehicle movements in urban and highway traffic. His work aims to optimize traffic flow and ensure safety through real-time traffic surveillance, which is crucial for Smart City applications. 's thesis demonstrates that CNN models are essential for advanced visual object analysis. Recent advancements in CNN-based techniques and embedded GPU hardware have enabled real-time, low-power systems for traffic surveillance. These systems enhance traffic enforcement, safety, and flow efficiency, with future large-scale deployments expected to improve transportation in Smart Cities, offering better control during events and emergencies with minimal impact.

The problem statement of the thesis is to design and develop the first fundamental stages, such as object detection, localization, classification, tracking and re-identification, based on deep learning with CNN models. With the developed algorithm stages and network models, three industrial applications are constructed and evaluated with system architectures matching with the requirements of the considered case study. The first application introduces a Make-Model Recognition (MMR) system that detects and classifies vehicle models. The second application focuses on vessel-speed enforcement, measuring vessel speeds between distant camera locations. The third application presents a versatile detection model with hierarchical classification for various industrial uses, including surveillance, abnormal behavior detection, and traffic flow improvement at urban traffic lights by prioritizing group cyclists.

All studied traffic surveillance systems start object analysis with an object detection and localization stage. The initial detection model utilizes manually crafted features and a sliding window-based linear classifier for object localization, suitable for fixed camera viewpoints and constrained object appearances. However, for unconstrained viewpoints and objects with varied visual appearances, a CNN-based SSD model is employed. This model is trained for vessel and traffic detection, achieving high detection performance.

Simultaneous detection and classification

The second stage focuses on object classification, initially using the AlexNet network for detailed vehicle make and model identification. As the study progresses, hierarchical classification is added to the SSD detection model, enabling simultaneous detection and classification. This enhancement sorts objects with greater granularity, scoring 74% mAP for challenging cases and 82.2% mAP for easier ones when tested on traffic surveillance datasets.

Object tracking is performed by a straightforward tracking method to achieve real-time object tracking, complementing the SSD detection model that can't keep up with typical surveillance camera frame rates.

The object re-identification stage aims to track unique vessels across distant camera locations. The proposed tracklet-based querying alone elevates the score from 68.9% to 74.5%, while additional refinements in tracklet size and pre-filtering based on travel time and direction increase the performance to 88.9% for final re-identification.

For the above research, several semi-automated methods are proposed for collecting relevant data for training and evaluation of the systems. This has resulted in datasets for evaluating a vehicle make and model recognition system (650k images), a vessel detection system (48k images) and a vessel re-identification system (136k images). The last part of the research investigates using different combinations of existing datasets for training, without the requirement to align the labels to the same detail level of classification.

Innovative traffic detection and tracking

In conclusion, the thesis contributes in the following way. First, the research offers a thorough evaluation of multiple one-shot detection solutions, which are evaluated for several specific traffic surveillance applications for enforcement and safety. Second, fine-grained classification is learned and evaluated for vehicle make and model recognition and multiple levels of detail are integrated by means of a hierarchical classification structure in the SSD-detection model, using semi-supervised learning. Third, object tracking is exploited elegantly to facilitate real-time throughput of the overall industrial/professional application, mainly by efficiently trading-off tracking effort against renewed detection. The fundamental processing stages and the designed models for detection, localization and recognition are suitable for reuse in similar alternative applications.

Media Contact

Rianne Sanders

(Communications Advisor ME/EE)

J.J.M.Sanders@tue.nl

果冻传媒