MATLAB implementation of ST-Matching Algorithm for map matching problem.
Map-matching is the process of aligning a sequence of observed user positions with the road network on a digital map. It is a fundamental pre-processing step for many applications, such as moving object management, traffic flow analysis, and driving directions. In practice there exists huge amount of low-sampling-rate (e.g., one point every 2-5 minutes) GPS trajectories. Unfortunately, most current map-matching approaches only deal with high-sampling-rate (typically one point every 10-30s) GPS data, and become less effective for low-sampling-rate points as the uncertainty in data increases. In this paper, we propose a novel global map-matching algorithm called ST-Matching for low-sampling-rate GPS trajectories.ST-Matching considers (1) the spatial geometric and topological structures of the road network and (2) the temporal/speed constraints of the trajectories. Based on spatio-temporal analysis, a candidate graph is constructed from which the best matching path sequence is identified. We compare ST-Matching with the incremental algorithm and Average-Fréchet-Distance (AFD) based global map-matching algorithm. The experiments are performed both on synthetic and real dataset. The results show that our ST-matching algorithm significantly outperform incremental algorithm in terms of matching accuracy for low-sampling trajectories. Meanwhile, when compared with AFD-based global algorithm, ST-Matching also improves accuracy as well as running time.
st-matching.m
There are two sets of input files:
- road network file (containing information of edges and nodes, usually from Open Street Map)
- gps trajectory files
Format of road network file:
EdgeID | Node1ID | Node2ID | Longitude of Node1 | Latitude of Node1 | Longitude of Node2 | Latitude of Node2 | Road Type |
---|
Format of gps trajactory files:
GPS Device ID | Timestamp | Longitude | Latitude | Speed (km/h) | Direction |
---|
define length of each cell(in km),then call splitRoadtoCell
code sample:
roadnetworkfilename = 'RoadNetwork_Beijing.txt';
cell_size = 0.1;
[road_network,speed_limits,road_cells,road_ids,grid_size] = splitRoad2Cell(roadnetworkfilename, cell_size);
output of this code include
- a matrix of raw road network road_network (m x 7,same as input road network file except last column)
- a vector of speed limist of each roads speed_limits translated from road type (last column of input road network)
- information of each cell road_cells (m x 5)
cellID | startLon | endLon | startLat | endLat |
---|
- size of grid grid_size (m x n cells from entire road network)
function: splitGPS2line
Splitting criteria:
- assume a standard sampling interval
$\Deltat_n$ and a standard length for each trajectory segment l - the length of a trajectory should be less or equal than l
$sum \Delta t <= 5\Deltat_n*l$ - cut as few trajectory as possible
code sample:
gpsfilename = 'GPS_Beijing.txt';
raw_gps_points = splitGPS2line(gpsfilename, 6, 5);
output result will be added as a tag additional to the original gps trajectory file, so that the format of raw_gps_points (m x 7) will be
GPS Device ID | Timestamp | Longitude | Latitude | Speed (km/h) | Direction | Tag |
---|
With the cutted road network, we first find candidate edges around raw sample points, then project sample points to that edge. (for speed consideration, only the top 5 closest candidate edges are considered)
- Script for this step can be seen in function getCandidatePoints.
- Line projection is vector based and can be seen in the sub-function project2Line
The entire road_network obtained in step 1 is considered too big for an individual trajectory, hence we implement a cutting on the original big road network.
- script for this step can be seen in function cutGridforTrajactory
- The cutted network contains all the candidate edges of trajectory points.
- a graph G is constructed for path search in the next step
- a table node_table recording relationship between original NodeID and actual node ID used in graph G is generated
see findMatchedSequence above for general procedure of this step. both spatial(transmission * observation) and temporal probability are calculated between candidate points.
In our experiment, we find some issue particularly at crossing roads. This is due to the imbalance of two probabilities. For example, when transmission probability is very close, observation probability become the dominant factor. We did a potential fix for simple crossing error.
see validation.m and picture in the result folder for detail. Sample points are plotted in blue points, while the matched points are plotted in red cross. The path between matched points are also plotted.