We implemented a map-matching module that takes position traces and a digital map, then finds the most likely sequence of segments taken by the vehicle, along with the points in the trace where the vehicle transitioned from one segment to another. The module uses a modified shortest path algorithm to find a minimum-error path from the nearest starting intersection to the nearest ending intersection. The error at each intersection in the path is the closest distance between the intersection and the position trace. Since the intersection locations and the position traces are inaccurate, the map matcher will not necessarily give correct results, but in practice it met our needs.
The map matcher lets the system focus on a single segment at a time, but the problem of modeling the lane structure for that segment remains. If we assume that vehicles in the center of different lanes will always be a certain distance apart, we can use a clustering technique to separate positions into lanes. Spatially clustering the positions using an algorithm such as k-means will not work, because two points in the same lane may be spatially distant. However, if two points are less than half a lane apart, they are probably in the same lane. Similarly, if a point is within half a lane of any point in a lane cluster, it is probably in that lane.
We could use this intuition to ``grow'' lanes by initializing each point to be its own lane, then merging lanes where a point from one lane is within half a lane of a point from the other. This algorithm is similar to hierarchical agglomerative clustering [6], but it represents the clusters by their constituent points instead of a statistical average. However, this algorithm does not take advantage of the knowledge that lanes are constrained to be parallel to each other and the segment centerline. Figure 2 shows a representative road segment and its parallel lanes. If we had an accurate representation of the segment centerline, the perpendicular distance, or offset, between a lane and the centerline should be constant. This allows us to represent a lane with a single value, substantially reducing the dimensionality of the space of lane models.
![]() |
The challenge in this approach lies in finding a sufficiently accurate segment centerline. The Navigation Technologies digital map includes shape information on segments, represented as a sequence of two or more points consisting of latitude/longitude pairs. If we use the piecewise linear curve connecting these points as the road centerline, we can compute offsets and cluster them. Unfortunately, experimental evidence shows that lanes are far from parallel to this curve, because the digital map is not sufficiently accurate. For example, an analysis of a sample trace with no lane-changing shows offsets from -20 to 10 meters from the digital map centerline.
It may also be possible to use one of the position traces itself as a sufficiently accurate approximation of the segment centerline. For our purposes, the centerline does not need to follow the center of the pavement. Instead, any curve parallel to all the lanes suffices. To select the best trace, we could try all available traces and select the one that scores best on an evaluation metric of the resulting clusters. The standard deviation of each cluster is a plausible metric. The standard deviation for a correct clustering should be near the position accuracy, because all the points, except for points during lane changing, center around a lane. Clustering on uniformly distributed points produces the highest possible standard deviation, because points are just as likely to be far from as cluster as near. However, even the best trace for clustering according to this metric may not be suitable, because all traces may change lanes at some point, or all traces may be noisy. Clearly, a superior approach will take advantage of the parallel structure of lanes without relying exclusively on an inaccurate digital map or a single inaccurate position trace.