next up previous
Next: Supervised Neural Network Classifier Up: Control Identification on Individual Previous: Collection of Driver Behavior

   
Computation and Preprocessing of Segment Statistics

We took the following steps to preprocess segment data and compute segment statistics:


  
Figure 2: This graph of driver speed along a road segment shows that speed estimates derived from GPS data exhibit noise. The system records a 'stop' when speed drops below a threshold rather than when it reaches zero to compensate for this noise and to detect stops whose duration is smaller than the 1 Hz GPS sampling rate.
\includegraphics[width=5in]{chris-metric.eps}

1.
Insure consistency of data: The data preprocessor, comprised of several UNIX shell scripts and 'C' programs, excluded data that reflect a turn at a four way intersection. Drivers can behave differently when turning than when continuing straight through an intersection. For example, it is legal to make a right-hand turn after a making a stop when the light is red at some stop lights in California.

2.
Detect stops: The data preprocessor detected a stop whenever a driver's speed dropped below a threshold. It used a threshold greater than zero to compensate for variability in the GPS position estimate and to allow detection of stops whose duration was less than the position sampling rate of 1 Hz.

3.
Record stop data: The data preprocessor counted and recorded the number and duration of stops. It recorded the three stop times for the stops closest to the end of the segment.

4.
Match data to ground truth: Drivers drove on some roads for which we did not possess ground truth; we did not use this data. The preprocessor selected those segments for which we had ground truth from files containing segment traversal data.

5.
Average traversal data: The preprocessor computed the average and standard deviation statistics for each measurement. It computed the percentage of traversals in which at least one stop occurred. If there were 5 or more samples, it created an input instance from the computations. That is, a single input instance comprises the statistics computed from all the different traversals of a given road segment in a particular direction.

6.
Reject known ambiguous instances: The preprocessor filtered out instances that exhibited infrequent stops. It removed cases where the percent of traversals with a stop was greater than zero but less than 30%. This data is ambiguous. We explain why and discuss how to reduce this ambiguity below in §4.

7.
Exclude intersections with short segments from training: Some intersections included very short segments on the digital road map. It is difficult to correctly assign GPS data to road segments in intersections with short segments. We excluded this data from training, but allowed it during testing.


next up previous
Next: Supervised Neural Network Classifier Up: Control Identification on Individual Previous: Collection of Driver Behavior
Seth Rogers
1998-11-20