Video processing algorithms - Factors that affect the accuracy of the measurements taken from v

5. PRACTICAL EXPERIENCE

5.4. Factors that affect the accuracy of the measurements taken from video

5.4.2. Video processing algorithms

Video analysis algorithms require quite a few parameters to be set, e.g. the thresholds between detection of a road user and a noise, size parameters for separation of

Scale is another important quality which says how big an area in the real world is covered by one pixel. This value is usually not constant as more remote objects seem to be smaller in the image and thus have a smaller pixel size. If the number of pixels representing an object is too low, it may be classified as “noise” and therefore go undetected. This becomes crucial when an object is partly occluded and is given an even smaller pixel size. The scale can be increased by moving the camera closer to the scene, but this will decrease the total area covered by the image. Another way is to increase the resolution of the images, but this also means more data to store and analyse.

Frame rate and exposure time

A digital video is a sequence of images taken one by one with a high frequency. The frame rate characterises how many images (frames) are recorded during a time unit (usually 1 second). Again, a higher frame rate means more data and ensures more stable detections. If the frame rate is low, an object with a high speed might cover a significant distance and tracking errors, like losing or detecting new objects in the middle of the scene or mixing up 2 objects, become more probable.

The maximum frame rate is limited by the exposure time required for one frame to be recorded. When the ambient lighting is low (in twilight or night conditions), the required exposure time is longer and the frame rate is normally set to be lower compared to what is possible in the daytime.

Colour vs. grey-scaled imagery

Modern camera equipment most often records video in colour. Colour is an important quality if the recordings are meant to be visually controlled by an observer, since it creates a better representation of the scene. However, it appears that video analysis performs quite well on grey-scale video and the use of extra information provided by colour does not improve the quality of the results much, at least for the traffic scenes where the grey palette is dominating.

Atmospheric and lighting conditions

Atmospheric conditions like rain, snow or mist have a considerable effect on the quality of the recorded video as they usually make the image less clear and thus the detection more difficult. When installing the camera, it is also important to consider how the light will change during the day. Sun glare in the morning and evening can make the video data completely unusable, while the low contrast in dark shadows may affect the quality of the detections. Analysis of video recordings done in the dark requires modification of the algorithms as road users are then often poorly lighted, while the light patches on the asphalt in front of vehicles with headlamps on may be detected as moving objects.

5.4.2. Video processing algorithms

Detection and tracking of road users

Video analysis algorithms require quite a few parameters to be set, e.g. the thresholds between detection of a road user and a noise, size parameters for separation of

individual road users and their classification by type, etc. Being set for some conditions, the same parameters might no longer be optimal if the conditions change.

When a road user is occluded by some element of road furniture or another road user, most of the tracking algorithms break the trajectory and then start a new one when the road user is seen again. Some techniques (for example, the Kalman filter) allow the separate pieces of trajectories to be connected, but it is still not unusual for the same road user to be lost and detected again, each time with a new identity.

Accuracy of the position and speed estimates

The accuracy of position and speed estimates is a very important issue for studying interactions between road users. As discussed in chapter 4, a description of an interaction requires complex indicators that are seldom measurable directly, but are calculated from the speed and position of one or both road users. The inaccuracy in the “raw” data may have a substantial effect on the values of the indicators calculated from it. This is illustrated in an example in Figure 32, which shows calculations of Time-to-Collision for an encounter between a pedestrian and a car. Calculations are performed for the “true” pedestrian trajectory (a in Figure 32a) and several trajectories with introduced errors in positions generated by shifting the “true” trajectory by 1 and 2 meters (b-e, Figure 32a). Figure 32b shows the Time-to-Collision curves calculated for the different pedestrian trajectories. The error in position has an effect on the TTC values and how long the road users are considered to be on a collision course.

0 4 8 12 16 20 24 28 32 36 40 44

0 4 8 12 16 20 24 28

Y, m

X, m d

a b

e c

a "true" position b, c +/- 1m error d, e +/- 2m error

2 2,5 3 3,5 4 4,5

1,8 2,2 2,6 3 3,4 3,8

TTC, sec.

Time, sec.

a e

d b

Figure 32. The effects of position accuracy on Time-to-Collision (TTC) estimates: a) vehicle and pedestrian trajectories; b) TTC profiles calculated for different pedestrian trajectories.

individual road users and their classification by type, etc. Being set for some conditions, the same parameters might no longer be optimal if the conditions change.

Accuracy of the position and speed estimates

0 4 8 12 16 20 24 28 32 36 40 44

0 4 8 12 16 20 24 28

Y, m

X, m d

a b

e c

a "true" position b, c +/- 1m error d, e +/- 2m error

2 2,5 3 3,5 4 4,5

1,8 2,2 2,6 3 3,4 3,8

TTC, sec.

Time, sec.

a e

d b

Figure 32. The effects of position accuracy on Time-to-Collision (TTC) estimates: a) vehicle and pedestrian trajectories; b) TTC profiles calculated for different pedestrian trajectories.

Getting a correct position requires estimation of the “footprint” of a road user on the ground. This in turn requires restoration of the road user’s 3-dimensional shape from its 2-dimensional representation in the video images, which is not possible without making certain assumptions. The simplest assumption is that road users are “flat” and lie on the road plane. In this case, the position of a road user is estimated simply as the middle point of the pixels representing the road user in the image, and it is transferred to the real-world co-ordinates as shown in Figure 5. However, the cost of the simplicity is the introduction of a systematic error in position (Figure 33). The size of the error depends on factors like camera height above the scene, height and orientation of the road user and distance from the camera. Generally, the error is greater for large vehicles (e.g. buses, lorries) than for small ones, and increases as the road user moves away from the camera, i.e., it is not constant even for the same road user during a passage. Nonetheless, the error size does not change significantly between two adjacent frames and therefore the estimates of speed are not affected much.

true position estimated position

Figure 33. The systematic error introduced by assumption of “flat” road users.

Table 6. The errors in position estimation (m) caused by “flat” road user assumption (seen strictly from the side).

Camera distance,

Camera height, m

15 20 25

Car width 2.2 height 1.6m

20 1.3 0.9 0.7 40 2.5 1.8 1.4 60 3.6 2.7 2.1 Bus

width 2.7m height 2.5m

20 2.1 1.5 1.2 40 4.1 3.0 2.3 60 6.1 4.4 3.4

Table 6 shows the results of some simple calculations of the error size for a car and a bus position depending on the camera height and the distance to the vehicle. Here, it