Improving the evidence base on journey time reliability on the Trunk Road Network in Scotland

4. Theoretical Background and Current Best Practice

4.1. The current appraisal framework

In section 9.2.2.9 of Transport Scotland's "STAG Technical Database Section 9", see Transport Scotland (2012), reliability for private road vehicles is expressed by means of the reliability ratio (RR), defined as:

Reliability Ratio = Value of SD of travel time / Value of travel time

We prefer to write that as:

Value of SD of travel time

For example, we might have an estimate of the value of a travel time saving as £6/hour for some group. That means that the value of a ΔT= ‒10 minutes is estimated at £1. We now need to know the value of reducing the standard deviation of travel times also by 10 minutes. From the literature, the consensus of opinion is that RR is often around 0.8, in which case the value of reducing the standard deviation of travel times would be about £4.80 per hour, and so the value of reducing the sd by 10 minutes would be £0.80.

For public transport however, the RR is defined in STAG differently. The justification given for this is the existence of a timetable. They say that in "the general case one minute of average lateness is valued by passengers as being equivalent to three minutes of scheduled journey time". This value of 3 is referred to as a Lateness Factor. Passengers are said to be concerned less about journey time variability per se, but more about lateness relative to the timetable. They say that, broadly, "the value of average lateness for public transport is expected to be the same as the value of time spent waiting for public transport, that is, at 2.5 times the value of in-vehicle time". The reliability ratio, for public transport is defined as:

Reliability Ratio = Value of SD of lateness / Value of lateness

We prefer to write that as:

Value of SD of lateness

ΔL: an identical sized change in lateness (= A_i - A^S, where A _i is the actual arrival time of trip i and A^S is the timetabled arrival time referring to that trip; with A_i – A^S ≥ 0, i.e. early arrivals are treated as being on time).

For example, we might have an estimate of the value of a travel time saving as £6/hour for some group. That means that the value an hour's lateness is £15 (using the recommended f value of 2.5). From the literature, the recommended RR value for public transport is given as 1.4. In that case the value of increasing the standard deviation of lateness would be about £21 per hour.

4.2 The problem

The question now is what would be the most appropriate definition of reliability:

the reliability ratio based on the standard deviation of travel time;
the reliability ratio based on the standard deviation of lateness; or
another definition of reliability one might think of, notably involving the expected value of schedule delay early (SDE) and schedule delay late (SDL), following a classic Vickrey-Small scheduling model to which uncertainty in travel time has been added (see Bates et al., 2001), where:
o SDE: the number of minutes one arrives earlier than the preferred arrival time PAT (for early arrivals);
o SDL: the number of minutes one arrives later than the preferred arrival time PAT (for late arrivals).

Some observations:

Lateness in the sense of actual arrival time minus scheduled arrival time exactly corresponds to delay in the sense of actual travel time minus scheduled travel time (assuming that there are no other delays in the actual departure time):

A_i – A^S = T_i - T^S

where:
T^S : the scheduled travel time.

Consequently, the standard deviation of lateness is equal to the standard deviation of transport time delays.

One can also substitute free-flow or expected for the "scheduled" in the text above: there is also a formulation of this that holds for private (e.g. road) transport. This result simply follows from the fact that the only delays considered are those in the travel time of the mode studied.

For road transport, lateness might be defined with respect to free flow time:

L_i = T_i - T^F

where:

T^F: free flow travel time

If the free-flow time T^F is constant (e.g. for all trips on different days on a given route), then the value of lateness (VOL) will be equal to the value of time (VOT), and the standard deviation of Ti will equal the standard deviation of T_i-T^F :

formula

In this case, the two first options given in the beginning of this section give the same result. This result hinges on the constancy of free flow travel time: in that case subtracting it from actual travel time affects the mean, but does not affect the standard deviation. As soon as one compares trips over different routes, the free flow travel time will vary, and the equality no longer holds.

4.3 The current view by experts in the field

In a project for the German Federal Ministry of Transport, Building and Urban Development, international experts^[1] on travel and transport time reliability were interviewed (Significance et al., 2012) on a number of related, but somewhat broader issues than the question posed by Transport Scotland. One of the questions was which operational definition of reliability they would recommend for including reliability in the CBA in the next 2-3 years (for Germany, Scotland, and almost every national or regional transport model used across the world for appraisal that do not include explicit Vickrey-Small scheduling models). Below is a chart of the frequency distribution of the answers of the experts. From Figure 4.1 it is very clear that the standard deviation has most support among the experts as a measure of reliability that can be included in the CBA in 2-3 years from now. This is the standard deviation of travel time, not of lateness. Some experts however, expressed a preference for using lateness relative to the timetable (expressed in Figure 4.1 as "punctuality"), but only for modes that use a published timetable.

Significance et al. (2012) also reviewed the literature on arguments for and against different operational definitions of reliability and asked the experts to give their arguments for and against. With respect to the standard deviation the following arguments were obtained.^[2]

Arguments for using the standard deviation (again referring to travel time) are:

(i) It has an indirect base in theory, since Fosgerau and Karlström (2010) showed the formal equivalence with the scheduling model (at least for modes without timetables, such as the car; for public transport this argument does not hold).

(ii) It can be empirically measured.

(iii) It is relatively easy to include in standard transport models (since it does not require including a scheduling model to the transport model, but only an extra reliability term in choices like mode and route choice).

(iv) Related to the previous, since it requires no formal scheduling model, it also does not require preferred arrival times (PATs), for which specific survey interviews would be needed or reverse engineering (Kristoffersson, 2011).

(v) It often provides a good fit to stated preference (SP) data (choices between alternatives that differ in terms of reliability are often well explained by a model that includes the standard deviation).

(vi) It can capture a residual (non-scheduling-related) value (e.g. anxiety).

(vii) It is a natural way to summarise a distribution (together with the mean).

Figure 4.1: Most appropriate definition of reliability for use in CBA: Frequency distribution of answers of the experts

Arguments against using the standard deviation are:

(i) It is rather sensitive to outliers.

(ii) It does not properly pick up the form of the tail and skew (i.e. it ignores the higher 'moments' of the distribution).

(iii) It is not additive over links: even when link travel times are independent of each other, simple summation of standard deviations (unlike the variance) over links will not give the standard deviation of the route that uses these links. Choosing to use the variance would resolve the latter argument for independent link travel times. However, in a congested network, congestion spreads backwards from the original bottleneck, creating dependence among the travel times of adjacent links, so that the variance is not additive over links either.

Given that in most countries there will not be a national departure time choice model in the next 2-3 years, there is no other choice really than to use a dispersion measure instead of schedule delay. This also applies to Scotland. So the option mentioned in section 4.2 of using schedule delay early and late is not feasible in the short run; it is possible to get a monetary value, but there is no forecasting model that can support this definition of reliability.

Significance et al. (2012) recommended using the standard deviation of travel time (in the long run, it might be possible to switch to the scheduling model), both for private transport (passenger and freight) and for public transport. In the recent Dutch VOTVOR study (Significance et al., 2013) the standard deviation of travel time was also chosen as the measure of reliability for all modes (including public transport) in passenger and freight transport. This study could have chosen to use the standard deviation of travel time for non-scheduled modes and the standard deviation of lateness for scheduled modes, but it preferred using the standard deviation of travel time for all modes because:

The advantage of consistency of definition across modes.
When one uses travel time, both early and late arrivals are included, lateness only looks at late arrivals.

In summary, for the appraisal of trunk road schemes there are many, mainly practical, arguments for using the standard deviation of travel time (valued as part of an RR). Most experts support this choice, at least for the short to medium run. For public transport, where there are scheduled services, the selection of the best definition is less straightforward. A measure based on lateness relative to the timetable is a serious contender to the standard deviation of travel time. The theoretical argument for the standard deviation does not hold here, because the Fosgerau-Karlström model assumes a free choice of departure times. Nevertheless some of the most recent studies (Germany, The Netherlands) have selected the standard deviation of travel time even for public transport.

4.4 Some Numerical Results for the RR

The following overview of numerical results for the RR in passenger transport is taken from Significance et al. (2013). The results are summarized in Table 4.4.1, which is mostly concerned with passenger trips. All results use the RR definition based on the standard deviation of travel time. It should be noted that the table includes their own results, from Stated Preference surveys carried out in 2009 and 2011 in The Netherlands. The findings of an Expert Workshop, held in 2004, are also shown. The consensus for the RR of car travelers is around 0.8. For Public Transport the position is less clear, but the value of 0.8 again looks reasonable.

The same table also presents results for road freight transport (all from SP studies). The RR using the standard deviation of road freight transport time in Significance et al. (2013) is around 0.4. This is substantially lower than the preliminary (highly provisional) value of 1.2 (for road transport) from de Jong et al. (2009); In the new Dutch VOTVOR survey, unreliability, its context and its consequences were made much more explicit and the presentation format is much more suitable for measuring unreliability in terms of the standard deviation of transport time (or scheduling terms), so the 2013 values are to be preferred. Other recent empirical studies, notably Halse et al. (2010) and Fowkes (2006) also found similar low RRs in freight (when including the valuation of transport staff time and vehicles from the carriers in the values of reliability and time).

**Table 4.4.1. Summary of the empirical findings on the reliability ratio in passenger and freight transport (the value of the standard deviation of travel time versus the value of travel time)**
Study	Country	RR
Car
MVA (1996)	UK	0.36 – 0.78
Copley et al. (2002)	UK	Pilot survey: 1.3
Hensher (2007)	Australia	0.3 – 0.4
Eliasson (2004)	Sweden	0.30 – 0.95
Mahmassani (2011)	USA	NCHRP 431: 0.80 – 1.10 SHRP 2 CO4: 0.40 – 0.90
*Expert workshop of 2004*	The Netherlands	0.8
Significance et al. (2013)	The Netherlands	Commuting: 0.4 Business: 1.1 Other: 0.6
Train
ATOC (2002)	UK	0.6 – 1.5
Ramjerdi et al. (2010)	Norway	Short trips: 0.69 Long trips: 0.54
*Expert workshop of 2004*	The Netherlands	1.4
Significance et al. (2013)	The Netherlands	Commuting: 0.4 Business: 1.1 Other: 0.6
Bus/tram/metro
MVA (2000)	France	0.24
Ramjerdi et al. (2010)	Norway	Short trips: 0.69 Long trips: 0.42
*Expert workshop of 2004*	The Netherlands	1.4
Significance et al. (2013)	The Netherlands	Commuting: 0.4 Business: 1.1 Other: 0.6
Air
Ramjerdi et al. (2010)	Norway	0.20
Significance et al. (2013)	The Netherlands	Business: 0.7 Other: 0.7
Road freight
Fowkes (2006)	UK	Shippers: 0.38 Own-account: 0.19
Halse et al. (2010)	Norway	Shippers: 1.2 Carriers: 0 Overall: 0.11
Significance et al. (2013)	The Netherlands	Shippers: 0.9 Carriers: 0.28 Overall: 0.37