Data and Sampling

  1. Obtaining new telematics and scheduling data of existing diesel HGV routes.
  2. Creating synthetic data for instances where route data is missing.

Note on Data Quality

The project considered fleet operations covering a mix of freight including retail, food, and general logistics. It also considers synthetic data for the forestry sector.

The data is high quality, being a mix of raw telematics data and scheduling data. It represents the movement and activities of individual HGVs over a one-year period, covering the majority of mainland Scotland.

Synthetic data is used for the forestry haulage sector. Synthetic data is designed to mirror the statistical properties of real-world data. In this case it was based on a network of routes identified in a report commissioned by the Timber Transport Forum covering the sector in Argyll and the Scottish Borders.

Note on Sampling

The data underlying this project covers a representative sample of HGV fleets in Scotland, equating  to approx. 2% of the total.  A broadly representative sample was achieved.

Pie chart showing the difference in population and sampling.

To provide greater confidence and nuance to drive investment decisions, there is an ongoing need to:

  • Increase the quantity of data to reflect an even wider variety of HGV routes.
  • Increase the quantity of data from specific sectors such as mining and quarrying.

Despite these caveats, the research is a significant step forward in our current understanding.