Exploring road incident data with heat maps

Iain Dillingham, MSc Geographic Information Systems, City University London
iain.dillingham.1@city.ac.uk

Thank you for agreeing to comment upon the four prototype applications presented here. Each is based on the idea that road incident data can be explored using heat maps -- grids of cells, with each cell coloured by value -- ultimately to generate insight into road incidents.

About the applications

You will find four short video presentations below. These describe the four applications. CountHeatMap and SumHeatMap are 'extended' prototypes, meaning they support a greater degree of interaction and allow the user to explore significantly more data. ChiHeatMap and TreeMap are 'limited' prototypes, both in interactivity and data terms. They are included to demonstrate some of the possibilities afforded by the visual exploration of road incident data.

Feedback

Based on the video presentations, I would greatly appreciate your feedback on the applications, specifically:

Any additional comments or suggestions are would be welcomed. Please email your responses to iain.dillingham.1@city.ac.uk.

Please do not hesitate to contact me should you have any questions about the project. Thank you once again: your feedback is invaluable.

Iain Dillingham

Video presentations

CountHeatMap

Can't see the video? Visit the Vimeo website

Read the transcript

SumHeatMap

Can't see the video? Visit the Vimeo website

Read the transcript

ChiHeatMap

Can't see the video? Visit the Vimeo website

Read the transcript

TreeMap

Can't see the video? Visit the Vimeo website

Read the transcript

Transcripts

CountHeatMap

The count heat map displays the number of road incidents at six spatial resolutions, at five one-hour time periods, on one of three dates.

Dates and time periods reflect the hour before, the hour of and the three hours after a severe incident on the M25, just north of the Dartford Crossing. The incident is located within the grey cell, which can be toggled with the 'i' key.

Toggling the 'h' key displays the help function: as you can see, zooming and panning are also supported.

First, we'll remove the explanatory text ('s') for clarity.

We'll zoom into the area of interest at 100km resolution and toggle the grey cell ('i'). The date is the 11th of February 2008, between seven and eight in the morning.

We'll gradually change the resolution, moving from 100km to 50, 25, 10, 5 and 2km ('↑').

The current view is the hour before the incident. We'll step through the hour of the incident and the three hours following the incident ('→').

Note that cells are coloured 'globally', that is by value for the whole of the UK at this resolution, date and time period.

We'll now return to the start of the sequence ('←') and load the second dataset ('2'). The date is now the 1st of March 2008, between seven and eight in the morning. The resolution is unchanged.

As before, the current view is the hour before the incident. We'll step through the hour of the incident and the three hours following the incident ('→').

Finally, we'll do the same for the third dataset: return to the start of the sequence ('←') and load ('3'). The date is now the 28th of August 2008, between four and five in the morning.

Stepping through the incident I was struck by this pattern. This is incident plus one, incident plus two, and incident plus three. The pattern is more regular than the previous two incidents at the same relative time period.

We'll go back to the beginning of the sequence: incident plus one, on the 28th of August 2008.

Compare this to the second incident at incident plus one ('2'); and the first incident at incident plus one ('1').

We'll investigate this pattern further using the sum heat map.

Watch the video presentation

SumHeatMap

As with the count heat map, the sum heat map displays data at six spatial resolutions, at five one-hour time periods, on one of three dates. However, here we are looking at the total squared road incident severity.

Dates and time periods reflect the hour before, the hour of and the three hours after a severe incident on the M25, just north of the Dartford Crossing. The incident is located within the grey cell, which can be toggled with the 'i' key.

First, we'll remove the explanatory text ('s') for clarity.

We'll zoom into the area of interest at 100km resolution and toggle the grey cell ('i'). The date is the 11th of February 2008, between seven and eight in the morning.

We'll gradually change the resolution, moving from 100km to 50, 25, 10, 5 and 2km ('↑').

The current view is the hour before the incident. We'll step through the hour of the incident and the three hours following the incident ('→').

Note that cells are coloured 'globally', that is by value for the whole of the UK at this resolution, date and time period.

We'll now return to the start of the sequence ('←') and load the second dataset ('2'). The date is now the 1st of March 2008, between seven and eight in the morning. The resolution is unchanged.

As before, the current view is the hour before the incident. We'll step through the hour of the incident and the three hours following the incident ('→').

Finally, we'll do the same for the third dataset: return to the start of the sequence ('←') and load ('3'). The date is now the 28th August 2008, between four and five in the morning.

As with the count heat map, when stepping through the incident I was struck by this pattern. This is incident plus one, incident plus two, and incident plus three. The pattern is more regular than the previous two incidents at the same relative time period.

Let's return to the hour of the incident and double check the cell we are looking at lies within our area of interest ('i').

Just! We would need to do some further work to confirm this cell contains the incident we are interested in. We could do this by checking all incidents in the database within this cell. Here are the Ordnance Survey coordinates of the cell, expressed as a well-known-text string ('c').

Starting at the hour of the incident and stepping through the sequence again, we observe:

  1. high incident severity in the cell of interest
  2. a pattern around the cell of interest, which is present for at least the following three time periods
  3. higher incident severity in the cell of interest than neighbouring cells

Whilst this does not imply causation, I believe it to be an interesting pattern which warrants further investigation.

Watch the video presentation

ChiHeatMap

The chi heat map application displays two maps: both are at 100km resolution.

The map on the left shows the total number of road incidents on the 11th of February 2008, between eight and nine in the morning.

London and South East England clearly dominate, with the largest number of incidents.

The map on the right shows the degree to which the observed total differs from the expected total, over the same time period. Here, the model assumes incidents are evenly distributed across the road network.

Again, London and South East England dominate: there are more incidents than expected in this area, at this time, based on the simple model.

In both maps, the white cell indicates the removal of data: in this case, for Northern Ireland.

Watch the video presentation

TreeMap

The tree map application displays a series of cells, each cell representing a 100km by 100km area of the United Kingdom. However, these cells are sized by length of road network and coloured by number of road incidents.

The period in question is the 11th of February 2008, between eight and nine in the morning.

This cell represents the area which covers London and South East England. Its size tells us that it has the greatest length of road network; its colour that it has the greatest number of road incidents.

Currently the tree map uses a squarified layout. However, a spatial layout might be more appropriate ('3'). This layout attempts to place the cell in its approximate geographic location. Here is the cell covering London and South East England.

Several alternative layouts are possible, such as ordered squarified ('2'), slice and dice ('4') and strip ('5').

Let's return to the spatial layout ('3').

At present, the treemap uses all roads in the road network and all incidents which took place in the time period.

It is also possible to size cells by the total length of 'A' roads and colour by the number of incidents on 'A' roads ('a').

Or size cells by the total length of 'B' roads and colour by the number of incidents on 'B' roads ('b').

Or size cells by the total length of motorways and colour by the number of incidents on motorways ('m').

This is an interesting comparison.

Whilst London and South East England clearly have the greatest number of incidents on 'A' roads ('a') and to a lesser extent on 'B' roads ('b'), the situation is not the same for motorways ('m').

Watch the video presentation