What Insights Can be Derived From The Publicly Available Data on the Government’s Major Project Portfolio?


Since 2012 the UK’s Infrastructure Projects Authority (IPA) has published data on the performance of the government major projects portfolio (GMPP). In 2017 this included 143 projects with a whole life cost of £455bn.

Data timeliness

The spreadsheets are published on an annual basis with a lead time of approximately 10 months (the data was sampled in Sept 2016 and published in July 2017). Noting that the data will have to be reviewed at a department level prior to release, the data may actually have been extracted from schedules and spreadsheets up to 2 months earlier.

International comparators

In comparison, the US and Australian governments publish near real time updates of data utilising interactive tools. The Western Australian Government has adopted the following five principles for open data within its Whole-of-Government Open Data Policy: data should be open by default; easily discoverable and subject to public input; usable; protected where required; and timely. Both governments view pubic accountability as a key factor in helping to drive culture change and improve project performance. See IT dashboard for the US version and ICT dashboard for the Australian dashboard.

Noting the depth of insights provided by other governments, why is the UK government not following a similar approach. Is it time that the entire GMPP and wider portfolio of projects is subject to wider public scrutiny? The costs associated with developing such a capability are not great; our website and embedded GMPP analysis cost approximately £19,000 to produce. I agree that there is a burden associated with reporting the data, but by importing data from on line cost and schedules which need to be produced as part of robin project management it can be delivered cost effectively. Are there any real obstacles to prevent the IPA from compiling live data downloads, rather than developing offline reports which are often subject to massaging? Most of this could actually be automated, so once set up the burden should be minimal.

Our Quest

In January 2018 we compiled all available GMPP data and imported it into Power BI. This provided us with an excellent opportunity to interrogate the data and visualise it. Our primary focus was to test 2 hypotheses:

  1. Do the deliver confidence assessment (RAG) ratings align with the delivered performance in the following year? If not, is there any evidence of under or over-marking?
  2. Is the overall delivery performance improving or worsening?

We also wanted to demonstrate what could be achieved in a short period of time with a relatively small amount of resource.

We will be rolling out additional insights over the coming weeks.


The results are fascinating (please see the interactive charts on the homepage):

  • The Infrastructure Projects Authority define amber as “Successful delivery appears feasible but significant issues already exist, requiring management attention. These appear resolvable at this stage and, if addressed promptly, should not present a cost/schedule overrun. If a project is graded as amber, green/amber or green it would be reasonable to expect the cost or schedule not to increase in the following year, or to assume that any increase would be negligible.
    • In 2013, from a sample of 132 projects, 11 had only one data point (i.e. we were unable to calculate variance). Of the remaining 121 projects, 17% of the projects had an in year increase in cost or schedule of over 20%. 84 (69%) of the 121 projects were classified as amber, amber/green or green (A/AG/G), with 19% having a variance of >20% and 32% having a variance of >1%.
    • If we look at 2016 data, from a sample of 84 projects, 28 have only one data point. Of the remaining 56 projects, 34% have an in year variance in cost or schedule of >20%. of which 23%>50%. 41 or these projects are classified as A/AG/G, with 31% having an in year variance of >20% and 56% having a variance >1%.
    • If we sample the 2013, 2014, 2015 and 2016 data, with more than 1 data point, we return 360 entries (note that these entries may multiple years of the same project). 259 (72%) of these are classified as A/AG/G. 22% of these have an in year variance of >20% and 39% have a variance >1%.
If we calculate the variance against the first recorded data point instead of in year variance we derive the following:
  • The variance of >50% in 2013 was 7.5% rising to 41.27% in 2016. Although some of this increase is due to cumulative increases across multiple samples of the same project, it still represents a significant proportion of the data.
  • Of the projects classified as A/AG/G in 2016 66% have a cumulative variance of >1%, with 34% of these being >50%.

Testing the Hypotheses

Is the portfolio overmarked? Whilst we accept that the assessment of RAG is an imprecise science, this analysis indicates that there is evidence to suggest that the portfolio overall may be overmarked. It should be noted that our variance analysis does not take account of performance variance, i.e. where scope or capability has been traded to maintain cost or schedule, which would compound the challenge.

Is the situation improving? The IPA’s own analysis illustrates that the projects joining the portfolio often suffer from a challenging birth. The analysis above indicates that in year variance is getting worse, year on year. We accept that our analysis is not perfect because of the quality and completeness of the source data, but it indicates that it is a cause for concern and further analysis. It should also be noted that the GMPP accounts for the tip of a very large and expensive iceberg.

Data Quality, Accuracy and Completeness

In terms of details:

  • Some projects have benefitted from a significant reduction in cost, yet the schedule remains the same. If projects are delivered in a similar timescale the project management costs begin to dominate. It would be helpful if the projects could provide an indication of the cost of the lost capability. Using the example of the Edinburgh tram, although project costs increased significantly, there was also a reduction in the length of the track from that originally planned. If project scope is constrained to fit the available budget, then it should be reported.
  • You may note that some projects have a significant negative schedule variance. This would appear to be because some projects reported out of service dates and others reported the in service date, i.e. when the capability was delivered. These were subsequently modified and results in a negative schedule variance of up to 30 years. Common data definitions are required to minimise the risk of apple and pear comparisons.
  • We can only measure schedule and cost variance from the previous data point because we do not have access to the baseline or sanctioned values. Therefore, any projects with only 1 data point results in a 0-0 variance, which does not necessarily reflect reality.
  • When departments are closed or renamed their projects get reassigned, which creates a problem with data cleansing and correlation. It would be helpful if the data contained a summary of these correlations.

Exploitation of the data

The data that the government releases is generally difficult to analyse and when compared to international comparisons, there is an opportunity for the UK government to embrace new MI and BI tools.

  • The original approved and baseline costs are not provided, therefore it is difficult to perform variance analysis. In the absence of this information we have compared the latest data against the forecast at the original data point. This means that if the information only has one data point then we are unable to conduct a variance assessment.
  • The procurement costs are not separated from the whole life costs so it is very difficult to differentiate between the costs of buying a capability and running it. The in year costs are of limited value. It would be helpful if government could revisit the scope of the data reported, otherwise, they are giving the illusion of transparency whilst providing very limited insights.
  • There are a number of exempt projects or projects with missing data which has impacted the analysis. We have filtered out any projects which have incomplete data.