The Graph of Temperature vs. Number of Stations
This page explains the origin of a graph comparing the number of weather stations
around the world with the simple mean of the temperature data. I have shown this graph in
some of my own publications and it has been reproduced in a new book by Marlo Lewis.
As I have been asked about its origins a number of times I thought it would be simplest
to post a web page about it.
This is the graph:
You can get the Excel spreadsheet that created it below.
- I first saw the data behind the graph at the Intellicast web site in an
essay by Dr Dewpoint, aka Joe D'Aleo, their
former Chief Meteorologist. The essay is also available in pdf format
- I wrote to Joe, asking him for the data, which he sent to me. He explained that he got it from
obtained it from http://www.ncdc.noaa.gov/cgi-bin/res40.pl?page=ghcn.html, using the
search criterion 'All stations'.
- The data Joe obtained were put into 3 categories, urban, suburban and rural. In the
spreadsheet I used to generate the graph I construct the average temperature as weighted by the
number of stations in each group.
- A very similar station count graph is posted at the GISS website.
It is taken, in turn from a 1997 paper by Peterson and Vose. However that station count is smaller by about two-thirds. The GISS data (or GHCN)
removes duplicates, records with insufficient continuity, etc. The resulting station count graph looks almost the same though, with the rapid fall around 1990.
- The graph clearly shows that there is a step up in the mean coincident with the sudden loss of over half the sampling sites around 1990.
- The temperature average in the above graph is unprocessed. Graphs of the 'Global Temperature' from places like GISS and CRU
reflect attempts to
correct for, among other things, the loss of stations within grid cells, so they don't show the same jump at 1990.
- The graph mainly serves to illustrate one of the challenges for people who are trying to use land-based station data to construct a continuous
index of the global average temperature over the 1990 boundary.
Gridded data reflect processing to
(hopefully) remove the influence of problems such as the loss of
station counts within grids. The point of the graph above
is that a change in the raw mean occurred coincidental with the big
loss of stations in the early 1990s. This creates a problem of confounding. After the early
1990s the gridded series started behaving differently, i.e. going upwards so that the 1990s becomes the warmest decade, etc.
Maybe the anomaly series are fully corrected for the problem of station
closure and the shift in the 1990s was climatic. Or maybe the anomaly
series are not fully corrected for the problem of station closure,
implying not all the shift in the 1990s data was climatic. To accept
the claims that the post-1990 anomaly index is continuous with the pre-1990 data, and only reflects a climatic change,
requires the assumption, as a maintained hypothesis, that any effects of the sudden
sample change around 1990 have been removed. It has puzzled me why this
assumption is not more rigorously tested by people whose research
depends on the optimistic interpretation of the gridded data.
- The loss in stations was not uniform around the world. Most stations were lost in the former Soviet Union, China, Africa and South America.
To see this visually, go to the
University of Delaware global temperature archive.
Click Available Climate Data; log in; under Global Climate Data select Time Series 1950
to 1999; then select Station Locations (MPEG file for downloading). Then sit and watch
the movie. The remarkable things are, first, how bad the spatial coverage is outside the
US and Europe, and second, what happens at 1990.
- As early as 1991, there was evidence that station closure beginning in the 1970s had added a permanent upward bias to the
global average temperature. Willmott, Robeson and Feddema ("Influence of Spatially Variable Instrument Networks on Climatic
Averages, Geophysical Research Letters vol 18 No. 12, pp2249-2251, Dec 1991) calculated a +0.2C bias in the global average
due to pre-1990 station closures.
- Researchers doing trend regressions on globally-averaged temperature data should consider including an intercept/slope break
point at around 1990.
- Pat Michaels and I published a paper that tests
whether homogeneity corrections in gridded data are adequate to remove non-climatic influences. We find they are not, and that
the nonclimatic effects add up to a net warm bias for the world as a whole.
- I have not found any discussion of the sudden loss in stations around 1990 in the recent IPCC report. In the TAR there was a brief mention
in the Technical Summary, to the effect that if this rate of station closure keeps up it will make it difficult to continue detecting global warming.
In other words, the underlying assumption that the increase in average temperature is due to global climate change is not itself
subject to question; the problem created by station closure is only that it makes it hard to measure the phenomenon they know must be there.
- Weather satellites provide complete spatial coverage from 1979 to the present.
After seeing the Delaware video, an interesting question to ask would be: in the regions where the most surface data were lost (i.e. Russia)
were the temperature trends measured by satellites above or below the global average. That might give some indication of whether the regions
that are still well-sampled tend to have higher-than-average warming trends. This is not a study I plan to do, but hopefully someone will.
- I am using terms like 'global temperature' and 'average temperature' for shorthand. They are intrinsically
Go up to Publications and Papers Page
Return to Ross McKitrick's home page