# A SIMPLE REGRESSION MODEL - WITH AND WITHOUT CO2 EQUIVALENT

All scientists are well aware of the mantra “Correlation does not imply causation” and people have identified examples to illustrate this. Two spurious correlations I remember were one between the number of missionaries sent to Jamaica and the number of illegitimate babies born the following year, and another between the number of nesting storks and the number of babies born in some German towns. That said, correlation can indicate possible links; cancer and smoking being a case in point. When a physical explanation for a phenomenon is known then regression can also indicate the relative importance of different factors.

Scafetta’s models discussed at scepticalscience.com are examples of regression models which assume that global temperature can be explained by astronomical and climatic cycles superimposed on an underlying but unexplained trend. With some justification they have been dubbed ‘climastrology’.

To look at this in a bit more detail I’ve been playing around with two simple regression models. The first used three parameters: sunspots (SS - as a proxy for solar radiation), optical thickness (OT - to represent aerosols from volcanoes and elsewhere) and Atlantic Multidecadal Oscillation (AMO, as a representative climatic cycle). This is similar to the model Foster and Rahmstorf used to determine the underlying temperature trend from 1979 to 2010 with AMO replacing the El Nino index. The most questionable parameter in both cases is the use of natural climatic oscillations as independent variables. The regression parameters were calculated using the HadCRU3 temperature base and the LINEST function in Excel.

The resulting equation was:

Temperature = 0.62 * AMO + 0.00098*SS + 0.092*OT – 0.44

The equation has an r2 value of 0.19 and a standard error estimate of 0.24 °C. One anomaly is that the coefficient for the Optical Thickness is positive – implying that volcanoes would increase temperature! From these statistics you would not expect the agreement to be very good and it is not.

I then tried a four parameter regression, adding CO2. This gave the following equation.

Temperature = 0.49 * AMO + 0.00035 * SS -0.15 * OT + 0.0082 * CO2 - 2.98
I
n this case the r2 value is 0.89 and the standard error of estimate is 0.091 °C. The coefficients are now in the right direction. For comparison I have also plotted the ensemble of IPCC climate models in this case the standard error of estimate is 0.14, rather worse than the regression model.

The regression model does as well as the IPCC ensemble in places where the IPCC ensemble performs well (the increase from 1970 to 2000)and the regression model does better in places where the IPCC ensemble is known to underperform ( the increase from 1910 to 1945, the slight decline from 1945 to 1970 and the levelling off from 2000 to 2011). It is clear that the IPCC models will improve dramatically when they are able to simulate climate oscillations.

Ater I had developed this model I remembered that CO2 is not the only greenhouse gas so I modified it to use CO2 equivalent (CO2Eqv). The data were taken mainly from the GISS site (http://data.giss.nasa.gov/modelforce/ghgases//GHGs.1850-2000.txt) updated to 2011. The new model gave the following regression:

Temperature = 0.55 * AMO + 0.000049 * SS -0.35 * OT + 0.0177 * CO2Eqv - 2.98

In this case the r2 value is 0.90 and the standard error of estimate is 0.087 °C - a slight improvement. One intersting difference between the models is that in this model the influence of sunspots is considerably reduced and that of optical thickness (aerosols) is increased.

A final point – which may or may not be significant but I throw it in for fun. The coefficient for CO2Eqv is 0.0177. During that period CO2Eqv increased from 537.4 to 956.8 ppm and accounted for 0.74 °C of warming. The ratio of the increase of CO2Eqv was 1.78. This implies a CO2Eqv sensitivity of 0.89 °C for a CO2Eqv doubling (0.74 * log2)/log(1.78)), a figure close to most estimates. Given the warning about not reading too much into correlations with which I started this piece it should be treated with caution.

The modelling using CO2 equivalent was added after the initial post. The final figure with corrected labelling was replaced on 4 March.

# SMOOTH OPERATORS

In the IPCC Technical Assessment Report of 2007 many of the graphs used smoothed values of data series. This is perfectly valid as it facilitates seeing trends among the year-to-year fluctuations. The methods used are described in 'Appendix 3.A: Low-Pass Filters and Linear Trends'. They use for annual data as filter which has 13 weights 1/576 [1-6-19-42-71-96-106-96-71-42-19-6-1]. An example of this is in figure 3.8. It is interesting to note that have used a algorithm which allows smoothing right to the end of the data series.

We give below the three main global temperature series using the 13 point filter. At the end of the series we used only the part of the filter which applies retrospectively. This curve appears to shows that the rate of temperature increase has fallen off in recent years.

The same annex also mentions using regression to estimate trends. For simplicity we have done this using 13-year series and the LINEST algorithm in Excel.

Using the 13-year period for trends gives the clear impression that the rate of warming has slowed dramatically. Perhaps as a result of this, it is now being suggested that we need at least 30 years to detect a trend. (Though, it should be noted that I can find no reference to the need for a 30-year trend in IPCC report.) In the following graph we have plotted the 30-year trend lines and the 30-year trend from the average of a model ensemble.

Even using the 30-year trend it is clear that rate of temperature increase has fallen back. This graph also shows that the modelled 30-year trend was close to the observed one for the period 1975 to 2005 but outside that period diverged widely.

# THE HOCKEY STICK AND THE CLIMATE WARS

This book was better than I expected. My expectation was based not on Mann’s reputation/caricature, but on his previous book “Dire Predictions” co-authored with Lee Kump which we have reviewed previously.

On this site and in blog contributions I’ve recently been trying to promote two themes. The first is that the climate science community is weakening its case by trying to ignore inconvenient facts and data; it then gets doubly blamed for the cover-up and the inconsistency between their claims and the data. The second theme is that when presenting climate science to a largely lay audience, as Mann does here, a scientist has to be more careful than in a published paper. A paper will be thoroughly scrutinised by other scientists; lay people will not know if the wool is being pulled over their eyes.

Much of the book deals with the ‘climate wars’ aspect of title and the fact that few attacks on climate scientists fail to include Mann or his ‘Hockey Stick’. In the USA the whole issue of climate is much more divisive than in the UK. In the UK the Climate change Bill, mandating a reduction of 80% in CO2 emissions by 2050, was passed with only 5 votes against. At a recent lecture the Director of the Grantham Institute for Climate Change, Sir Brian Hoskins, was very frank about the shortcoming in climate science. None of this would be possible in the USA. And it goes a long way to explain why Mann devotes time to this topic.

One of my complaints of his previous book was that Mann blithely ignored any criticisms of his work. In this book he tackles some of them – even if not always head on. One example was the use of the word ‘censored’. When the data used for his original millennial temperature reconstruction was released it contained a folder called ‘censored’. This was regarded by anthropogenic global warming antagonists as proof of malfeasance; in reality it is a normal statistical term used to define a data sub-set excluded to test its importance to the overall conclusion. Another area he deals with is what he refers to as the ‘divergence problem’. This is the fact from about 1960 onwards most tree rings fail to respond to global warming. In the case of his own work he simply says that his data sets ended in the 1970s and 1980s and claims that the idea of adding the observed temperature for recent years to bring the data up to date, and increasing the hockey stick appearance, was suggested by a reviewer. Elsewhere he deals with a “high-elevation site in western United States”, without actually calling them ‘bristle cone pines’, and accepts that their growth rates could have been influenced by CO2 enhancement rather than temperature increases. This had been a criticism of his record. Another criticism of his work had been that one of the proxy records, sediments from a Lake in Finland, had not only been corrupted by upstream engineering works but had been used ‘upside down’. In one of comments on this Mann says “one of our methods didn’t assume orientation, while the other used an objective procedure for determining it”. This appears to be an admission that the orientation might not have been correct though elsewhere he says that this record did not change the overall conclusions.

So, if other climate scientists might have understood the oblique references in the book how might the public react to the book. Well of course few of them would have picked up the allusions and would quite possibly have been unaware of the significance of some of the statements. Another objection of proxy records is that, for statistical reasons, they underestimate the variability of the parameter they are estimating. Mann recognises this and explains this is a reason for the wide error bands. It is quite possible that increases in temperature such as those from 1910 to 1945 or 1975 to 2005 might have occurred in the past but not have registered in the proxy record. Again few members of the public reading this book would have understood the point and simply seen the ‘blade’ of hockey stick and not realised that the handle could have been as curvy as the blade. Another example of misleading the public is the graph he presents of a projection of temperature made in 1988 but he only includes data “available through 2005 in this analysis” even though later data were available at the time of writing the book and show the projection as been less accurate.

Elsewhere I have argued that there was need for a book in a popular style to combat the popular books of AGW antagonists – this is indeed such a book. What is now needed is a book which arbitrates between two sides.

Author: Michael E. Mann

Publisher: Columbia University Press, 2012
E-ISBN: 978-0-231-52638-8

# PRECIPITATION PROJECTIONS

This is the time of year when climate data sets are updated to include annual totals for the preceding year (in this case 2011). Most sites concentrate on temperature - though sometimes include not just observed atmospheric temperature but also variables such as modelled projections and temperature in the oceans. One variable which is often forgotten is precipitation. After all, the positive feedback from water vapour assumes that it remains in the atmosphere rather than becoming precipitation.

On the chart below we use two data sets. The first is the NCDC 5° gridded precipitation anomaly at http://www1.ncdc.noaa.gov/pub/data/ghcn/v2/grid/grid_prcp_1900-current.dat.gz. To get a monthly global figure we averaged the data, cosine weighted on latitude to compensate for reducing grid sizes. The values are in millimetres. The second data set was of precipitation hind-cast/projected downloaded from the Climate Explorer web site at http://climexp.knmi.nl. The data set used was described as “all models, 20c3m/sresa1b” and included 23 models. These data were in mm/day so to convert then to equivalent units they were multiplied by the number of days in the month. They were adjusted to give values relative to the period 1980 to 2010. As trends were masked by month-to-month variations the five year centred moving averages are also plotted.

This shows, as stated in the IPCC TAR4 report, that the variance of the simulated precipitations is less than that of the observed values (TAR4 section 9.5.4.2.1). The difference, based on the 5-year moving average, is as high at 2 mm/month which is equivalent to 1.88 W/m2. (1 mm evaporation over 1 m2 weighs 1 kg. The latent heat of evaporation of 1 kg of water is 2.45 Mj. 1 kWh is 3.6 Mj.)

# FUN WITH NUMBERS

One theme of this site over the last month or so has been the misleading use of numbers. This is very much related to the ongoing discussion of whether warming has stopped or not. One side, the Anti-AGW, says it has stopped and therefore global warming is not happening and never will, conveniently forgetting that a lot of it has already occurred. The other side, the Pro-AGW, says warming still happening and introduces new metrics to demonstrate this.

In an age when every ‘scandal’ unearthed by budding ‘Woodward and Bernsteins’ is given a ‘gate’ suffix it is easy to forget that President Nixon was impeached not for the original crime of breaking into offices of the Democratic Party but for the cover up. I wonder if something similar is happening with the Climate Science Community at the moment. I have, in earlier blogs, identified examples of presentations and sites trying to prove that the models’ predictions were accurate which have left off data from the recent years (though internal evidence shows that they could have included it.) Yes, it would be inconvenient to admit that the current climate stasis was not predicted but playing with numbers to try and disguise it opens the Climate Science Community to charges of intellectual dishonesty.

Below I ‘prove’ that there has been no increase in temperatures since 1998. I did this by assuming that temperatures from 1998 onwards have been flat and calculating the temperature perturbation from the flat line. I then regressed this perturbation, after adjusting for residual seasonality, against sunspots, aerosols and a multivariate ENSO index and got the following graph.

Wow – there’s no warming! Eat your heart out you pro-AGW crowd. But, how did I prove it? I cheated. Firstly the coefficients I derived suggested that higher solar irradiance is associated with cooling and higher aerosols are associated with warming; only the El Nino index behaved sensibly. Then I adjusted the format of the equation so the coefficient was shown to only two places of decimals. As the data are monthly this means I could have had warming of 0.049 degrees/month (0.6 degrees a year) and still shown ‘zero’ warming. What is more, if I’d played around a bit more with lag times, different indices, and different amounts of smoothing I could have produced an even more convincing, but even more dishonest, proof.

I’ve been completely upfront about what I have done but as long as the Climate Science Community tries to avoid the reality of climate stasis they will face a barrage of similar ‘proofs’ but with any underlying falsification being undisclosed.

# WHEN IS A TREND NOT A TREND?

There has recently been discussion on the blogosphere as to whether or not global temperatures are continuing to rise, have levelled off, or are falling. For example the ‘Met Office in the Media’ site had a rebuttal of an article in the (United Kingdom) Daily Mail which, inter alia, claimed there had been no warming for 15 years. (http://metofficenews.wordpress.com/2012/01/29/met-office-in-the-media-29-january-2012/) In their rebuttal the Met Office pointed out that most of the highest temperatures on record have occurred in the last 15 years. They also present a chart of temperature in 10 years blocks which showed that the first decade of the 21stcentury was the warmest on record.

In the IPCC Technical Assessment Report of 2007 many of the temperature graphs are smoothed using 13 point binomial moving average (for example figure 3.8). Elsewhere they talk of decadal smoothing. One of the contributors to the blog in support of the Met Office’s position refers to the ‘Skeptical Science’ web site (http://www.skepticalscience.com/global-warming-stopped-in-1998-intermediate.htm). Here they show temperature using an unweighted 11-year moving average but do not include data after 2007. The data they do show has a positive trend. (This is strange as that page of the web site has been updated to include a 2011 paper by Foster and Rahmstorf, which we have discussed elsewhere (http://www.climatedata.info/Discussions/Discussions/opinions.php?id=3871334005763196947), which purports to compensate for the effect of solar irradiance, volcanoes and El Nino.)

In the graph below we show the HADCRU3V data set (normalised to 1979 to 2008 for compatibility with satellite temperature data series). We have include three smoothed series: 13 point binomial using the weights in the IPCC report, and 11 point unweighted moving average and a 15 year unweighted moving average.

As can be seen, only the 15 year moving average suggests that temperatures are continuing to rise. At the time of the IPCC report in 2007, the authors were happy with a 13 point smoothing or decadal smoothing. Now that these statistics no longer show a rising trend they are finding new ways of presenting the data to claim that the trend continues. I genuinely believe that rather than moving the goal posts the Met Office would better maintain its credibility by admitting that, at the very least, the rate of temperature rise has slowed. The basic truth of course is that extrapolating from the past climate says little about the future (year on year temperatures are correlated so the temperature in one year does have a bearing on following years).

[The following material was added on 3 February 2012.]

This post was, in part, prompted by the exchange on the Met Office site mentioned above. As part of the moderator's reply he referenced a Met Office document of 2010:
http://www.metoffice.gov.uk/media/pdf/m/6/evidence.pdf

This did accept that the rate of temperature rise in the past decade was slower than in the immediately preceding decades and postulated some tentative ideas as to why this might have happened. However it also contained some other metrics which were presented in such a way as to minimise the impact of this fact. Some of them I have commented on in the blog itself. One of them was a plot of decadal temperatures which showed that, despite the slowdown in the rate of temperature increase, the decade from 2000 to 2009 was the warmest on record. Below we give a plot of the decadal rate of change of temperature in C per year both the for observed temperatures (HadCRU3V annual average values) and the average of 7 climate models combining the 203cm and a1b scenarios to cover the period 1870 to 2010.

This shows two things: that while the rate of rise in the 1990s was the highest on record there were other decades in the past with rises of the same order of magnitude, the models (which get the overall rise fairly accurately) do not represent the rate of rise at a decadal time scale at all well. My purpose in this posting is not to 'rubbish' the models; it is to suggest the selecting data to agree with the science is the wrong approach; it should rather be to admit shortcomings in the science and work on improving it.

# DISTORTION BY DELETION

Recently there has been a thread at Skeptical Science (www.skepticalscience.com) in which they claim that Pat Michaels has distorted the science of climate change by deleting parts of figures. I hold no brief for Michaels and am not intending to comment on the accuracy or otherwise of the claims. However, if the pot is calling the kettle black it had better make sure that it is burnished and spotlessly shining.

A few months ago I watched Michael Mann’s presentation to TEDx in November 2011, uploaded on 5 December. In it he shows a plot of the predictions of Hansen in 1988. This is from the same source as the graph from which Michaels removed two of the lines. The following slide is from a screen dump of the presentation.

This shows Hansen’s 1988 prediction up to 2019 and observed temperatures up to 2005. Since the presentation was given in November 2011 it would have been possible to include 2010 and, to a high degree of accuracy, the likely value for 2011. In his 2006 paper Hansen also shows his projection with both “station data” and “land-ocean data”. The plot below shows observed data updated to 2011 for both observed records (the 2011 value is provisional) and Hansen's projection digitised from his 2006 paper.

When both data series are included and the data are not truncated to 2005 they tell a slightly different story.

# SHORT TERM VARIATIONS IN TEMPERATURE

In a recent paper Foster and Rahmstorf (F and R, Global temperature evolution 1979–2010) examine the influence of three factors which introduce variability to the temperature record: ElNino/Nina, Volcanoes and Total Solar Irradiance(TSI). They chose to represent the El Nino/Nina effect by the Multivariate El Nino Index (MEI), volcanoes by Aerosol Optical Thickness Data and TSI by sunspot number. They describe their regression as "the multiple regression includes a linear time trend, MEI, AOD, TSI and a second-order Fourier series with period 1 yr." Effectively they assumed the temperature perturbation as the difference from a linear trend plus and allowance for seasonal effects. They chose the period 1979 to 2010 as this included two satellite temperature records in addition to three records based on measurements. They concluded that adjusting the temperature records showed that the underlying temperature trend was upwards for the whole period.

However data for all three variables are available from 1950 and, for some variables, much earlier. We therefore examined the effect of the three factors on temperature from 1950 to 2011.

As we could not assume a linear temperature trend for the whole of the period 1950 to 2011 we calculated the adjustment (or perturbation) as the difference between the three month mean and the 60 month mean. The three month mean represented the short term effect and the 60 month mean the underlying temperature trend. We worked only with HadCRU3V global temperature series. The need to use the 60 month mean means that our series is truncated relative to the F and R series.

The first chart shows the adjustment calculated by this method.

As can be seen the adjustment clearly represents short term effects such as El Nino and volcanoes.

We then regressed the adjustment against the same three variables as F and R. This chart shows a line trend line between the adjustments and the values calculated from the regression equation.The r2 value was 0.3564, a bit lower than the best of the values found by F and R but covering a longer period.

The final plot shows the observed and adjusted temperature series.