THE POST HAS BEEN UPDATED. SEE THE UPDATE AT THE END OF THE POST:
Alternate Title: An Average of Climate Models, Which Individually Give Wrong Answers, Cannot, By Averaging Them, Give the Right Answer, So A Model Mean Can Be Very Misleading. And The Use of Anomalies in Model-Data Comparisons, When Absolute Values Are Known, Can Also Be Very Misleading
For the purposes of examples only, I’m going to initially present comparison graphs of monthly global land-ocean surface temperature data and model outputs, all of which include (roughly) 3.5 to 4 deg C annual cycles. Why am I presenting model-data comparisons with overlapping annual cycles, you ask? There’s something very unusual about a couple of ensemble members from the CMIP5 archive (the simulations with Historic & RCP8.5 Forcings) when you compare them to the Berkeley Earth data. You’ve got to see this to believe it! I couldn’t make this up. And they provide wonderful lead-ins to a discussion of one of the climate science community’s favorite presentation devices, the multi-model mean, and they provide wonderful lead-ins to a discussion of anomalies.
The climate science community regularly averages the outputs of climate model simulations of Earth’s climate for use in scientific studies. For example, they may call the average a multi-model mean or a multi-model ensemble-member mean, depending on the groups of model outputs they’re averaging. I’m providing the model-data comparisons, because they provide wonderful examples of why averaging a bunch of models that individually give the wrong answers CANNOT hope to provide the correct answer. The model mean is simply an average of models that provide the wrong answers. Or, if you like, the climate model mean is a consensus of wrong answers, with some more wrong than others. And that, after a good number of decades, is what we expect from climate science—a consensus of wrong answers—because the foundation of climate science is global politics.
THE DATA AND CLIMATE MODELS
I’m using the monthly land+ocean surface temperature data from Berkeley Earth, because Berkeley Earth, on their global land+ocean surface temperature data page for that product, provides monthly factors that can [be] added to their monthly anomaly data by users in need of the global mean surface temperature data in absolute, not anomaly, form. Sadly, Berkeley Earth does not specifically state the source of those absolute values. We’ll discuss that later on in the post. [See the update at the end of the post.] Regardless, the model-data presentations are only for example purposes, so let’s proceed.
Climate model outputs from the CMIP5 archive, which was used by the IPCC for their 5th assessment report, are available from the KNMI Climate Explorer. For the models in this post, I’m using models with historic and RCP8.5 forcings, and I’ve selected the climate model ensemble members that provide the warmest and coolest average surface temperature during the period of 1850 to 1900, which the IPCC now uses for pre-industrial conditions. That is, of the simulations of global mean Surface Air Temperatures (TAS), from 90S-90N, from the 81 individual ensemble members, these two examples provide the warmest and coolest global mean surface temperatures during the period of 1850 to 1900. The coolest (lowest average absolute GMST for the period of 1850-1900) is identified as IPSL-CM5A-LR EM-3 at the KNMI Climate Explorer, and the warmest (highest average absolute GMST for the period of 1850-1900) is identified there as GISS-E2-H p3. The average global mean surface temperatures for the other 79 ensemble members during preindustrial times reside somewhere between the two ensemble members shown in this post. The two ensemble members are the same model outputs used in the recent post What Was Earth’s Preindustrial Global Mean Surface Temperature, In Absolute Terms Not Anomalies?. The WattsUpWithThat cross post is here.
As you’ll recall from that post, for the IPCC-defined pre-industrial period of 1850 to 1900, there is a 3-deg C difference between the global mean surface temperatures of the ensemble member IPSL-CM5A-LR EM-3 (at 12.0 deg C) and the ensemble member GISS-E2-H p3 (at 15.0 deg C). You might say, they’ve got the actual global mean surface temperature surrounded.
MODEL-DATA COMPARISONS 1 – BERKELEY EARTH VERSUS GISS-E2-H p3
Figure 1 is a model-data comparison of monthly global mean surface temperatures in absolute form. With the overlaps of the model and data annual cycles, it’s pretty difficult to see what’s so unusual about the model output and the data.
Figures 2 and 3 present the model and data individually. It’s still difficult to see what’s so unusual.
# # #
So, to make it easier to see, in Figure 4 (click to enlarge), I’ve illustrated the two graphs (Figure 2 and 3) side by side, with the data on the left, and the GISS ensemble member on the right.
That’s right! The global mean surface temperatures in the GISS-E2-H p3 are so warm that the 1850s in the ensemble member output aligns with the most-recent decade (2008-2017) of the data’s global mean surface temperatures. So we can say, because the GISS-E2-H p3 ensemble member is not only off in terms of global mean surface temperature (too high), it is also off in terms of time. That is, based on the time periods when the model output and data overlap, the GISS-E2-H p3 is simulating surface temperatures for some time in the future with respect to the observations-based data, not the same time period as the data.
And to confirm that the 1850s in the model align with most-recent decade (2008-2017) of the data’s global mean surface temperatures, see Figure 5. In it, I’ve compared the 10-year-average annual cycles in monthly model and data global mean surface temperatures, with 2008-2017 used for the data and 1850-1859 used for the GISS-E2-H p3 ensemble member.
If it wasn’t for the nearly 160-year difference in time, I’d be willing to admit that there’s a reasonable agreement in the annual cycles. Unfortunately for the GISS-E2-H p3 ensemble member, the 158-year difference does exist between the model and data.
MODEL-DATA COMPARISONS 1 – BERKELEY EARTH VERSUS IPSL-CM5A-LR EM-3
Now, in Figures 6 through 10, we’ll run through the similar sequence of model-data comparison graphs for the Berkeley Earth global mean surface temperature data and the output of the IPSL-CM5A-LR EM-3’s simulation of it.
# # #
# # #
# # #
# # #
Yup! That’s right. As shown in Figures 9 and 10, the global mean surface temperatures of the first ten years of the data (1850-1859) align with the global mean surface temperatures of the last ten years of the IPSL-CM5A-LR EM-3 ensemble member (2008-2017). The IPSL-CM5A-LR EM-3 ensemble member could also be said to be off in terms of time, but in this case, based on the point at which the model and data align, the ensemble member is almost 16 decades too soon with respect to the observations-based data.
As I said in the opening, you’ve got to see this to believe it! I couldn’t make this up.
WHY PRESENTING A “MULTI-MODEL MEAN” AND THE USE OF ANOMALIES INSTEAD OF ABSOLUTE VALUES CAN BE MISLEADING, ESPECIALLY WHEN THE ABSOLUTE VALUES ARE KNOWN
Above, using two worst-case examples, we’ve seen how poorly two CMIP5-archived climate models actually simulate global mean surface temperatures as represented by data. Now it’s time for the catch. We’re assuming the values of the adjustment factors provided by Berkeley Earth for their data are correct. Berkeley Earth doesn’t cite the source of the adjustment factors. If they do somewhere and I’ve missed it, please correct me and provide a link in the comments. [See the update at the end of the post.]
We may get an idea of the source from Figure 11. In it, for the commonly used period of [1951-
1850 typo ] 1951-1980, I’ve compared the average annual cycles of the Berkeley Earth global mean surface temperature data and the average annual cycles of the two CMIP5 ensemble members (IPSL-CM5A-LR EM-3 and GISS-E2-H p3) along with the average of those two ensemble members (a.k.a. the model mean).
Based on how closely the average of the two extremely poor ensemble members matches the data, I suspect Berkeley Earth used the model mean of one of the groups of historic simulations associated with one of RCP scenarios. Don’t know for sure, but I’ll look. Maybe one of the regular denizens at WattsUpWithThat who work with Berkeley Earth will provide the answer and save me some time. [See the update at the end of the post.]
Regardless, the graphs in this post were provided as examples…fun examples. I could just as easily have used modeled and observed sea surface temperatures where the data are in absolute form and also furnished in anomaly form.
Figure 12 includes two model-data comparisons of global mean surface temperatures in time-series format for the period of 1850 to 2017. In the top graph, the data and ensemble member outputs are presented absolute form, while, in the bottom graph, they’re presented in anomaly form, referenced to the often-used period of 1951-1980.Figure 12
As noted at the bottom of Figure 12, If You Were a Climate Scientist And You Wanted to Illustrate How Well a Group of Terrible Climate Models Simulated Global Mean Surface Temperatures—Or Any Other Metric—Would You Present Them in Absolute or Anomaly Form? Also, Would You Present the Model Mean or the Scattered Individuals? Consider that the next time you read any climate science report with model-data comparisons.
Again, in the top graph, even though the individual ensemble members have provided the wrong answers when compared to the data, when we average the wrong answers, we get something close to the correct answer as represented by the data.
Let’s put that in perspective as it relates to the oft-cited climate model ensemble members that use historic forcings to simulate past climate as far back as 1850 and RCP8.5 forcings for computer-aided, crystal-ball-like prognostications of the future under an unrealistic future scenario. A multi-model mean of 81 ensemble members, where the models are all giving wrong answers, some worse than others, simply provides us with a consensus of the wrong answers. And to compound that, the RCP8.5 scenario is, more and more often as days go by, being found to be unrealistic.
And in the bottom graph, the same climate model outputs and data are compared, but this time in anomaly form. Look at how much better the models appear to simulate the long-term observed global mean surface temperatures. Keep those two graphs in mind the next time someone presents a model-data comparison with the data and model outputs presented in anomaly form…and try not to laugh.
Oh, go ahead and laugh, we might as well have fun while we endure this nonsense.
Oops, almost forgot. I need to thank Berkeley Earth for publishing those monthly conversion factors. I had a lot of fun preparing this post, and I couldn’t have done it without those factors.
That’s it for this post. Have fun in the comments and enjoy the rest of your day.
And there are people who wonder why I’m a heretic of the religion of human-induced global warming/climate change. At least I’m a happy heretic. I laugh all the time about the nonsense prepared by whiny alarmists.
Stephen Mosher of Berkeley Earth was kind enough to explain the source of Berkeley Earth’s absolute temperature adjustment factors in the comment here. Stephen Mosher writes (excluding his typical rudeness to bloggers at WUWT):
The SOURCE of the absolute temperatures is the data which comes in absolute T
All other methods use station temperatures and then they construct station anomalies and then they combine those anomalies.
Our approach is different. we use kriging
Temperature at a location is decompsoted into a 2 elements: A climate element and a weather element
T = f( Lat, Elevation) + Weather
F(Lat, Elevation) is the climate element. It states that part of the temperature is a function of the latitude
of the station and the elevation of a station. Think of this as a regression. ( short aside, there is also a seasonal compoment)
if you take MONTHLY averages then you can show that over 90% of the monthly mean is explained by the latitude of the station and the elevation of the station ( Willis showed something similar with
satitille temps) Using this regreession approach allows you to predict temperatures where you have no
observations. A simple example would be you have the temperature at the base of hill and you can predict the temperture up to the peak… with error of course, but we know what those errors look like.
This “climate” part of the temperature is then subtracted from T
W= T – f(L,E)
This is the “residual” the 10% that is not explained by latitude and elevation. We call this
“weather” Its this residual that changes over time.
After decomposing the Temperature into a fixed climate ( F(L,E)) and a varaiable weather component
we then use krigging to interpolate the weather feild.
The Temp we give you is the absolute T. the source is the observations.
Thank you, Stephen.
STANDARD CLOSING REQUEST
Please purchase my recently published ebooks. As many of you know, this year I published 2 ebooks that are available through Amazon in Kindle format:
- Dad, Why Are You A Global Warming Denier? (For an overview, the blog post that introduced it is here.)
- Dad, Is Climate Getting Worse in the United States? (See the blog post here for an overview.)
To those of you who have purchased them, thank you. To those of you who will purchase them, thank you, too.
PS: Will I continue to present model-data comparisons using a multi-model mean or a multi-model ensemble-member mean? Of course I will, because I use them to show how poorly the consensus (better said group think) of the climate modeling groups, as represented by the model means, simulates a given metric, usually sea surface temperatures in absolute form or 30-year trends in global mean surface temperatures. And now I have this post to link to those future posts.
Reblogged this on Climate Collections.
Yes sir, Bob. Models, even when averaged are questionably neutral; if useful at all. Since Earth is over 4 Billion years old; models that go back to 1850 or even a million years ago are highly questionable. I urge you and everyone following you to correspond with geologist that have examined climate through indirect means; specifically through the amount of oxygen, carbon dioxide, nitrogen, and miscellaneous gases in rocks by age; a common indirect method of knowing what the climate was doing at any specific time. My understanding is all rocks can be carbon dated. And finding rocks over 4 billion years old or younger and doing the gas measurements in the rock will give a good indication of climate at the time. Or a real indication of climate on Earth instead of the recent climate only; which is just not useful in talking about climate change. After all, it is the climate change over the life of Earth that matters; not just since humanity has been on Earth.
Pingback: Examples of How the Use of Temperature ANOMALY Data Instead of Temperature Data Can Result in WRONG Answers | Watts Up With That?
Pingback: Examples of How the Use of Temperature ANOMALY Data Instead of Temperature Data Can Result in WRONG Answers | Bob Tisdale – Climate Observations
Pingback: Examples of How the Use of Temperature ANOMALY Data Instead of Temperature Data Can Result in WRONG Answers |
First thank you for all the excellent work you have done during many years and please continue to investigate the data because you’re doing a great job.
My comment was a reaction to a sentence in this post which was “Based on how closely the average of the two extremely poor ensemble members matches the data, I suspect Berkeley Earth used the model mean of one of the groups of historic simulations associated with one of RCP scenarios”. The guy from Berkeley Earth answered you about that, but my point is : have you ever imagined the contrary : that the models are purposely done to be very different separated but to come back to the observation curve when you average them ?
After all, in the CMIP5, you have only about 20 labs making models, they almost all have a seat at the IPCC working groups and they all know each other. For example, Valérie Masson-Delmotte from the French IPSL has published articles together with James Hansen from the GISS, etc.. [Note that IPSL is a lab from the French Atomic Energy Commission.. If you wonder why a nuclear energy lab works on climate, I don’t have the answer. The only thing I know is that the IPCC is asking repeatedly for tremendous increases in “carbon-free” nuclear energy].
Anyway, the question would be : Why the models would be purposely wrong in order to be approximately decent in average ? You must remember that the only purpose of the IPCC is the summary for policymakers and the global surface temperature curve compared with the average of the models. They don’t care to be good about Polar amplification, Southern Hemisphere, Pacific ocean temperatures, they only want to show a global correlation. But we know that it is impossible to have a model based on GHG forcings that is good for all of the following periods at the same time : 1850-1910, 1910-1940, 1940-1975, 1975-2019.
For example, if a model is good for 1910-1940 like IPSL-CM5A-LR EM-2 as you showed in your post “Global Mean Surface Temperature: Early 20th Century Warming Period – Models versus Models & Models versus Data”, it will give terrible results for 1975-2019 and overwarm for this period (as it is impossible to have the same warming trend with 6 times more emissions), and terrible results for 19th century temperatures.
My hypothesis is that the different labs purposely elaborate models that are good each one for a specific different period of time, regardless of the other periods. That way, when you make the average of all the models, you find something not so bad for all the periods ! This would work because models who are very wrong for a period are designed to have their error that is canceled by another model who has the inverse error. You showed it between IPSL-CM5A-LR EM-3 and GISS-E2-H p3 who are very wrong but exactly at the same distance of the observation and so give the observation in average.
Maybe an idea of post would be to look into each one of the models, period by period and see which one is good for which period(s) and bad positively or negatively for which period(s). This way you could find the detailed mechanics of how each model contributes to the average for a certain period and how its errors to another period are cancelled by the inverse error of another model.
If my hypothesis is good, this work could expose the tricks of the IPCC averaging by showing that each model has a precise definite role in it.
What do you think of it ?