In his post Schmidt and Sherwood on climate models, Andrew Montford of BishopHill commented on the new paper by Schmidt and Sherwood A practical philosophy of complex climate modelling (preprint). I haven’t yet studied the Schmidt and Sherwood paper in any detail, but in scanning it, a few things stood out. Those of you who have studied the paper will surely have additional comments.
DO CLIMATE MODELS SIMULATE GLOBAL SURFACE TEMPERATURES BETTER THAN A LINEAR TREND?
The abstract of Schmidt and Sherwood reads (my boldface):
We give an overview of the practice of developing and using complex climate models, as seen from experiences in a major climate modelling center and through participation in the Coupled Model Intercomparison Project (CMIP). We discuss the construction and calibration of models; their evaluation, especially through use of out-of-sample tests; and their exploitation in multi-model ensembles to identify biases and make predictions. We stress that adequacy or utility of climate models is best assessed via their skill against more naïve predictions. The framework we use for making inferences about reality using simulations is naturally Bayesian (in an informal sense), and has many points of contact with more familiar examples of scientific epistemology. While the use of complex simulations in science is a development that changes much in how science is done in practice, we argue that the concepts being applied fit very much into traditional practices of the scientific method, albeit those more often associated with laboratory work.
The boldfaced sentence caught my attention. A straight line based on a linear trend should be considered a more naïve method of prediction. A linear trend is a statistical model and it is definitely a whole lot simpler than all of those climate models used by the IPCC. So I thought it would be interesting to see if, when and by how much the CMIP5 climate models simulated global surface temperatures better than a simple straight line…a linear trend line based on global surface temperature data.
Do climate models simulate global surface temperatures better than a linear trend? Over the long-term, of course they do, because many of the models are tuned to reproduce global surface temperature anomalies. But the models do not always simulate surface temperatures better than a straight line, and currently, due to the slowdown in surface warming, the models perform no better than a trend line.
Figure 1 compares the modeled and observed annual December-to-November (Meteorological Annual Mean) global surface temperature anomalies. The data (off-green curve) are represented by the GISS Land-Ocean Temperature Index. The models (red curve) are represented by the multi-model ensemble mean of the models stored in the CMIP5 archive. The models are forced with historic forcings through 2005 (later for some models) and the worst-case scenario (RCP8.5) from then to 2014. Also shown is the linear trend (blue line) as determined from the data by EXCEL. The data and models are referenced to the full term (1881 to 2013) so not to skew the results.
Over the past decade or so, the difference between the models and the data and the difference between the trend and the data appear to be of similar magnitude but of opposite signs. So let’s look at those differences, where the data are subtracted from both the model outputs and the values of the linear trend. See Figure 2. I’ve smoothed the differences with 5-year running-mean filters to remove much of the volatility associated with ENSO and volcanic eruptions.
Not surprisingly, in recent years, the difference between the models and the data and the difference between the trend line and the data are in fact of similar magnitudes. In other words, recently, a straight line (a linear trend) performs about as well at modeling global surface temperatures as the average of the multimillion dollar climate models used by the IPCC for their 5th Assessment Report. From about 1950 to the early 1980s, the models perform better than the straight line. Now notice the period between 1881 and 1950. A linear trend line, once again, performs about as well at simulating global surface temperatures as the average of the dozens of multimillion dollar climate models.
Obviously, the differences between the trend line and the data are caused by the multidecadal variability in the data. On the other hand, differences between the models and the data are caused by poor modeling of global surface temperatures.
For those interested, Figure 3 presents the results shown in Figure 2 but without the smoothing.
SCHMIDT AND SHERWOOD ON SWANSON (2013)
The other thing that caught my eye was the comment by Schmidt and Sherwood about the findings of Swanson (2013) “Emerging Selection Bias in Large-scale Climate Change Simulations.” The preprint version of the paper is here. In the Introduction, Swanson writes (my boldface):
Here we suggest the possibility that a selection bias based upon warming rate is emerging in the enterprise of large-scale climate change simulation. Instead of involving a choice of whether to keep or discard an observation based upon a prior expectation, we hypothesize that this selection bias involves the ‘survival’ of climate models from generation to generation, based upon their warming rate. One plausible explanation suggests this bias originates in the desirable goal to more accurately capture the most spectacular observed manifestation of recent warming, namely the ongoing Arctic amplification of warming and accompanying collapse in Arctic sea ice. However, fidelity to the observed Arctic warming is not equivalent to fidelity in capturing the overall pattern of climate warming. As a result, the current generation (CMIP5) model ensemble mean performs worse at capturing the observed latitudinal structure of warming than the earlier generation (CMIP3) model ensemble. This is despite a marked reduction in the inter-ensemble spread going from CMIP3 to CMIP5, which by itself indicates higher confidence in the consensus solution. In other words, CMIP5 simulations viewed in aggregate appear to provide a more precise, but less accurate picture of actual climate warming compared to CMIP3.
In other words, the current generation of climate models (CMIP5) agrees better among themselves than the prior generation (CMIP3), i.e., there is less of a spread between climate model outputs, because they are converging on the same results. Overall, however, the CMIP5 models perform worse than the CMIP3 models at simulating global temperatures. “[M]ore precise, but less accurate.” Swanson blamed this on the modelers trying to better simulate the warming in the Arctic.
Back to Schmidt and Sherwood: The last paragraph under the heading of Climate model development in Schmidt and Sherwood reads (my boldface):
Arctic sea ice trends provide an instructive example. The hindcast estimates of recent trends were much improved in CMIP5 compared to CMIP3 (Stroeve et al 2012). This is very likely because the observation/model mismatch in trends in CMIP3 (Stroeve et al 2007) lead developers to re-examine the physics and code related to Arctic sea ice to identify missing processes or numerical problems (for instance, as described in Schmidt et al (2014b)). An alternate suggestion that model groups specifically tuned for trends in Arctic sea ice at the expense of global mean temperatures (Swanson 2013) is not in accord with the practice of any of the modelling groups with which we are familiar, and would be unlikely to work as discussed above.
Note that Schmidt and Sherwood did not dispute the fact that the CMIP5 models performed worse than the earlier generation CMIP3 models at simulating global surface temperatures outside of the Arctic over recent decades. Schmidt and Sherwood simply commented on the practices of modeling groups. Regardless of the practices, in recent decades, the CMIP5 models perform better (but still bad) in the Arctic but worse outside the Arctic than the earlier generation models.
As a result, the CMIP3 models perform better at simulating global surface temperatures over the past 3+ decades than their newer generation counterparts. Refer to Figure 4. That fact stands out quite plainly in a satellite-era sea surface temperature model-data comparison.
Those are the things that caught my eye in the new Schmidt and Sherwood paper. What caught yours?