Forecasting

Where Did Durbin and Watson Go Wrong?

August 14, 2019

I’m sure you have all been there—the spot where you mention autocorrelation and the Durbin-Watson statistic. I saw this recently in some testimony where the witness goes on to explain the statistic. The testimony read something like this (which is taken from Investopedia):

  • The Durbin Watson (DW) statistic is a test for autocorrelation of the residuals from a statistical regression analysis. The DW statistic will always have a value between 0 and 4. A value of 2.0 means that there is no autocorrelation detected in the sample. Values from 0 to less than 2 indicate positive autocorrelation and values from 2 to 4 indicate negative autocorrelation.

If you dig in further, you will find that there is a table that tells you whether the evidence for autocorrelation is conclusive or inconclusive. And the critical values depend on how much data you have and how many variables are in the model, all of which definitely starts to sound like blah, blah, blah…because none of it really makes any sense. You can understand it if you dig into the formulas (which we will do), but the whole thing sounds strange.

A smaller number (further below 2) means stronger positive autocorrelation?

A larger number (further above 2) means stronger negative autocorrelation?

Someone who has a stats background will know the term “correlation” and they will know that the correlation coefficient will be a number between -1 and 1. They will know that a value of 0.0 means uncorrelated. They will know that a value above 0.0 means positive correlation. They will also know that a value below 0.0 means negative correlation.

So, how did we get to this point where we have a test statistic that is flipped and that is centered on 2.0? First, a little history. The DW statistic was developed by James Durbin (British) and Geoffrey Watson (Australian) in 1950. Second, this statistic is only relevant for a time-series model. In a time-series model, you have a set of sequential observations. Let’s label the first observation “1” and the last observation “T.” Then an individual observation is for time period t, where t is between 1 and T. So far so good.

In its most general form, a model with an additive error can be written as:


We don’t really care about F[ ] or X at this point. We really only care about the string of model residuals or errors (the e(t) values). This brings us to the formula developed by Durbin and Watson, which is:


On its face, this seems ok. The sum in the numerator is over changes in sequential residual values squared if the denominator is the sum of squared residuals. But just looking at it, it is not at all clear why this is centered at 2.0.

Digging a bit deeper, we can expand the numerator and rewrite DW as follows:


The first and last ratios each have a value of about 1.0, the only difference being that the numerator sums run from 2 to T instead of from 1 to T. So, if the middle sum has a value close to zero, the DW value will be about 2.0. That’s why the DW is centered at 2.0.

This means that the middle ratio is where all the action is. Let’s look at this middle term:


In the numerator, we have the product of sequential residuals. If the sequential residuals are changing sign (+ to – or – to +), the products will be negative. Sum them up, flip the sign (multiply by -2) and add the result, and the DW goes up above 2.0. If the sequential residuals are not changing sign (+ to + or – to -), the product will be positive. Sum them up, flip the sign (multiply by -2) and add this in and the DW will go down below 2.0. Hmmm…

If you take off the -2 in the middle term, it turns out that what is left is the formula for the first order autocorrelation coefficient for a variable with a zero mean. This is usually represented by the symbol , which is the 17th character of the Greek alphabet and is pronounced rho.

This means the DW statistic is closely approximated by the formula:


Based on this approximation, I propose we define a new statistic (the S statistic after Watson’s middle name) defined as follows:


All the critical value tables can be transformed using the same formula, and then everything would work as it does now with the DW and its critical value tables.

But now the Investopedia definition is straightforward and greatly simplified:

  • The S statistic is a test for first order autocorrelation of the residuals from a statistical regression analysis. The S statistic will always have a value between -1.0 and +1.0. A value of 0.0 means that there is no autocorrelation detected in the sample. Values from 0.0 to 1.0 indicate positive autocorrelation and values from 0.0 to -1.0 indicate negative autocorrelation.

Even though this is clearly a great idea, it will never happen. Once a statistic is coined and the textbooks are written, it is too late to change. It will live on forever. It’s kind of like the QWERTY keyboard that I typed this on – not the best design, but the one that made sense in the distant past, the one we are used to, and therefore, the one we are stuck with.

Surely there is another planet or a parallel universe, however, where the counterparts of Durbin and Watson got it right and the S statistic is well known and is easily understood as a simple correlation coefficient.

By Stuart McMenamin


Dr. J. Stuart McMenamin is the Managing Director of Forecasting at Itron, where he specializes in the fields of energy economics, statistical modeling, and software development. Over the last 35 years, he has managed numerous projects in the areas of system load forecasting, price forecasting, retail load forecasting, end-use modeling, regional modeling, load shape development, and utility data analysis. In addition to directing large analysis projects, Dr. McMenamin directs the development of Itron’s forecasting software products (MetrixND, MetrixLT, Forecast Manager, and Itron Load Research System). Prior to these efforts, he directed the development and support of the EPRI end-use models (REEPS, COMMEND, and INFORM). Related to this work, Dr. McMenamin is the author of several dozen papers relating to statistical modeling, energy forecasting, and the application of neural networks to data analysis problems. In prior jobs, Dr. McMenamin conducted research in the telecommunications industry, worked on the President's Council of Wage and Price Stability under the Carter Administration, and lectured in economics at the University of California, San Diego. Dr. McMenamin received his B.A. in Mathematics and Economics from Occidental College and his Ph.D. in Economics from UCSD.


Related Articles