Prais–Winsten estimation

In econometrics, Prais–Winsten estimation is a procedure meant to take care of the serial correlation of type AR(1) in a linear model. Conceived by Sigbert Prais and Christopher Winsten in 1954,^[1] it is a modification of Cochrane–Orcutt estimation in the sense that it does not lose the first observation, which leads to more efficiency as a result and makes it a special case of feasible generalized least squares.^[2]

Theory

Consider the model

y_{t}=\alpha +X_{t}\beta +\varepsilon _{t},\,

where $y_{t}$ is the time series of interest at time t, $\beta$ is a vector of coefficients, $X_{t}$ is a matrix of explanatory variables, and $\varepsilon _{t}$ is the error term. The error term can be serially correlated over time: $\varepsilon _{t}=\rho \varepsilon _{t-1}+e_{t},\ |\rho |<1$ and $e_{t}$ is white noise. In addition to the Cochrane–Orcutt transformation, which is

y_{t}-\rho y_{t-1}=\alpha (1-\rho )+(X_{t}-\rho X_{t-1})\beta +e_{t},\,

for t = 2,3,...,T, the Prais-Winsten procedure makes a reasonable transformation for t = 1 in the following form:

{\sqrt {1-\rho ^{2}}}y_{1}=\alpha {\sqrt {1-\rho ^{2}}}+\left({\sqrt {1-\rho ^{2}}}X_{1}\right)\beta +{\sqrt {1-\rho ^{2}}}\varepsilon _{1}.\,

Then the usual least squares estimation is done.

Estimation procedure

First notice that

$\mathrm {var} (\varepsilon _{t})=\mathrm {var} (\rho \varepsilon _{t-1}+e_{it})=\rho ^{2}\mathrm {var} (\varepsilon _{t-1})+\mathrm {var} (e_{it})$

Noting that for a stationary process, variance is constant over time,

$(1-\rho ^{2})\mathrm {var} (\varepsilon _{t})=\mathrm {var} (e_{it})$

and thus,

$\mathrm {var} (\varepsilon _{t})={\frac {\mathrm {var} (e_{it})}{(1-\rho ^{2})}}$

Without loss of generality suppose the variance of the white noise is 1. To do the estimation in a compact way one must look at the autocovariance function of the error term considered in the model blow:

\mathrm {cov} (\varepsilon _{t},\varepsilon _{t+h})=\rho ^{h}\mathrm {var} (\varepsilon _{t})={\frac {\rho ^{h}}{1-\rho ^{2}}},{\text{ for }}h=0,\pm 1,\pm 2,\dots \,.

It is easy to see that the variance–covariance matrix, $\mathbf {\Omega }$ , of the model is

\mathbf {\Omega } ={\begin{bmatrix}{\frac {1}{1-\rho ^{2}}}&{\frac {\rho }{1-\rho ^{2}}}&{\frac {\rho ^{2}}{1-\rho ^{2}}}&\cdots &{\frac {\rho ^{T-1}}{1-\rho ^{2}}}\\[8pt]{\frac {\rho }{1-\rho ^{2}}}&{\frac {1}{1-\rho ^{2}}}&{\frac {\rho }{1-\rho ^{2}}}&\cdots &{\frac {\rho ^{T-2}}{1-\rho ^{2}}}\\[8pt]{\frac {\rho ^{2}}{1-\rho ^{2}}}&{\frac {\rho }{1-\rho ^{2}}}&{\frac {1}{1-\rho ^{2}}}&\cdots &{\frac {\rho ^{T-3}}{1-\rho ^{2}}}\\[8pt]\vdots &\vdots &\vdots &\ddots &\vdots \\[8pt]{\frac {\rho ^{T-1}}{1-\rho ^{2}}}&{\frac {\rho ^{T-2}}{1-\rho ^{2}}}&{\frac {\rho ^{T-3}}{1-\rho ^{2}}}&\cdots &{\frac {1}{1-\rho ^{2}}}\end{bmatrix}}.

Having $\rho$ (or an estimate of it), we see that,

{\hat {\Theta }}=(\mathbf {Z} ^{\mathsf {T}}\mathbf {\Omega } ^{-1}\mathbf {Z} )^{-1}(\mathbf {Z} ^{\mathsf {T}}\mathbf {\Omega } ^{-1}\mathbf {Y} ),\,

where $\mathbf {Z}$ is a matrix of observations on the independent variable (X_t, t = 1, 2, ..., T) including a vector of ones, $\mathbf {Y}$ is a vector stacking the observations on the dependent variable (y_t, t = 1, 2, ..., T) and ${\hat {\Theta }}$ includes the model parameters.

Note

To see why the initial observation assumption stated by Prais–Winsten (1954) is reasonable, considering the mechanics of generalized least square estimation procedure sketched above is helpful. The inverse of $\mathbf {\Omega }$ can be decomposed as $\mathbf {\Omega } ^{-1}=\mathbf {G} ^{\mathsf {T}}\mathbf {G}$ with^[3]

\mathbf {G} ={\begin{bmatrix}{\sqrt {1-\rho ^{2}}}&0&0&\cdots &0\\-\rho &1&0&\cdots &0\\0&-\rho &1&\cdots &0\\\vdots &\vdots &\vdots &\ddots &\vdots \\0&0&0&\cdots &1\end{bmatrix}}.

A pre-multiplication of model in a matrix notation with this matrix gives the transformed model of Prais–Winsten.

Restrictions

The error term is still restricted to be of an AR(1) type. If $\rho$ is not known, a recursive procedure (Cochrane–Orcutt estimation) or grid-search (Hildreth–Lu estimation) may be used to make the estimation feasible. Alternatively, a full information maximum likelihood procedure that estimates all parameters simultaneously has been suggested by Beach and MacKinnon.^[4]^[5]

References

^ Prais, S. J.; Winsten, C. B. (1954). "Trend Estimators and Serial Correlation" (PDF). Cowles Commission Discussion Paper No. 383. Chicago.
^ Johnston, John (1972). Econometric Methods (2nd ed.). New York: McGraw-Hill. pp. 259–265. ISBN 9780070326798.
^ Kadiyala, Koteswara Rao (1968). "A Transformation Used to Circumvent the Problem of Autocorrelation". Econometrica. 36 (1): 93–96. doi:10.2307/1909605. JSTOR 1909605.
^ Beach, Charles M.; MacKinnon, James G. (1978). "A Maximum Likelihood Procedure for Regression with Autocorrelated Errors". Econometrica. 46 (1): 51–58. doi:10.2307/1913644. JSTOR 1913644.
^ Amemiya, Takeshi (1985). Advanced Econometrics. Cambridge: Harvard University Press. pp. 190–191. ISBN 0-674-00560-0.