In econometrics, Prais–Winsten estimation is a procedure meant to take care of the serial correlation of type AR(1) in a linear model. Conceived by Sigbert Prais and Christopher Winsten in 1954,[1] it is a modification of Cochrane–Orcutt estimation in the sense that it does not lose the first observation, which leads to more efficiency as a result and makes it a special case of feasible generalized least squares.[2]
Consider the model
where y t {\displaystyle y_{t}} is the time series of interest at time t, β {\displaystyle \beta } is a vector of coefficients, X t {\displaystyle X_{t}} is a matrix of explanatory variables, and ε t {\displaystyle \varepsilon _{t}} is the error term. The error term can be serially correlated over time: ε t = ρ ε t − 1 + e t , | ρ | < 1 {\displaystyle \varepsilon _{t}=\rho \varepsilon _{t-1}+e_{t},\ |\rho |<1} and e t {\displaystyle e_{t}} is white noise. In addition to the Cochrane–Orcutt transformation, which is
for t = 2,3,...,T, the Prais-Winsten procedure makes a reasonable transformation for t = 1 in the following form:
Then the usual least squares estimation is done.
First notice that
v a r ( ε t ) = v a r ( ρ ε t − 1 + e i t ) = ρ 2 v a r ( ε t − 1 ) + v a r ( e i t ) {\displaystyle \mathrm {var} (\varepsilon _{t})=\mathrm {var} (\rho \varepsilon _{t-1}+e_{it})=\rho ^{2}\mathrm {var} (\varepsilon _{t-1})+\mathrm {var} (e_{it})}
Noting that for a stationary process, variance is constant over time,
( 1 − ρ 2 ) v a r ( ε t ) = v a r ( e i t ) {\displaystyle (1-\rho ^{2})\mathrm {var} (\varepsilon _{t})=\mathrm {var} (e_{it})}
and thus,
v a r ( ε t ) = v a r ( e i t ) ( 1 − ρ 2 ) {\displaystyle \mathrm {var} (\varepsilon _{t})={\frac {\mathrm {var} (e_{it})}{(1-\rho ^{2})}}}
Without loss of generality suppose the variance of the white noise is 1. To do the estimation in a compact way one must look at the autocovariance function of the error term considered in the model blow:
It is easy to see that the variance–covariance matrix, Ω {\displaystyle \mathbf {\Omega } } , of the model is
Having ρ {\displaystyle \rho } (or an estimate of it), we see that,
where Z {\displaystyle \mathbf {Z} } is a matrix of observations on the independent variable (Xt, t = 1, 2, ..., T) including a vector of ones, Y {\displaystyle \mathbf {Y} } is a vector stacking the observations on the dependent variable (yt, t = 1, 2, ..., T) and Θ ^ {\displaystyle {\hat {\Theta }}} includes the model parameters.
To see why the initial observation assumption stated by Prais–Winsten (1954) is reasonable, considering the mechanics of generalized least square estimation procedure sketched above is helpful. The inverse of Ω {\displaystyle \mathbf {\Omega } } can be decomposed as Ω − 1 = G T G {\displaystyle \mathbf {\Omega } ^{-1}=\mathbf {G} ^{\mathsf {T}}\mathbf {G} } with[3]
A pre-multiplication of model in a matrix notation with this matrix gives the transformed model of Prais–Winsten.
The error term is still restricted to be of an AR(1) type. If ρ {\displaystyle \rho } is not known, a recursive procedure (Cochrane–Orcutt estimation) or grid-search (Hildreth–Lu estimation) may be used to make the estimation feasible. Alternatively, a full information maximum likelihood procedure that estimates all parameters simultaneously has been suggested by Beach and MacKinnon.[4][5]