Estimation of signal parameters via rotational invariant techniques (ESPRIT), is a technique to determine the parameters of a mixture of sinusoids in background noise. This technique was first proposed for frequency estimation.[1] However, with the introduction of phased-array systems in everyday technology, it is also used for angle of arrival estimations.[2]
At instance t {\textstyle t} , the M {\textstyle M} (complex-valued) output signals (measurements) y m [ t ] {\textstyle y_{m}[t]} , m = 1 , … , M {\textstyle m=1,\ldots ,M} , of the system are related to the K {\textstyle K} (complex-valued) input signals x k [ t ] {\textstyle x_{k}[t]} , k = 1 , … , K {\textstyle k=1,\ldots ,K} , as y m [ t ] = ∑ k = 1 K a m , k x k [ t ] + n m [ t ] , {\displaystyle y_{m}[t]=\sum _{k=1}^{K}a_{m,k}x_{k}[t]+n_{m}[t],} where n m [ t ] {\textstyle n_{m}[t]} denotes the noise added by the system. The one-dimensional form of ESPRIT can be applied if the weights have the form a m , k = e − j ( m − 1 ) ω k {\textstyle a_{m,k}=e^{-j(m-1)\omega _{k}}} , whose phases are integer multiples of some radial frequency ω k {\displaystyle \omega _{k}} . This frequency only depends on the index of the system's input, i.e. k {\textstyle k} . The goal of ESPRIT is to estimate ω k {\displaystyle \omega _{k}} 's, given the outputs y m [ t ] {\textstyle y_{m}[t]} and the number of input signals, K {\textstyle K} . Since the radial frequencies are the actual objectives, a m , k {\textstyle a_{m,k}} is denoted as a m ( ω k ) {\textstyle a_{m}(\omega _{k})} .
Collating the weights a m ( ω k ) {\textstyle a_{m}(\omega _{k})} as a ( ω k ) = [ 1 e − j ω k e − j 2 ω k . . . e − j ( M − 1 ) ω k ] ⊤ {\displaystyle \mathbf {a} (\omega _{k})=[\,1\,\ e^{-j\omega _{k}}\,\ e^{-j2\omega _{k}}\,\ ...\,\ e^{-j(M-1)\omega _{k}}\,]^{\top }} and the M {\textstyle M} output signals at instance t {\textstyle t} as y [ t ] = [ y 1 [ t ] y 2 [ t ] . . . y M [ t ] ] ⊤ {\displaystyle \mathbf {y} [t]=[\,y_{1}[t]\,\ y_{2}[t]\,\ ...\,\ y_{M}[t]\,]^{\top }} , y [ t ] = ∑ k = 1 K a ( ω k ) x k [ t ] + n [ t ] , {\displaystyle \mathbf {y} [t]=\sum _{k=1}^{K}\mathbf {a} (\omega _{k})x_{k}[t]+\mathbf {n} [t],} where n [ t ] = [ n 1 [ t ] n 2 [ t ] . . . n M [ t ] ] ⊤ {\displaystyle \mathbf {n} [t]=[\,n_{1}[t]\,\ n_{2}[t]\,\ ...\,\ n_{M}[t]\,]^{\top }} . Further, when the weight vectors a ( ω 1 ) , a ( ω 2 ) , . . . , a ( ω K ) {\displaystyle \mathbf {a} (\omega _{1}),\ \mathbf {a} (\omega _{2}),\ ...,\ \mathbf {a} (\omega _{K})} are put into a Vandermonde matrix A = [ a ( ω 1 ) a ( ω 2 ) . . . a ( ω K ) ] {\displaystyle \mathbf {A} =[\,\mathbf {a} (\omega _{1})\,\ \mathbf {a} (\omega _{2})\,\ ...\,\ \mathbf {a} (\omega _{K})\,]} , and the K {\displaystyle K} inputs at instance t {\textstyle t} into a vector x [ t ] = [ x 1 [ t ] . . . x K [ t ] ] ⊤ {\displaystyle \mathbf {x} [t]=[\,x_{1}[t]\,\ ...\,\ x_{K}[t]\,]^{\top }} , we can write y [ t ] = A x [ t ] + n [ t ] . {\displaystyle \mathbf {y} [t]=\mathbf {A} \,\mathbf {x} [t]+\mathbf {n} [t].} With several measurements at instances t = 1 , 2 , … , T {\textstyle t=1,2,\dots ,T} and the notations Y = [ y [ 1 ] y [ 2 ] … y [ T ] ] {\textstyle \mathbf {Y} =[\,\mathbf {y} [1]\,\ \mathbf {y} [2]\,\ \dots \,\ \mathbf {y} [T]\,]} , X = [ x [ 1 ] x [ 2 ] … x [ T ] ] {\textstyle \mathbf {X} =[\,\mathbf {x} [1]\,\ \mathbf {x} [2]\,\ \dots \,\ \mathbf {x} [T]\,]} and N = [ n [ 1 ] n [ 2 ] … n [ T ] ] {\textstyle \mathbf {N} =[\,\mathbf {n} [1]\,\ \mathbf {n} [2]\,\ \dots \,\ \mathbf {n} [T]\,]} , the model equation becomes Y = A X + N . {\displaystyle \mathbf {Y} =\mathbf {A} \mathbf {X} +\mathbf {N} .}
The weight vector a ( ω k ) {\textstyle \mathbf {a} (\omega _{k})} has the property that adjacent entries are related. [ a ( ω k ) ] m + 1 = e − j ω k [ a ( ω k ) ] m {\displaystyle [\mathbf {a} (\omega _{k})]_{m+1}=e^{-j\omega _{k}}[\mathbf {a} (\omega _{k})]_{m}} For the whole vector a ( ω k ) {\displaystyle \mathbf {a} (\omega _{k})} , the equation introduces two selection matrices J 1 {\displaystyle \mathbf {J} _{1}} and J 2 {\displaystyle \mathbf {J} _{2}} : J 1 = [ I M − 1 0 ] {\displaystyle \mathbf {J} _{1}=[\mathbf {I} _{M-1}\quad \mathbf {0} ]} and J 2 = [ 0 I M − 1 ] {\displaystyle \mathbf {J} _{2}=[\mathbf {0} \quad \mathbf {I} _{M-1}]} . Here, I M − 1 {\displaystyle \mathbf {I} _{M-1}} is an identity matrix of size ( M − 1 ) {\textstyle (M-1)} and 0 {\displaystyle \mathbf {0} } is a vector of zeros.
The vector J 1 a ( ω k ) {\displaystyle \mathbf {J} _{1}\mathbf {a} (\omega _{k})} [ J 2 a ( ω k ) ] {\displaystyle [\mathbf {J} _{2}\mathbf {a} (\omega _{k})]} contains all elements of a ( ω k ) {\displaystyle \mathbf {a} (\omega _{k})} except the last [first] one. Thus, J 2 a ( ω k ) = e − j ω k J 1 a ( ω k ) {\displaystyle \mathbf {J} _{2}\mathbf {a} (\omega _{k})=e^{-j\omega _{k}}\mathbf {J} _{1}\mathbf {a} (\omega _{k})} and J 2 A = J 1 A H , where H := [ e − j ω 1 e − j ω 2 ⋱ e − j ω K ] . {\displaystyle \mathbf {J} _{2}\mathbf {A} =\mathbf {J} _{1}\mathbf {A} \mathbf {H} ,\quad {\text{where}}\quad {\mathbf {H} }:={\begin{bmatrix}e^{-j\omega _{1}}&\\&e^{-j\omega _{2}}\\&&\ddots \\&&&e^{-j\omega _{K}}\end{bmatrix}}.} The above relation is the first major observation required for ESPRIT. The second major observation concerns the signal subspace that can be computed from the output signals.
The singular value decomposition (SVD) of Y {\textstyle \mathbf {Y} } is given as Y = U Σ V † {\displaystyle \mathbf {Y} =\mathbf {U} \mathbf {\Sigma } \mathbf {V} ^{\dagger }} where U ∈ C M × M {\textstyle \mathbf {U} \in \mathbb {C} ^{M\times M}} and V ∈ C T × T {\textstyle \mathbf {V} \in \mathbb {C} ^{T\times T}} are unitary matrices and Σ {\textstyle \mathbf {\Sigma } } is a diagonal matrix of size M × T {\textstyle M\times T} , that holds the singular values from the largest (top left) in descending order. The operator † {\textstyle \dagger } denotes the complex-conjugate transpose (Hermitian transpose).
Let us assume that T ≥ M {\textstyle T\geq M} . Notice that we have K {\textstyle K} input signals. If there was no noise, there would only be K {\textstyle K} non-zero singular values. We assume that the K {\textstyle K} largest singular values stem from these input signals and other singular values are presumed to stem from noise. The matrices in the SVD of Y {\textstyle \mathbf {Y} } can be partitioned into submatrices, where some submatrices correspond to the signal subspace and some correspond to the noise subspace. U = [ U S U N ] , Σ = [ Σ S 0 0 0 Σ N 0 ] , V = [ V S V N V 0 ] , {\displaystyle {\begin{aligned}\mathbf {U} ={\begin{bmatrix}\mathbf {U} _{\mathrm {S} }&\mathbf {U} _{\mathrm {N} }\end{bmatrix}},&&\mathbf {\Sigma } ={\begin{bmatrix}\mathbf {\Sigma } _{\mathrm {S} }&\mathbf {0} &\mathbf {0} \\\mathbf {0} &\mathbf {\Sigma } _{\mathrm {N} }&\mathbf {0} \end{bmatrix}},&&\mathbf {V} ={\begin{bmatrix}\mathbf {V} _{\mathrm {S} }&\mathbf {V} _{\mathrm {N} }&\mathbf {V} _{\mathrm {0} }\end{bmatrix}}\end{aligned}},} where U S ∈ C M × K {\textstyle \mathbf {U} _{\mathrm {S} }\in \mathbb {C} ^{M\times K}} and V S ∈ C N × K {\textstyle \mathbf {V} _{\mathrm {S} }\in \mathbb {C} ^{N\times K}} contain the first K {\textstyle K} columns of U {\textstyle \mathbf {U} } and V {\textstyle \mathbf {V} } , respectively and Σ S ∈ C K × K {\textstyle \mathbf {\Sigma } _{\mathrm {S} }\in \mathbb {C} ^{K\times K}} is a diagonal matrix comprising the K {\textstyle K} largest singular values.
Thus, The SVD can be written as Y = U S Σ S V S † + U N Σ N V N † , {\displaystyle \mathbf {Y} =\mathbf {U} _{\mathrm {S} }\mathbf {\Sigma } _{\mathrm {S} }\mathbf {V} _{\mathrm {S} }^{\dagger }+\mathbf {U} _{\mathrm {N} }\mathbf {\Sigma } _{\mathrm {N} }\mathbf {V} _{\mathrm {N} }^{\dagger },} where U S {\textstyle \mathbf {U} _{\mathrm {S} }} , V S {\textstyle \mathbf {V} _{\mathrm {S} }} , and Σ S {\textstyle \mathbf {\Sigma } _{\mathrm {S} }} represent the contribution of the input signal x k [ t ] {\textstyle x_{k}[t]} to Y {\textstyle \mathbf {Y} } . We term U S {\textstyle \mathbf {U} _{\mathrm {S} }} the signal subspace. In contrast, U N {\textstyle \mathbf {U} _{\mathrm {N} }} , V N {\textstyle \mathbf {V} _{\mathrm {N} }} , and Σ N {\textstyle \mathbf {\Sigma } _{\mathrm {N} }} represent the contribution of noise n m [ t ] {\textstyle n_{m}[t]} to Y {\textstyle \mathbf {Y} } .
Hence, from the system model, we can write A X = U S Σ S V S † {\textstyle \mathbf {A} \mathbf {X} =\mathbf {U} _{\mathrm {S} }\mathbf {\Sigma } _{\mathrm {S} }\mathbf {V} _{\mathrm {S} }^{\dagger }} and N = U N Σ N V N † {\textstyle \mathbf {N} =\mathbf {U} _{\mathrm {N} }\mathbf {\Sigma } _{\mathrm {N} }\mathbf {V} _{\mathrm {N} }^{\dagger }} . Also, from the former, we can write U S = A F , {\displaystyle \mathbf {U} _{\mathrm {S} }=\mathbf {A} \mathbf {F} ,} where F = X V S Σ S − 1 {\displaystyle {\mathbf {F} }=\mathbf {X} \mathbf {V} _{\mathrm {S} }\mathbf {\Sigma } _{\mathrm {S} }^{-1}} . In the sequel, it is only important that there exists such an invertible matrix F {\textstyle \mathbf {F} } and its actual content will not be important.
Note: The signal subspace can also be extracted from the spectral decomposition of the auto-correlation matrix of the measurements, which is estimated as R Y Y = 1 T ∑ t = 1 T y [ t ] y [ t ] † = 1 T Y Y † = 1 T U Σ Σ † U † = 1 T U S Σ S 2 U S † + 1 T U N Σ N 2 U N † . {\displaystyle \mathbf {R} _{\mathrm {YY} }={\frac {1}{T}}\sum _{t=1}^{T}\mathbf {y} [t]\mathbf {y} [t]^{\dagger }={\frac {1}{T}}\mathbf {Y} \mathbf {Y} ^{\dagger }={\frac {1}{T}}\mathbf {U} {\mathbf {\Sigma } \mathbf {\Sigma } ^{\dagger }}\mathbf {U} ^{\dagger }={\frac {1}{T}}\mathbf {U} _{\mathrm {S} }\mathbf {\Sigma } _{\mathrm {S} }^{2}\mathbf {U} _{\mathrm {S} }^{\dagger }+{\frac {1}{T}}\mathbf {U} _{\mathrm {N} }\mathbf {\Sigma } _{\mathrm {N} }^{2}\mathbf {U} _{\mathrm {N} }^{\dagger }.}
We have established two expressions so far: J 2 A = J 1 A H {\textstyle \mathbf {J} _{2}\mathbf {A} =\mathbf {J} _{1}\mathbf {A} \mathbf {H} } and U S = A F {\textstyle \mathbf {U} _{\mathrm {S} }=\mathbf {A} \mathbf {F} } . Now, J 2 A = J 1 A H ⟹ J 2 ( U S F − 1 ) = J 1 ( U S F − 1 ) H ⟹ S 2 = S 1 P , {\displaystyle {\begin{aligned}\mathbf {J} _{2}\mathbf {A} =\mathbf {J} _{1}\mathbf {A} \mathbf {H} \implies \mathbf {J} _{2}(\mathbf {U} _{\mathrm {S} }\mathbf {F} ^{-1})=\mathbf {J} _{1}(\mathbf {U} _{\mathrm {S} }\mathbf {F} ^{-1})\mathbf {H} \implies \mathbf {S} _{2}=\mathbf {S} _{1}\mathbf {P} ,\end{aligned}}} where S 1 = J 1 U S {\displaystyle \mathbf {S} _{1}=\mathbf {J} _{1}\mathbf {U} _{\mathrm {S} }} and S 2 = J 2 U S {\displaystyle \mathbf {S} _{2}=\mathbf {J} _{2}\mathbf {U} _{\mathrm {S} }} denote the truncated signal sub spaces, and P = F − 1 H F . {\displaystyle \mathbf {P} =\mathbf {F} ^{-1}\mathbf {H} \mathbf {F} .} The above equation has the form of an eigenvalue decomposition, and the phases of the eigenvalues in the diagonal matrix H {\displaystyle \mathbf {H} } are used to estimate the radial frequencies.
Thus, after solving for P {\displaystyle \mathbf {P} } in the relation S 2 = S 1 P {\displaystyle \mathbf {S} _{2}=\mathbf {S} _{1}\mathbf {P} } , we would find the eigenvalues λ 1 , λ 2 , … , λ K {\textstyle \lambda _{1},\ \lambda _{2},\ \ldots ,\ \lambda _{K}} of P {\displaystyle \mathbf {P} } , where λ k = α k e j ω k {\displaystyle \lambda _{k}=\alpha _{k}\mathrm {e} ^{j\omega _{k}}} , and the radial frequencies ω 1 , ω 2 , … , ω K {\displaystyle \omega _{1},\ \omega _{2},\ \ldots ,\ \omega _{K}} are estimated as the phases (argument) of the eigenvalues.
Remark: In general, S 1 {\displaystyle \mathbf {S} _{1}} is not invertible. One can use the least squares estimate P = ( S 1 † S 1 ) − 1 S 1 † S 2 {\displaystyle {\mathbf {P} }=(\mathbf {S} _{1}^{\dagger }{\mathbf {S} _{1}})^{-1}\mathbf {S} _{1}^{\dagger }{\mathbf {S} _{2}}} . An alternative would be the total least squares estimate.
Input: Measurements Y := [ y [ 1 ] y [ 2 ] … y [ T ] ] {\textstyle \mathbf {Y} :=[\,\mathbf {y} [1]\,\ \mathbf {y} [2]\,\ \dots \,\ \mathbf {y} [T]\,]} , the number of input signals K {\textstyle K} (estimate if not already known).
In the derivation above, the selection matrices J 1 = [ I M − 1 0 ] {\displaystyle \mathbf {J} _{1}=[\mathbf {I} _{M-1}\quad \mathbf {0} ]} and J 2 = [ 0 I M − 1 ] {\displaystyle \mathbf {J} _{2}=[\mathbf {0} \quad \mathbf {I} _{M-1}]} were used. However, any appropriate matrices J 1 {\textstyle \mathbf {J} _{1}} and J 2 {\displaystyle \mathbf {J} _{2}} may be used as long as the rotational invariance ( {\textstyle (} i.e., J 2 a ( ω k ) = e − j ω k J 1 a ( ω k ) {\displaystyle \mathbf {J} _{2}\mathbf {a} (\omega _{k})=e^{-j\omega _{k}}\mathbf {J} _{1}\mathbf {a} (\omega _{k})} ) {\textstyle )} , or some generalization of it (see below) holds; accordingly, the matrices J 1 A {\textstyle \mathbf {J} _{1}\mathbf {A} } and J 2 A {\textstyle \mathbf {J} _{2}\mathbf {A} } may contain any rows of A {\textstyle \mathbf {A} } .
The rotational invariance used in the derivation may be generalized. So far, the matrix H {\displaystyle \mathbf {H} } has been defined to be a diagonal matrix that stores the sought-after complex exponentials on its main diagonal. However, H {\displaystyle \mathbf {H} } may also exhibit some other structure.[3] For instance, it may be an upper triangular matrix. In this case, P = F − 1 H F {\textstyle \mathbf {P} =\mathbf {F} ^{-1}\mathbf {H} \mathbf {F} } constitutes a triangularization of P {\displaystyle \mathbf {P} } .