Eigenvalues and eigenvector of a positive-definite, real valued and symmetric matrix

A bivariate normal distributions [BVD] is governed by a central positive symmetric matrix. This matrix is a covariance matrix which describes the variances and correlation of the BVD’s marginal distributions. The contour lines of the probabilty density function of a BVD are ellipses. The half axes and the orientation of these ellipses are controlled by the eigenvalues and eigenvectors of the BVD’s covariance matrix. In this post I show a simple way how to determine the eigenvalues and eigenvectors of such a symmetric matrix. We will use the characteristic equation of the matrix for this purpose.

In the context of Machine Learning approximate BVDs appear relatively often. In many cases as projections of approximate Multivariate Normal Distributions on some 2-dimensional coordinate planes of input data or processed data in latent spaces. The results below will help us in such contexts to determine parameters of confidence ellipses of the probability densities.

Related posts about covariance matrices of BVDs:

Invertible symmetric matrix

Let us take an invertible symmetric matrix A_q of the form

\[ \operatorname{\pmb{A}}_q \:=\: \begin{pmatrix} \alpha & \beta/2 \\\beta/2 & \gamma \end{pmatrix}\,, \quad \left| \operatorname{\pmb{A}}_q \right| \:\ne\: 0 \,. \tag{1} \]

We request in addition positive definiteness:

\[ \pmb{v}^{\operatorname{T}} \bullet \operatorname{\pmb{A}} \bullet \, \pmb{v} \: \gt \: 0, \,\, \forall \,\pmb{v} \,\ne \, \pmb{0} \,. \tag{2} \]

Such a matrix has an eigenvalue decomposition and its two eigenvalues are real and positive.

Eigenvalue problem

The eigenvalue problem for a symmetric matrix can be written in the form

\[ \begin{align} &\operatorname{\pmb{A}}_q \bullet \pmb{\xi}_{1/2} \:=\: \lambda_{1/2} * \pmb{\xi}_{1/2} \tag{3} \\[10pt] &\left[ \, \operatorname{\pmb{A}}_q – \lambda_{1/2} * \operatorname{\pmb{I}}_2 \, \right] \bullet \pmb{\xi}_{1/2} \:=\:\pmb{0} \,, \tag{4} \end{align} \]

with I₂ being the 2-dim identity matrix. λ_1/2 and ξ_1/2 represent the two eigenvalues and two eigenvectors, respectively. The combined matrix applied to the vector(s) ξ_1/2 can obviously not be inverted. Therefore, its determinant must be zero:

\[ \begin{align} \left| \, \operatorname{\pmb{A}}_q – \lambda_{1/2} * \operatorname{\pmb{I}}_2 \, \right| \:=\: 0\,, \\[10pt] \left| \, \begin{pmatrix} \alpha & \beta/2 \\ \beta/2 & \gamma \end{pmatrix} \,-\, \begin{pmatrix} \lambda_{1/2} & 0 \\ 0 & \lambda_{1/2} \end{pmatrix} \, \right| \:=\: 0\,. \tag{5} \end{align} \]

This is also called the “characteristic equation” of our combined matrix. It is a quadratic equation in λ_1/2 :

\[ \lambda_{1/2}^2 \,-\, \left(\, \alpha \,+\, \gamma \, \right) * \lambda_{1/2} \,+\, \left( \alpha \, \gamma \,-\, \,\beta^2/4 \, \right) \,. \tag{6}\]

This allows for two solutions.

Eigenvalues of the symmetric matrix

The characteristic equation (6) has the following two solutions:

\[ \lambda_{1/2} \:=\: {1\over 2} \, \left[ \left(\, \alpha +\gamma \right) \,\mp \, \left[\,\left(\, \alpha +\gamma \right)^2 \,-\, 4\, \left( \alpha \, \gamma \,-\, \,\beta^2/4 \, \right) \right]^{1/2} \,\right] \,. \tag{7} \]

Sorting terms and simplifying gives us two eigenvalues λ₁ and λ₂:

\[ \begin{align} \lambda_1 \:&=\: {1\over 2} \, \left[ \left(\, \alpha + \gamma \right) \, – \, \left[\, \beta^2 \,+\, \left(\alpha \,-\, \gamma \right)^2 \, \right]^{1/2} \,\right] \,, \\[10pt] \lambda_2 \:&=\: {1\over 2} \, \left[ \left(\, \alpha + \gamma \right) \, + \, \left[\, \beta^2 \,+\, \left(\alpha \,-\, \gamma \right)^2 \, \right]^{1/2} \,\right] \,. \end{align} \tag{8} \]

Note that the enumeration has no deeper meaning. We have chosen the first value to be the one which is the smaller of the two.

Eigenvectors

We are only interested in the directions of the eigenvectors ξ_1/2 and not in their length. We simply set the second component of the first eigenvector to 1:

\[ \pmb{\xi}_{1} \:=\: \begin{pmatrix} \xi_{1,1} \\ 1 \end{pmatrix} \,, \quad \pmb{\xi}_{2} \:=\: \begin{pmatrix} \xi_{2,1} \\ 1 \end{pmatrix} . \tag{9} \]

Eq. (5) gives us two (equivalent) equations for component ξ_1,₁. We take the second one (resulting from the lower line of the combined matrix):

\[ {1\over 2} \, \beta * \xi_{1,1} \, +\, (\gamma \, \,-\, \lambda_1) \:=\: 0 \,. \tag{10} \]

Thus

\[ \xi_{1,1} \:=\: {1\over \beta} \, \left[ \left(\, \alpha \,-\, \gamma \right) \, – \, \left[\, \beta^2 \,+\, \left(\alpha \,-\, \gamma \right)^2 \, \right]^{1/2} \,\right] \,. \tag{11} \]

In a completely analogous way we can determine ξ₂_,1:

\[ \xi_{2,1} \:=\: {1\over \beta} \, \left[ \left(\, \alpha \,-\, \gamma \right) \, + \, \left[\, \beta^2 \,+\, \left(\alpha \,-\, \gamma \right)^2 \, \right]^{1/2} \,\right] \,. \tag{12} \]

So, eventually we have:

\[ \begin{align} \lambda_1 \, &: \quad \, \pmb{\xi}_1 \:=\: \left(\, {1\over \beta} \, \left[ \left(\, \alpha \,-\, \gamma \right) \, – \, \left[\, \beta^2 \,+\, \left(\alpha \,-\, \gamma \right)^2 \, \right]^{1/2} \,\right] \,,\,1 \,\right)^{\operatorname{T}} \,, \\[10pt] \lambda_2 \, &: \quad \, \pmb{\xi}_2 \:=\: \left(\, {1\over \beta} \, \left[ \left(\, \alpha \,-\, \gamma \right) \, + \, \left[\, \beta^2 \,+\, \left(\alpha \,-\, \gamma \right)^2 \, \right]^{1/2} \,\right] \,,\,1 \,\right)^{\operatorname{T}} \,. \end{align} \tag{13} \]

(T symbolizes the transposition operation.) These results, of course, are identical to results, which I have derived in another post elsewhere on ellipses and their defining matrices. See:
Properties of ellipses by matrix coefficients – I – Two defining matrices and eigenvalues.

Normalization and angle

For practical purposes and to get angles it is sometimes better to normalize the length of the eigenvectors to 1:

\[ \begin{align} \lambda_1 \, : \quad \, \pmb{\xi}_1^n \:=\: {1 \over \|\,\pmb{\xi}_1 \,\| } \, \pmb{\xi}_1 \,, \\[10pt] \lambda_2 \, : \quad \, \pmb{\xi}_2^n \:=\: {1 \over \|\,\pmb{\xi}_2 \,\| } \, \pmb{\xi}_2 \,. \end{align} \tag{14} \]

Then the components in x– and in y-direction will give the cosθ and sinθ of the angle θ of the specific eigenvector vs. the x axis.

An explicit formula for the angle θ between ξ₁ and the x-axis is:

\[ \sin \left(2\,\theta\right) \:=\: \,-\, { \beta \over \left[ \,\beta^2 \,+\, \left(\alpha \,-\,\gamma\right)^2\,\right]^{1/2} } \,. \tag{15} \]

You see already some ambiguities here which have to be resolved in practical applications. They have to do with the fact that having two eigenvalues does not tell us something about their position in the central diagonal matrix of an eigendecomposition. This has to be checked.

Eigenvalues and eigenvectors of the inverse matrix

In the context of statistics and Machine Learning often the inverse (symmetric) matrix (A_q)^-1 plays an important role, too. Its eigenvalues are also real and given by the reciprocate values of the eigenvalues for A_q :

\[ \begin{align} \operatorname{\pmb{A}}_q \: &: \quad \mbox{eigenvalues} \quad \, \lambda_1, \,\, \,\,\lambda_2 \quad \Rightarrow \\[10pt] \operatorname{\pmb{A}}_q^{-1} \: &: \quad \mbox{eigenvalues} \quad {1 \over \lambda_1} , \,\, {1 \over \lambda_2} \,. \\[10pt] \end{align} \]

One can show that the eigenvectors of the inverse matrix (A_q)^-1 are the same as those of A_q.

Conclusion

In this post we have written down the eigenvalues and eigenvectors of a symmetric and invertible real valued matrix explicitly in terms of the matrix elements. In the context of Machine Learning the results will be helpful for determining e.g. contour lines (ellipses) of approximate bivariate normal distributions.

Stay tuned …