A bivariate normal distributions [BVD] is governed by a central positive symmetric matrix. This matrix is a covariance matrix which describes the variances and correlation of the BVD’s marginal distributions. The contour lines of the probabilty density function of a BVD are ellipses. The half axes and the orientation of these ellipses are controlled by the eigenvalues and eigenvectors of the BVD’s covariance matrix. In this post I show a simple way how to determine the eigenvalues and eigenvectors of such a symmetric matrix. We will use the characteristic equation of the matrix for this purpose.
In the context of Machine Learning approximate BVDs appear relatively often. In many cases as projections of approximate Multivariate Normal Distributions on some 2-dimensional coordinate planes of input data or processed data in latent spaces. The results below will help us in such contexts to determine parameters of confidence ellipses of the probability densities.
Related posts about covariance matrices of BVDs:
- Bivariate normal distribution – derivation by linear transformation of a random vector for two independent Gaussians
- Bivariate Normal Distribution – derivation of the covariance and correlation by integration of the probability density
- Probability density function of a Bivariate Normal Distribution – derived from assumptions on marginal distributions and functional factorization
- Bivariate normal distribution – explicit reconstruction of a BVD random vector via Cholesky decomposition of the covariance matrix
Invertible symmetric matrix
Let us take an invertible symmetric matrix Aq of the form
We request in addition positive definiteness:
Such a matrix has an eigenvalue decomposition and its two eigenvalues are real and positive.
Eigenvalue problem
The eigenvalue problem for a symmetric matrix can be written in the form
with I2 being the 2-dim identity matrix. λ1/2 and ξ1/2 represent the two eigenvalues and two eigenvectors, respectively. The combined matrix applied to the vector(s) ξ1/2 can obviously not be inverted. Therefore, its determinant must be zero:
This is also called the “characteristic equation” of our combined matrix. It is a quadratic equation in λ1/2 :
This allows for two solutions.
Eigenvalues of the symmetric matrix
The characteristic equation (6) has the following two solutions:
Sorting terms and simplifying gives us two eigenvalues λ1 and λ2:
Note that the enumeration has no deeper meaning. We have chosen the first value to be the one which is the smaller of the two.
Eigenvectors
We are only interested in the directions of the eigenvectors ξ1/2 and not in their length. We simply set the second component of the first eigenvector to 1:
Eq. (5) gives us two (equivalent) equations for component ξ1,1. We take the second one (resulting from the lower line of the combined matrix):
Thus
In a completely analogous way we can determine ξ2,1:
So, eventually we have:
(T symbolizes the transposition operation.) These results, of course, are identical to results, which I have derived in another post elsewhere on ellipses and their defining matrices. See:
Properties of ellipses by matrix coefficients – I – Two defining matrices and eigenvalues.
Normalization and angle
For practical purposes and to get angles it is sometimes better to normalize the length of the eigenvectors to 1:
Then the components in x– and in y-direction will give the cosθ and sinθ of the angle θ of the specific eigenvector vs. the x axis.
An explicit formula for the angle θ between ξ1 and the x-axis is:
You see already some ambiguities here which have to be resolved in practical applications. They have to do with the fact that having two eigenvalues does not tell us something about their position in the central diagonal matrix of an eigendecomposition. This has to be checked.
Eigenvalues and eigenvectors of the inverse matrix
In the context of statistics and Machine Learning often the inverse (symmetric) matrix (Aq)-1 plays an important role, too. Its eigenvalues are also real and given by the reciprocate values of the eigenvalues for Aq :
One can show that the eigenvectors of the inverse matrix (Aq)-1 are the same as those of Aq.
Conclusion
In this post we have written down the eigenvalues and eigenvectors of a symmetric and invertible real valued matrix explicitly in terms of the matrix elements. In the context of Machine Learning the results will be helpful for determining e.g. contour lines (ellipses) of approximate bivariate normal distributions.
Stay tuned …