Bivariate Normal Distributions – parameterization of contour ellipses in terms of the Mahalanobis distance and an angle

In my last post about Bivariate Normal Distributions [BVD] I have discussed why contour lines of a BVD’s probability density function [pdf] are concentric ellipses. These contour ellipses are defined by constant values of the so called Mahalanobis distance. In addition, I have discussed a method to create these ellipses from values of the elements of the BVD’s (variance-) covariance matrix. The method was based on an eigendecomposition of the covariance matrix and respective eigenvalues and eigenvectors. However, practical evaluations in Machine Learning contexts sometimes require an explicit parameterization of the x-, y-coordinates of points on a contour ellipse.

Below, I give you a parameterization of the (x,y)-coordinates of points on a BVD’s contour ellipse. The parameterization will be given in terms of the Mahalanobis distance and an angle 0 ≤ φ ≤ 2π. The approach is based on previous results regarding a Cholesky decomposition of the covariance matrix (or its inverse, if you like); see this previous post.

Related posts

Reminder 1: Density function of a Bivariate Normal Distribution

I summarize some results about BVDs derived in other posts of this blog. For the sake of simplicity, I will only discuss BVDs centered in a Cartesian coordinate system [CCS]. The probability density function g_2c(x,y) of a BVD and a respective random vector V = (X,Y)^T (composed of two statistical variables X and Y) can be written as:

\[ g_{2c}(x, y) \,=\, {1 \over 2 \pi \, \sigma_x \, \sigma_y } {1 \over { \sqrt{\, 1\,-\, \rho^2}} } \operatorname{exp} \left( – {1\over2} {1 \over 1\,-\, \rho^2 } \left[ {x^2 \over \sigma_x^2} \,-\, 2 \rho \, {x \over \sigma_x} {y \over \sigma_y} \,+\, {y^2 \over \sigma_y^2} \, \right] \right) \,. \tag{1} \]

ρ is the Pearson correlation coefficient of the statistical variables X and Y (see below). σ_x and σ_y are standard deviations for the X– and Y-distributions. Using the following notations

\[ \pmb{V} = \begin{pmatrix} X \\ Y \end{pmatrix}, \quad \mbox{concrete values}: \, \pmb{v} = \begin{pmatrix} x \\ y \end{pmatrix}, \quad \pmb{\mu} = \begin{pmatrix} \mu_x \\ \mu_y \end{pmatrix}, \quad \pmb{v}_{\mu} = \begin{pmatrix} x – \mu_x \\ y – \mu_y \end{pmatrix} \]

and inserting the symmetric matrix Σ^-1

\[ \pmb{\Sigma}^{-1} \,=\, {1 \over \sigma_x^2\, \sigma_y^2\, \left( 1\,-\, \rho^2\right) } \, \begin{pmatrix} \sigma_y^2 &-\rho\, \sigma_x\sigma_y \\ -\rho\, \sigma_x\sigma_y & \sigma_x^2 \end{pmatrix}, \tag{2} \]

we can rewrite the density of a centered BVD ( μ = 0) in the form

\[ \mbox{centered CCS :} \quad g_{2c}(x, y) \,=\, {1 \over 2 \pi \, \sigma_x \, \sigma_y } {1 \over { \sqrt{\, 1\,-\, \rho^2}} } \operatorname{exp} \left( – {1\over2} \, {\pmb{v}}^{\operatorname{T}} \bullet \, \pmb{\Sigma}^{-1} \bullet {\pmb{v}} \, \right) \,. \tag{3} \]

T symbolizes the transposition operation. The inverse Σ of Σ^-1 is the variance-covariance matrix, with ρ coupling X and Y :

\[ \pmb{\Sigma} \,=\, \begin{pmatrix} \sigma_x^2 &\rho\, \sigma_x\sigma_y \\ \rho\, \sigma_x\sigma_y & \sigma_y^2 \end{pmatrix}, \quad \pmb{\Sigma} \bullet \pmb{\Sigma}^{-1} = \mathbf I_n \,. \tag{5} \]

We require that Σ and Σ^-1 are invertible matrices to avoid a discussion of so called degenerate cases (where the ellipse collapses to a 1-dimensional object). ρ is related to the covariance of the distributions X,Y:

\[ \rho \,=\, { \operatorname{cov} (X,Y) \over \sigma_x\,\sigma_y} \,. \tag{6} \]

g_2c(x,y) depends on the inverse of Σ, only. The scalar d_m given by the exponent of a BVD’s pdf

\[ \begin{align} \left(d_m\right)^2 \,:=\, {\pmb{v}}^T \bullet \, \pmb{\Sigma}^{-1} \bullet {\pmb{v}} \,, \quad d_m \,:=\, \sqrt{ \, {\pmb{v}}^{T^{\phantom{A}}} \bullet \, \pmb{\Sigma}^{-1} \bullet {\pmb{v}} } \,. \tag{7} \end{align} \]

is called Mahalanobis distance. Setting d_m = const. defines a contour line. Reason: A quadratic form like

\[ \left(d_m\right) ^2 \,:=\, {\pmb{v}}^T \bullet \, \pmb{\Sigma}^{-1} \bullet {\pmb{v}} \, = \, const. = r^2 \tag{8} \]

(for a symmetric, invertible and positive definite matrix Σ^-1) defines vectors whose end-points reside on the border of a centered ellipse. The contour ellipses of a centered BVD are concentric. Their half-axes and orientation in the CCS can be calculated with the help of the eigenvalues, eigenvectors of Σ^-1 or Σ. See ; see this post for details and formulas.

Reminder 2: Cholesky decomposition of the covariance matrix and explicit construction of a BVD’s random vector

In yet another post, we saw that a decomposition of a symmetric, real valued matrices Σ^-1 or Σ is not unique. Instead of an eigendecomposition we can also apply a Cholesky decomposition. It involves a lower triangular, real-valued matrix U, which fulfills

\[ \operatorname{\pmb{\Sigma}} \,=\, \operatorname{\pmb{U}} \bullet \, \operatorname{\pmb{U}}^{\operatorname{T}} \,. \tag{9} \]

As our Σ is positive definite, the Cholesky decomposition into triangular matrices is unique – and U‘s diagonal contains positive values, only. With the help of U, the random vector V = (X, Y)^T of a centered BVD (with marginals X and Y) can be created in the following explicit form from a random vector Z = (Z₁, Z₂)^T of two independent centered Gaussians Z₁, Z₂:

\[ \pmb{V} \:=\: \begin{pmatrix} X \\ Y \end{pmatrix} \:=\: \pmb{\operatorname{U}} \bullet \, \begin{pmatrix} Z_1 \\ Z_2 \end{pmatrix} \,, \tag{10} \]

\[ \begin{align} X \:&=\: \sigma_x \, Z_1 \,, \\[10pt] Y \:&=\: \sigma_y \, \left[ \, \rho \, Z_1 \,+ \, \left( 1 \,-\, \rho^2\right)^{1/2} \, Z_2 \, \right] \,. \end{align} \tag{11} \]

Application of Cholesky decomposition to vectors z defining a contour of Z

As we have seen in other posts in this blog, vectors v of a BVD can in general be generated by a linear transformation of vectors z pointing to a contour line of the pdf of Z by some positive-definite, real-value matrix M:

\[ \begin{align} &\operatorname{\pmb{M}} \,: \,\quad \pmb{z} \quad \rightarrow \quad \pmb{v} \,=\, \operatorname{\pmb{M}} \bullet \, \pmb{z} \, \\[10pt] &\mbox{with} \quad \operatorname{ \pmb{\Sigma}}_m \:=\: \operatorname{\pmb{M}} \bullet \, \operatorname{\pmb{M}}^T \,, \quad \operatorname{ \pmb{\Sigma}}_m^{-1} \:=\: \left[\operatorname{\pmb{M}}^{-1}\right]^T \bullet \operatorname{\pmb{M}}^{-1} \,. \tag{12} \end{align}\]

Vectors z pointing to a contour line of the pdf for Z are thereby mapped to vector v fulfilling

\[ \begin{align} \Rightarrow \quad \|\pmb{z}\|^2 \,=\, \pmb{z} \bullet \pmb{z}^{\operatorname{T}^{\phantom{A}}} \:=\: r^2 \, \quad &\rightarrow \quad d_m^2(\pmb{v}) \:=\: {\pmb{v}}^{\operatorname{T}^{\phantom{A}}} \bullet \, \pmb{\Sigma}^{-1} \bullet \pmb{v} \:=\: r^2 \,, \\[10pt] \Rightarrow \quad \|\pmb{z}\| \,=\, \sqrt{ \pmb{z} \bullet \pmb{z}^{\operatorname{T}^{\phantom{A}}} } \:=\: r \, \quad &\rightarrow \quad d_m(\pmb{v}) \:=\: \sqrt{ {\pmb{v}}^{\operatorname{T}^{\phantom{A}}} \bullet \, \pmb{\Sigma}^{-1} \bullet \pmb{v} \, } \:=\: r \,. \tag{13} \end{align}\]

Among other valid solutions, we can set M = U. (In the named post on the Cholesky decomposition of Σ, I have discussed where this degree of freedom comes from.) This in turn means means that we can pick concrete vectors z with end-points on a circle with radius r and apply the transformation (11) to get our contour line determined by d_m = r in the target BVD. We can set the components of such z-vectors to

\[ \textit{z} \text{-contour circle of radius }\textit{r} \,\, : \quad \pmb{z} \:=\: \begin{pmatrix} z_1 \\z_2 \end{pmatrix} \:=\: r * \begin{pmatrix} \cos \phi \\ \sin \phi \end{pmatrix} \,. \]

Using transformation (11) we get for the set of contour vectors v_c ( r =const., φ )

\[ \begin{align} \pmb{v_c}\left( r, \phi\right) \:&=\: \begin{pmatrix} x \\ y\end{pmatrix} \:=\: \operatorname{\pmb{U}} \bullet \, \pmb{z} \:=\: r * \operatorname{\pmb{U}} \bullet \, \begin{pmatrix} \cos \phi \\ \sin \phi \end{pmatrix} \quad \Rightarrow \tag{14} \\[10pt] x \:&=\: \sigma_x * r * \cos \phi \,, \tag{15} \\[10pt] y\:&=\: \sigma_y * r * \left( \, \rho * \cos \phi \,+\, \sqrt{\, 1 \,-\, \rho^2 \, } * \sin \phi \, \right) \,, \tag{16} \\[10pt] 0 &\le \phi \,\le\, 2\,\pi \,, \quad d_m \:=\: r \:=\: const. \, . \tag{17} \end{align} \]

This is a very convenient parameterization of the contour ellipse for d_m = r – given by parameters coming from the BVD’s covariance matrix Σ and an variable angle φ. It can easily be handled in numerical applications.

Check of the Mahalanobis distance

We check that the transformed vectors v_c really fulfill d_m = r. We indeed get

\[ \begin{align} (d_m)^2 \:&=\: \left[ \, \pmb{v}^{\operatorname{T}} \bullet \pmb{\operatorname{\Sigma}^{-1}} \bullet \pmb{v} \right] \\[8pt] &=\: {1 \over 1 \,-\, \rho^2} \, \left[ r^2 \, \cos^2 \phi \, -\, 2\,\rho \, \left( r * \cos \phi \right) * r \, \left( \rho * \cos \phi \,+\, \sqrt{\, 1 \,-\, \rho^2 \, } \, * \sin \phi \, \right) \,+\, r^2 \, {1 \over 1 \,-\, \rho^2} \, \left( \rho \, \cos \phi \,+\, \sqrt{\, 1 \,-\, \rho^2 \, } \, \sin \phi \, \right) \, \right] \\[8pt] &=\: r^2 \, {1 \over 1 \,-\, \rho^2} \, \left[ \, \cos^2 \phi \,-\, 2\,\rho^2\, \cos^2 \phi \,-\, 2\,\rho\, \sqrt{\, 1\,-\, \rho^2} \, \sin \phi \, \cos \Phi \,+\, \rho^2 \, \cos^2 \Phi \,+ \, 2 \,\rho \, \sqrt{\, 1\,-\, \rho^2\, } \, \sin\Phi\, \cos\Phi \,+\, \left( 1 \,-\, \rho^2 \right) \,\sin^2 \Phi \, \right] \\[8pt] &=\: r^2 \, {1 \over 1 \,-\, \rho^2} \, \left[ 1 \,-\, \rho^2 \right] \:=\: r^2\,. \end{align} \]

As expected.

Steps to construct BVD contour ellipses for a defined Mahalanobis distance

We have found a simple method to numerically construct a contour ellipse for a given constant value r of the Mahalanobis distance d_m= r :

Method based on Cholesky decomposition

Step 1: Get the parameters of the (approximate) BVD’s central covariance matrix. If necessary, determine them numerically from given data.
Step 2: Take vectors z = ( r* cosφ, r* sinφ)^T for a sufficiently dense covering of 0 ≤ φ ≤ 2π.
Step 3: Apply the transformation given by eqs. (15) and (16) to these vectors z and get a list of transformed vectors v.
Step 4: Plot the end-points of the vectors v in a coordinate system.

Conclusion

The Cholesky decomposition and a resulting vector transformation is helpful to construct contour ellipses of a BVD for a defined Mahalanobis distance d_m with the help of the elements of the BVD’s covariance matrix. The necessary are relatively simple to implement. In a forthcoming post we will use a respective procedure to get a clearer impression of the effects of covariance matrix parameters on the shape of contour ellipses.

Stay tuned …