Skip to content

Bivariate Normal Distribution – Mahalanobis distance and contour ellipses

I continue with my posts on Bivariate Normal Distributions [BVDs]. In this post we consider the exponent of a BVD’s probability density function [pdf]. This function is governed by a central matrix Σ-1, the inverse of the variance-covariance matrix of the BVD’s random vector. We define the so called Mahalanobis distance dm for BVD vectors. A constant value of the Mahalanobis distance defines a contour line of the probability density via a quadratic form. The resulting ellipses can be constructed using results of the eigendecomposition of the covariance matrix Σ or of its inverse Σ-1.

Related posts:

Reminder: Density function of a bivariate normal distribution

I summarize some results of previous posts listed above. A given statistical data distribution can be centered in a Cartesian Coordinate System [CCS] by choosing an appropriate location of the origin: The expectation values for the marginal distributions are thereby set to zero. This corresponds to a simple translation. So, without loosing much of generality, I will only discuss centered BVDs below.

The probability density function g2c(x,y) of a (centered) BVD and a respective random vector V = (X,Y)T (composed of two statistical variables X and Y) can be written as:

\[ g_{2c}(x, y) \,=\, {1 \over 2 \pi \, \sigma_x \, \sigma_y } {1 \over { \sqrt{\, 1\,-\, \rho^2}} } \operatorname{exp} \left( – {1\over2} {1 \over 1\,-\, \rho^2 } \left[ {x^2 \over \sigma_x^2} \,-\, 2 \rho \, {x \over \sigma_x} {y \over \sigma_y} \,+\, {y^2 \over \sigma_y^2} \, \right] \right) \,. \tag{1} \]

ρ is the so called Pearson correlation coefficient of the statistical variables X and Y (see below). σx and σy are standard deviations for the X– and Y-distributions. Using the following notations

\[ \pmb{V} = \begin{pmatrix} X \\ Y \end{pmatrix}, \quad \mbox{concrete values}: \, \pmb{v} = \begin{pmatrix} x \\ y \end{pmatrix}, \quad \pmb{\mu} = \begin{pmatrix} \mu_x \\ \mu_y \end{pmatrix}, \quad \pmb{v}_{\mu} = \begin{pmatrix} x – \mu_x \\ y – \mu_y \end{pmatrix} \]

and inserting the symmetric matrix Σ-1

\[ \pmb{\Sigma}^{-1} \,=\, {1 \over \sigma_x^2\, \sigma_y^2\, \left( 1\,-\, \rho^2\right) } \, \begin{pmatrix} \sigma_y^2 &-\rho\, \sigma_x\sigma_y \\ -\rho\, \sigma_x\sigma_y & \sigma_x^2 \end{pmatrix}, \tag{2} \]

we can rewrite the density of a centered BVD ( μ = 0) in the form

\[ \mbox{centered CCS :} \quad g_{2c}(x, y) \,=\, {1 \over 2 \pi \, \sigma_x \, \sigma_y } {1 \over { \sqrt{\, 1\,-\, \rho^2}} } \operatorname{exp} \left( – {1\over2} \, {\pmb{v}}^{\operatorname{T}} \bullet \, \pmb{\Sigma}^{-1} \bullet {\pmb{v}} \, \right) \,. \tag{3} \]

For the sake of completeness, here the formula for μ ≠ 0 :

\[ \mbox{generell CCS :} \quad g_2(x, y) \,=\, {1 \over 2 \pi \, \sigma_x \, \sigma_y } {1 \over { \sqrt{\, 1\,-\, \rho^2}} } \operatorname{exp} \left( – {1\over2} \, {\pmb{v}_{\mu}}^{\operatorname{T}} \bullet \, \pmb{\Sigma}^{-1} \bullet {\pmb{v}_{\mu}} \, \right) \,. \tag{4} \]

T symbolizes the transposition operation. The inverse Σ of Σ-1 is the variance-covariance matrix, with ρ coupling X and Y :

\[ \pmb{\Sigma} \,=\, \begin{pmatrix} \sigma_x^2 &\rho\, \sigma_x\sigma_y \\ \rho\, \sigma_x\sigma_y & \sigma_y^2 \end{pmatrix}, \quad \pmb{\Sigma} \bullet \pmb{\Sigma}^{-1} = \mathbf I_n \,. \tag{5} \]

We require that Σ and Σ-1 are invertible matrices to avoid a discussion of so called degenerate cases (where the ellipse collapses to a 1-dimensional object). For ρ one can show (see here) that it really is related to the covariance of the distribution:

\[ \rho \,=\, { \operatorname{cov} (X,Y) \over \sigma_x\,\sigma_y} \,. \tag{6} \]

Therefore, ρ indeed is a coefficient characterizing the correlation of the X and Y data.

Note that both the matrices Σ and Σ-1 are symmetric and invertible. I.e., each of them is its own transposed matrix.

A notation often used to describe a two-dimensional normal distribution is

\[ \pmb{V}_B \,=\, \mathcal{N} \left[ \pmb{v}_{\mu},\, \pmb{\Sigma} \right] \,=\, \mathcal{N} \left[ \begin{pmatrix} \mu_x \\ \mu_y\end{pmatrix}, \, \begin{pmatrix} \sigma_x^2 &\rho\, \sigma_x\sigma_y \\ \rho\, \sigma_x\sigma_y & \sigma_y^2 \end{pmatrix} \right] \,. \tag{7} \]

Note that g2c(x,y) depends on the inverse of Σ, only.

Mahalanobis distance

The scalar dm given by the exponent of a BVD’s pdf

\[ \begin{align} \left(d_m\right)^2 \,:=\, {\pmb{v}}^T \bullet \, \pmb{\Sigma}^{-1} \bullet {\pmb{v}} \,, \\[10pt] d_m \,:=\, \sqrt{ \, {\pmb{v}}^{T^{\phantom{A}}} \bullet \, \pmb{\Sigma}^{-1} \bullet {\pmb{v}} } \,. \tag{8} \end{align} \]

is also called the Mahalanobis distance. It reminds us strongly of a norm.

Actually, it measures a kind of “distance” of a point (= endpoint of a vector) in the BVD-distribution from its center. It is clear that setting dm = const. means g2c(x,y) = const., and thus defines the data of a contour line of constant probability density.

I have shown in detail elsewhere that an equation

\[ \left(d_m\right) ^2 \,:=\, {\pmb{v}}^T \bullet \, \pmb{\Sigma}^{-1} \bullet {\pmb{v}} \, = \, const. \tag{9} \]

(for a symmetric and invertible matrix Σ-1) gives us vectors with end-points on the border of a centered ellipse. (dm)2 = const. defines a so called quadratic form for conic sections – in our case for an ellipse. See

Important note: In the quoted posts the ellipse’s half-axes were unfortunately named σ1 and σ2. They must not be confused with the standard deviations named σx/y used in the present post.

Note: When I refer to ellipses below, I always mean its border line and not its area contents – if not stated otherwise.

Ellipses rotated against the coordinate axes

In the posts named above you will find how the coefficients of a general symmetric and real valued (2×2)-matrix Aq in the following equation for 2-dimensional vectors v

\[ {\pmb{v}}^T \bullet \, \operatorname{\pmb{A}}_q \bullet {\pmb{v}} \, = \, 1 \,, \tag{10} \]
\[ \pmb{\operatorname{A}}_q \:=\: \begin{pmatrix} \alpha & \beta / 2 \\ \beta / 2 & \gamma \end{pmatrix} \tag{11} \]

are related to properties of a corresponding ellipse. Regarding the vector components of v, Aq (with constant elements) defines a general quadratic form – characteristic of an ellipse. Note that in the sense of the BVD discussion above and interpreting Σ-1 as an example for a matrix Aq, equ. (10) corresponds to

\[ {\pmb{v}}^T \bullet \, \operatorname{\pmb{A}}_q \bullet {\pmb{v}} \, = \, d_m^2 \,=\, 1 \,. \tag{12} \]

We first focus on this special value of dm and give a generalization later on.

In the posts quoted above, I have in particular shown why and how the directions of the main axes of the ellipse are given by the direction of the eigenvectors of the coupling matrix Aq (here of Σ-1). The length of the axes are given by reciprocate value of the square roots of the inverse eigenvalues. See below.

Geometric interpretation

In the posts on ellipses mentioned above, I have shown that a general ellipse defined by Aq can be created from

  • scaling the orthogonal half-axes of a unit circle (centered in a CCS) along the coordinate axes
  • and afterward rotating the resulting figure (and its constituting position vectors).

How can we relate this to Σ-1 ?

Well, in a previous post I have shown that a centered BVD can be created by applying a linear transformation (in form of a matrix M) on a centered random vector Z = (Z1, Z2)T . The components Z1, Z2 are random variables representing two independent normalized and standardized Gaussian distributions. In a CCS concrete vectors z of Z (and their end-points) are distributed according to the probability density function

\[ g_{\small Z}(\pmb{z}, \pmb{0}, \pmb{\operatorname{I}} ) \, = \, {1 \over (2\, \pi) } \, {\large e}^{ – \, {\Large 1 \over \Large 2} \left( {\Large \pmb{z}^T} \, \bullet \, {\Large \pmb{z}} \right) } \, = \, {1 \over (2\, \pi) } \, {\large e}^{ – \, {\Large 1 \over \Large 2} \left( {\Large \pmb{z}^T} \,\bullet\, {\Large \operatorname{\pmb{I}}} \,\bullet\, {\Large \pmb{z}} \right) } \,. \tag{13} \]

The symmetric Σ and Σ-1 for our BVD incorporate a potential linear matrix M in the sense of

\[ \operatorname{\pmb{\Sigma}} \:=\: \operatorname{\pmb{M}} \bullet \, \operatorname{\pmb{M}}^{\operatorname{T}} \,. \tag{14} \]

M maps Z to a distribution V whose vectors follow the pdf of a centered BVD:

\[ \begin{align} \pmb{Z} \, &=\, \begin{pmatrix} Z_1 \\ Z_2 \end{pmatrix} , \quad \pmb{V} \, =\, \begin{pmatrix} X \\ Y \end{pmatrix}, \\[10pt] \pmb{V} \, &= \, \operatorname{\pmb{M}} \bullet \, \pmb{Z}, \, \quad \pmb{Z} \,=\, \operatorname{\pmb{M}}^{-1} \bullet \pmb{V} \,, \quad \left| \operatorname{\pmb{M}} \right| \, \ne \, 0 \,, \\[10pt] g_v(\pmb{v}) \:&=\: g_{2c} (x, y) \,. \end{align} \tag{15}\]

So, we map vectors linearly and thereby also their probability density function. Note that, following our line of argumentation for the linear transformation in my last post, we, obviously, map vector end-points for z-vectors having a unit length to vectors v having dm = 1.

\[ \begin{align} \pmb{z} \:\: & \rightarrow \:\: \pmb{v} \\[10pt] \sqrt{\,\pmb{z} \bullet \pmb{z}^{\operatorname{T}^{\phantom{A}}} \,} \:=\: 1 \:\: &\rightarrow \:\: d_m\left(\pmb{v}\right) \:=\: 1 \,. \end{align} \]

A unit circle is mapped to an ellipse with dm = 1. In general and for scaled circles given by z-vectors, we have :

\[ \begin{align} \operatorname{\pmb{M}} \,: \,\quad \quad \pmb{z} \quad & \rightarrow \quad \pmb{v} \,=\, \operatorname{\pmb{M}} \bullet \, \pmb{z} \, \\[10pt] \Rightarrow \quad \|\pmb{z}\|^2 \,=\, \pmb{z} \bullet \pmb{z}^{\operatorname{T}^{\phantom{A}}} \:=\: r^2 \, \quad &\rightarrow \quad d_m^2(\pmb{v}) \:=\: {\pmb{v}}^{\operatorname{T}^{\phantom{A}}} \bullet \, \pmb{\Sigma}^{-1} \bullet \pmb{v} \:=\: r^2 \,. \tag{16} \end{align}\]

But, how do we understand the scaling and rotation occurring in the course of this mapping?

Eigendecomposition of Σ as a key for a geometric understanding of BVDs

For a symmetric and invertible matrix, as Aq or Σ-1, we can always find a decomposition which can be interpreted as a combination of a stretching or scaling operation and a sub-sequent rotation. This corresponds to the facts

  • that a symmetric invertible matrix Aq has real eigenvalues and perpendicular eigenvectors.
  • that a symmetric invertible matrix as Aq or has an eigendecomposition of the form Aq = Q * Λ * QT, with Q being an orthogonal matrix (as e.g. a rotational matrix parameterized by an angle), with QT = Q-1. Its columns are given by eigenvectors Aq . Λ, instead, is a diagonal matrix with the eigenvalues λu and λd of Aq on its diagonal. This is true because a transformation by orthogonal matrices does not change eigenvalues of a matrix.
\[ \pmb{\operatorname{A}_q} \:=\: \begin{pmatrix} \alpha & {1 \over 2}\, \beta \\ {1 \over 2}\, \beta & \gamma \end{pmatrix} \:=\: \pmb{\operatorname{Q}} \, \circ \, \begin{pmatrix} \lambda_u & 0 \\ 0 & \lambda_d \end{pmatrix} \, \circ \, \pmb{\operatorname{Q}}^{\operatorname{T}} \,. \tag{17} \]

The eigenvalues directly correspond to the two perpendicular main half-axes rx and ry of the (rotated) ellipse:

\[ \pmb{\operatorname{A}_q} \:=\: \pmb{\operatorname{Q}} \, \circ \, \begin{pmatrix} {1 / r_x^2} & 0 \\ 0 & {1 / r_y^2} \end{pmatrix} \, \circ \, \pmb{\operatorname{Q}}^{\operatorname{T}} \,. \tag{18} \]

See the articles on ellipses and matrices quoted above for more information.

This means that a general symmetric matrix and a relation like eq. (9) define a centered ellipse, but with its main axes rotated against the coordinate axes of the CCS. The lengths of the ellipse’s half-axes are given by the eigenvalues as

\[ \begin{align} r_x \:&=\: {1 \over \sqrt{\lambda_u}} \,, \\[10pt] r_y \:&=\: {1 \over \sqrt{\lambda_d}} \,. \end{align} \tag{19} \]

The direction of the half-axes of the ellipse is given by the perpendicular eigenvectors. Theses eigenvectors only depend on parameters of Aq.

Identifying Σ-1 with the central matrix of a quadratic form for a contour ellipse

To get information about a BVD’s contour lines, we now identify the general matrix Aq for an ellipse as the BVD’s covariance matrix Σ-1 :

\[ \begin{align} \pmb{A}_q \:&=\: \begin{pmatrix} \alpha & {1 \over 2}\, \beta \\ {1 \over 2}\, \beta & \gamma \end{pmatrix} \\[10pt] =\: \pmb{\Sigma}^{-1} \:&=\: {1 \over\left( 1\,-\, \rho^2\right) } \, \begin{pmatrix} {1 \,/\, \sigma_x^2} & -\,\rho * \left(\,1 \,/\, \left( \sigma_x \, \sigma_y\right) \,\right) \\ -\,\rho * \,/\, \left(\sigma_x \, \sigma_y \right) & {1 \,/\, \sigma_x^2} \end{pmatrix} \,. \end{align} \tag{20} \]

Its inverse Σ is a bit simpler (see above and here):

\[ \begin{align} \pmb{A}_q^{-1} \:&=\: \begin{pmatrix} a & {1\over 2 }\, b \\ {1\over 2 }\, b & c \end{pmatrix} \\[10pt] \:=\: \pmb{\Sigma} \:&=\: \begin{pmatrix} \sigma_x^2 &\rho\, \sigma_x\sigma_y \\ \rho\, \sigma_x\sigma_y & \sigma_y^2 \end{pmatrix} \,. \end{align} \tag{21} \]

Note: The eigenvalues λ1 and λ2 of Σ-1 have reciprocate values of the eigenvalues ξ1 and ξ2 of Σ.

\[ \begin{align} \operatorname{\pmb{A}}_q \,=\, \operatorname{\pmb{\Sigma}}^{-1} \:&\Rightarrow \: \mbox{eigenvalues} \: : \quad \,\, \lambda_1, \quad \quad \quad \lambda_2 \,, \\[10pt] \operatorname{\pmb{A}}_q^{-1} \,=\, \operatorname{\pmb{\Sigma}} \:&\Rightarrow \: \mbox{eigenvalues} \: : \quad \xi_1 = {1 \over \lambda_1} , \:\: \xi_2 = {1 \over \lambda_2} \,. \end{align} \tag{22} \]

Eigenvalues of Σ, Σ-1 and the lengths of the ellipse’s half-axes

In another post of this blog

I have already derived formulas for the eigenvalues of symmetric, real valued matrices. The formulas there give us the eigenvalues of Σ-1 in terms of

\[ \begin{align} \alpha \,& =\, {1 \over (1 \,-\, \rho^2 ) } \, {1\over \sigma_x^2} \,, \\[10pt] \beta \,& =\, – \,2 * \rho * {1 \over \sigma_x\, \sigma_y } \,, \\[10pt] \gamma \,&= \, {1 \over (1 \,-\, \rho^2 )} \, {1\over \sigma_y^2} \,, \end{align} \tag{23} \]

as

\[ \begin{align} \lambda_1 \:&=\: {1 \over a_L^2} \:=\: {1\over 2} \, \left[ \left(\, \alpha + \gamma \right) \, – \, \left[\, \beta^2 \,+\, \left(\alpha \,-\, \gamma \right)^2 \, \right]^{1/2} \,\right] \,, \\[10pt] \lambda_2 \:&=\: {1 \over a_S^2} \:=\: {1\over 2} \, \left[ \left(\, \alpha + \gamma \right) \, + \, \left[\, \beta^2 \,+\, \left(\alpha \,-\, \gamma \right)^2 \, \right]^{1/2} \,\right] \,. \end{align} \tag{24} \]

aL and aS are the longer and the shorter of the half-axes, respectively. We have to check

  1. whether aL = rx , aS = ry    <=>   rx > ry    <=>   λ1 = λu , λ2 = λd   <=>   σx > σy
  2. or            aS = rx , aL = ry   <=>   rx < ry   <=>   λ1 = λd , λ2 = λu   <=>   σx < σy .

Note:

  • λ1 always refers to the longer half-axis.
  • One can show that the difference between the two cases distinguished above is given by (σx > σy) OR (σx < σy).
  • The half-axes aL and aS depend on all parameters ρ, σx and σy.

The given expressions can unfortunately not be made much simpler than they are. For Machine Learning we do not care much about this obstacle, as for most practical purposes we would use numerical values, anyway.

The angle θ by which our ellipse is rotated against the x-coordinate axis of our CCS was found to be given by

\[ \sin \left(2\,\theta\right) \:=\: \,\mp\, { \beta \over \left[ \,\beta^2 \,+\, \left(\alpha \,-\,\gamma\right)^2\,\right]^{1/2} } \,. \tag{25} \]

The minus-sign holds for (σx > σy), the plus sign for (σx < σy).

There are still multiple solutions. By a more detailed analysis (or by trial and error 🙂 ) one can show that with

\[ \psi \:=\: {1 \over 2} \, \arcsin \left(\, { -\, \beta \over \left[ \,\beta^2 \,+\, \left(\alpha \,-\,\gamma\right)^2\,\right]^{1/2} } \right) \,, \tag{26} \]

we should choose the following options for specific conditions

\[ \begin{align} \sigma_x \, & \ge\, \sigma_y \, \,\land\, \rho \,\gt\, 0 \,\,: \quad \theta = \psi \,, \quad \mbox{with} \quad \psi \gt 0 \,, \\[10pt] \sigma_x \, & \ge\, \sigma_y \, \,\land\, \rho \,\lt\, 0 \,\,: \quad \theta = \psi \,, \quad \mbox{with} \quad \psi \lt 0 \,, \\[14pt] \sigma_x \, & \lt\, \sigma_y \,\land\, \rho \,\gt\, 0 \,\,: \quad \theta = {1 \over 2} \, \pi \,- \, \psi \,, \quad \mbox{with} \quad \psi \gt 0 \,, \\[10pt] \sigma_x \, & \lt\, \sigma_y \,\land\, \rho \,\lt \, 0 \,\,: \quad \theta = {1\over 2} \, \pi \,- \psi \,, \quad \mbox{with} \quad \psi \lt 0 \,. \end{align} \tag{27} \]

In an improved version of this post, I will provide a thorough derivation. If you are interested, please visit this post again in some days.

We now have found a way to construct an ellipse for a Mahalanobis distance dm = 1:

  1. We first build an ellipse which half-axes ae = (1/λ1)1/2 and be = (1/λ2)1/2 parallel to the CCS’ s x-axis and y-axis, respectively.
  2. Then we rotate the ellipse by angle θ in counter-clockwise direction. The rotated ae has an angle of θ with the x-axis of the CCS.

But, how to construct the ellipses for larger values of dm?

Other constant values of dm?

You can achieve something analogous to eqs. (24) and (25) for any value of dm. The expressions in the exponents are quadratic terms of the vectors. A doubling of vector lengths ||z|| = 2 means a squared value ||z||2 in the pdfs’ exponents, but dm in the end simply changes by the same factor dm = 2. We found this already in eq. (16):

\[ \Rightarrow \quad \|\pmb{z}\|^2 \,=\, \pmb{z} \bullet \pmb{z}^{\operatorname{T}^{\phantom{A}}} \:=\: r^2 \, \quad \rightarrow \quad d_m^2(\pmb{v}) \:=\: {\pmb{v}}^{\operatorname{T}^{\phantom{A}}} \bullet \, \pmb{\Sigma}^{-1} \bullet \pmb{v} \:=\: r^2 \,. \tag{28} \]

Bringing the right side back into a form where we have 1 as the target value scales the vector v by r. [Another point of view would be to accept , Aq (= Σ-1) as changeable. Then all matrix coefficients and also the eigenvalues would be scaled by r2.] We end up with scaling the half-axes of the new ellipse just by a factor r. Original z vectors on a circle with radius r get mapped to ellipses with dm = r.

So, constructing a contour ellipse for vectors dm(v) = r, just requires to start with an axis-parallel ellipse that has half-axes scaled by r

\[ \begin{align} &d_m(\pmb{v}) \:=\: \sqrt{\, {\pmb{v}}^{\operatorname{T}^{\phantom{A}}} \bullet \, \pmb{\Sigma}^{-1} \bullet \pmb{v} \,} \:=\: r \quad\Rightarrow \\[10pt] & r_1\:=\: r * {1 \over \sqrt{\lambda_1}} \,, \quad r_2 \:=\: r * {1 \over \sqrt{\lambda_2}} \,. \end{align} \tag{29} \]

Again with the point that we have to determine, how the longer axis r1 > r2 relates to the cases (σx > σy) or (σx < σy). We will deepen the understanding of these cases in forthcoming posts.

Note:

  • The angle θ remains unchanged! Despite the scaling, it depends on the coefficients of the (unchanged) matrix, only.
  • This also shows that contour ellipses for different constant values of the probability density of a BVD are concentric ellipses!

(If you had changed the matrix coefficients the factors would cancel each other in the fraction of eq. (25).) We will show that this gives the correct results by the help of numerical calculations and plots in the next post.

Special case: Standardized distributions σx = 1, σy = 1

Let us assume that someone has given us data of a (approximate) bivariate distribution in standardized form, such that the standard deviations of the marginals fulfill

\[ \sigma_x \,=\, 1, \quad \sigma_y = 1 \,. \]

Note that the standardization corresponds to a scaling of original x and y values to x’ = x/σx and y’ = y/σy. We assume that we also have determined the Pearson correlation coefficient ρ for our data. Then we get a rather interesting and helpful result:

\[ \begin{align} \lambda_1 \:&=\: {1 \over a_e^2} \:=\: {1\over 2} \, * {1 \over \left(\, 1 \,-\,\rho^2\,\right) } \, \left[ \, 2 \, – \, 2 * \rho \,\right] \:=\: {1 \over 1 \,+\, \rho} \,, \tag{30} \\[10pt] \lambda_2 \:&=\: {1 \over b_e^2} \:=\: {1\over 2} \, * {1 \over \left(\, 1 \,-\,\rho^2\,\right) } \, \left[ \, 2 \,+\, 2*\rho \,\right] \:=\: {1 \over 1 \,-\, \rho} \quad \Rightarrow \tag{31}\\[12pt] a_e \:&=\: \sqrt{ \, 1 \,+\, \rho \, } \,, \quad b_e \:=\: \sqrt{ \, 1 \,-\, \rho \, } \,, \quad \theta \:=\: \pi / 4\,. \tag{32} \end{align} \]

This, actually, gives rise to another method for the construction of contour ellipses. I will describe in one of the next posts.

Conclusion

In this post we have seen that constant values of the Mahalanobis distance dm of a BVD define concentric ellipses, which at the same time are contour lines of the BVD’s probability density.

Such contour ellipses can be constructed by calculating their perpendicular half-axes as reciprocate square roots of the eigenvalues of the covariance matrix and by determining the inclination angle θ of the half-axes against the axes of the Cartesian coordinate system we work with. θ is also given by elements of the BVD’s covariance matrix.

In the next posts I will first discuss a simple method to parameterize contour ellipses of a BVD by the Mahalanobis distance and an angle 0 ≤ φ ≤ 2π. In addition we will find yet another way to construct contour ellipses.