Skip to content

Properties of BVD confidence ellipses – II – dependency of half-axes on the correlation coefficient

If you have read my last post on confidence ellipses, you may have tried to derive the result on the longer half-axis for maximum correlation by following an eigenvalue analysis of the (inverse) covariance matrix of a Bivariate Normal Distribution [BVD]. If you have succeeded, jump over this post. If not, the contents my be interesting for you. Its is just a little and pleasant math exercise.

Inverse covariance matrix and eigenvalues

From previous posts we know that the variance-covariance matrix and its inverse of a BVD are given by

\[ \begin{align} &\pmb{\Sigma} \,=\, \begin{pmatrix} \sigma_x^2 &\rho\, \sigma_x\sigma_y \\ \rho\, \sigma_x\sigma_y & \sigma_y^2 \end{pmatrix} \,, \tag{1} \\[10pt] &\pmb{\Sigma}^{-1} \,=\, {1 \over \sigma_x^2\, \sigma_y^2\, \left( 1\,-\, \rho^2\right) } \, \begin{pmatrix} \sigma_y^2 &-\rho\, \sigma_x\sigma_y \\ -\rho\, \sigma_x\sigma_y & \sigma_x^2 \end{pmatrix} \\[10pt] &\quad \quad =\, \begin{pmatrix} \alpha & \beta/2 \\ \beta/2 & \gamma \end{pmatrix} \,, \tag{2} \\[10pt] & \pmb{\Sigma} \bullet \pmb{\Sigma}^{-1} \,=\, \mathbf I_n \,. \tag{3} \end{align} \]

σx and σy are the standard deviations of the BVD’s marginal distributions along the x– and y-axes. ρ is the Pearson correlation coefficient. See [1] for details and more information.

In [2] we have seen that the the eigenvalues λ1 and λ2 of Σ-1 are given by:

\[ \begin{align} \lambda_1 \,&=\, {1\over 2}\, \left[\, (\,\alpha\,+\,\gamma\,) \,-\, \left[\, \beta^2 \,+\, \left(\,\gamma \,-\,\alpha \,\right)^2\, \right]^{1/2} \, \right] \,, \tag{4} \\[10pt] \lambda_2 \,&=\, {1\over 2}\, \left[\, (\,\alpha\,+\,\gamma\,) \,+\, \left[\, \beta^2 \,+\, \left(\,\gamma \,-\,\alpha \,\right)^2\, \right]^{1/2} \, \right] \,.\tag{5} \end{align} \]

They give us the half-axes of a contour ellipse for the Mahalanobis distance dm = 1 as

\[ \begin{align} h_1 \:=\: {1 \over \sqrt{\lambda_1^{\phantom{1}} } } \,, \quad h_2 \:=\: {1 \over \sqrt{\lambda_2^{\phantom{1}} } } \,. \tag{6} \end{align} \]

Dependency of the eigenvalues on the correlation coefficient

We use the matrix elements acc. to eq. (2):

\[ \begin{align} \alpha \,&= \, {1 \over 1 \,-\, \rho^2} \, { 1 \over \sigma_x^2} \,, \\[10pt] \beta \,& =\, -\, 2\, {\rho \over 1 \,-\, \rho^2} \,, \\[10pt] \gamma \,&=\, {1 \over 1 \,-\, \rho^2} \, { 1 \over \sigma_y^2} \,. \end{align} \]

So, e.g. λ1 becomes:

\[ \lambda_1 \,=\, {1\over2} \, \left[ {1 \over 1 \,-\, \rho^2} \, \, \left( {1 \over \sigma_x^2} \, +\, {1\over \sigma_y^2} \right)\,-\, \left[ \, 4 {\rho^2 \over \left(1 \,-\, \rho^2 \right)^2 } \, {1 \over \sigma_x^2 \, \sigma_y^2} \,+\, {1 \over \left(1 \,-\, \rho^2 \right)^2 } \,\left( {1 \over \sigma_x^2} \,-\, {1 \over \sigma_y^2} \right)^2 \, \right]^{1/2} \, \right] \,, \]
\[ \lambda_1 \,=\, {1\over2} \, {1 \over 1 \,-\, \rho^2} \, \left[ \left( {1 \over \sigma_x^2} \, +\, {1\over \sigma_y^2} \right)\,-\, \left[ \left( {1 \over \sigma_x^2} \, +\, {1\over \sigma_y^2} \right)^2 \,-\, {4 \over \sigma_x^2\, \sigma_y^2} \, \left( 1 \,-\, \rho^2\right) \, \right]^{1/2} \, \right] \,, \]
\[ \lambda_1 \,=\, {1\over2} \, {1 \over 1 \,-\, \rho^2} \, {\sigma_x^2 \,+\, \sigma_y^2 \over \sigma_x^2 \, \sigma_y^2 } \, \left[ 1 \,-\, \left[ 1 \,-\, 4\, {\sigma_x^2 \, \sigma_y^2 \over \left( \sigma_x^2 \,+\, \sigma_y^2\right)^2} \, \left(1 \,-\, \rho^2 \right) \, \right]^{1/2} \, \right] \,. \]

This is a rather complicated dependency of the longer half-axis of the ellipse on the correlation coefficient and the standard variations of the marginals. But do not forget that it summarizes both the effects of correlation, stretching and rotation.

Limit for maximum correlation

For a transition ρ → 1, we have a little obstacle to overcome: the ρ-dependent denominator. But a Taylor series of the square root helps:

\[ \begin{align} \lambda_1 \,&=\, {1\over2} \, {1 \over 1 \,-\, \rho^2} \, {\sigma_x^2 \,+\, \sigma_y^2 \over \sigma_x^2 \, \sigma_y^2 } \, \left[ 1 \,-\, \left[ 1 \,-\, 2\, {\sigma_x^2 \, \sigma_y^2 \over \left( \sigma_x^2 \,+\, \sigma_y^2\right)^2} \, \left(1 \,-\, \rho^2 \right) \, \right] \, \right] \,. \\[10pt] &=\, {1 \over \sigma_x^2 \,+\, \sigma_y^2} \,. \end{align} \]

Meaning:

\[ \rho \, \rightarrow \, 1\,, \, \, d_m\,=\, 1 \,\,: \quad h_1 \,=\, \sqrt{\, \sigma_x^2 \,+\, \sigma_y^2 \, } \,. \]

We have confirmed our result from the last post! For dm ≠ 1 we know already that we can account for it by replacing the sigma-values σx and σy by (σx * dm) and (σy* dm):

\[ \rho \, \rightarrow \, 1\,: \quad h_1 \,=\, d_m * \sqrt{\, \sigma_x^2 \,+\, \sigma_y^2 \, } \,. \]

Limit for second half-axis

Following the different sign through all the steps above (for dm = 1) leads us to:

\[ \lambda_2 \,=\, {1 \over 1 \,-\, \rho^2} \, {\sigma_x^2 \,+\, \sigma_y^2 \over \sigma_x^2 \, \sigma_y^2 } \, \left[ 1 \,-\, {\sigma_x^2 \, \sigma_y^2 \over \left( \sigma_x^2 \,+\, \sigma_y^2\right)^2} \, \left(1 \,-\, \rho^2 \right) \, \right] \,. \]

Which in the end means for ρ → 1, dm ≠ 1:

\[ \rho \, \rightarrow \, 1\,\, : \quad \lambda_2 \,\rightarrow \, \infty \,, \quad h_2 \,\rightarrow\, 0 \,. \]

This is exactly what we have seen in the plots of my last post.

Conclusion

In this post we have confirmed the somewhat peculiar results for the behavior of the half-axes of a BVD-confidence ellipse for maximum correlation. Centered in a Cartesian coordinate system, the ellipse gets narrower with growing correlation coefficient until it approaches a line segment between the corners of a centered rectangle with side-lengths 2*dm*σx in x-direction and 2*dm*σy in y-direction. The orientation depends on the sign of the correlation coefficient (see my last post).