Ellipses are specific two-dimensional geometrical objects. They are of interest in many contexts – e.g. in physics, engineering and in cryptography. However, they also appear in statistics. For example, in the form of elliptic contour lines of Bivariate Normal Distributions [BVDs] and as elliptic contours of the projections of Multivariate Normal Distributions [MVDs] onto coordinate planes. Approximate BVDs/MVDs in turn often appear as input data of Machine Learning [ML] algorithms – or sometimes also as result data in latent spaces of neural networks.

This post discusses basic mathematical aspects of ellipses. My main objective is to explain the relation between geometrical properties of an ellipse and the elements of two special matrices which describe a (rotated) ellipse in different ways. These matrices are related to so called quadratic forms describing ellipses.

When we deal with data of statistical vector distributions, we regularly have to derive geometrical properties of contour and confidence ellipses from numerically calculated elements of a variance-covariance matrix. Another task is the calculation of the properties of an ellipse from the coefficients of its quadratic form. The results of this post can be helpful to solve such problems.

This post is a revised and extended version of thoughts and formulas which I have previously presented in a post of a sister blog on Linux topics. This post will show you how results, which you may find elsewhere, are derived from first principles. A special topic is the determination of the rotation angle of an ellipse from matrix elements. While other texts on the topic jump over related ambiguities, I have tried to present respective considerations as clearly as possible. In a second post I will give you numerical examples and show some plots. Other posts in this blog will apply the results to BVDs and MVDs.

We work in a 2-dimensional Cartesian coordinate system [CCS]. When I speak of vectors, I mean position-vectors from the origin of our chosen CCS to points somewhere in our 2-dimensional space. The components of such vectors have values that are equal to the (x,y)-coordinates of the points in our CCS. Required knowledge is some basic trigonometry and some Linear Algebra.

Related posts:

Introduction

Geometrically, we describe ellipses in terms of their two perpendicular principal axes, focal points, ellipticity and rotation angles with respect to a CCS. Another aspect is:

Ellipsoids/ellipses can in general be defined and/or created by matrices operating on position vectors. The elements of such matrices are related to coefficients of quadratic polynomial expressions which control the coordinates of points on an elliptic curve. Such quadratic forms in turn describe ellipses as a specific type of conic sections.

Ellipses are typically rotated against the axes of the chosen CCS. The rotation angle of an ellipse’s longest axis against the x-coordinate axis reflects a specific correlation of the x– and y-components of vectors reaching from the origin of the CCS to points on the ellipse.

Questions about ellipses

There are five questions which I want to cover in this and a forthcoming post:

What kind of matrices define 2-dimensional elliptic curves? How can we build such matrices from elementary vector operations – such as scaling/stretching and rotation? Vector operations applied to vectors of what, exactly?
How do the matrix elements define the coordinates of points on a ellipse?
How can one derive the lengths h₁, h₂ of the (perpendicular) principal axes of an ellipse from the elements of relevant matrices? In particular of symmetric matrices that correspond to quadratic forms?
How do the eigen-decomposition or the Cholesky-decomposition of such a symmetric matrix fit into the picture?
By which formula do the matrix elements provide the inclination angle of the ellipse’s primary axes? What about ambiguities coming from multiple solutions of trigonometric functions and/or from certain degrees of freedom when creating the ellipse?

Three defining matrices for centered ellipses

In this post we only regard ellipses whose centers coincide with the origin of a chosen CCS. We thereby get rid of boring constant terms in the equations we have to solve. We do not loose much of general validity by this step: Results of an off-center ellipse follow from applying a simple translation operation to the resulting vector data.

The longer principal axis of an ellipse may be inclined by some angle Φ against the x-axis of our CCS. If Φ = 0, we speak of an axis-parallel ellipse: The half-axes of such an ellipse are oriented in parallel to the coordinate axes of the CCS. The algebraic terms to describe such an ellipse with half-axes h₁ in x-direction and h₂ in y-direction should be familiar from school:

\[ {1 \over h_1^2} * x_E^2 \, + \, {1 \over h_2^2} * y_E^2 \:=\: 1 \,. \tag{E} \]

x_E and y_E are coordinates of points on the axis-parallel ellipse.

There are actually (at least) three important and different ways to define a centered and rotated ellipse by a matrix:

Alternative 1: We define the (rotated) ellipse by a matrix A_E which results from the (matrix) product of two simpler matrices: A_E = R_Φ • D_E:
D_E is a diagonal matrix and corresponds to a scaling operation applied to the components of vectors to points located on a centered unit circle. This leads to an axis-parallel ellipse. We will show that stretching a unit circle by D_E reproduces the standard definition formula (0.1) of an axis-parallel ellipse.
R_Φ describes a subsequent rotation by an angle Φ. A_E summarizes these geometrical operations in a compact form.
Alternative 2: We define a (rotated) ellipse by a matrix A_q which acts on a vector v such that we get a polynomial equation with quadratic terms in the vector components (v^T • A_q • v; see below). I.e., matrix A_q gives us a so called quadratic form. Geometrically interpreted, a quadratic form describes an ellipse as a special case of a conic section. The coefficients of the polynomial and the elements of the matrix must, of course, fulfill some particular properties.
Alternative 3: From the inverse of A_q one can derive a matrix K_ch via a Cholesky decomposition. This matrix can also be applied to vectors on a unit circle to create an ellipse

Note that Alternative 1 comes with an ambiguity besides the choice of an angle : The creator of an ellipse (and of a respective matrix A_E) has the freedom to decide which half-axis shall become the longer one. Formally, we associate h₁ with the half-axis of the original axis-parallel ellipse in x-direction, and h₂ with the half-axis of the original axis-parallel ellipse in y-direction. This gives rise to two situations:

\[ \begin{align} h_1 \:&\ge\: h_2 \,, \\[14pt] h_2 \:&\gt\: h_1\,. \end{align} \tag{C} \]

The differences will occupy us a bit in this text.

Major mathematical steps

While it is relatively simple to derive the matrix elements from known values of h₁, h₂ and Φ, it is a bit harder to derive an ellipse’s properties from the elements of the two defining matrices A_E and A_q. This, however, is our main objective. We will cover both matrices.

Our approach will comprise seven main steps, each of which means solving a specific mathematical task:

Step 1: We derive the dependency of the elements (a, b, c, d) of A_E on the geometrical properties of a corresponding ellipse: A_E = A_E (h₁, h₂, Φ) .
Step 2: We use A_E to create a unique quadratic form which defines the ellipse.
Step 3: We compare the coefficients derived in step 2 with the coefficients of a general quadratic equation mediated by a symmetric matrix A_q: v^T • A_q • v = 1.
We then establish relations of the A_q elements α, β, γ with h₁, h₂, Φ : α=α(h₁,h₂,Φ), β=β(h₁,h₂,Φ), γ=γ(h₁,h₂,Φ).
Step 4: We invert the results of step 3 to get relations of the form
h₁=h₁(α,β,γ), h₂=h₂(α,β,γ), Φ=Φ(α,β,γ) and
h₁=h₁(a,b,c,d), h₂=h₂(a,b,c,d), Φ=Φ(a,b,c,d).
Step 5: We establish relations of h₁, h₂, Φ with the the eigenvalues λ₁, λ₂ and respective eigenvectors of A_q.
Step 6: We briefly discuss the creation of an ellipse by a matrix K_ch coming from the Cholesky decomposition of A_q‘s inverse matrix.
Step 7: We discuss various cases and conditions giving us different distinct formulas for the rotation angle Φ = Φ (λ₁, λ₂, α, β, γ) .

For many practical purposes steps 5 and 7 are of special interest.

Step 1: Matrix A_E for the creation of a centered and rotated ellipse – scaling of a unit circle followed by a rotation

Our starting point is a unit circle C whose center coincides with the origin of our CCS. The components of vectors v_c from the origin to points on the circle C fulfill the following conditions:

\[ \pmb{C} \::\: \left\{ \pmb{v}_c \:=\: \begin{pmatrix} x_c \\ y_c \end{pmatrix} \:=\: \begin{pmatrix} \operatorname{cos}(\theta) \\ \operatorname{sin}(\theta) \end{pmatrix}, \quad 0\,\leq\,\theta\, \le 2\pi \right\} \tag{1} \]

and

\[ x_c^2 \, +\, y_c^2 \,=\; 1 \,. \tag{2} \]

We define an ellipse E(h₁, h₂) (having half-axes h₁ and h₂) by the application of two linear operations to the vectors v_c:

\[ \pmb{E}_{h_1,h_2} \::\: \left\{ \, \pmb{v}_E \:=\: \begin{pmatrix} x_E \\ y_E \end{pmatrix} \:=\: \pmb{\operatorname{R}}_{\phi} \circ \pmb{\operatorname{D}}_E \circ \pmb{v}_c , \quad \pmb{v_c} \in \pmb{C} \, \right\} \,. \tag{3} \]

D_E is a diagonal matrix which describes a stretching of the circle along the CCS-axes, and R_Φ is an orthogonal rotation matrix. “○” symbolizes the standard matrix multiplication.

The stretching (or scaling) of the vector-components is done by matrix D_E :

\[ \pmb{\operatorname{D}}_E \:=\: \begin{pmatrix} h_1 & 0 \\ 0 & h_2 \end{pmatrix} \,, \tag{4} \]

\[ \pmb{\operatorname{D}}_E^{-1} \:=\: \begin{pmatrix} {1 / h_1} & 0 \\ 0 & {1 / h_2} \end{pmatrix} \,. \tag{5} \]

We will see in a minute that the application of D_E on vectors v_c indeed is equivalent to the standard definition (E) of a centered, axis-parallel ellipse. Therefore, the elements h₁, h₂ of D_E define the lengths of the principal axes of the yet un-rotated ellipse: h₁ is half of the ellipse’s diameter in x-direction, h₂ is half of the diameter in y-direction.

The subsequent rotation by an angle Φ against the x-axis of the CCS is done by a standard rotation matrix R_Φ:

\[ \pmb{\operatorname{R}}_{\phi} \:=\: \begin{pmatrix} \operatorname{cos}(\phi) & – \,\operatorname{sin}(\phi) \\ \operatorname{sin}(\phi) & \operatorname{cos}(\phi)\end{pmatrix} \:=\: \begin{pmatrix} u_1 & -\,u_2 \\ u_2 & u_1 \end{pmatrix} \,, \tag{6} \]

\[ \pmb{\operatorname{R}}_{\phi}^{\operatorname{T}} \:=\: \pmb{\operatorname{R}}_{\phi}^{-1} \:=\: \pmb{\operatorname{R}}_{-\,\phi} \,\,. \tag{7} \]

Combining both operations results in a matrix A_E with elements ((a, b), (c, d)):

\[ \pmb{\operatorname{A}}_E \:=\: \pmb{\operatorname{R}}_{\phi} \circ \pmb{\operatorname{D}}_E \:=\: \begin{pmatrix} a & b \\ c & d \end{pmatrix} \:=\: \begin{pmatrix} h_1\,u_1 & -\,h_2\,u_2 \\ h_1\,u_2 & h_2\,u_1 \end{pmatrix} \,\,. \tag{8} \]

Note:

\[ \pmb{\operatorname{A}}_E^{-1} \:=\: \begin{pmatrix} (1 / h_1) * ,u_1 & (1 / h_1) * u_2 \\ – (1 / h_2) * u_2 & (1 / h_2) * u_1 \end{pmatrix} \, , \tag{9} \]

\[ \pmb{v}_E \:=\: \begin{pmatrix} x_E \\ y_E \end{pmatrix} \:=\: \pmb{\operatorname{A}}_E \circ \begin{pmatrix} x_c \\ y_c \end{pmatrix} \,, \tag{10} \]

\[ \pmb{v}_c \:=\: \begin{pmatrix} x_c \\ y_c \end{pmatrix} \:=\: \pmb{\operatorname{A}}_E^{-1} \circ \begin{pmatrix} x_E \\ y_E \end{pmatrix} \,. \tag{11} \]

Remember:

\[ u_1 \,=\, \operatorname{cos}(\phi),\quad u_2 \,=\,\operatorname{sin}(\phi), \quad u_1^2 \,+\, u_2^2 \,=\, 1 \,. \tag{12} \]

We introduce new variables λ₁ and λ₂

\[ (1/\lambda_1) \,: =\, h_1^2, \quad\quad (1/\lambda_2) \,: =\, h_2^2 \tag{13} \]

and get

\[ \begin{align} a \,&=\, h_1\,u_1 \,=\, {1 \over \sqrt{\lambda_1}} \, u_1 , \quad b \,=\, -\, h_2\,u_2 \,=\, -\,{1 \over \sqrt{\lambda_2}} \, u_2 \,, \tag{14}\\[10pt] c \,&=\, h_1\,u_2 \,=\, {1 \over \sqrt{\lambda_1}} \, u_2 , \quad d \,=\, h_2\,u_1 \,=\, {1 \over \sqrt{\lambda_2}} \, u_1\,, \tag{15} \end{align} \]

The mathematical meanings of λ₁ and λ₂ will become clearer in a later section. The determinant of matrix A_E is given by

\[ \operatorname{det}\left( \pmb{\operatorname{A}}_E \right) \:=\: a\,d \,-\, b\,c \:=\: h_1\, h_2 \,. \tag{16} \]

h₁ and h₂ refer to the lengths of the principal axes of the ellipse. h₁ and h₂ have positive values by definition. Therefore:

\[ \operatorname{det}\left( \pmb{\operatorname{A}}_E \right) \:=\: a\,d \,-\, b\,c \:\gt\: 0 \,. \tag{17} \]

We have succeeded with step 1: We have defined an ellipse via an invertible (and also positive definite) matrix A_E, whose elements are directly based on geometrical properties.

But as said: Often an ellipse is described by a quadratic form. As a next step we derive such a quadratic equation first for an axis-parallel ellipse. Afterward we move on to a general rotated ellipse. This will in turn define another very useful matrix A_q.

Quadratic form describing a centered, un-rotated, axis-parallel ellipse

let us look at an ellipse which results from applying our scaling matrix D_E to our unit circle C. I.e., the rotation matrix is assumed to be just the identity matrix:

\[ \pmb{\operatorname{R}}_{\phi} \:=\: \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix} \,=\, \pmb{\operatorname{I}}, \quad u_1 \,=\, 1,\: u_2 \,=\, 0, \: \phi = 0 \,. \tag{18} \]

We need an expression in terms of (x_E, y_E). To get quadratic terms of vector components it helps to invoke a scalar product. The scalar product of a vector with itself gives us the squared length of a vector. In our case the norm of the inversely scaled vectors has to fulfill:

\[ ||v_c||^2 \,=\, \left[\, \pmb{\operatorname{D}}_E^{-1} \circ \begin{pmatrix} x_E \\ y_E \end{pmatrix} \, \right]^T \,\circ \, \left[\, \pmb{\operatorname{D}}_E^{-1} \circ \begin{pmatrix} x_E \\ y_E \end{pmatrix} \,\right] \:=\: 1 \,\,. \tag{19} \]

This directly results in:

\[ {1 \over h_1^2} * x_E^2 \, + \, {1 \over h_2^2} * y_E^2 \:=\: 1 \,. \tag{20} \]

This, obviously, is the standard definition (E) of an (axis-parallel) eclipse. We replace the denominators to get a convenient quadratic form:

\[ \lambda_1 * x_E^2 \,+\, \lambda_2 * y_E^2 \:=1 \,. \tag{21} \]

If someone had given us the quadratic form more generally with coefficients α, β and γ

\[ \alpha * x_E^2 \,+\, \beta * x_E \, y_E \,+\, \gamma * y_E^2 \,:=\, 1 \,, \tag{22} \]

we could have directly related these coefficients with the geometrical properties of our axis-parallel ellipse:

\[ \begin{align} \alpha \,&=\, 1 / a^2 \,=\, 1 / h_1^2 \,=\, \lambda_1 \,, \tag{23}\\[10pt] \gamma \,&=\, 1 / d^2 \,=\, 1 / h_2^2 \,=\, \lambda_2 \,, \tag{24}\\[10pt] \beta \,&=\, b \,=\, c \,=\, 0 \,, \tag{25} \\[10pt] \phi &= 0 \,. \end{align} \]

A first success: We can derive h₁, h₂ and Φ from the coefficients of the (reduced) quadratic form. But an axis-parallel ellipse is a simple case. Things get more difficult for a general rotated ellipse. Then we have to care about the mixing term with β ≠ 0.

Step 2: Quadratic form defining a general centered and rotated ellipse

We turn to step 2, now. To get a quadratic polynomial for a rotated ellipse, we use the trick from the last section once again. We apply a full back-transformation [A_E]^-1 to vectors v_E of a general ellipse:

\[ \left[ \,\pmb{\operatorname{A}}_E^{-1} \circ \begin{pmatrix} x_E \\ y_E \end{pmatrix} \, \right]^T \, \circ \, \left[ \, \pmb{\operatorname{A}}_E^{-1} \circ \begin{pmatrix} x_E \\ y_E \end{pmatrix} \,\right] \:=\: 1 \,\,. \tag{26} \]

This leads to equations for the elements of matrix A_E :

\[ {1 \over h_1^2 \, h_2^2 } \, \left[ \, \left( c^2 \,+\, d^2 \right) * x_E^2 \,\, – \,\, 2\left( a\,c\, +\, b\,d \right)* x_E \, y_E \,\, + \,\, \left(a^2 \,+\, b^2\right) * y_E^2 \, \right] \,=\, 1 \, . \tag{27} \]

The rotation included in A_E has obviously lead to a mixing of the vector components (x_E, y_E) in the polynomial: The coefficient for x_E y_E is different from zero in the general, non-trivial case.

Step 3: Quadratic form of an ellipse: Definition by a symmetric, invertible (2×2)-matrix A_q

We rewrite our polynomial equation again with general coefficients α, β and γ:

\[ \alpha * x_E^2 \, + \, \beta * x_E \, y_E \, + \, \gamma * y_E^2 \,=\, 1 \,. \tag{28} \]

With the help of a symmetric (2×2)-matrix A_q, the quadratic polynomial can be reformulated:

\[ \pmb{v}_E^T \circ \pmb{\operatorname{A}}_q \circ \pmb{v}_E \:=\: 1 \,, \tag{29} \]

with

\[ \pmb{\operatorname{A}}_q \:=\: \begin{pmatrix} \alpha & \beta / 2 \\ \beta / 2 & \gamma \end{pmatrix} \,\,. \tag{30} \]

Of course, the elements of such a matrix must fulfill certain conditions to really define an ellipse. A comparison with eq. (27) shows

\[ \pmb{\operatorname{A}}_q \:=\: { 1 \over h_1^2 \, h_2^2} \, \begin{pmatrix} c^2 \,+\, d^2 & – a\,c \, – \, b\,d \\ -\,a\,c \, -\, b\,d & a^2 \,+\, b^2 \end{pmatrix} \, . \tag{31} \]

This in turn means

\[ \begin{align} \alpha \:&=\: { 1 \over h_1^2 \, h_2^2} \, \left(\, c^2 \,+\, d^2 \,\right) \:=\: \lambda_2 * u_2^2 \, + \, \lambda_1 * u_1^2 \,, \tag{32} \\[10pt] \gamma \:&=\: { 1 \over h_1^2 \, h_2^2} \, \left(\, a^2 \,+\, b^2 \,\right) \:=\: \lambda_2 * u_1^2 \, + \, \lambda_1 * u_2^2 \,, \tag{33} \\[10pt] \beta \:&=\: – \, 2\, { 1 \over h_1^2 \, h_2^2} \, \left(\,a \,c \,+\, b \, d \,\right) \:=\: – \, 2 \, \left( \lambda_2 \,-\, \lambda_1 \right)\, u_1 \, u_2 \,. \tag{34} \end{align} \]

With

\[ u_1 = \cos \phi \,, \quad u_2 = \sin \phi \,, \tag{35} \]

it follows that

\[ \begin{align} \alpha \:&=\: \lambda_2 * \sin^2 \phi \, + \, \lambda_1 * \cos^2 \phi \,, \tag{36} \\[10pt] \gamma \:&=\: \lambda_2 * \cos^2 \phi \, + \, \lambda_1 * \sin^2 \phi \,, \tag{37} \\[10pt] \beta \:&=\: – \, 2 \, \left( \lambda_2 \,-\, \lambda_1 \right)\, \cos \phi \, \sin \phi \,. \tag{38} \end{align} \]

The elements of matrix A_q are intimately related to the ellipse’s geometrical data, but in a somewhat convoluted way. We have to explore these relations in more detail.

With the help of the elements of A_E we can also show that det(A_q) > 0:

\[ \operatorname{det}\left( \pmb{\operatorname{A}}_q \right) \:=\: \left(\,\alpha \, \gamma \,-\, {1\over 4}\, \beta^2 \, \right) \:=\: { 1 \over h_1^2 \, h_2^2} \, \left(\, b\,c \,-\, a\,d \, \right)^2 \, \gt \, 0 \,. \tag{39} \]

To describe an ellipse A_q must be an invertible matrix, too. At least for standard conditions (h₁ >0, h₂ > 0). Furthermore, A_q is obviously symmetric and thus its own transposed matrix. Eq. (39) can be regarded as a necessary condition for any symmetric matrix that shall describe an ellipse! We will see later later that A_q must in addition be a positive-definite matrix.

By the way: Eq. (39) tells us that α and γ must have the same sign – and β must fulfill a condition:

\[ \alpha * \gamma \:\ge\: 0\,, \quad \beta^2 \:\lt \: 4\, \alpha \, \gamma \,. \tag{39.1} \]

We will derive further conditions later on.

Success for step 3: We have written down A_q‘s elements α, β, γ as some relatively simple functions of a, b, c, d and thus of h₁, h₂ and Φ. We now focus on the question of how we can express h₁, h₂ and Φ as functions of either (a, b, c, d) or (α, β, γ).

How to derive h₁, h₂ and Φ from the elements of A_E or A_q in the general case?

Let us assume we have (numerical) data for the coefficients of the quadratic form and thus of matrix A_q. Then we may want to calculate values for the length of the principal axes and the rotation angle Φ of the corresponding ellipse. There are two ways to derive respective formulas:

Approach 1: We use trigonometric relations to directly solve a respective equation system. This corresponds to step 4.
Approach 2: We use an eigenvector decomposition of A_q. This will help us to solve the tasks of step 5.

We study both paths below.

Step 4.1: Derivation of h₁, h₂ and Φ from the elements of matrix A_q via trigonometric relations

Helpful trigonometric relations are:

\[ \begin{align} \sin (2 \, \phi) \:&=\: 2 \,\cos (\phi)\, \sin(\phi) \,, \tag{40}\\[10pt] \cos (2 \, \phi) \:&=\: 2 \,\cos^2 (\phi)\, -\, 1 \\[10pt] \:&=\: 1 \,-\, 2\,\sin^2(\phi) \\[10pt] \:&=\: \cos^2 (\phi)\, -\, \sin^2 (\phi) \,. \tag{41} \end{align} \]

We do not make any assumptions, yet, about the difference h₁ – h₂. Remember, that our ellipse could have been created with either inequality of its half-axes. See the alternatives in the inequalities (C).

By combining the above relations (36) to (38) for the elements of A_q we find

\[ \begin{align} \alpha \, + \, \gamma \,&=\, \lambda_1 \,+\, \lambda_2 \,, \tag{43} \\[10pt] \gamma \,\, – \, \alpha \,&=\, \left( \, \lambda_2 \,-\, \lambda_1\,\right) \, \cos (2\,\phi) \,, \tag{44} \\[10pt] \,-\, \beta \:&=\: \left( \, \lambda_2 \,-\, \lambda_1 \, \right)\, \sin (2\,\phi) \,. \tag{45} \end{align} \]

Eqs. (44) and (45) are key-relations! We have to treat them carefully.

By squaring and adding the last two equations we further get

\[ \begin{align} \lambda_2 \,+\, \lambda_1 \,&=\, \alpha \, + \gamma \,, \tag{46.1} \\[10pt] \left| \lambda_2 \,-\, \lambda_1 \right| \,&=\, \left[\, \beta^2 \,+\, \left(\,\gamma \,-\,\alpha \,\right)^2\, \right]^{1/2} \,. \tag{46.2} \end{align} \]

From eq. (46.1) together with with ineq. {39.1} we find that the following conditions must be fulfilled at the same time

\[ \alpha \:\ge \: 0 \quad \land \quad \gamma \:\ge \: 0 \quad \land \quad \beta^2 \:\lt \: 4\, \alpha \, \gamma \,. \tag{47} \]

to guarantee that both axes have a positive length. These are conditions which correspond to the invertibility and the positive-definiteness of A_q.

Two basic cases we have to distinguish

We are now confronted with our two elementary cases (C) – depending on how we had chosen to scale the half-axes of the original axis-parallel ellipse and the sub-sequent rotation angle Φ.

Case 1: λ₂ ≥ λ₁

In this case we would have started our construction of our ellipse with h₁ ≥ h₂. Let us call the resulting ellipse E_1>2. Then:

\[ \begin{align} \lambda_1 \,&=\, {1\over 2}\, \left[\, (\,\alpha\,+\,\gamma\,) \,-\, \left[\, \beta^2 \,+\, \left(\,\gamma \,-\,\alpha \,\right)^2\, \right]^{1/2} \, \right] \,, \tag{48} \\[10pt] \lambda_2 \,&=\, {1\over 2}\, \left[\, (\,\alpha\,+\,\gamma\,) \,+\, \left[\, \beta^2 \,+\, \left(\,\gamma \,-\,\alpha \,\right)^2\, \right]^{1/2} \, \right] \,, \tag{49} \end{align} \]

with, obviously, λ₂ ≥ λ₁. We define the following helper angle ψ, whose sign only depends on β

\[ \psi \::=\:\, {1 \over 2} \operatorname{arcsin}\left( {-\, \beta \over \left[ \beta^2 \, +\, \left( \gamma \,-\, \alpha \right)^2 \, \right]^{1/2} } \right) \,. \tag{50} \]

Note that this reduces the solution space of eq. (45), because by definition of the arcsin-function we have:

\[ -\pi/2 \:\le\: \psi \:\le\: \pi/2 \,. \]

For the rotation angle Φ we, therefore, get two potentially valid results of eq. (45)

\[ \begin{align} E_{1\ge2}\,:\quad \phi \:&=\: \psi \, , \tag{51.1} \\[10pt] \mbox{or} \quad \phi \:&=\: \pi/2 \,-\, \psi \, . \tag{51.2} \end{align} \]

This raises the questions: What conditions let us decide between the two options?

Well we should not forget eq. (44). Let us consider the example of β < 0 and at the same time (γ – α) ≥ 0. Then ψ is positive. But Φ must fulfill two conditions, namely (44) and (45) :

\[ \begin{align} \cos (2 \phi) \:\ge\: 0 \quad &\Rightarrow \quad – \, \pi/4 \:\le\: \phi \:\le\: \pi/4 \,, \\[10pt] \sin (2 \phi) \:\ge\: 0 \quad &\Rightarrow \quad \quad \quad 0 \:\le\: \phi \:\le \pi/2 \, \\[10pt] &\Rightarrow \quad \quad \quad0 \:\le\: \phi \:\le \pi/4 \,. \tag{52} \end{align} \]

For the given conditions this restricts us to the solution given by (51.1).

This example shows that a proper choice depends on all elements of matrix α, β, γ – and a fulfillment of both eqs. (44) and (45). See a later section below for a proper treatment of all possible cases. Such a distinction of multiple solutions for the rotation angle is necessary, when you want to reconstruct ellipses by numerical methods from a given matrix A_q.

Case 2: λ₂ < λ₁

Some matrices may describe ellipses whose half-axis in y-direction had been longer than its half-axis in x-direction, ahead of rotation. Let us call such an ellipse E_2>1. Then λ₁ and λ₂ would switch their role:

\[ \begin{align} \lambda_1 \,&=\, {1\over 2}\, \left[\, (\,\alpha\,+\,\gamma\,) \,+\, \left[\, \beta^2 \,+\, \left(\,\gamma \,-\,\alpha \,\right)^2\, \right]^{1/2} \, \right] \,, \tag{53} \\[10pt] \lambda_2 \,&=\, {1\over 2}\, \left[\, (\,\alpha\,+\,\gamma\,) \,-\, \left[\, \beta^2 \,+\, \left(\,\gamma \,-\,\alpha \,\right)^2\, \right]^{1/2} \, \right] \,. \tag{54} \end{align} \]

Note that a rotation of such an ellipse by Φ = – (π/2 – ψ) would, for the same values α, β , γ ), give us the same ellipse as case 1!

Eq.(45), once again, indicates that we must choose between two alternatives

\[ \begin{align} E_{2\ge1}\,: \quad \phi \:&=\: – \, \psi \,, \tag{55.1} \\[10pt] \phi \:&=\: -\, \left(\, \pi/2 \,-\, \psi \,\right) \,. \tag{55.2} \end{align}\]

Let us again consider an example, namely for β < 0, (γ – α) ≥ 0. By analyzing the two conditions (44) and (45) we get for the smallest value of |Φ|:

\[ \begin{align} \cos (2 \phi) \:\le\: 0 \quad &\Rightarrow \quad – \, 3\,\pi/4 \:\lt\: \phi \:\le\: – \pi/4 \,, \\[10pt] \sin (2 \phi) \:\le\: 0 \quad &\Rightarrow \quad \quad \quad -\,\pi/2 \:\lt\: \phi \:\le 0 \, \\[10pt] &\Rightarrow \quad \quad \quad -\, \pi/2 \:\le\: \phi \:\le -\, \pi/4 \,, \\[10pt] &\Rightarrow \quad \phi \:=\; -\, \left( \, \pi/2 \,-\,\psi\,\right)\,. \tag{55} \end{align} \]

This fits solution (52.2) .

Remaining ambiguities?

Although we have in principle succeeded with parts of step 4, the remaining choices leave us a bit skeptical for the time being. It seems that we can impose restrictive conditions on the range of Φ; this may select one of two possible solutions regarding the sine-function. However, the natural natural ambiguity described in (C) and which led us two the distinguished two cases 1 and 2 could not be resolved by a comparison of A_q with A_E alone.

We will come back to this point in a special section below, where we will summarize all possible scenarios. We will eventually see that whenever we need to create a proper ellipse given by a symmetric, positive definite matrix A_q we can safely cling to Case 1 (λ₂ ≥ λ₁), for which we only must distinguish 4 sub-cases depending on the values of β and (γ – α).

Example: Standardized BVD

A simple example for a matrix A_q would be one that appears in the context of a standardized (!) Bivariate Normal Distribution [BVD] with some correlation imposed onto the statistical variables:

\[ \pmb{\operatorname{A}}_q \:=\: {1\over 1\,-\, \rho^2}\, \begin{pmatrix} 1 & -\,\rho \\ -\,\rho & 1 \end{pmatrix} \,\,. \tag{56} \]

With ρ > 0 being the Pearson correlation coefficient. For this case we get the following result for the half axes of a respective ellipse and the rotation angle Φ:

\[ \begin{align} h_1 \,& =\, \sqrt{\left(\, 1\,+\, \rho\,\right)\, } \,, \tag{57} \\[10pt] h_2 \,& =\, \sqrt{\left(\, 1\,-\, \rho\,\right)\, } \,, \tag{58} \\[10pt] \phi \,& =\, \pi / 4 \,. \tag{59} \end{align} \]

Hint: These are very useful equations for a numerical reconstruction of contour ellipses of a BVD. I will describe this in more detail in other posts of this blog.

Step 4.2: Derivation of h₁, h₂ and Φ from the elements of A_E via trigonometric relations

Let us quickly look at the relation of the elements of matrix A_E with geometric properties of a respective ellipse. We set

\[ \begin{align} \epsilon_1 \,&=\, h_1^2 \,=\, {1 \over \lambda_1 } \,, \tag{60} \\[10pt] \epsilon_2 \,&=\, h_2^2 \,=\, {1 \over \lambda_2 } \,. \tag{61} \end{align} \]

This gives us

\[ \begin{align} a^2 \,+\, b^2 \:&=\: \epsilon_1 * \cos^2 \phi \, + \, \epsilon_2 * \sin^2 \phi \,, \\[10pt] c^2 \,+\, d^2 \:&=\: \epsilon_1 * \sin^2 \phi \, + \, \epsilon_2 * \cos^2 \phi \,, \\[10pt] a\, c \,+\, b\, d \:&=\: \left( \, \epsilon_1 \,-\, \epsilon_2 \, \right) * \cos \phi * \sin \phi \end{align} \tag{62} \]

and

\[ \begin{align} 2 \,\left( \, a^2 \,+\, b^2 \, \right) \:&=\: \epsilon_1 * \left(\, 1 \,+\, \cos (2\phi) \, \right) \, + \, \epsilon_2 * \left( \, 1 \,-\, \cos (2\phi) \, \right) \,, \\[10pt] 2 \,\left( \, c^2 \,+\, d^2 \, \right) \:&=\: \epsilon_2 * \left(\, 1 \,+\, \cos (2\phi) \, \right) \, + \, \epsilon_1 * \left( \, 1 \,-\, \cos (2\phi)\, \right) \,, \\[10pt] 2 \, \left(\,a \,c \,+\, b\, d \,\right) \:&=\: \left(\, \epsilon_1 \,-\, \epsilon_2 \, \right) * \sin (2\phi) \,. \end{align} \tag{63} \]

We rearrange terms and get:

\[ \begin{align} \epsilon_1 \,+\, \epsilon_2 \, +\, \left( \epsilon_1 \,-\, \epsilon_2 \right) * \cos (2\phi) \:&=\: \,+ 2 \, \left( \, a^2 \,+\, b^2 \, \right) \,, \\[10pt] \,-\, \epsilon_1 \,-\, \epsilon_2 \,+\, \left( \epsilon_1 \,-\, \epsilon_2 \right) * \cos (2\phi) \:&=\: \,- 2 \, \left( \, c^2 \,+\, d^2 \, \right) \,, \\[10pt] \left(\, \epsilon_1 \,-\, \epsilon_2 \, \right) * \sin (2\phi) \:&=\: \,+ 2 \, \left( \, a\,c \,+\, b\,d \, \right) \,. \end{align} \tag{64} \]

So far, so good. But the equations are a bit hard to tackle. Let us, therefore, define some further variables before we add and subtract the first two of the above equations:

\[ \begin{align} r \:&=\: {1 \over 2} \left(\, a^2 \,+\, b^2 \,+\, c^2 \,+\, d^2 \, \right) \,, \tag{65} \\[10pt] s_1 \:&=\: {1 \over 2} \left(\, a^2 \,+\, b^2 \,-\, c^2 \,-\, d^2 \, \right) \,, \tag{66} \\[10pt] s_2 \:&=\: \left(\, a\,c \,+\, b\,d \, \right) \,, \tag{67} \\[10pt] \pmb{s} \:&=\: \begin{pmatrix} s_1 \\ s_2 \end{pmatrix} \,, \tag{68} \\[10pt] s \:&=\: \sqrt{ s_1^2 \,+\, s_2^2 } \, .\tag{69} \end{align} \]

Adding two of the equations with the sin(2Φ) and cos(2Φ) above and using the third equation results in:

\[ {1 \over 2} \, \left( \epsilon_1 \,-\, \epsilon_2 \right) \begin{pmatrix} \cos (2\phi) \\ \sin (2\phi) \end{pmatrix} \:=\: \begin{pmatrix} s_1 \\ s_2 \end{pmatrix} \,. \tag{70} \]

Again, we must distinguish 2 cases:

Case 1: ε₁ > ε₂

\[ \begin{align} h_1^2 \,=\, \epsilon_1 \:&=\: r \,+\, s \,, \tag{71} \\[10pt] h_2^2 \,=\, \epsilon_2 \:&=\: r \,-\, s \,. \tag{72} \end{align} \]

In terms of the matrix elements a, b, c, d:

\[ \begin{align} h_1^2 \,=\, {1 \over \lambda_1} \:&=\: {1 \over 2} \left[ \, a^2+b^2+c^2 +d^2 \,+\, \left[ 4 (ac + bd)^2 \, +\, \left( c^2+d^2 -a^2 -b^2\right)^2 \, \right]^{1/2} \right] \,, \tag{73} \\[10pt] h_2^2 \,=\, {1 \over \lambda_2} \:&=\: {1 \over 2} \left[ \, a^2+b^2+c^2 +d^2 \,-\, \left[ 4 (ac + bd)^2 \, +\, \left( c^2+d^2 -a^2 -b^2\right)^2 \, \right]^{1/2} \right] \,. \tag{74} \end{align} \]

Who said that life had to be easy? I leave it to the reader to show with the help of eq.(31) that this is identical to eqs. (48) and (49).

Case 2: ε₂ > ε₁

\[ \begin{align} h_1^2 \,=\, \epsilon_1 \:&=\: r \,-\, s \,, \tag{75} \\[10pt] h_2^2 \,=\, \epsilon_2 \:&=\: r \,+\, s \,. \tag{76} \end{align} \]

\[ \begin{align} h_1^2 \,=\, {1 \over \lambda_1} \:&=\: {1 \over 2} \left[ \, a^2+b^2+c^2 +d^2 \,-\, \left[ 4 (ac + bd)^2 \, +\, \left( c^2+d^2 -a^2 -b^2\right)^2 \, \right]^{1/2} \right] \,, \tag{77} \\[10pt] h_2^2 \,=\, {1 \over \lambda_2} \:&=\: {1 \over 2} \left[ \, a^2+b^2+c^2 +d^2 \,+\, \left[ 4 (ac + bd)^2 \, +\, \left( c^2+d^2 -a^2 -b^2\right)^2 \, \right]^{1/2} \right] \,. \tag{78} \end{align} \]

It is relatively easy to prove:

\[ h_1^2 * h_2^2 \,=\, {1 \over \lambda_1^2 \, \lambda_2^2} \,=\, \left(\, a\,d\,-\, b\,c\ \, \right)^2 \,=\, \left[\operatorname{det}\left(\pmb{\operatorname{A}}_q \right)\right]^2 \,. \tag{79} \]

We thus can rewrite eq. (31) as

\[ \begin{align} \alpha \:&=\: { 1 \over \left(\,a\,d \,-\, b\,c\,\right)^2 } \, \left(\, c^2 \,+\, d^2 \,\right) \,, \\[10pt] \gamma \:&=\: { 1 \over \left(\,a\,d \,-\, b\,c\,\right)^2 } \, \left(\, a^2 \,+\, b^2 \,\right) \,, \\[10pt] \beta \:&=\: – \, 2\, { 1 \over \left(\,a\,d \,-\, b\,c\,\right)^2 } \, \left(\,a \,c \,+\, b \, d \,\right)\,. \end{align} \tag{80} \]

Determination of the inclination angle Φ via elements of A_E

For the determination of the angle Φ we use:

\[ \begin{pmatrix} \operatorname{cos}(2\phi) \\ \operatorname{sin}(2\phi) \end{pmatrix} \:=\: {1 \over s} \begin{pmatrix} s_1 \\ s_2 \end{pmatrix} \,. \tag{81} \]

We choose

\[ -\pi/2 \,\lt\, \phi \le \pi/2 \,, \tag{82} \]

and get:

\[ \phi \:=\: {1 \over 2} \operatorname{arctan}\left({s_2 \over s_1}\right) \:=\: {1 \over 2} \, \operatorname{arctan}\left( { 2\left(ac \,+\, bd\right) \over (a^2 \,+\, b^2 \,-\, c^2 \,-\, d^2) } \right) \,. \tag{83} \]

Note: All in all there are four different solutions. The reason is that we alternatively could have requested λ₂ ≥ λ₁ or λ₁ ≥ λ₂, and in addition chosen a different angle. Again, we find ambiguities due to a selection of the considered principal axis and rotational symmetries. See a later section for a proper distinction between all reasonable cases.

Step 5: 2nd way to a solution for h₁, h₂ and Φ via eigendecomposition of A_q

We now turn to step 5. For our second way of deriving formulas for h₁, h₂ and Φ we use some Linear Algebra. Above we have written down a symmetric, positive-definite matrix A_q describing an operation on the position vectors of points on our rotated ellipse:

\[ \pmb{v}_E^T \circ \pmb{\operatorname{A}}_q \circ \pmb{v}_E \:=\: 1 \,. \tag{84} \]

From Linear Algebra we know that every symmetric, real-valued and positive definite matrix can be decomposed into a product of orthogonal matrices O, O^T and a diagonal matrix. This reflects the so called eigendecomposition of a symmetric matrix:

\[ \pmb{\operatorname{A}}_q \:=\: \pmb{\operatorname{O}} \circ \pmb{\operatorname{D}}_{diag} \circ \pmb{\operatorname{O}}^T \tag{85} \]

with

\[ \pmb{\operatorname{D}}_{diag} \:=\: \begin{pmatrix} \lambda_{u} & 0 \\ 0 & \lambda_{d} \end{pmatrix} \,. \tag{86} \]

The elements λ_u and λ_d are eigenvalues of both D_diag and A_q. Reason:
Orthogonal matrices do not change eigenvalues of a transformed matrix. So, the diagonal elements of D_diag are the eigenvalues of A_q, too. Linear Algebra also tells us that the columns of the matrix O are given by the components of the normalized eigenvectors of A_q. (The order of the eigenvalues along the diagonal is not pre-defined. Only by convention one would arrange them downwards in increasing order.)

We can interpret O as a rotation matrix R_Φ for some angle Φ:

\[ \left[ \pmb{\operatorname{R}}_{-\phi} \circ \pmb{v}_E \right]^T \circ \pmb{\operatorname{D}}_{diag} \circ \left[ \pmb{\operatorname{R}}_{-\phi} \circ \pmb{v}_E \right] \:=\: 1 \,. \tag{87} \]

The whole operation tells us a simple truth, which we are already familiar with:

By our construction procedure for a rotated ellipse we know that a (rotated) CCS exists, in which the ellipse can be described as the result of a scaling operation (along the coordinate axes of the rotated CCS) applied to a unit circle. (This CCS is, of course, rotated by an angle Φ against our working CCS in which the ellipse appears rotated.)

Above we had found

With our matrices R_Φ and the scaling matrix D_E we rewrite this as

\[ \left[ \begin{pmatrix} x_E \\ y_E \end{pmatrix} \, \right]^T \, \circ \, \left[ \pmb{\operatorname{D}}_E^{-1} \circ \pmb{\operatorname{R}}_{-\phi} \right]^T \circ \left[ \pmb{\operatorname{D}}_E^{-1} \circ \pmb{\operatorname{R}}_{-\phi} \right] \circ \begin{pmatrix} x_E \\ y_E \end{pmatrix} \:=\: 1 \,. \tag{89} \]

A rearrangement tells us:

\[ \left[ \begin{pmatrix} x_E \\ y_E \end{pmatrix} \, \right]^T \, \circ \, \left[ \pmb{\operatorname{R}}_{-\phi} \right]^T \circ \left[ \pmb{\operatorname{D}}_E^{-1} \right]^T \circ \pmb{\operatorname{D}}_E^{-1} \circ \pmb{\operatorname{R}}_{-\phi} \circ \begin{pmatrix} x_E \\ y_E \end{pmatrix} \:=\: 1 \,. \tag{90} \]

Now, we remember that a diagonal matrix is its own transposed matrix and that the inverse of an orthogonal matrix (rotation) is its transposed matrix:

\[ \left[ \begin{pmatrix} x_E \\ y_E \end{pmatrix} \, \right]^T \, \circ \, \left[ \pmb{\operatorname{R}}_{\phi} \right] \circ \left[ \pmb{\operatorname{D}}_E^{-1} \circ \pmb{\operatorname{D}}_E^{-1} \right] \circ \pmb{\operatorname{R}}_{\phi}^T \circ \begin{pmatrix} x_E \\ y_E \end{pmatrix} \:=\: 1 \,. \tag{91} \]

Comparing with (857) we find:

\[ \pmb{\operatorname{D}}_{diag} \:=\: \left[ \pmb{\operatorname{D}}_E^{-1} \circ \pmb{\operatorname{D}}_E^{-1} \right] \,. \tag{92} \]

We therefore may identify the eigenvalues as some familiar terms

\[ \begin{align} \lambda_u \,=\, \lambda_1 \,&=\, {1 \over h_1^2} \,, \\[10pt] \lambda_d \,=\, \lambda_2 \,&=\, {1 \over h_2^2} \,. \end{align} \tag{93} \]

The eigenvalues λ_u and λ_d of our symmetric matrix A_q are just our parameters λ₁ and λ₂. It is really noteworthy that the half-axes of the ellipse are given by the reciprocate value of the square root of the matrix’ eigenvalues:

\[ \begin{align} h_1 \:&=\: {1 \over \sqrt{\lambda_u} } \,, \tag{94} \\[10pt] h_2 \:&=\: {1 \over \sqrt{\lambda_d} } \,. \tag{95} \end{align} \]

Mathematically, a lengthy calculation to solve the eigenvalue-problem indeed reveals that the two eigenvalues of a symmetric matrix A_q with the elements α, β and γ have the following form:

\[ \lambda_{I/II} \:=\: {1 \over 2} \left[\, \left(\alpha \,+\, \gamma \right) \,\mp\, \left[ \beta^2 + \left(\gamma \,-\, \alpha \right)^2 \,\right]^{1/2} \, \right] \,. \tag{96} \]

I have used different indices here because of the following reason:

It is a priori not clear which of the eigenvalues has to be identified with the upper and lower elements of the central matrix D_diag. This reflects the already familiar ambiguity in how the ellipse was oriented before rotation and the resulting problem of finding a proper rotation angle from the eigendecomposition. See a later section for a solution.

Side remark: You can derive the results (94) to (96) a bit easier with solving the so called “characteristic equation” of the matrix. For more details see e.g.:
Eigenvalues and eigenvector of a positive-definite, real valued and symmetric matrix

Our result, of course, is consistent with what we have found above by solving respective equations with the help of trigonometric terms. We will prove the fact that λ_I and λ_II indeed are valid eigenvalues in a minute.

Let us first look at respective eigenvectors ξ_I/II. To get them we must solve the equations of the eigenvalue problem:

\[ \left( \begin{pmatrix} \alpha & \beta / 2 \\ \beta / 2 & \gamma \end{pmatrix} \,-\, \begin{pmatrix} \lambda_{I/II} & 0 \\ 0 & \lambda_{I/II} \end{pmatrix} \right) \,\circ \, \pmb{\xi}_{I/II} \:=\: \pmb{0}\,, \tag{97} \]

with

\[ \pmb{\xi_I} \,=\, \begin{pmatrix} \xi_{I,x} \\ \xi_{I,y} \end{pmatrix}, \quad \pmb{\xi}_{II} \,=\, \begin{pmatrix} \xi_{II,x} \\ \xi_{II,y} \end{pmatrix} \,. \tag{98} \]

The following vectors fulfill the conditions (up to a common factor in the components) :

\[ \begin{align} \lambda_I \: &: \quad \pmb{\xi}_I \:=\: \left(\, {1 \over \beta} \left( (\alpha \,-\, \gamma) \,-\, \left[\, \beta^2 \,+\, \left(\gamma \,-\, \alpha \right)^2\,\right]^{1/2} \right), \: 1 \, \right)^{\operatorname{T}} \,, \tag{99} \\[10pt] \lambda_{II} \: &: \quad \pmb{\xi}_{II} \:=\: \left(\, {1 \over \beta} \left( (\alpha \,-\, \gamma) \,+\, \left[\, \beta^2 \,+\, \left(\gamma \,-\, \alpha \right)^2\,\right]^{1/2} \right), \: 1 \, \right)^{\operatorname{T}} \,, \tag{100} \end{align} \]

for the eigenvalues

\[ \begin{align} \lambda_I \:&=\: {1 \over 2} \left(\, \left(\alpha \,+\, \gamma \right) \,-\, \left[ \beta^2 \,+\, \left(\gamma \,-\, \alpha \right)^2 \,\right]^{1/2} \, \right) \,, \tag{101} \\[10pt] \lambda_{II} \:&=\: {1 \over 2} \left(\, \left(\alpha \,+\, \gamma \right) \,+\, \left[ \beta^2 \,+\, \left(\gamma \,-\, \alpha \right)^2 \,\right]^{1/2} \, \right) \,. \tag{102} \end{align} \]

For a positive-definite matrix we must guarantee that λ_I > 0 and λ_II > 0 (corresponding to positive length-values of the half-axes). This again gives us the conditions

\[ \begin{align} \beta^2 \,\lt\, 4\, \gamma\, \alpha \, \quad &\land \quad \alpha*\gamma \,\gt\, 0 \quad \land \quad \alpha \,+\, \gamma \gt 0 \\[10pt] \Rightarrow \quad \alpha \,\gt\, 0 \quad &\land \quad \gamma \,\gt\, 0 \,. \end{align} \tag{103 }\]

Compare this with eqs. (47). Note that the vector components given above are not normalized. This is important for performing numerical checks as Numpy and Linear Algebra programs would typically give you normalized eigenvectors with a length = 1. But you can easily compensate for this by working with

\[ \begin{align} \lambda_I \: &: \quad \pmb{\xi_I^n} \:=\: {1 \over \|\pmb{\xi_I}\|}\, \pmb{\xi_I} \,, \\[10pt] \lambda_{II} \: &: \quad \pmb{\xi_{II}^n} \:=\: {1 \over \|\pmb{\xi_{II}}\|}\, \pmb{\xi_{II}} \,. \end{align} \tag{104} \]

Proof for the eigenvalues and eigenvector components

We just prove that the eigenvector conditions are e.g. fulfilled for the components of the first eigenvector ξ_I and and the respective eigenvalue λ_I.

\[ \begin{align} \left(\alpha \,-\, \lambda_I \right) * \xi_{I,x} \,+\, {1 \over 2} \beta * \xi_{I,y} \,&=\, 0 \,, \tag{105} \\[10pt] {1 \over 2} \beta * \xi_{I,x} \,+\, \left( \gamma \,-\, \lambda_I \right) * \xi_{I,y} \,&=\, 0 \,. \tag{106} \end{align} \]

(The steps for the second eigenvector are completely analogous). We start with the condition for the first component – and fill in the assumed eigenvalue:

\[ \begin{align} &\left( \alpha \,-\, {1\over 2}\left[\,\left(\alpha \, + \, \gamma\right) \,-\, \left[ \beta^2 \,+\, \left( \alpha \,-\, \gamma \right)^2 \right]^{1/2} \right] \right) * \\[10pt] & {1 \over \beta}\, \left[\, \left(\alpha \,-\, \gamma\right) \,-\, \left[ \beta^2 \,+\, \left(\alpha \,-\, \gamma \right)^2 \right]^{1/2} \,\right] \,+\, {\beta \over 2 } \,=\, 0 \,, \tag{107} \end{align} \]

Thus

\[ \begin{align} & {1 \over 2 } \left[ \left(\alpha \,-\,\gamma\right) \,+\, \left[ \beta^2 \,+\, \left( \alpha \,-\, \gamma \right)^2 \right]^{1/2} \right] * \\[10pt] & {1 \over \beta}\, \left[\, \left(\alpha \,-\, \gamma\right) \,-\, \left[ \beta^2 \,+\, \left(\alpha \,-\, \gamma \right)^2 \right]^{1/2} \,\right] \,+\, {\beta \over 2 } \,=\, 0 \,. \tag{108} \end{align} \]

\[ {1 \over 2 \, \beta} \left[ (\alpha \,-\,\gamma)^2 \,-\, \beta^2 \,-\, (\alpha \,-\,\gamma)^2 \right] \,+\, {\beta \over 2 } \,=\, 0 \,. \tag{109} \]

Obviously, the last equation is true.

One can perform a similar calculation for the other eigenvector component:

\[ \begin{align} {1 \over 2} \, \beta \, & {1 \over \beta}\, \left[\, \left(\alpha \,-\, \gamma\right) \,-\, \left[ \beta^2 \,+\, \left(\alpha \,-\, \gamma \right)^2 \right]^{1/2} \,\right] \,+\, \\ & \left( \gamma \, -\, {1\over 2}\left[\,\left(\alpha \, + \, \gamma\right) \,-\, \left[ \beta^2 \,+\, \left( \alpha \,-\, \gamma \right)^2 \right]^{1/2} \right] \right) * 1 \,=\, 0 \,. \tag{110} \end{align} \]

\[ \begin{align} &{1 \over 2} \, \left(\alpha \,-\, \gamma\right) \,-\, {1 \over 2} \left[ \beta^2 \,+\, \left(\alpha \,-\, \gamma \right)^2 \right]^{1/2} \\[10pt] -\, &{1\over 2}\left(\alpha \,-\, \gamma\right) \,+\, {1\over 2}\left[ \beta^2 \,+\, \left( \alpha \,-\, \gamma \right)^2 \right]^{1/2} \,=\, 0 \,. \tag{111} \end{align} \]

True, again.

In a very similar exercise one can show that the scalar product of our two eigenvectors is equal to zero:

\[ \begin{align} & {1 \over \beta^2}\, \left[\, \left(\alpha \,-\, \gamma\right)^2 \,-\, \beta^2 \,+\, \left(\alpha \,-\, \gamma \right)^2 /\,\right] \,+\, 1 \\[8pt] & – {1 \over \beta^2} * \beta^2 \,*\,1 \,=\, 0 \,. \tag{112} \end{align} \]

I.e.,

\[ \pmb{\xi}_I \circ \pmb{\xi}_{II} \,=\, \left( \xi_{I,x}, \, \xi_{I,y} \right) \circ \begin{pmatrix} \xi_{II,x} \\ \xi_{II,y} \end{pmatrix} \,= \, 0 \,. \tag{113} \]

The eigenvectors are perpendicular to each other. Exactly, what we expect for the orientations of the principal axes of an ellipse.

A formula for the rotation angle based on the orientation of the eigenvectors

From Linear Algebra results related to an eigendecomposition we know that the orthogonal (rotation) matrices consist of columns of the (normalized) eigenvectors. With the components given in terms of our fixed un-rotated CCS, in which we basically work. It is relatively easy to see that these vectors point along the principal axes of our ellipse.

Therefore, the components of the eigenvectors of A_q should define our aspired rotation angles of the ellipse’s principal axes against the x-axis of our CCS. Let us prove this. By assuming

\[ \begin{align} \cos (\phi_I) \,&=\, \xi_{I,x}^n \,, \\[10pt] \sin (\phi_I) \,&=\, \xi_{I,y}^n \,, \end{align} \tag{114} \]

and using

\[ \sin(2\phi_I) \,=\, 2\, \sin (\phi_I) \, \cos (\phi_I) \,, \]

we get

\[ \begin{align} \sin (2 \phi_I) \,=\, 2 * { \xi_{I,x} * \xi_{I,y} \over \left[\, \xi_{I,x}^2 \, + \, \xi_{I,y}^2 \,\right] } \,. \tag{115} \end{align} \]

Therefore,

\[ \begin{align} \operatorname{sin}(2 \phi_I) \,&=\, 2 \,\, { {1 \over \large{\beta}} \left( (\alpha \,-\, \gamma) \,-\, \left[\, \beta^2 \,+\, \left(\gamma \,-\, \alpha \right)^2\,\right]^{1/2} \right) \,*\, 1 \over \left[\, \left( {1 \over \large{\beta}} \left( (\alpha \,-\, \gamma) \,-\, \left[\, \beta^2 \,+\, \left(\gamma \,-\, \alpha \right)^2\,\right]^{1/2} \right) \right)^2 \,+\, 1^2 \right] } \\[8pt] &=\, 2\,\, { {1 \over \large{\beta}} \left( t \,-\, z \right) \over {1 \over \large{\beta}^2 \phantom{\large{]}} } \left[\, \beta^2 \,+\, \left(\, t \,-\, z \,\right)^2 \right] } \,, \tag{116} \end{align} \]

with

\[ \begin{align} t \,&=\, (\alpha \,-\, \gamma) \,, \\[8pt] z \,&=\, \left[\, \beta^2 \,+\, \left(\gamma \,-\, \alpha \right)^2\,\right]^{1/2} \,. \tag{117} \end{align} \]

Somewhat surprisingly, this looks very differently from the simple expression we got above. And a direct approach is cumbersome. The trick is to multiply nominator and denominator by a convenience factor

\[ \left( t \,+\, z \right), \]

and to exploit

\[ \begin{align} \left( t \,-\, z \right) \, \left( t \,+\, z \right) \,&=\, t^2 \,-\, z^2 \,, \tag{118} \\[10pt] \left( t \,-\, z \right) \, \left( t \,+\, z \right) \,&=\,\, – \,\beta^2 \,. \tag{119} \end{align} \]

This gives us

\[ \begin{align} &2 * \beta \, { (t\,-\, z) * ( t\,+\, z) \over \left[ \beta^2 \, + \,( t \,-\, z )^2 \right] * (t \,+\, z) } \\[10pt] &=\: 2 * \beta \, { – \beta^2 \over \beta^2 \, (t\,+\,z) \,-\, \beta^2 \, (t\,-\,z) } \\ &=\: – \, {\beta \over \left[\, \beta^2 \,+\, (\alpha \,-\, \gamma)^2 \,\right]^{1/2} } \,. \tag{120} \end{align} \]

Hence

\[ \operatorname{sin}\, (2 \phi_I) \:=\: \, – \, { \beta \over \left[\, \beta^2 \,+\, (\alpha \,-\, \gamma)^2 \,\right]^{1/2} }\,, \tag{121} \]

which is of course identical to the result we got with our first solution approach. Again, we have an ambiguity here, as we can get the same sine-value for two different angles. In addition: For any eigenvector, the vector pointing into opposite direction (rotation by π) is an eigenvector, too. This makes life a bit harder when you get a matrix, use the formulas above and try to construct the ellipse with correct orientation.

Step 6: Cholesky decomposition of the inverse of [A_q] – and yet another way to reconstruct our ellipse

We have come a long way. However, the ambiguities regarding the determination of an angle from matrix elements still is worrisome. While some examples gave us the feeling that eqs (44) and (45) will help to narrow down the angle, the basic choice between (λ₂ – λ₁) ≥ 0 or (λ₂ – λ₁) < 0 lurks in the background. Can we hope that we can get rid of this choice when reconstructing an ellipse from the elements of a matrix A_q? Is there something which indicates a method to derive an angle? Is there a matrix (dependent on A_q) which creates an ellipse fulfilling the conditions of eqs. (28) and (29) in a unique way?

Yes, there is: We can use a Cholesky decomposition of the inverse of A_q. Such a decomposition for a positive-definite and symmetric matrix actually is unique:

\[ \left[ \operatorname{\pmb{A}}_q \right]^{-1} \:=\: \operatorname{\pmb{K}}_{ch}\,\bullet\,\operatorname{\pmb{K}}_{ch}^{\operatorname{T}}\, . \tag{122} \]

K_ch is a lower triangular matrix. It is easy to show that K_ch transports vectors on a unit circle in a defined and unique way to our target ellipse. If you insert

\[ \operatorname{\pmb{A}}_q \:=\: \left[ \operatorname{\pmb{K}}_{ch}\,\bullet\,\operatorname{\pmb{K}}_{ch}^{\operatorname{T}}\right]^{-1} \]

into eq. (29) and rearrange terms according to matrix rules you find the definition of vectors for a unit circle:

\[ \left[ \operatorname{\pmb{K}}_{ch}^{-1} \circ \pmb{v}_E \right]^{\operatorname{T}} \circ \left[ \operatorname{\pmb{K}}_{ch}^{-1} \circ \pmb{v}_E \right] \:=\: 1 \,. \tag{123} \]

Thus

\[\pmb{v}_E = \operatorname{\pmb{K}}_{ch} \circ \pmb{v}_{c, ch} \tag{124} \]

for some vector v_c,ch to a point on a unit circle. It is also clear that K_ch operates in a unique way on vectors defining a unit circle. This in turn means that an ellipse defined via A_q should be well defined – without ambiguities.

Furthermore, K_ch gives us a valuable tool to recreate ellipses defined by A_q from vectors on a unit circle. We can use it to check other methods to re-construct ellipses from symmetric, invertible matrices A_q. I will show examples in a forthcoming post in this blog.

Note that A_E normally is not identical with the lower triangular matrix K_ch. That such an alternative way to create our ellipse with a different matrix than A_E exists has the following reason: As we start with a rotation-invariant unit circle, we can always add some initial rotation of this rotational symmetric object.

The elements of K_ch can be derived with a somewhat lengthy calculation from the elements of A_q. I omit it here. I just give you the result:

\[ \operatorname{\pmb{K}}_{ch} \:=\: \begin{pmatrix} k_1 &0 \\ k_3 & k_4 \end{pmatrix} \,, \tag{125} \]

with

\[\begin{align} k_1 \:&=\: \sqrt{\, {4 \, \gamma \over 4\, \gamma\, \alpha \,-\, \beta^2}\, } \,, \\[10pt] k_2 \:&=\: 0 \,, \\[10pt] k_3 \:&= \: -\, {\beta \over 2 \, \gamma} \, \sqrt{\, {4 \, \gamma \over 4\, \gamma\, \alpha \,-\, \beta^2}\, } \,, \\[10pt] k_4 \:&=\: \sqrt{ \,{1 \over \gamma}\, } \,. \end{align} \tag{126} \]

We will use this matrix in the next post for numerical and graphical examples.

Step 7: Clarifying the determination of the rotation angle from matrix elements

It is time to clarify the situation for the rotation angle of an ellipse defined by a matrix A_q . We need a proper analysis for all cases which might occur regarding the initial orientation of the stretched ellipse and certain values of the elements of A_q. The angles of the eigenvectors will not help much, because here the direction is only defined up to a sign – and we would also have to check for the right order of the half-axes with a positive angle of π/2 in between. Therefore, we refer again to our key eqs. (44) and (45).

\[ \begin{align} \gamma \,\, – \, \alpha \,&=\, \left( \, \lambda_2 \,-\, \lambda_1\,\right) \, \cos (2\,\phi) \,, \\[10pt] \,-\, \beta \:&=\: \left( \, \lambda_2 \,-\, \lambda_1 \, \right)\, \sin (2\,\phi) \,. \end{align} \]

Analysis of the allowed intervals for the rotation angle Φ

β as well as (γ – α) can become positive or negative. We also have the option of setting (h₁ – h₂) and equivalently (λ₂ – λ₁) greater or less than zero. This gives us 8 cases to distinguish. We also set the rule that |Φ| should be limited to avoid multiple solutions with a shift of 2π:

\[ -\, 3/4 \, \pi \:\le\: \phi \:\le\: 3/4 \pi \,. \]

We also demand that we should get a rather symmetric dependency on our helper angle ψ. Note that the sign of ψ depends on β:

\[ -\,\pi/4 \:\le\: \psi \:=\:\, {1 \over 2} \operatorname{arcsin}\left( {-\, \beta \over \left[ \beta^2 \, +\, \left( \gamma \,-\, \alpha \right)^2 \, \right]^{1/2} } \right) \:\le\: \pi/4 \,.\]

What can we say under these conditions about about the angle Φ and choice between λ_I or λ_II for λ₁ and λ₂ ? The following table summarizes what we can logically derive. Note that, depending on another definition of ψ, we could have used somewhat other schemes, too, but the given scheme provides a convincing symmetry in the results.

Table 1 – rotation angle determination for different cases and helper angle ψ

Table for determination of rotation angle of a matrix-defined ellipse

This table tells you exactly what to do if you have a matrix A_q and choose an initial scaling of the half-axes such that

\[ \begin{align} \textbf{ Case 1 : } &\quad \lambda_2 \,-\, \lambda_1 \:\ge\: 0 \,, \\[10pt] \textbf{ Case 2 : } &\quad \lambda_2 \,-\, \lambda_1 \:\lt\: 0 \,. \end{align} \]

In case 1, you should assign λ_I (see eq. (101)) to λ₁ and λ_II (see eq. (102)) to λ₂ , in case 2 you should do the opposite. From the λ-values you then get the length of the respective half-axes. The rotation angle then follows from ψ as it was defined in eq. (50). This prescription can directly be encoded in Python programs.

Testing another helper angle : We could also have used another helper angle as e.g.

\[ \omega \::=\:\, {1 \over 2} \operatorname{arcsin}\left( {-\, \beta \over \lambda_2 \, -\, \lambda_1 } \,\right)\,. \tag{127} \]

Then ω flips signs with β and (λ₂ – λ₁). The following table shows the effect:

Table 2 – rotation angle determination for different cases and helper angle ω

2nd table for the rotation angle of a matrix-defined ellipse

You see that as long as (λ₂ – λ₁) and (γ – α) have the same sign, then only the sign of β is relevant for the calculation of Φ.

Simplification for the determination of the rotation angle

Now, we come to an important point: If you look just from a geometrical point of view onto the tables and take into account the symmetries of an ellipse, you see that for given (fixed) values of α, β and γ

case #7 in the table is equivalent to case #1,
case #8 is equivalent to case#2,
case#5 is equivalent to case #3
and case #6 is equivalent to case #4.

I leave it to the reader to visualize this with some schematic drawings. Please, also mark the upper part of both axes whilst rotating. You will see that we get an exact match of the final ellipse. I will provide plots for real examples in one of the next posts.

Actually, the equivalence of the named cases is a a natural thing: We can get one and the same target ellipse from two initial configurations of an axis-parallel ellipse: Either the longer or the shorter axis is aligned with the x-axis. The angle to one and the same rotated ellipse is different by just π/2.

Our result means two things:

The upper part of the table covers already all visually identical cases.
The ambiguity in discriminating between Case 1 and Case 2 can be resolved:
When we get a matrix A_q describing some elliptic data, we can always choose (λ₂ – λ₁) > 0 and then determine the angle Φ from the upper part of table 1, without missing any solution.

We should see this in numerical tests.

Conclusion

In this post we have derived essential properties of centered, rotated ellipses from two matrix-based representations. Such calculations become relevant when e.g. experimental or numerical data only deliver the coefficients of a quadratic form for the ellipse. We have shown how the elements of relevant matrices are related to quantities like the lengths of the principal axes of the ellipse and the inclination of these axes against the x-axis of the chosen coordinate system.

We have seen that the determination of the ellipse’s rotation angle deserves some precision. Determining the right angle depends on characteristic properties of the matrix elements. We have to follow some rules which I have given in a table.

In one of the next posts of this mini-series,

Ellipses via matrix elements – II – numerical tests of formulas

I will briefly show plots of numerical examples, which illustrate some of the results of the present post.

Ellipses via matrix elements – I – basic derivations and formulas

Introduction

Questions about ellipses

Three defining matrices for centered ellipses

Major mathematical steps

Step 1: Matrix AE for the creation of a centered and rotated ellipse – scaling of a unit circle followed by a rotation

Quadratic form describing a centered, un-rotated, axis-parallel ellipse

Step 2: Quadratic form defining a general centered and rotated ellipse

Step 3: Quadratic form of an ellipse: Definition by a symmetric, invertible (2×2)-matrix Aq

How to derive h1, h2 and Φ from the elements of AE or Aq in the general case?

Step 4.1: Derivation of h1, h2 and Φ from the elements of matrix Aq via trigonometric relations

Two basic cases we have to distinguish

Case 1: λ2 ≥ λ1

Case 2: λ2 < λ1

Remaining ambiguities?

Example: Standardized BVD

Step 4.2: Derivation of h1, h2 and Φ from the elements of AE via trigonometric relations

Determination of the inclination angle Φ via elements of AE

Step 5: 2nd way to a solution for h1, h2 and Φ via eigendecomposition of Aq

Proof for the eigenvalues and eigenvector components

A formula for the rotation angle based on the orientation of the eigenvectors

Step 6: Cholesky decomposition of the inverse of [Aq] – and yet another way to reconstruct our ellipse

Step 7: Clarifying the determination of the rotation angle from matrix elements

Analysis of the allowed intervals for the rotation angle Φ

Simplification for the determination of the rotation angle

Conclusion

Step 1: Matrix A_E for the creation of a centered and rotated ellipse – scaling of a unit circle followed by a rotation

Step 3: Quadratic form of an ellipse: Definition by a symmetric, invertible (2×2)-matrix A_q

How to derive h₁, h₂ and Φ from the elements of A_E or A_q in the general case?

Step 4.1: Derivation of h₁, h₂ and Φ from the elements of matrix A_q via trigonometric relations

Case 1: λ₂ ≥ λ₁

Case 2: λ₂ < λ₁

Step 4.2: Derivation of h₁, h₂ and Φ from the elements of A_E via trigonometric relations

Determination of the inclination angle Φ via elements of A_E

Step 5: 2nd way to a solution for h₁, h₂ and Φ via eigendecomposition of A_q

Step 6: Cholesky decomposition of the inverse of [A_q] – and yet another way to reconstruct our ellipse