In an another post on properties of a Bivariate Normal Distribution [BVD] I have motivated the form of its probability density function [pdf] by symmetry arguments and the underlying probability density functions of its marginals, namely 1-dimensional Gaussians. In this post we will derive the probability density function by following the line of argumentation for a general Multivariate Normal Distribution [MVD]. We regard the a BVD as the result of a linear transformation applied to a random vector of two independent 1-dimensional Gaussian random variables.
μjis the mean value and σj is the square root of the variance of the Wj distribution.
We use a 2-dimansional Cartesian coordinate system [CCS] to work with random vectors W = (W1, W2)T composed of such distributions – and respective concrete vectors w. To make things simpler, we center the distributions by choosing an appropriate location of the origin of our CCS. We also standardize each of the distributions Wj. Then we get distributions Zj and a respective random vector Z, which assumes concrete vectors z (i.e, Z = z) according to a pdf gz(z):
For the purpose of a later generalization I have introduced a coupling matrix. For our random vector Z, it is just the identity matrix I.
Reminder 2: Probabilty density function of a BVD
In another post we have already seen that we could write the probability density function g2(x, y) function of a BVD in a characteristic vector notation for two correlated (Gaussian) random variables X and Y:
\[ \pmb{V} = \begin{pmatrix} X \\ Y \end{pmatrix}, \quad \mbox{concrete values}: \, \pmb{v} = \begin{pmatrix} x \\ y \end{pmatrix}, \quad \pmb{\mu} = \begin{pmatrix} \mu_x \\ \mu_y \end{pmatrix}, \quad \pmb{v}_{\mu} = \begin{pmatrix} x – \mu_x \\ y – \mu_y \,. \end{pmatrix} \tag{5}
\]
g2c(x, y) is the probability density in centered CCS, in which we have μ = 0. The coupling matrix in this case is the inverse of the so called variance-covariance matrix (or just covariance matrix), describing the correlation of X and Y:
Below I just regard centered distributions with zero expectation values.
From two independent standardized Gaussian distributions to a BVD
We work in a centered CCS. We assume that we have a random vector of two independent standardized Gaussian normal distributions Z1 and Z2, each with the standardized Gaussian probability densities defined above. We then apply a linear transformation via a 2×2-matrix M to get another centered random vector Vm :
We assume that M is invertible (to avoid cases of degradation). The bullet above indicates the standard matrix multiplication.
Regarding the probability density gv(vm) for a concrete vector vm = (x, y)T of the transformed random vector VM we must take into account a change of volume elements by this transformation. I.e., we have to take into account the Jacobian determinant of the transformation:
By comparison with the results in the above section “Reminder 2”, we see that we have already reached our desired form and that
we should identify Vm with V,
we should identify gv(vm) with g2c(x, y),
we should identify Σm with the covariance matrix Σ .
With M being reversible, one can show that Σm indeed is a positive definite symmetric matrix. A thing that is still open is to prove that Σm really represents a covariance matrix of the transformed random vector.
Variance and respective matrix
Let us determine the variance of our transformed random vector Vm. We take a formal road according to general definitions. Two aspects are important:
(1) An expectation value of a random vector is defined via the expectation values of its components. Thus, the expectation vector of a random vector S is just the vector composed of the expectation values of its (marginal) component distributions:
Note the order of transposition in the definition of cov! A (vertical) vector is combined with a transposed (horizontal) vector. The rules of a matrix-multiplication then give you a matrix as the result! It contains all combinations of the components.
The expectation value has to be determined for every element of the matrix. Thus, the interpretation of the notation above for the 2-dimenional case is: (a) Pick all pairwise combinations (Sj, Sk) of the component distributions. (b) Calculate the covariance of the pair cov(Sj, Sk) and put it at the (j,k)-place inside the matrix. See this post for more details. Note that the covariance of a distribution with itself is identical to the variance of the distribution: cov(S1, S1) = var(S1).
The above matrix is the (variance-) covariance matrix of a 2-dim random vector S, which we also abbreviate with ΣS. For a general squared transformation matrix M one can show that
From the definition of the covariance it is easy to derive the following relations for our centered, standardized special random vector Z (with independent component distributions):
The proof that ρ is the Pearson correlation coefficient for cov(X, Y) has already been given in another post by explicit integration of gv(vm).
Conclusion
In this post we have shown that a general centered Bivariate Normal Distribution can be regarded as the result of a linear transformation of a random vector Z for two independent Gaussian distributions Z1 and Z2. We have confirmed the general form of the probability density function with the exponent written in vector form. The central matrix appearing is the inverse of the covariance matrix of the transformed random vector (X, Y)T = M • Z.
Unfortunately, we have so far neither an explicit rule based on the correlation coefficient ρ for constructing correlated X, Y distributions based on Z1, Z2. We will solve this problem in a forthcoming post on BVDs in this blog. Such a construction rule will later allow for an explicit parameterization of the contour lines of a BVD.