{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Dekorrelation von normalverteilten Daten\n",
"\n",
"Sei $A$ eine Matrix und $\\mu$ ein Vektor. Wenn $Z$ ein Vektor von standard-normalverteilten Zufalls-Vektoren ist,\n",
"dann ist $ZA^T + \\mu^T$ ein Vektor von normalverteilten Zufallszahlen mit Mittelwert $\\mu$ und der Varianz-Kovarianz-Matrix\n",
"$$\n",
"\\Sigma = A A^T.\n",
"$$\n",
"\n",
"Wenn, andersherum, eine Stichprobe $X$ aus einer multivariaten Normalverteilung mit Mittelwert $\\mu$ und\n",
"Varianz-Kovarianz-Matrix $\\Sigma$ gegeben ist, dann gibt es genau eine **untere Dreiecksmatrix** $A$ mit der Eigenschaft\n",
"$$\n",
"\\Sigma = A A^T.\n",
"$$\n",
"\n",
"Diese **Cholesky-Faktorisierung** ist nützlich, weil dann $Z = (X - \\mu) (A^T)^{-1}$ standard-normalverteilt ist. \n",
"Durch diese Transformation wird die Stichprobe $X$ **dekorreliert**. Außerdem kann die Cholesky-Faktorisierung\n",
"benutzt werden, um Zufallszahlen aus beliebigen multivariaten Normalverteilungen zu generieren.\n",
"\n",
"Wenn man die Bedingung, dass $A$ eine untere Dreiecksmatrix sei, aufgibt, ist die Faktorisierung einer\n",
"Varianz-Kovarianz-Matrix **nicht eindeutig**.\n",
"\n",
"### Beispiel\n",
"\n",
"$$ \\begin{align}\n",
" A &= \n",
" \\begin{pmatrix}\n",
" \\frac{\\sqrt(2)}{2} & \\frac{\\sqrt(2)}{2}\\\\\n",
" -\\frac{\\sqrt(2)}{2} & \\frac{\\sqrt(2)}{2}\n",
" \\end{pmatrix}\\\\\n",
" A A^T &= \\mathbf{I} \\\\\n",
" &= \\mathbf{I} ~\\mathbf{I}^T\n",
"\\end{align}\n",
"$$"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"A matrix: 2 × 2 of type dbl\n",
"\n",
"\t0.7071068 | 0.7071068 |
\n",
"\t0.7071068 | -0.7071068 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A matrix: 2 × 2 of type dbl\n",
"\\begin{tabular}{ll}\n",
"\t 0.7071068 & 0.7071068\\\\\n",
"\t 0.7071068 & -0.7071068\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A matrix: 2 × 2 of type dbl\n",
"\n",
"| 0.7071068 | 0.7071068 |\n",
"| 0.7071068 | -0.7071068 |\n",
"\n"
],
"text/plain": [
" [,1] [,2] \n",
"[1,] 0.7071068 0.7071068\n",
"[2,] 0.7071068 -0.7071068"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"(A <- matrix(c(1, 1, 1, -1), nrow=2)/sqrt(2))\n"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A matrix: 2 × 2 of type dbl\n",
"\n",
"\t1 | 0 |
\n",
"\t0 | 1 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A matrix: 2 × 2 of type dbl\n",
"\\begin{tabular}{ll}\n",
"\t 1 & 0\\\\\n",
"\t 0 & 1\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A matrix: 2 × 2 of type dbl\n",
"\n",
"| 1 | 0 |\n",
"| 0 | 1 |\n",
"\n"
],
"text/plain": [
" [,1] [,2]\n",
"[1,] 1 0 \n",
"[2,] 0 1 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"(Sigma = A %*% t(A))"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"A matrix: 2 × 2 of type dbl\n",
"\n",
"\t1 | 0 |
\n",
"\t0 | 1 |
\n",
"\n",
"
\n"
],
"text/latex": [
"A matrix: 2 × 2 of type dbl\n",
"\\begin{tabular}{ll}\n",
"\t 1 & 0\\\\\n",
"\t 0 & 1\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A matrix: 2 × 2 of type dbl\n",
"\n",
"| 1 | 0 |\n",
"| 0 | 1 |\n",
"\n"
],
"text/plain": [
" [,1] [,2]\n",
"[1,] 1 0 \n",
"[2,] 0 1 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"chol(Sigma)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Die Matrix $A$ rotiert alle Datenpunkte um $45°$. Eine Rotation einer Standard-Normalverteilung\n",
"bleibt jedoch standard-normalverteilt, so dass das Ergebnis auch durch die Identitätstransformation\n",
"(trivial) dekorreliert wird."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "R",
"language": "R",
"name": "ir"
},
"language_info": {
"codemirror_mode": "r",
"file_extension": ".r",
"mimetype": "text/x-r-source",
"name": "R",
"pygments_lexer": "r",
"version": "3.6.0"
}
},
"nbformat": 4,
"nbformat_minor": 2
}