- Markov Chain Monte Carlo methods
- Metropolis Hastings
- Gibbs
- convergence / representativeness
- trace plots
- R hat
- efficiency
- autocorrelation
- effective sample size
Bayes rule for data analysis:
\[\underbrace{P(\theta \, | \, D)}_{posterior} \propto \underbrace{P(\theta)}_{prior} \times \underbrace{P(D \, | \, \theta)}_{likelihood}\]
normalizing constant:
\[ \int P(\theta') \times P(D \mid \theta') \, \text{d}\theta' = P(D) \]
easy to solve only if:
# get density values and samples xVec = seq(-4, 4, length.out = 10000) xDense = dnorm(xVec, mean = 0, sd = 1) # density of a standard normal xSamples = rnorm(10000, mean = 0, sd = 1) # 10000 samples from a standard normal # plot actual densities and estimated densities from samples plotData = data.frame(xSamples = xSamples) # put data in data frame for plotting myPlot = ggplot(data = plotData, aes(x = xSamples)) + # prepare plot geom_density() + # density estimates geom_line(x = xVec, y = xDense, color = "blue") + # actual densities xlab("x") + ylab("(estimated) probability") + # axis labels theme(plot.background=element_blank()) # transparent background show(myPlot)
examples:
problem:
dummy
solution:
intuition
Markov property
\[P(X_{n+1} = x_{n+1} \mid X_0 = x_0, \dots, X_n = x_n) = P(X_{n+1} = x_n \mid X_n = x_n)\]
get sequence of samples from \(X \sim P\) s.t. every sample depends on its predecessor, such that:
dummy
clever software helps determine when/how to do Gibbs and how to tune MH's proposal function (next class)
convergence/representativeness
efficiency
require('coda') require('ggmcmc')
coda and ggmcmc basically do the same thing, but they differ in, say, aesthetics
out = MH(f, 60000,2,10000) # this is the output of the program of Ex 4 in HW 2 # its an MCMC-list from 'coda' package: i.e., a list # of MCMC-objects show(summary(out))
## ## Iterations = 1:50000 ## Thinning interval = 1 ## Number of chains = 2 ## Sample size per chain = 50000 ## ## 1. Empirical mean and standard deviation for each variable, ## plus standard error of the mean: ## ## Mean SD Naive SE Time-series SE ## mu 0.004904 0.07475 0.0002364 0.0012889 ## sigma 1.051322 0.05314 0.0001680 0.0008058 ## ## 2. Quantiles for each variable: ## ## 2.5% 25% 50% 75% 97.5% ## mu -0.1484 -0.0424 0.006156 0.0562 0.1471 ## sigma 0.9537 1.0136 1.050294 1.0867 1.1619
out2 = ggs(out) # cast the same information into a format that 'ggmcmc' uses show(head(out2)) # this is now a data.frame
## Source: local data frame [6 x 4] ## ## Iteration Chain Parameter value ## (int) (int) (fctr) (dbl) ## 1 1 1 mu -0.01714470 ## 2 2 1 mu -0.09520337 ## 3 3 1 mu -0.09520337 ## 4 4 1 mu -0.09520337 ## 5 5 1 mu -0.09520337 ## 6 6 1 mu -0.09520337
plot(out)
ggs_traceplot(out2)
EJ Wagenmakers:
\(\hat{R}\)-statistics,
gelman.diag(out)
## Potential scale reduction factors: ## ## Point est. Upper C.I. ## mu 1 1.00 ## sigma 1 1.01 ## ## Multivariate psrf ## ## 1
ggs_Rhat(out2)
autocorr.plot(out)
ggs_autocorrelation(out2)
\[\text{ESS} = \frac{N}{ 1 + 2 \sum_{k=1}^{\infty} ACF(k)}\]
effectiveSize(out)
## mu sigma ## 3365.707 4355.494