Please read this: Name your file homework2_lastname_firstname. This set is due 11/18 before class. The submission should be in Rmarkdown, and should go to Fabian with subject BDA: Homework 2 (submit both .Rmd and .html / .pdf). We encourage discussing exercises with fellow students, but only individual solutions will be accepted.
Read the blog post on p-value logic.
Explain in your own words what “power” is in this context.
Write a function, ppv, that gives the positive predictive value, and a function, fdr, that gives the false discovery rate. Both functions take as arguments the prior probability of \(H_1\), the nominal \(\alpha\) rate, and the statistical power \(1 - \beta\).
ppv <- function(prior, alpha, power) {
# TODO
}
fdr <- function(prior, alpha, power) {
# TODO
}
\[ P(H_1|p < \alpha) = \frac{P(p < \alpha|H_1)P(H_1)}{P(p < \alpha)} \]
Given three coin flips, all coming up heads, what is the probability that the next coin flip will show heads? Give the prediction via maximum likelihood and – assuming an uninformative prior – Bayesian inference (use the mean, median, and mode of the posterior). Discuss your findings.
Note: Make every step explicit. How do you derive the MLE? How do you derive the posterior? Etc.
Let’s look at data for sale rates after spending various amounts on advertising on TV, radio or in newspapers. We are interested in the question which outlet for advertising “significantly” influences sale rates. Download the data from here, and import it like so:
dat <- read.csv('Advertising.csv', header = TRUE)[, -1]
classicalModel <- lm(Sales ~ TV + Radio + Newspaper, data = dat)
summary(classicalModel)
library('BayesFactor')
dat <- read.csv('Advertising.csv', header = TRUE)[, -1]
BayesModels <- regressionBF(Sales ~ TV + Radio + Newspaper, data = dat)
summary(BayesModels)
Look at this code whose MCMC sample chain output we inspected in class:
require('coda')
fakeData = rnorm(200, mean = 0, sd = 1)
f = function(mu, sigma){
if (sigma <=0){
return(0)
}
priorMu = dunif(mu, min = -4, max = 4)
priorSigma = dunif(sigma, min = 0, max = 4)
likelihood = prod(dnorm(fakeData, mean = mu, sd = sigma))
return(priorMu * priorSigma * likelihood)
}
MH = function(f, iterations = 50, chains = 2, burnIn = 0){
out = array(0, dim = c(chains, iterations - burnIn, 2))
dimnames(out) = list("chain" = 1:chains,
"iteration" = 1:(iterations-burnIn),
"variable" = c("mu", "sigma"))
for (c in 1:chains){
mu = runif(1, min = -4, max = 4)
sigma = runif(1, min = 0, max = 4)
for (i in 1:iterations){
muNext = mu + runif(1, min = -1.25, max = 0.25)
sigmaNext = sigma + runif(1, min = -0.25, max = 0.25)
rndm = runif(1, 0, 1)
if (f(mu, sigma) < f(muNext, sigmaNext) | f(muNext, sigmaNext) >= f(mu, sigma) * rndm) {
mu = muNext
sigma = sigmaNext
}
if (i >= burnIn){
out[c,i-burnIn,1] = mu
out[c,i-burnIn,2] = sigma
}
}
}
return(mcmc.list(mcmc(out[1,,]), mcmc(out[2,,])))
}
out = MH(f, 60000,2,10000)
# save(out, file = "out.Rdata")