Please read this: Name your file homework3_lastname_firstname. This set is due 12/16 before class. The submission should be in Rmarkdown, and should go to Fabian with subject BDA: Homework 4 (submit both .Rmd and .pdf). We encourage discussing exercises with fellow students, but only individual solutions will be accepted.

1. Estimating a model’s evidence by clever sampling

Use the method of clever sampling introduced in the lecture to estimate the evidence of the exponential and the power model of forgetting curves, where we use an approximation of the posterior (using a Gamma distribution, just like in class) for aligning distribution \(h\). The models are as before with priors \(a,b,c,d \sim \text{Unif}(0,1.5)\). Let’s assume that the data is:

y = c(.93, .81, .39, .25, .24, .15)
t = c(  1,   3,   6,   9,  12,  18)
obs = y*100

Here’s what you need to do, step by step (once for each model):

  1. Get the posteriors for the new data, using JAGS and the implementation of the models from the lecture.

  2. Find Gamma-approximations for the posterior, using function getGammaApprox from the lecture. Notice that you have to massage the samples into the right format (as done in the lecture, where the function getGammaApprox was introduced.)

  3. Use enough JAGS samples from the posterior distribution to calculate, where \(h\) is approximated by your Gamma distributions:

\[ \begin{align*} \frac{1}{D} & \approx \frac{1}{n} \sum^{n}_{\theta_i \sim P(\theta \mid D)} \frac{h(\theta)}{P(D \mid \theta) P(\theta)} \end{align*} \]

  1. Calculate the Bayes factor from the estimated evidences and interpret the result.

2. Transdimensional MCMC

Our data are, as before, \(k=7\) and \(N=24\). Look at the two models that we considered in class for this case:

Use transdimensional MCMC to compare the models. Compute the posterior model odds, given equal priors for each model. We did calculate the Bayes factor in class, so you should be able to verify your results.

3. A night at the club

Jack & Jill debate whether they should enter club Whatever tonight. They only want to go if the women/men ratio is about even. Otherwise the sphere just isn’t, like, right, you know. Before they decide to pay, they screen the crowd that is lining up at the entrance. It is now 1:30 and there’s 28 people in line, of which 9 female, 16 men, and three unidentifiable. How should they test whether the women/men ratio is even in this particular case? Should they use estimation, model comparison or \(p\)-value testing? If you cannot decide, discuss pros and cons for each. (Think about how they should construe a likelihood function, whether a reasonable prior is at hand, whether the sampling procedure is reasonably clear, whether model complexity plays a role, etc. Suppose for the sake of argument, that computational complexity is not a problem: Jack & Jill have brought their laptops, of course.)