Assignment 1#
You may discuss homework problems with other students, but you have to prepare the written assignments yourself.
Please combine all your answers, the computer code and the figures into one PDF file, and submit a copy to your folder on Gradescope.
Grading scheme: 10 points per question, total of 50.
Due date: 11:59 PM Monday April 15, 2025 (Tuesday evening) 11:59 PM Thursday April 17, 2025.
Download RMarkdown#
Part A: From Mardia, Kent & Biiby#
Problem 3.4.2
Problem 3.4.9
Problem 3.4.20
Problem 3.6.1
Problem 5.5.1
Part B#
Let \(\mathbb{R}^{n \times p} \ni W \sim \text{Wishart}(n, \Sigma)\). Show that the covariance matrix of \(W\) is, for symmetric matrices \(\Delta, \tilde{\Delta}\)
(Hint: use the fact that the cumulant generating function (log moment generating function) of \(-W/2\) is \(-n/2 \log \det \Theta\) with \(\Theta=\Sigma^{-1}\).)
If \(X\) is a random matrix with covariance given by the Kronecker product \(\Sigma_R \otimes \Sigma_C\), show that
Conclude that a Wishart has covariance given by \(2 n \Sigma \otimes \Sigma\).
Part C#
Consider a bivariate normal distribution \( X \sim N(\mu, \Sigma)\) with $\( X \sim N \left(\mu = \begin{pmatrix} 2 \\ -1 \end{pmatrix}, \Sigma = \begin{pmatrix} 1 & -0.4 \\ -0.4 & 1.3 \end{pmatrix}\right) \)$
Using the language of your choice, plot an ellipsoid that has 95% probability under this distribution.
Compute the law of \(X_2|X_1=x_1\) as an explicit function of \(x_1\) for \(\mu\) and \(\Sigma\) as above. What is the support of the law?
Compute the law of \(X|X_1=x_1\) as an explicit function of \(x_1\) for \(\mu\) and \(\Sigma\) as above. What is the support of the law?
Suppose \(\mu\) was unknown. Show that \(N=X - \Sigma[,1] \frac{X_1}{\Sigma[1,1]}\) is independent of \(X_1\). Show that the law of \(X | N\) depends only on \(\mu_1\) and \(\Sigma\) but not \(\mu_2\). What is the support of \(X|N=n\)?
Suppose you had observed a value \(X=(2.32,-0.57)\). Compute \(n\) the observed value of \(N\) and describe the law of \(X|N=n\). Describe a set that contains 95% probability for this law.
Generalize 4. to a case of higher than 2: suppose \(\mathbb{R}^{n_1+n_2} \ni X = (X_1, X_2)\) with mean \((\mu_1, \mu_2)\) and a similarly partitioned covariance matrix. Construct \(N\) such that \(X|N\) depends only on \(\mu_1\). What is the support of \(X|N=n\)? Does \(\Sigma\) need to be non-degenerate? Explain.
Part D#
The two-sample Hotelling \(T^2\) test (with large sample size so we can
assume \(\Sigma\) is roughly known) looks for differences in all
possible directions \(a \in \mathbb{R}^p\) by maximizing a (suitably
standardized) test of \(H_{0,a}:\mu_1^Ta=\mu_2^Ta\) over all directions
\(a \in \mathbb{R}^p\). An alternative test would be to only maximize
over coordinate directions. Call this the maxT test. For
covariance, use the sample covariance of the Sardinia.inland and Sardinia.coast
for the region in the oliveoil data ignoring the eicosenoic measurement.
library(pdfCluster)
data(oliveoil)
Construct distributions \(F_1 = N(\mu_1, \Sigma)\) and \(F_2 = N(\mu_2, \Sigma)\), for which the \(T^2\) test has noticably more power than the
maxTtest. Provide plots as a function of sample size (say with fixed \(\mu_1-\mu_2\)) illustrating this difference in power. (How did you calibrate themaxTtest?) (If this turns out to be difficult, you can choose your own covariance matrix instead of that from theoliveoil).Repeat 1. but switching \(T^2\) and
maxT. (If this turns out to be difficult, you can choose your own covariance matrix instead of that from theoliveoil).Carry out the Hotelling \(T^2\) and
maxTtests, comparing the two regions of Sardinia.Do you think this data fits the hypothesized model reasonably well?