\documentclass[11pt]{article}
\def\sol{1}
\usepackage{fullpage,graphicx}
\usepackage{amsmath,amsfonts,amsthm,amssymb,multirow}
\usepackage{algorithmic,comment,url,enumerate}
\usepackage{tikz}
\usepackage{framed}
\usetikzlibrary{decorations.pathreplacing, shapes}
\usepackage[ruled,vlined,commentsnumbered,titlenotnumbered]{algorithm2e}
\newcommand{\expecting}[1]{\noindent{[\textbf{We are expecting:} \em #1\em]}}
\newcommand{\hint}[1]{\noindent{[\textbf{HINT:} \em #1 \em]}}
\newcommand{\pts}[1]{\textbf{(#1 pt.)}}
\newcommand{\sgn}{\mathrm{sgn}}
\definecolor{shadecolor}{gray}{0.95}
\newcommand{\clarification}[1]{\noindent{[\textbf{Clarification:} \em #1 \em]}}
\newcommand{\bE}{\ensuremath{\mathbb{E}}}
\newcommand{\Var}{\operatorname*{Var}}
\newtheorem{theorem}{Theorem}
\newcommand{\Z}{\mathbb{Z}}
\begin{document}
\noindent
\textbf{Problem Set 2} \hfill CS265, Fall 2020 \newline
Due: October 2 (Friday) at 23:59 (Pacific Time)
\ifdefined\sol
\newline
Group members: INSERT NAMES HERE
\fi
%\today
\vspace{.4cm}\noindent
Please follow the homework policies on the course website.
\noindent
\rule{\linewidth}{0.4pt}
\begin{enumerate}
\item \pts{6} \textbf{[Gotta catch 'em all?]}
Let $M$ be an unknown set of molecules of size $|M|=n$ that are all present in a liquid solution. You want to identify the set $M$ using an experiment. One run of your experiment on the solution can identify and output a uniformly random molecule from the set $M$. You can conduct multiple experiments on this solution. Assume that the result of each experiment is independent of the others.
\begin{enumerate}
\item \pts{1} Give the best lower bound you can to the expected number of experiments you must run to identify all the $n$ distinct molecules in $M$. To identify a molecule, it must appear as the output of at least one experiment.
Use big Omega notation to report a simple answer.
\item \pts{4} Suppose the set $M$ of molecules is structured enough for the following to be possible. If you know any $0.99n$ of the items in $M$, you can infer the other $0.01n$. Thus you will stop conducting experiments after identifying $0.99n$ distinct molecules. What is the expected number of experiments? Show your work and use big O notation to report a simple answer.
\hint{Linearity of expectation is still your friend.}
\item \pts{2} Solution A contains molecules from a set $S$ of size $n$. However, $S$ has no helpful structure. To learn $S$ from Solution A, you use Strategy 1.
\textbf{Strategy 1:} Run experiments on Solution A until each of the $n$ molecules of $S$ has been observed as the output of an experiment at least once.
On the other hand, Solution B contains molecules from a different and larger set $S'$. $|S'|=10n$ and one can infer the set $S$ from $S'$. Moreover, $S'$ \textit{is} nicely structured. You can infer $S'$ from any of its subsets of size $9.9n$. To learn $S$ from Solution B, you use Strategy 2.
\textbf{Strategy 2:}
\begin{enumerate}
\item Run experiments on Solution B until at least $9.9n$ distinct molecules have appeared as the output of an experiment at least once each.
\item Infer the set $S$ from the subset of $S'$ of size $9.9n$ you now know.
\end{enumerate}
Your goal is to find the set $S$ and minimize the expected number of experiments you need to run. Do you choose Strategy 1 or 2?\footnote{This scenario is less contrived than you might think, and features in systems where information is stored in DNA. In these systems, enlarging the set from $S$ to $S'$ corresponds to using an error-correcting-code to add redundancy.} Provide a sentence or two of justification for your answer.
\item \pts{0} \textbf{[Optional: this won't be graded.]}
Can you strengthen the argument for your answer to part (c) by coming up with high probability statements for parts (a) and (b) rather than statements in expectation?
\hint{Try to compute an appropriate variance and use Chebyshev's inequality}
\end{enumerate}
\ifdefined\sol
\begin{shaded}
\textbf{SOLUTION:}
\end{shaded}
\fi
\item \pts{10} \textbf{[Markov's Inequality and Alice's Magic.]}
\begin{enumerate}
\item
\pts{4} Let $Z$ be some random variable supported on $(-\infty, m]$ with $\bE[Z] = \mu$.
What is the largest possible value of $\Pr[Z \leq t]$ over all such random variables $Z$?
Express your answer in terms of $m, \mu, t$ and justify it.
Be careful to consider all possible values of $m,\mu,t$.
\hint{Try applying Markov's inequality to $m - Z$. Do not forget to show an example distribution (which may depend on $m, \mu, t$) that achieves this largest possible $Pr[Z \leq t]$.}
\item
\pts{1} Let $X_0,\cdots,X_k$ be a sequence of numbers with the following properties:
\begin{enumerate}
\item $X_0 = 0$;
\item for every $i = 1,\cdots,k$, either $X_{i}\leq 100$ or $X_{i} = X_{i-1} + 1$,
i.e., $X_i \in (-\infty, 100] \cup \{X_{i-1} + 1\}$.
\end{enumerate}
What is the largest possible value of $X_k$?
Express your answer as a function of $k$.
You can assume $k$ is a positive integer,
and you do {\itshape not} need to justify your answer for this problem.
\item
\pts{5}
Alice is a magician who has granted you some large finite number of wishes. You want to use this to make easy money, having started off with $0$ dollars. Alice's magic is not perfect, so she can only grant you the following money-growing wish.
If you have $X \leq 99$ dollars when you ask for a wish, you will have $X'$ dollars when the wish is granted.
Here $X'$ is a random variable drawn from an unknown distribution supported on $(-\infty, 100]$ with mean $X+1$. If $X>99$, Alice's magic will change nothing and your wish will be wasted.
So, once you have more than $99$ dollars, you will stop using your wishes on money and use them for something else.\footnote{As you may have noticed,
you may be unfortunate and get negative money (debt) at some point.
Don't worry. This problem shows that you will very likely reach more than 99 dollars after making sufficiently many wishes.}
You want to understand how many wishes you will need to spend on this money-growing to get to $99$ dollars. Let $N$ denote the number of wishes you use to reach a sum of more than $99$ dollars.
Prove that $\Pr[N > k] \leq 99/k$ for any positive integer $k$.
\hint{There is an elegant solution that simply applies part (a) to
a cleverly constructed random variable $Z$.}
\hint{
You can still pretend to ask for wishes even after you have more than 99 dollars
and define the behavior of those wishes however you want. Try to define it wisely to make the problem easier.
You may find ideas from part (b) useful.}
\hint{There is also a solution that uses neither of the above hints.}
\end{enumerate}
\ifdefined\sol
\begin{shaded}
\textbf{SOLUTION:}
\end{shaded}
\fi
\item \pts{0} \textbf{[This whole problem is optional and will not be graded.]}
In this problem, you'll analyze a different primality test than we saw in class. This one is called the \em Agrawal-Biswas Primality test. \em
Given a degree $d$ polynomial $p(x)$ with integer coefficients, for any polynomial $q(x)$ with integer coefficients, we say $q(x) \equiv t(x) \mod(p(x),n)$ if there exists some polynomial $s(x)$ such that $q(x)=s(x)\cdot p(x)+t(x) \mod n$.
(Here, we say that $\sum_i c_i x^i = \sum_i c'_i x^i \mod n$ if and only if $c_i = c'_i \mod n$ for all $i$.)
%
For example, $x^5+6x^4+3x + 1 \equiv 3x +1 \mod(x^2+x,5)$, since $(x^3)(x^2+x) + (3x+1) = x^5 + x^4 +3x + 1 \equiv x^5 + 6x^4 +3x + 1 \mod 5.$
\bigskip
\boxed{
\begin{minipage}{\textwidth}
\noindent
\textbf{Agrawal-Biswas Primality Test}.
\newline
Given $n$:
\begin{itemize}
\item If $n$ is divisible by 2,3,5,7,11, or 13, or is a perfect power (i.e. $n=c^r$ for integers $c$ and $r$) then output \textbf{composite}.
\item Set $d$ to be the smallest integer greater than $\log n$, and choose a random degree $d$ polynomial with leading coefficient 1: $$r(x)=x^d+c_{d-1}x^{d-1}+\ldots+c_1 x + c_0,$$ by choosing each coefficient $c_i$ uniformly at random from $\{0,1,\ldots,n-1\}.$
\item If $(x+1)^n \equiv x^n +1 \mod (r(x),n)$ then output \textbf{prime}, else output \textbf{composite}.
\end{itemize}
\end{minipage}
}
Consider the following theorem (you can assume this if you like, or for even more optional work, try to prove it!):
\begin{theorem}[Polynomial version of Fermat's little theorem]
\
\begin{itemize}
\item If $n$ is prime, then for any integer $a$, $(x-a)^n=x^n-a \mod n.$
\item If $n$ is not prime and is not a power of a prime, then for any $a$ s.t. $gcd(a,n)=1$ and any prime factor $p$ of $n$, $(x-a)^n \neq x^n - a \mod p.$ %[Hint: you just need to show that if you were to expand the left side, there is one term other than $x^n$ and $-a$ that does not vanish, modulo $p$.]
\end{itemize}
\end{theorem}
First, show that if $n$ is prime, then the Agrawal-Biswas primality test will always return \textbf{prime}.
Now, we will prove that if $n$ is composite, the probability over random choices of $r(x)$ that the algorithm successfully finds a witness to the compositeness of $n$ (and hence returns \textbf{composite}) is at least $\frac{1}{4d}$ .
\begin{enumerate}
\item Using the polynomial version of Fermat's Little Theorem, and the fact that, for prime $q$, every polynomial over $\Z_q$ that has leading coefficient 1 (i.e. that is ``monic'') has a unique factorization into irreducible monic polynomials, prove that the number of irreducible degree $d$ factors that the polynomial $(x+1)^n-(x^n+1)$ has over $\Z_p$ is at most $n/d$, where $p$ is any prime factor of $n$. (A polynomial is irreducible if it cannot be factored, for example $x^2 +1 = (x+1)(x+1) \mod 2$ is not irreducible over $\Z_2$, but $x^2+1$ is irreducible over $\Z_3$.)
\hint{Even though this question sounds complicated, the proof is just one line...}
\item Let $f(d,p)$ denote the number of irreducible monic degree $d$ polynomials over $\Z_p$. Prove that if $n$ is composite, and not a power of a prime, the probability that $r(x)$ is a witness to the compositeness of $n$ is at least $\frac{f(d,p) - n/d}{p^d}$, where $p$ is a prime factor of $n$.
\hint{ $p^d$ is the total number of monic degree $d$ polynomials over $\Z_p$.}
\item Now complete the proof, and prove that the algorithm succeeds with probability at least $1/(4d)$, leveraging the fact that the number of irreducible monic polynomials of degree $d$ over $\mathbb{Z}_p$ is at least $p^d/d-p^{d/2}.$ (You should be able to prove a much better bound, though $1/4d$ is fine.)
\hint{You will also need to leverage the fact that we chose $d > \log n$ and also explicitly made sure that $n$ has no prime factors less than $17$.}
\end{enumerate}
\end{enumerate}
\end{document}