%
% LaTeX Problem Set Template by Sachin Padmanabhan
% I created this when I was a freshman in CS 103,
% and I continue to use it to this day.
%
% Hope you enjoy!
%
% There may be problems with this template.
% If so, feel free to contact me.
%
\documentclass{article}
\usepackage{amsmath}
\usepackage{amssymb}
\usepackage{amsthm}
\usepackage{amssymb}
\usepackage{mathdots}
\usepackage[pdftex]{graphicx}
\usepackage{fancyhdr}
\usepackage[margin=1in]{geometry}
\usepackage{multicol}
\usepackage{bm}
\usepackage{listings}
\PassOptionsToPackage{usenames,dvipsnames}{color} %% Allow color names
\usepackage{pdfpages}
\usepackage{algpseudocode}
\usepackage{tikz}
\usetikzlibrary{automata,positioning}
\usepackage{enumitem}
\usepackage[T1]{fontenc}
\usepackage{inconsolata}
\usepackage{framed}
\usepackage{wasysym}
\usepackage[thinlines]{easytable}
\usepackage{wrapfig}
\usepackage{hyperref}
\usepackage{mathrsfs}
\usepackage{tabularx} % also loads 'array' package
\newcolumntype{C}{>{\centering\arraybackslash}X} % centered version of 'X' columns
\title{\vspace{-1.5cm} CS 103: Mathematical Foundations of Computing\\Problem Set \#8}
\author{Sachin Padmanabhan}
\date{\today}
\lhead{Sachin Padmanabhan}
\chead{Problem Set \#8}
\rhead{\today}
\lfoot{}
\cfoot{CS 103: Mathematical Foundations of Computing --- Winter 2018}
\rfoot{\thepage}
\newcommand{\abs}[1]{\lvert #1 \rvert}
\newcommand{\absfit}[1]{\left\lvert #1 \right\rvert}
\newcommand{\norm}[1]{\left\lVert #1 \right\rVert}
\newcommand{\eval}[3]{\left[#1\right]_{#2}^{#3}}
\renewcommand{\(}{\left(}
\renewcommand{\)}{\right)}
\newcommand{\floor}[1]{\left\lfloor#1\right\rfloor}
\newcommand{\ceil}[1]{\left\lceil#1\right\rceil}
\newcommand{\pd}[1]{\frac{\partial}{\partial #1}}
\newcommand{\inner}[1]{\langle#1\rangle}
\newcommand{\cond}{\bigg|}
\newcommand{\rank}[1]{\mathbf{rank}(#1)}
\newcommand{\range}[1]{\mathbf{range}(#1)}
\newcommand{\nullsp}[1]{\mathbf{null}(#1)}
\newcommand{\repr}[1]{\left\langle#1\right\rangle}
\newcommand\VRule[1][\arrayrulewidth]{\vrule width #1}
\DeclareMathOperator{\Var}{Var}
\DeclareMathOperator{\tr}{tr}
\DeclareMathOperator{\Tr}{\mathbf{Tr}}
\DeclareMathOperator{\diag}{\mathbf{diag}}
\DeclareMathOperator{\dist}{\mathbf{dist}}
\DeclareMathOperator{\prob}{\mathbf{prob}}
\DeclareMathOperator{\dom}{\mathbf{dom}}
\DeclareMathOperator{\E}{\mathbf{E}}
\DeclareMathOperator{\Q}{\mathbb{Q}}
\DeclareMathOperator{\R}{\mathbb{R}}
\DeclareMathOperator{\var}{\mathbf{var}}
\DeclareMathOperator{\quartile}{\mathbf{quartile}}
\DeclareMathOperator{\conv}{\mathbf{conv}}
\DeclareMathOperator{\VC}{VC}
\DeclareMathOperator*{\argmax}{arg\,max}
\DeclareMathOperator*{\argmin}{arg\,min}
\DeclareMathOperator{\Ber}{Bernoulli}
\DeclareMathOperator{\NP}{\mathbf{NP}}
\DeclareMathOperator{\coNP}{\mathbf{coNP}}
\DeclareMathOperator{\TIME}{\mathsf{TIME}}
\DeclareMathOperator{\polytime}{\mathbf{P}}
\DeclareMathOperator{\PH}{\mathbf{PH}}
\DeclareMathOperator{\SIZE}{\mathbf{SIZE}}
\DeclareMathOperator{\ATIME}{\mathbf{ATIME}}
\DeclareMathOperator{\SPACE}{\mathbf{SPACE}}
\DeclareMathOperator{\ASPACE}{\mathbf{ASPACE}}
\DeclareMathOperator{\NSPACE}{\mathbf{NSPACE}}
\DeclareMathOperator{\Z}{\mathbb{Z}}
\DeclareMathOperator{\N}{\mathbb{N}}
\DeclareMathOperator{\EXP}{\mathbf{EXP}}
\DeclareMathOperator{\NEXP}{\mathbf{NEXP}}
\DeclareMathOperator{\NTIME}{\mathbf{NTIME}}
\DeclareMathOperator{\DTIME}{\mathbf{DTIME}}
\DeclareMathOperator{\poly}{poly}
\DeclareMathOperator{\BPP}{\mathbf{BPP}}
\DeclareMathOperator{\ZPP}{\mathbf{ZPP}}
\DeclareMathOperator{\RP}{\mathbf{RP}}
\DeclareMathOperator{\coRP}{\mathbf{coRP}}
\DeclareMathOperator{\BPL}{\mathbf{BPL}}
\DeclareMathOperator{\IP}{\mathbf{IP}}
\DeclareMathOperator{\PSPACE}{\mathbf{PSPACE}}
\DeclareMathOperator{\NPSPACE}{\mathbf{NPSPACE}}
\DeclareMathOperator{\SAT}{\mathsf{SAT}}
\DeclareMathOperator{\NL}{\mathbf{NL}}
\DeclareMathOperator{\PCP}{\mathbf{PCP}}
\DeclareMathOperator{\PP}{\mathbf{PP}}
\DeclareMathOperator{\cost}{cost}
\let\Pr\relax
\DeclareMathOperator*{\Pr}{\mathbf{Pr}}
\definecolor{shadecolor}{gray}{0.9}
\theoremstyle{plain}
\newtheorem*{lem}{Lemma}
\theoremstyle{plain}
\newtheorem*{claim}{Claim}
\theoremstyle{definition}
\newtheorem*{answer}{Answer}
\newtheorem{theorem}{Theorem}[section]
\newtheorem*{thm}{Theorem}
\newtheorem{corollary}{Corollary}[theorem]
\newtheorem{lemma}[theorem]{Lemma}
\renewcommand{\headrulewidth}{0.4pt}
\renewcommand{\footrulewidth}{0.4pt}
\setlength{\parindent}{0pt}
\pagestyle{fancy}
\renewcommand{\thefootnote}{\fnsymbol{footnote}}
\begin{document}
\maketitle
\vspace{-1.5cm}
\section{Designing CFGs}
For each of the following languages, design a CFG for that language. \textit{\textbf{Please use our online tool to design,
test, and submit the CFGs in this problem}}. To use it, visit the CS103 website and click the ``CFG
Editor'' link under the ``Resources'' header. You should only have one member from each team submit
your grammars; tell us who this person is when you submit the rest of the problems through GradeScope.
\begin{enumerate}[label*=\roman*.,ref=\roman*]
\item Given $\Sigma = \{a, b, c\}$, write a CFG for the language $\{\ w \in \Sigma^*\ |\ w$ contains $aa$ as a substring $\}$. For
example, the strings $aa, baac,$ and $ccaabb$ are all in the language, but $aba$ is not.
\item Given $\Sigma = \{a, b\}$, write a CFG for the language $\{\ w \in \Sigma^*\ |\ w$ is \textit{not} a palindrome $\}$, the language
of strings that are not the same when read forwards and backwards. For example, $aab \in L$ and $baabab \in L$, but $aba \not\in L$, $bb \not\in L$, and $\varepsilon \not\in L$. \\
\textit{\textcolor{blue}{ Don't try solving this one by starting with the CFG for palindromes and making modifications to it. In general, there's no way to mechanically turn a CFG for a language L into a CFG for the language $\overline{L}$, since the
context-free languages aren't closed under complementation. However, the idea of looking at the first and last characters of a given string might be a good idea. }}
\item Let $\Sigma$ be an alphabet containing these symbols:
\begin{equation*}
\varnothing \quad\quad \N \quad\quad \{ \quad\quad \} \quad\quad , \quad\quad \cup
\end{equation*}
We can form strings from these symbols which represent sets. Here's some examples:
\begin{center}
\begin{tabularx}{12cm}{X X X X}
$\varnothing$ & $\{\varnothing, \N\} \cup \N \cup\ \varnothing$ & $\{\varnothing\} \cup \N \cup \{\N\}$ & $\{\varnothing, \varnothing, \varnothing\}$ \\[0.1cm]
$\{\{\N, \varnothing\} \cup \{\varnothing\}\}$ & $\N \cup \{\N, \varnothing\}$ & $\{\}$ & $\{\N\}$ \\[0.1cm]
$\{\varnothing, \{\varnothing, \{\varnothing\}\}\}$ & $\{\{\{\{\N\}\}\}\}$ & $\N$ & $\{\varnothing, \{\}\}$
\end{tabularx}
\end{center}
Notice that some of these sets, like $\{\varnothing, \varnothing\}$ are syntactically valid but redundant, and others like
$\{\}$ are syntactically valid but not the cleanest way of writing things. Here's some examples of
strings that don't represent sets or aren't syntactically valid:
\begin{center}
\begin{tabularx}{12cm}{X X X X}
$\varepsilon$ & $\}\varnothing\{$ & $\varnothing\{\N\}$ & $\{\{\}$ \\[0.1cm]
$\N, \varnothing, \{\varnothing\}$ & $\{, \N\}$ & $\{ \N \varnothing \},$ & $\{,\}$ \\[0.1cm]
$\{\varnothing$ & $\}\} \N$ & $\{ \varnothing, \varnothing, \varnothing, \}$ & $\{\N, , , \varnothing\}$
\end{tabularx}
\end{center}
Write a CFG for the language $\{\ w \in \Sigma^*\ |\ w$ is a syntactically valid string representing a set $\}$. \\
When using the CFG tool, please use the letters n, u, and o in place of $\N, \cup,$ and $\varnothing$, respectively. \\
\textit{\textbf{Fun fact}}: The starter files for Problem Set One contain a parser that's designed to take as input a
string representing a set and to reconstruct what set that is. The logic we wrote to do that parsing
was based on a CFG we wrote for sets and set theory. Take CS143 if you're curious how to go
from a grammar to a parser! \\
\textit{\textcolor{blue}{ Test your CFG thoroughly! In Fall 2017, a quarter of the submissions we received weren't able to derive the string $\{\varnothing, \varnothing, \varnothing\}$. }}
\end{enumerate}
\begin{shaded}
Write the SUNetID of the person who submitted the CFGs here.
\end{shaded}
\newpage
\section{The Complexity of Addition}
This problem explores the following question:
\begin{center}
\textit{\textbf{How hard is it to add two numbers?}}
\end{center}
Suppose that we want to check whether $x + y = z$, where $x, y,$ and $z$ are all natural numbers. If we want to
phrase this as a problem as a question of strings and languages, we will need to find some way to standardize
our notation. In this problem, we will be using the \textit{\textbf{unary number system}}, a number system in
which the number $n$ is represented by writing out $n$ 1's. For example, the number 5 would be written as
11111, the number 7 as 1111111, and the number 12 as 111111111111. \\
Given the alphabet $\Sigma = \{1, +, =\}$, we can consider strings encoding $x + y = z$ by writing out $x, y,$ and $z$ in
unary. For example:
\begin{align*}
4 + 3 &= 7 \text{ would be encoded as }111+1111=1111111 \\
7 + 1 &= 8 \text{ would be encoded as }1111111+1=11111111 \\
0 + 1 &= 1 \text{ would be encoded as }+1=1
\end{align*}
Consider the alphabet $\Sigma = \{1, +, =\}$ and the following language, which we'll call \textit{ADD}:
\begin{equation*}
\{\ 1^m+1^n=1^{m+n}\ |\ m, n \in \N\ \}
\end{equation*}
For example, the strings 111+1=1111 and +1=1 are in the language, but 1+11=11 is not, nor is the string
1+1+1=111.
\begin{enumerate}[label*=\roman*.,ref=\roman*]
\item Prove or disprove: the language \textit{ADD} defined above is regular.
\begin{shaded}
Provide a proof or disproof here.
\end{shaded}
\item Write a context-free grammar for \textit{ADD}, showing that \textit{ADD} is context-free. (Please submit your CFG online.)
\textit{\textcolor{blue}{ You may find it easier to solve this problem if you first build a CFG for this language where you're allowed
to have as many numbers added together as you'd like. Once you have that working, think about how you?d
modify it so that you have exactly two numbers added together on the left-hand side of the equation. }}
\begin{shaded}
Write the SUNetID of the person who submitted the CFG here.
\end{shaded}
\end{enumerate}
\section{The Complexity of Pet Ownership}
This problem explores the following question:
\begin{center}
\textit{\textbf{How hard is it to walk your dog without a leash?}}
\end{center}
Let's imagine that you're going for a walk with your dog, but this time don't have a leash. As in Problem
Set Six and Problem Set Seven, let $\Sigma = \{y, d\}$, where $y$ means that you take a step forward and $d$ means
that your dog takes a step forward. A string in $\Sigma^*$ can be thought of as a series of events in which either
you or your dog moves forward one unit. For example, the string $yydd$ means that you take two steps forward,
then your dog takes two steps forward. \\
Let \textit{DOGWALK} $ = \{w \in \Sigma^*\ |\ w$ describes a series of steps where you and your dog arrive at the same
point$\}$. For example, the strings $yyyddd, ydyd,$ and $yyyddddddyyy$ are all in \textit{DOGWALK}.
\begin{enumerate}[label*=\roman*.,ref=\roman*]
\item Prove or disprove: the language \textit{DOGWALK} defined above is regular.
\begin{shaded}
Provide a proof or disproof here.
\end{shaded}
\item Write a context-free grammar for \textit{DOGWALK}, showing that \textit{DOGWALK} is context-free. (Please submit your CFG online.)
\textit{\textcolor{blue}{ Be careful, and test your CFG! As you saw in lecture, a lot of ideas that seem plausible here don't work. }}
\begin{shaded}
Write the SUNetID of the person who submitted the CFG here.
\end{shaded}
\end{enumerate}
\section{RNA Hairpins}
RNA strands consist of strings of \textit{nucleotides}, molecules which encode genetic information. Computational
biologists typically represent each RNA strand as a string made from four different letters, A, C, G,
and U, each of which represents one of the four possible nucleotides. \\
Each of the the four nucleotides has an affinity for a specific other nucleotide. Specifically:
\begin{center}
A has an affinity for U (and vice-versa) \quad\quad\quad\quad C has an affinity for G (and vice-versa)
\end{center}
This can cause RNA strands to fold over and bind with themselves. Consider this RNA strand:
\begin{center}
\begin{tikzpicture}[auto,node distance=1cm, minimum size=0.7cm,
thick,main node/.style={circle,draw}]
\node[main node] (1) {G};
\node[main node] (2) [right of=1] {A};
\node[main node] (3) [right of=2] {U};
\node[main node] (4) [right of=3] {U};
\node[main node] (5) [right of=4] {A};
\node[main node] (6) [right of=5] {C};
\node[main node] (7) [right of=6] {A};
\node[main node] (8) [right of=7] {G};
\node[main node] (9) [right of=8] {G};
\node[main node] (10) [right of=9] {U};
\node[main node] (11) [right of=10] {A};
\node[main node] (12) [right of=11] {A};
\node[main node] (13) [right of=12] {U};
\node[main node] (14) [right of=13] {C};
\path[line width=1mm, style={font=\sffamily\small}]
(1) edge node {} (2)
(2) edge node {} (3)
(3) edge node {} (4)
(4) edge node {} (5)
(5) edge node {} (6)
(6) edge node {} (7)
(7) edge node {} (8)
(8) edge node {} (9)
(9) edge node {} (10)
(10) edge node {} (11)
(11) edge node {} (12)
(12) edge node {} (13)
(13) edge node {} (14);
\end{tikzpicture}
\end{center}
If you perfectly fold this RNA strand in half, you get the following:
\begin{center}
\begin{tikzpicture}[auto,node distance=1cm, minimum size=0.7cm,
thick,main node/.style={circle,draw}]
\node[main node] (1) {G};
\node[main node] (2) [right of=1] {A};
\node[main node] (3) [right of=2] {U};
\node[main node] (4) [right of=3] {U};
\node[main node] (5) [right of=4] {A};
\node[main node] (6) [right of=5] {C};
\node[main node] (7) [right of=6] {A};
\node[main node] (8) [below of=7] {G};
\node[main node] (9) [left of=8] {G};
\node[main node] (10) [left of=9] {U};
\node[main node] (11) [left of=10] {A};
\node[main node] (12) [left of=11] {A};
\node[main node] (13) [left of=12] {U};
\node[main node] (14) [left of=13] {C};
\path[line width=1mm, style={font=\sffamily\small}]
(1) edge node {} (2)
(2) edge node {} (3)
(3) edge node {} (4)
(4) edge node {} (5)
(5) edge node {} (6)
(6) edge node {} (7)
(7) edge node {} (8)
(8) edge node {} (9)
(9) edge node {} (10)
(10) edge node {} (11)
(11) edge node {} (12)
(12) edge node {} (13)
(13) edge node {} (14);
\end{tikzpicture} \quad\quad
\begin{tikzpicture}[auto,node distance=1cm, minimum size=0.7cm,
thick,main node/.style={circle,draw}]
\node[main node] (1) {G};
\node[main node] (2) [right of=1] {A};
\node[main node] (3) [right of=2] {U};
\node[main node] (4) [right of=3] {U};
\node[main node] (5) [right of=4] {A};
\node[main node] (6) [right of=5] {C};
\node[main node] (7) [right of=6] {A};
\node[main node] (8) [below of=7] {G};
\node[main node] (9) [left of=8] {G};
\node[main node] (10) [left of=9] {U};
\node[main node] (11) [left of=10] {A};
\node[main node] (12) [left of=11] {A};
\node[main node] (13) [left of=12] {U};
\node[main node] (14) [left of=13] {C};
\path[line width=1mm, style={font=\sffamily\small}]
(1) edge node {} (2)
edge[line width=0.5mm,dotted] node {} (14)
(2) edge node {} (3)
edge[line width=0.5mm,dotted] node {} (13)
(3) edge node {} (4)
edge[line width=0.5mm,dotted] node {} (12)
(4) edge node {} (5)
edge[line width=0.5mm,dotted] node {} (11)
(5) edge node {} (6)
edge[line width=0.5mm,dotted] node {} (10)
(6) edge node {} (7)
edge[line width=0.5mm,dotted] node {} (9)
(7) edge node {} (8)
(8) edge node {} (9)
(9) edge node {} (10)
(10) edge node {} (11)
(11) edge node {} (12)
(12) edge node {} (13)
(13) edge node {} (14);
\end{tikzpicture}
\end{center}
Notice that each pair of nucleotides -- except for the A and the G on the far right -- are attracted to the corresponding
nucleotide on the other side of the RNA strand. Because of the natural affinities of the nucleotides
in the RNA strand, the RNA strand will be held in this shape. This is an example of an \textit{RNA hairpin},
a structure with important biological roles. \\
For the purposes of this problem, we'll say that an RNA strand forms a hairpin if
\begin{itemize}
\item it has even length (so that it can be cleanly folded in half);
\item it has length at least ten (there are at least four pairs holding the hairpin shut); and
\item all of its nucleotides, except for the middle two, have an affinity for its corresponding nucleotide
when folded over. (The middle two nucleotides in a hairpin might coincidentally have an affinity
for one another, but it's not required. For example, CCCC\underline{AU}GGGG forms a hairpin.)
\end{itemize}
This problem explores the following question:
\begin{center}
\textit{\textbf{How hard is it to determine whether an RNA strand forms a hairpin?}}
\end{center}
Let $\Sigma = \{A, C, G, U\}$ and let $L_{RNA} = \{\ w \in \Sigma^*\ |\ w$ represents an RNA strand that forms a hairpin $\}$. For example,
the strings UGACCCGUCA, GUACAAGUAC, UUUUUUUUUAAAAAAAAA, and CCAACCUUGG are all in $L_{RNA}$,
but the strings AU, AAAAACUUUUU, GGGC, and GUUUUAAAAG are all not in $L_{RNA}$.
\begin{enumerate}[label*=\roman*.,ref=\roman*]
\item Prove that $L_{RNA}$ is not regular. Since this language imposes a lot of requirements on the strings it
contains, if in the course of your proof you want to claim that a particular string is or is not in
$L_{RNA}$, please articulate clearly why the string does or does not meet each of the requirements of
strings in $L_{RNA}$.
\textit{\textcolor{blue}{ There's a good amount of trial and error required here. Test your proof carefully -- pick concrete examples
of strings from your set and make sure that your argument really does work for them. The fact that the two
middle characters don't have to match makes this a little bit trickier than you might initially suspect. }}
\begin{shaded}
Provide a proof here.
\end{shaded}
\item Design a CFG for $L_{RNA}$, which proves that the language is context-free. (Please submit it online.)
\begin{shaded}
Write the SUNetID of the person who submitted the CFG here.
\end{shaded}
\end{enumerate}
\section{Right-Linear Grammars}
A context-free grammar is called a \textit{\textbf{right-linear grammar}} if every production in the grammar has one of
the following three forms:
\begin{itemize}
\item $A \rightarrow \varepsilon$
\item $A \rightarrow B$, where $B$ is a nonterminal.
\item $A \rightarrow aB$, where $a$ is a terminal and $B$ is a nonterminal.
\end{itemize}
For example, the following is a right-linear grammar:
\begin{itemize}[label={}]
\item $A \rightarrow aB\ |\ bB\ |\ \varepsilon$
\item $B \rightarrow aC\ |\ bA\ |\ C$
\item $C \rightarrow bA\ |\ aA\ |\ \varepsilon$
\end{itemize}
The right-linear grammars are all context-free grammars, so their languages are all context-free. However,
it turns out that this class of grammars precisely describe the regular languages. That is, a language
$L$ is regular if and only if there is a right-linear grammar $G$ such that $L = \mathscr{L}(G)$.
\begin{enumerate}[label*=\roman*.,ref=\roman*]
\item Let $G$ be a right-linear grammar. Describe how to construct an NFA $N$ such that $\mathscr{L}(G) = \mathscr{L}(N)$.
Briefly justify why $N$ accepts precisely the strings that $G$ can generate. To illustrate your construction, show the NFA you'd build from the following right-linear
grammar:
\begin{itemize}[label={}]
\item $A \rightarrow aB\ |\ bC$
\item $B \rightarrow aB\ |\ \varepsilon$
\item $C \rightarrow aD\ |\ A\ |\ bC$
\item $D \rightarrow aD\ |\ bD\ |\ \varepsilon$
\end{itemize}
\begin{shaded}
Write the SUNetID of the person who submitted the NFA here.
\end{shaded}
\item Let $N$ be an NFA. Describe how to construct a right-linear grammar $G$ such that $\mathscr{L}(G) = \mathscr{L}(N)$.
Briefly justify why $N$ accepts precisely the strings that $G$ can generate. To illustrate your construction, show the grammar that you'd build from the following NFA:
\begin{center}
\begin{tikzpicture}[shorten >=1pt,node distance=2cm,on grid,auto]
% \node[state, options?] (identifier) [placement] {display};
\node[state, initial] (0) {$q_0$};
\node[state] (1) [right=of 0] {$q_1$};
\node[state, accepting] (2) [right=of 1] {$q_2$};
\node[state, accepting] (3) [below=of 0] {$q_3$};
\node[state, accepting] (4) [right=of 3] {$q_4$};
\node[state] (5) [right=of 4] {$q_5$};
% (start identifier) edge [options?] node {display label} (target identifier)
% edge [options?] node {display label} (another target identifier)
\path[->]
(0) edge node {$a$} (1)
edge [loop above] node {$a$} (0)
edge [bend left=10] node {$\varepsilon$} (3)
(1) edge node {$a, \varepsilon$} (2)
edge [loop above] node {$a$} (1)
edge node {$b$} (4)
(2) edge [bend left=10] node {$a$} (5)
edge [loop above] node {$a, b$} (2)
(3) edge [bend left=10] node {$b$} (0)
(4) edge node {$\varepsilon$} (5)
(5) edge [bend left=10] node {$b$} (2)
edge [loop right] node {$a$} (5);
\end{tikzpicture}
\end{center}
\textit{\textcolor{blue}{ We're expecting you to actually apply your construction to this NFA, not to eyeball the NFA and find a
grammar that happens to have the same language. The point of doing this is to make sure that you've
thought through all of the necessary cases. }}
\begin{shaded}
Write the SUNetID of the person who submitted the CFG here.
\end{shaded}
\end{enumerate}
\section{The Collatz Conjecture}
The \textit{Collatz conjecture} is a famous conjecture (an unproved claim) that says the following procedure
(called the \textit{hailstone sequence}) terminates for all positive natural numbers $n$:
\begin{itemize}
\item If $n = 1$, stop.
\item If $n$ is even, set $n = \frac{n}{2}$ and repeat from the top.
\item If $n$ is odd, set $n = 3n + 1$ and repeat from the top.
\end{itemize}
Let $L = \{\ 1^n\ |\ n \ge 1$ and the hailstone sequence terminates for $n\ \}$ be a language over the singleton alphabet
$\Sigma = \{1\}$. It turns out that it's possible to build a TM for this language which means that $L \in$ \textbf{RE}, and in this problem you'll do just that. Parts (i) and (ii)
will ask you to design two useful subroutines, and you'll assemble the overall machine in part (iii).
\begin{enumerate}[label*=\roman*.,ref=\roman*]
\item Design a TM subroutine that, given a tape holding a string of the form $1^{2n}$ (where $n \in \N$) surrounded
by infinitely many blanks, ends with $1^n$ written on the tape, surrounded by infinitely many
blanks. You can assume the tape head begins reading the first 1 (or points to an arbitrary blank
cell in the case where $n = 0$), and your TM should end with the tape head reading the first 1 of the
result (or any blank cell if $n = 0$). For example, given the initial configuration
\begin{center}
\begin{tikzpicture}[scale=0.5]
% tape
\draw (0,0) rectangle (1,1) node[pos=.5] {$\ldots$};
\draw (1,0) rectangle (2,1);
\foreach \x in {2,...,9} { % cells with 1s
\draw (\x,0) rectangle (\x+1,1) node[pos=.5] {1};
}
\draw (10,0) rectangle (11,1);
\draw (11,0) rectangle (12,1) node[pos=.5] {$\ldots$};
% arrow
\def\x{2.5} % x-coord of bottom point of arrow
\def\y{1} % y-coord of bottom point of arrow
\coordinate (0) at (\x,\y);
\coordinate (1) at (\x+0.5, \y+0.5);
\coordinate (2) at (\x+0.25, \y+0.5);
\coordinate (3) at (\x+0.25, \y+1);
\coordinate (4) at (\x-0.25, \y+1);
\coordinate (5) at (\x-0.25, \y+0.5);
\coordinate (6) at (\x-0.5, \y+0.5);
\filldraw[draw=black, fill=black] (0) -- (1) -- (2) -- (3) -- (4) -- (5) -- (6) -- cycle;
\end{tikzpicture}
\end{center}
your TM subroutine would end with this configuration:
\begin{center}
\begin{tikzpicture}[scale=0.5]
% tape
\draw (0,0) rectangle (1,1) node[pos=.5] {$\ldots$};
\draw (1,0) rectangle (2,1);
\foreach \x in {2,...,5} { % cells with 1s
\draw (\x,0) rectangle (\x+1,1) node[pos=.5] {1};
}
\foreach \x in {6,...,10} { % blank cells
\draw (\x,0) rectangle (\x+1,1);
}
\draw (11,0) rectangle (12,1) node[pos=.5] {$\ldots$};
% arrow
\def\x{2.5} % x-coord of bottom point of arrow
\def\y{1} % y-coord of bottom point of arrow
\coordinate (0) at (\x,\y);
\coordinate (1) at (\x+0.5, \y+0.5);
\coordinate (2) at (\x+0.25, \y+0.5);
\coordinate (3) at (\x+0.25, \y+1);
\coordinate (4) at (\x-0.25, \y+1);
\coordinate (5) at (\x-0.25, \y+0.5);
\coordinate (6) at (\x-0.5, \y+0.5);
\filldraw[draw=black, fill=black] (0) -- (1) -- (2) -- (3) -- (4) -- (5) -- (6) -- cycle;
\end{tikzpicture}
\end{center}
You can assume that there are an even number of 1s on the tape at startup and can have your TM
behave however you'd like if this isn't the case. Please use our provided TM editor to design, develop,
test, and submit your answer to this question. Since our TM tool doesn't directly support
subroutines, just have your machine accept when it's done.
\textit{\textcolor{blue}{ For reference, our solution has fewer than 10 states. If you have significantly more than this and are struggling to get your TM working, you might want to change your approach. It's totally fine if you have a
bunch of states, provided that your solution works. }}
\textit{\textcolor{blue}{ There are a lot of different solutions here. Some use very little extra tape. Some use a lot of extra tape. Some don't need any other tape symbols. Some do. Be creative, try things out, and don't be afraid to back up and
try something else if your approach doesn't seem to be panning out. }}
\item Design a TM subroutine that, given a tape holding a string of the form $1^n$ (for some $n \in \N$), surrounded
by infinitely many blanks, ends with $1^{3n+1}$ written on the tape, surrounded by infinitely
many blanks. You can assume that the tape head begins reading the first 1, and your TM should
end with the tape head reading the first 1 of the result. For example, given this configuration
\begin{center}
\begin{tikzpicture}[scale=0.5]
% tape
\draw (0,0) rectangle (1,1) node[pos=.5] {$\ldots$};
\draw (1,0) rectangle (2,1);
\foreach \x in {2,...,4} { % cells with 1s
\draw (\x,0) rectangle (\x+1,1) node[pos=.5] {1};
}
\foreach \x in {5,...,12} { % blank cells
\draw (\x,0) rectangle (\x+1,1);
}
\draw (13,0) rectangle (14,1) node[pos=.5] {$\ldots$};
% arrow
\def\x{2.5} % x-coord of bottom point of arrow
\def\y{1} % y-coord of bottom point of arrow
\coordinate (0) at (\x,\y);
\coordinate (1) at (\x+0.5, \y+0.5);
\coordinate (2) at (\x+0.25, \y+0.5);
\coordinate (3) at (\x+0.25, \y+1);
\coordinate (4) at (\x-0.25, \y+1);
\coordinate (5) at (\x-0.25, \y+0.5);
\coordinate (6) at (\x-0.5, \y+0.5);
\filldraw[draw=black, fill=black] (0) -- (1) -- (2) -- (3) -- (4) -- (5) -- (6) -- cycle;
\end{tikzpicture}
\end{center}
your TM subroutine would end with this confguration:
\begin{center}
\begin{tikzpicture}[scale=0.5]
% tape
\draw (0,0) rectangle (1,1) node[pos=.5] {$\ldots$};
\draw (1,0) rectangle (2,1);
\foreach \x in {2,...,11} { % cells with 1s
\draw (\x,0) rectangle (\x+1,1) node[pos=.5] {1};
}
\draw (12,0) rectangle (13,1);
\draw (13,0) rectangle (14,1) node[pos=.5] {$\ldots$};
% arrow
\def\x{2.5} % x-coord of bottom point of arrow
\def\y{1} % y-coord of bottom point of arrow
\coordinate (0) at (\x,\y);
\coordinate (1) at (\x+0.5, \y+0.5);
\coordinate (2) at (\x+0.25, \y+0.5);
\coordinate (3) at (\x+0.25, \y+1);
\coordinate (4) at (\x-0.25, \y+1);
\coordinate (5) at (\x-0.25, \y+0.5);
\coordinate (6) at (\x-0.5, \y+0.5);
\filldraw[draw=black, fill=black] (0) -- (1) -- (2) -- (3) -- (4) -- (5) -- (6) -- cycle;
\end{tikzpicture}
\end{center}
You can assume the number of 1s on the tape at startup is odd and can have your TM behave
however you'd like if this isn't the case. Please use our provided TM editor to design, develop,
test, and submit your answer to this question. Since our TM tool doesn't directly support subroutines,
just have your machine accept when it's done. \textit{(For reference, our solution has fewer than 10
states. If you have significantly more than this, you might want to change your approach.)}
\item Draw the state transition diagram for a Turing machine M that recognizes $L$. Our TM tool is configured
for this problem so that you can use our reference solutions for parts (i) and (ii) as subroutines
in your solution. To do so, follow these directions:
\begin{enumerate}[label=\arabic*.,ref=\arabic*]
\item Create states named \textbf{\texttt{half}}, \textbf{\texttt{half\_}}, \textbf{\texttt{trip}}, and \textbf{\texttt{trip\_}}.
\item To execute the subroutine that converts $1^{2n}$ into $1^n$, transition into the state named \textbf{\texttt{half}}. When
that subroutine finishes, the TM will automagically jump into the state labeled \textbf{\texttt{half\_}}. You do
not need to -- and should not -- define any transitions into \textbf{\texttt{half\_}} or out of \textbf{\texttt{half}}.
\item To execute the subroutine that converts $1^n$ into $1^{3n+1}$, transition into the state named \textbf{\texttt{trip}}.
When that subroutine finishes, the TM will automagically jump into the state labeled \textbf{\texttt{trip\_}}.
You do not need to -- and should not -- define any transitions into \textbf{\texttt{trip\_}} or out of \textbf{\texttt{trip}}.
\end{enumerate}
Calling ahead to Monday's lecture: a TM $M$ recognizes a language $L$ if $M$ accepts all of the strings
in $L$ and either rejects or loops on all strings that are not in $L$. In other words, your TM should accept
every string in $L$, and for any string not in $L$ it can either loop infinitely or reject the string.\\
Please use our provided TM editor to design, develop, test, and submit your answer to this question.
\textit{(For reference, our solution has fewer than 15 states. If you have significantly more than this,
you might want to change your approach.)}
\begin{shaded}
Write the SUNetID of the person who submitted the TMs here.
\end{shaded}
\end{enumerate}
\section{TMs, Formally}
Just as it's possible to formally define a DFA using a 5-tuple, it's possible to formally define a TM as an
8-tuple $(Q, \Sigma, \Gamma, B, q_0, Y, N, \delta)$ where
\begin{itemize}
\item $Q$ is a finite set of states, which can be anything;
\item $\Sigma$ is a finite, nonempty set called the \textit{\textbf{input alphabet}};
\item $\Gamma$ is a finite, nonempty set called the \textit{\textbf{tape alphabet}}, where $\Sigma \subseteq \Gamma$;
\item $B \in \Gamma - \Sigma$ is the \textit{\textbf{blank symbol}};
\item $q_0$ is the start state, where $q_0 \in Q$;
\item $Y \subseteq Q$ is the set of \textit{\textbf{accepting states}};
\item $N \subseteq Q$ is the set of \textit{\textbf{rejecting states}}, where $Y \cap N = \varnothing$; and
\item $\delta$ is the \textit{\textbf{transition function}}, described below.
\end{itemize}
This question explores some aspects of the definition.
\begin{enumerate}[label*=\roman*.,ref=\roman*]
\item Is it possible to have a TM with no states? Justify your answer.
\begin{shaded}
\begin{answer}
Provide an answer here.
\end{answer}
Provide justification here.
\end{shaded}
\item Is it possible to have a TM with no \textit{accepting} states? Justify your answer.
\begin{shaded}
\begin{answer}
Provide an answer here.
\end{answer}
Provide justification here.
\end{shaded}
\item Is it possible to have a TM with no \textit{rejecting} states? Justify your answer.
\begin{shaded}
\begin{answer}
Provide an answer here.
\end{answer}
Provide justification here.
\end{shaded}
\item Why is the restriction $Y \cap N = \varnothing$ there? Justify your answer.
\begin{shaded}
\begin{answer}
Provide an answer here.
\end{answer}
Provide justification here.
\end{shaded}
\item Is it possible to have a TM where $\Sigma = \Gamma$? Justify your answer.
\begin{shaded}
\begin{answer}
Provide an answer here.
\end{answer}
Provide justification here.
\end{shaded}
\end{enumerate}
Now, let's talk about the transition function. As with DFAs, the transition function of a Turing machine is
what formally defines the transitions. If $q$ is a state in a TM that isn't an accepting state or a rejecting state
and $a$ is a symbol that can appear on the TM's tape, then
\begin{equation*}
\delta(q, a) = (r, b, D)
\end{equation*}
where $r$ is the new state to transition into, $b$ is the symbol to write back to the tape, and $D$ is either $L$ for
''move left'' or $R$ for ''move right.'' Because TMs immediately stop running after entering an accepting or
rejecting state, the $\delta$ function should not be defined for any state $q$ that's either accepting or rejecting.
Aside from this, $\delta$ should be defined for every combination of a (non-accepting, non-rejecting) state $q$ and
any symbol $a$ that can appear on the tape.
\begin{enumerate}[resume*]
\item Based on the above description of $\delta$, what should the domain of $\delta$ be? What should it codomain
be? Answer this question by filling in the following blanks, and briefly justify your answer.
\begin{center}
$\delta:$ \underline{\hspace{20ex}} $\rightarrow$ \underline{\hspace{20ex}}
\end{center}
\begin{shaded}
\begin{answer}
$ $
\begin{center}
$\delta$: \underline{Write your answer here.} $\rightarrow$ \underline{Write your answer here.}
\end{center}
\end{answer}
Provide justification here.
\end{shaded}
\end{enumerate}
\section{Regular and Decidable Languages}
In class, we alluded to the fact that \textbf{REG} (the class of all regular languages) is a subset of \textbf{R} (the class of
all decidable languages), but we never actually justified this claim. \\
Describe a construction that, given a DFA $D$, produces a decider $D'$ where $\mathscr{L}(D) = \mathscr{L}(D')$. Briefly justify
why the TM $D'$ you construct is a decider and why it accepts precisely the strings that $D$ accepts. Illustrate
your example by applying it to a small DFA $D$ of your choice. \\
Although you have a formal 5-tuple definition of a DFA and a formal 8-tuple definition of a TM at your
disposal, we're not expecting you to write your solution at that level of detail. \\
\textit{\textcolor{blue}{ Remember that DFAs and TMs work iompletely diferently with regards to aiiepting and rejeiting states
and that the transitions in TMs have a very diferent struiture than the transitions in DFAs! }}
\begin{shaded}
\begin{answer}
Provide an answer here.
\end{answer}
\end{shaded}
\section{Jumbled Jargon}
\textit{\textbf{(We will cover the material necessary to solve this problem on Monday.)}} \\
We've introduced a number of terms and definitions pertaining to Turing machines, languages, and what
it means to solve a problem. Some of the terms we've described are adjectives that can only describe
TMs, while others are adjectives that can only describe languages. Using them incorrectly leads to statements
that aren't mathematically meaningful. \\
To reason by analogy, consider the statement ``the set $\N$ is even.'' This statement isn't meaningful, because
``even'' can only be applied to individual natural numbers, and $\N$ isn't a natural number. Similarly, the
statement $1 \in 5$ isn't meaningful, since 5 isn't a set. The statement $\Z \subseteq \N$ is meaningful but not true - it's
the mathematical equivalent of a grammatically correct statement that just happens to be false. \\
Below is a series of statements. For each statement, decide whether that statement is mathematically
meaningful or not. If it's not mathematically meaningful, explain why not. If it is mathematically meaningful,
determine whether it's true or false and briefly justify your answer. \\
\textit{\textcolor{blue}{ You have a ton of experience type-checking things by this point in the quarter. Use that intuition here. }}
\begin{enumerate}[label*=\roman*.,ref=\roman*]
\item If $M$ is a Turing machine, $w$ is a string, and $M$ accepts $w$, then $A_{TM}$ accepts $\langle M, w\rangle$.
\begin{shaded}
\begin{answer}
Provide an answer here.
\end{answer}
Provide justification here.
\end{shaded}
\item If $M$ is a Turing machine, $w$ is a string, and $M$ loops on $w$, then $\langle M, w\rangle \not\in \mathscr{L}(U_{TM})$.
\begin{shaded}
\begin{answer}
Provide an answer here.
\end{answer}
Provide justification here.
\end{shaded}
\item $U_{TM}$ is decidable.
\begin{shaded}
\begin{answer}
Provide an answer here.
\end{answer}
Provide justification here.
\end{shaded}
\item $\langle U_{TM}\rangle$ is decidable.
\begin{shaded}
\begin{answer}
Provide an answer here.
\end{answer}
Provide justification here.
\end{shaded}
\item $\{\langle U_{TM}\rangle\}$ is decidable.
\begin{shaded}
\begin{answer}
Provide an answer here.
\end{answer}
Provide justification here.
\end{shaded}
\end{enumerate}
\section{What Does it Mean to Solve a Problem?}
Let $L$ be a language over $\Sigma$ and $M$ be a TM with input alphabet $\Sigma$. Below are three properties that may
hold for $M$:
\begin{center}
\begin{enumerate}
\item $M$ halts on all inputs.
\item For any string $w \in \Sigma^*$, if $M$ accepts $w$, then $w \in L$.
\item For any string $w \in \Sigma^*$, if $M$ rejects $w$, then $w \not\in L$.
\end{enumerate}
\end{center}
At some level, for a TM to claim to solve a problem, it should have at least some of these properties. Interestingly,
though, just having two of these properties doesn't say much. \\
\textit{\textcolor{blue}{ The whole point of this problem is to show that you have to be extremely careful about how you define
``solving a problem,'' since if you define it incorrectly then you can ``solve'' a problem in a way that bears little
resemblance to what we'd think of as solving a problem. Keep this in mind as you work through this one. }}
\begin{enumerate}[label*=\roman*.,ref=\roman*]
\item Prove that if $L$ is any language over $\Sigma$, then there is a TM $M$ that satisfies properties (1) and (2).
\begin{shaded}
Provide a proof here.
\end{shaded}
\item Prove that if $L$ is any language over $\Sigma$, then there is a TM $M$ that satisfies properties (1) and (3).
\begin{shaded}
Provide a proof here.
\end{shaded}
\item Prove that if $L$ is any language over $\Sigma$, then there is a TM $M$ that satisfies properties (2) and (3).
\begin{shaded}
Provide a proof here.
\end{shaded}
\item Suppose that $L$ is a language over $\Sigma$ for which there is a TM $M$ that satisfies properties (1), (2),
and (3). What can you say about $L$? Prove it.
\begin{shaded}
\begin{answer}
Provide an answer here.
\end{answer}
Provide a proof here.
\end{shaded}
\end{enumerate}
\section{R and RE Languages}
The following problems are designed to explore some of the nuances of how Turing machines, languages,
decidability, and recognizability all relate to one another. We hope that by working through them, you'll
get a much better understanding of key computability concepts.
\begin{enumerate}[label*=\roman*.,ref=\roman*]
\item Give a high-level description of a TM $M$ such that $\mathscr{L}(M) \in$ \textbf{R}, but $M$ is not a decider (you can
draw a concrete example of TM, or give pseudocode for a program along the lines of what we've
done in class). Briefly justify your answer. This shows that just because a TM's language is decidable,
it's not necessarily the case that the TM itself must be a decider.
\begin{shaded}
\begin{answer}
Provide an answer here.
\end{answer}
Provide justification here.
\end{shaded}
\item Only \textit{languages} can be decidable or recognizable; there's no such thing as an ``undecidable string''
or ``unrecognizable string.'' Prove that for every string $w$, there's an \textbf{R} language containing $w$ and
an \textbf{RE} language containing $w$.
\begin{shaded}
Provide a proof here.
\end{shaded}
\item Here's a weird one. Let $\Sigma$ be an alphabet containing all characters that can appear in a person's
name. Prove that the following language $L_{2020}$ is decidable, subject to the assumption that there is a single
US presidential election in 2020 and that it ends with a single winner:
\begin{center}
$L_{2020} = \{\ w \in \Sigma^*\ |\ w$ is the name of the winner of the 2020 presidential election $\}$
\end{center}
Then, explain how it's possible to build a decider for the language $L_{2020}$ given that no one has any
idea who is going to win the 2020 election!
\begin{shaded}
Provide a proof here.
\end{shaded}
\end{enumerate}
\section*{Optional Fun Problem: TMs and Regular Languages (1 Point Extra Credit)}
Let $M$ be a TM with the following property: there exists a natural number $k$ such that after $M$ is run on
any string $w$, $M$ always halts after at most $k$ steps. Prove that $\mathscr{L}(M)$ is regular.
\begin{shaded}
Provide a proof here.
\end{shaded}
\end{document}