Problem Set 9

Due Wednesday, June 4 at 11:59 PM Pacific

Solutions are available!

🔑 Solutions

DUE: WEDNESDAY June 4 at 11:59pm PT

SPRING QUARTER SCHEDULE NOTE:

Because Spring Quarter has a truncated Week 10, this pset will be due earlier than usual, on WEDNESDAY June 4 at 11:59pm PT, with no late days allowed (so we can release the solutions right away at 12:01am Thursday). In consideration of this shorter time period, we have made the following adjustments:

Cut problems.
All problems will be graded on completeness only (i.e., did you try to write something showing reasonable good-faith effort, regardless of whether it is correct).
Going over the last problem (Problem 4) in lecture on Friday May 30.

Introduction

What problems are beyond our capacity to solve? Why are they so hard? And why is anything that we've discussed this quarter at all practically relevant? In this problem set – the last one of the quarter! – you'll explore the absolute limits of computing power.

Before beginning, we recommend reading

the Guide to Self-Reference, which does a deep-dive into undecidability via self-reference, and
the Guide to the Lava Diagram, which explores how to classify languages.

As always, please feel free to drop by office hours or ask on Ed if you have any questions. We'd be happy to help out.

Good luck, and have fun!

Starter Files

Here are the starter files for the problems you'll submit electronically:

📦 PS9 Starter Files

Unpack the files somewhere convenient and open up the bundled project.

If you would like to type your solutions in $\LaTeX$, you may want to download this template file where you can place your answers:

🖋 PS9 $\LaTeX$ Template

General Advice

First, read through the index of problems below to get an idea of time management for this assignment.
Next, glance through all the questions before you start working. This not only ensures you won't be caught off guard as you reach later problems, but also gives your brain a chance to start working on some of these as a background process while you do other things.
If you get stuck, consider moving on to another problem for a while! Changing things up is a great way to get out of a rut, make new connections between concepts, and spark creative insights!
If you realize you don't have the requisite knowledge to complete a problem, take a step back from it and review the relevant lecture materials. Skipping lecture and trying to extract the relevant tidbits for each problem will lead to a more frustrating experience than engaging fully with a lecture and approaching the problem sets afterward.

Problem One: Verifier Warmups

Verifiers can feel a bit weird the first time you encounter them. The definition is abstract, and it never feels like they're actually solving a problem. These questions are designed to get you playing around a bit more with verifiers so you can build a better intuitive feel for them.

We recommend pulling up the formal definition of a verifier before starting this problem. Remember that the certificate $c$ can be the encoding of an object or the encoding of a group of one or more objects.

Consider the language $L = \Set{ \langle n \rangle \suchthat n \in \naturals \land n \text{ is a perfect square}}\text.$ Below is a list of programs that purport to be verifiers for $L\text.$ Which are, and which aren't? Briefly justify your answers, but no formal proof is necessary.

bool checkSquare1(int n) {
    for (int i = 0; i <= n; i++) {
        if (i * i == n) return true;
    }
    return false;
}

bool checkSquare2(int n) {
    int i = 0;
    while (true) {
        if (i * i == n) return true;
        i++;
    }
}

bool checkSquare3(int n, string w) {
    return w == "n is a perfect square."
}

bool checkSquare4(int n, int k) {
    return k * k == n;
}

bool checkSquare5(int n, int k, int m) {
    return k * m == n;
}

bool checkSquare6(int n, int k) {
    for (int i = 0; i < k; i++) {
        if (i * i == n) return true;
    }
    return false;
}

bool checkSquare7(int n, int i) {
    while (true) {
        if (i * i == n) return true;
        i++;
    }
}

Briefly explain why the following is a verifier for $A_{TM}\text.$ Specifically, explain why this verifier halts on all inputs and why it's the case that $\langle M, w\rangle \in A_{TM}$ if and only if there is a certificate $n$ where the verifier accepts $\langle M, w, n\rangle\text.$ No formal proof is necessary.
```
bool checkAccepts(TM M, string w, int n) {
    set up a simulation of M running on w;
    for (int i = 0; i < n; i++) {
        simulate one more step of M running on w;
    }
    return whether M executed the command "Return True";
}
```

Problem Two: Isn’t Everything Undecidable?

(We recommend reading the Guide to Self-Reference before attempting this problem.)

In lecture, we proved that $A_{TM}$ and the halting problem are undecidable – that, in some sense, they’re beyond the reach of algorithmic problem-solving. The proofs we used involved the nuanced technique of self-reference, which can seem pretty jarring and weird the first time you run into it. The good news is that with practice, you’ll get the hang of the technique pretty quickly!

One of the most common questions we get about self-reference proofs is why you can’t just use a self-reference argument to prove that every language is undecidable. As is often the case in Theoryland, the best way to answer this question is to try looking at some of the ways you might use self-reference to prove that every language is undecidable, then see where those arguments break down.

To begin with, consider this proof:

(Incorrect!) Theorem: All languages are undecidable.

(Incorrect!) Proof: By contradiction; assume there is a decidable language $L$. Then there is a decider $D$ for $L$. We can represent $D$ as a function

bool inL(string w);

that takes in a string $w$, then returns true if $w \in L$ and false otherwise.

Now, consider the following program:

bool trickster(string w) {
    return !inL(w);
}

Since inL decides $L$, we know that

\[\mathtt{inL(w)}\text{ returns true} \qquad \text{if and only if} \qquad w \in L\text.\]

Given how trickster is written, we see that

\[\mathtt{inL(w)}\text{ returns true} \qquad \text{if and only if} \qquad \mathtt{trickster(w)}\text{ returns false.}\]

This means that

\[\mathtt{trickster(w)}\text{ returns false} \qquad \text{if and only if} \qquad w \in L\text.\]

This is impossible. We've reached a contradiction, so our assumption was wrong and $L$ is undecidable. $\qed$

This proof has to be wrong because we know of many decidable languages.

What’s wrong with this proof? To answer this question, give us a specific claim made in the proof that is incorrect, then explain why it's incorrect.

Go one sentence at a time. For each claim that’s made, ask yourself – why, specifically, is this statement correct? Pull up the proof that $A_{TM}$ is undecidable and compare this proof and that one side-by-side, going one sentence at a time if you need to.

Here’s another incorrect proof that all languages are undecidable:

(Incorrect!) Theorem: All languages are undecidable.

(Incorrect!) Proof: By contradiction; assume there is a decidable language $L$. Then there is a decider $D$ for $L$. We can represent $D$ as a function

bool willAccept(string function, string w);

that takes in the source code of a function function and a string $w$, then returns true if function(w) returns true and false otherwise.

Now, consider the following program:

bool trickster(string w) {
    string me = /* source code for trickster */;

    /* See whether we'll accept, then do the opposite. */
    return !willAccept(me, w);
}

Since willAccept decides $L$ and me holds the source code of trickster, we know that

\[\mathtt{willAccept(me, input)}\text{ returns true}\qquad \text{if and only if} \qquad \texttt{trickster(input)} \text{ returns true.}\]

Given how trickster is written, we see that

\[\mathtt{willAccept(me, input)}\text{ returns true}\qquad \text{if and only if} \qquad \texttt{trickster(input)} \text{ doesn't return true.}\]

This means that

\[\mathtt{trickster(input)}\text{ returns true}\qquad \text{if and only if} \qquad \texttt{trickster(input)} \text{ doesn't return true.}\]

This is impossible. We've reached a contradiction, so our assumption was wrong and $L$ is undecidable. $\qed$

It’s a nice read, but this proof isn’t correct.

What’s wrong with this proof? To answer this question, give us a specific claim made in the proof that is incorrect, then explain why it's incorrect.

Many of the examples we’ve seen of undecidable languages involve checking for properties of Turing machines or computer programs, which might give you the sense that every question you might want to ask about TMs or programs is undecidable. That isn’t the case, though, and this question explores why.

Consider the following language $L$:

\[L = \Set{ \encoded{P} \suchthat P \text{ is a syntactically valid C++ program} }\]

Below is a purported proof that $L$ is undecidable:

(Incorrect!) Theorem: The language $L$ is undecidable.

(Incorrect!) Proof: By contradiction; assume $L$ is decidable. Then there is a decider $D$ for $L$. We can represent $D$ as a function

bool isValid(string function);

that takes in the source code of a function function, then returns true if function is syntactically valid and false otherwise.

Now, consider the following program:

void trickster() {
    string me = /* source code for trickster */;

    /* Execute a line based on whether our syntax is right. */
    if (isValid(me)) {
        oops, this line of code isn't valid C++!
    } else {
        int num = 137; // Perfectly valid syntax!
    }
}

Since isValid decides $L$ and me holds the source of trickster, we know that

\[\mathtt{isValid(me)}\text{ returns true} \qquad \text{if and only if} \qquad \mathtt{trickster} \text{ is syntactically valid.}\]

Given how trickster is written, we see that

\[\mathtt{isValid(me)}\text{ returns true} \qquad \text{if and only if} \qquad \mathtt{trickster} \text{ is not syntactically valid.}\]

This means that

\[\mathtt{trickster} \text{ is syntactically valid} \qquad \text{if and only if} \qquad \mathtt{trickster} \text{ is syntactically invalid.}\]

This is impossible. We've reached a contradiction, so our assumption was wrong and $L$ is undecidable. $\qed$

This proof, unfortunately, is incorrect.

What’s wrong with this proof? To answer this question, give us a specific claim made in the proof that is incorrect, then explain why it's incorrect.

Problem Three: Double Verification

This problem explores the following beautiful and fundamental theorem about the relationship between the $\rlangs$ and $\relangs$ languages:

If $L$ is a language, then $L \in \rlangs$ if and only if $L \in \relangs$ and $\overline{L} \in \relangs$.

This theorem has a beautiful intuition: it says that a language $L$ is decidable $(L \in \rlangs)$ precisely if for every string in the language, it's possible to prove it's in the language $(L \in \relangs)$ and, simultaneously, for every string not in the language, it's possible to prove that the string is not in the language $(\overline{L} \in \relangs)$. In this problem, we're going to ask you to prove one of the two directions of this theorem.

Let $L$ be a language where $L \in \relangs$ and $\overline{L} \in \relangs$. This means that there's a verifier for $L$ and a verifier for $\overline{L}$. In software, you could imagine that these verifiers correspond to functions with these signatures:

/* Verifier for L. This function has these properties:
 *
 *    checkInL(w, c) always halts.
 *    for all strings w, (w in L if and only if there's a c
 *        where checkInL(w, c) returns true
 *    )
 */
bool checkInL(string w, string c);

/* Verifier for L^bar. This function has these properties:
 *
 *    checkInLBar(w, c) always halts.
 *    for all strings w, (w in L^bar if and only if there's a c
 *        where checkInLBar(w, c) returns true
 *    )
 */
bool checkInLBar(string w, string c);

Show that $L \in \rlangs$ by writing pseudocode for a function

bool isInL(string w)

that accepts as input a string $w$, then returns true if $w \in L$ and returns false if $w \notin L$. You don't need to write much code here. If you find yourself writing ten or more lines of pseudocode, you're probably missing something. No justification is required.

Look at the appendix for "Unsolvable Problems, Part II" to see how to convert a verifier into a recognizer or vice-versa. You may find some of the ideas there useful here.

Problem Four: The Lava Diagram

(We recommend reading the Guide to the Lava Diagram before starting this problem.)

Download the starter files for Problem Set 9 and extract them somewhere convenient. Run the provided program and choose “The Lava Diagram.” You’ll be presented with a Venn diagram showing the overlap of different classes of languages we've studied so far. We have also provided you a list of twelve numbered languages. For each of those languages, draw where in the Venn diagram that language belongs. As an example, we've indicated where Language 1 and Language 2 should go. No proofs or justifications are necessary – the purpose of this problem is to help you build a better intuition for what makes a language regular, $\rlangs$, $\relangs$, or none of these.

Use the provided to test your solution locally, then submit the file res/LavaDiagram.answers to Gradescope when you’re done.

$\Sigma^\star$
$L_D$
$\Set{ \mathtt{a}^n \suchthat n \in \naturals }$
$\Set{ \mathtt{a}^n \suchthat n \in \naturals \text{ and } n \text{ is a multiple of } 137 }$
$\Set{ \mathtt{1}^m\mathtt{+1}^n\mathtt{=1}^{m+n} \suchthat m, n \in \naturals }$
$\Set{ \encoded{M} \suchthat M \text{ is a TM and } M \text{ is a not a recognizer for } \emptyset }$
$\Set{ \encoded{M} \suchthat M \text{ is a TM and } M \text{ is a recognizer for } \emptyset }$
$\Set{ \encoded{M} \suchthat M \text{ is a TM and } M \text{ is a recognizer for } L_D }$
$\Set{ \encoded{M, n} \suchthat M\text{ is a TM, } n \in \naturals, \text{ and } M \text{ accepts all strings in } \Set{\mathtt{a}, \mathtt{b}}^\star \text{ of length at most } n }$
$\Set{ \encoded{M, n} \suchthat M\text{ is a TM, } n \in \naturals, \text{ and } M \text{ rejects all strings in } \Set{\mathtt{a}, \mathtt{b}}^\star \text{ of length at most } n }$
$\Set{ \encoded{M, n} \suchthat M\text{ is a TM, } n \in \naturals, \text{ and } M \text{ loops on all strings in } \Set{\mathtt{a}, \mathtt{b}}^\star \text{ of length at most } n }$
{ $\encoded{M_1, M_2, M_3, w} \suchthat M_1, M_2,$ and $M_3$ are TMs, $w$ is a string, and at least two of $M_1$, $M_2$, and $M_3$ accept $w$. }

Optional Fun Problem: Undecidability Phase Transitions

Let $A_{TM}^{(n)}$ be the following language:

\[A_{TM}^{(n)} = \Set{ \langle M, w \rangle \suchthat M \text{ is a TM that is at most } n \text{ lines long, } w \text{ is a string, and } M \accepts w }.\]

The language $A_{TM}^{(0)}$ is the empty language because there are no 0-line TMs. As a result, $A_{TM}^{(0)}$ is decidable. Similarly, $A_{TM}^{(1)}$ is decidable - the only legal 1-line TM is the TM Start:, which implicitly rejects every string. $A_{TM}^{(2)}$ is also decidable, but it's a bit more interesting because there are two-line TMs that accept their inputs. More generally, as we increase $n\text,$ we start dealing with more and more complex TMs, and the language becomes more and more complex.

Prove that there exists an $n \in \naturals$ where $A_{TM}^{(n)}$ is undecidable. This shows that there is a "phase transition" in these languages - they begin decidable, are decidable for some range of values of $n\text,$ but then abruptly switch to undecidable.

Grand Challenge Problem: $\mathbf{P} \stackrel{?}{=} \mathbf{NP}$

(Worth an A+, $1,000,000, and a Ph.D)

Prove or disprove: $\mathbf{P} = \mathbf{NP}$.

Take fifteen minutes and try this. Seriously. And if you can’t crack this problem, feel free to submit your best effort, or the silliest answer you can think of.

Congratulations on finishing the last problem set of the quarter!

We’re extremely impressed with how much progress you’ve made since the start of the quarter.

Best of luck on the final exam!