\documentclass[11pt]{article}
\usepackage[pdftex]{graphicx, color}
\usepackage{amsmath}
\usepackage{listings}
\usepackage{ebproof}
\usepackage{tikz}
\usetikzlibrary{graphs}
\headheight 8pt \headsep 20pt \footskip 30pt
\textheight 9in \textwidth 6.5in
\oddsidemargin 0in \evensidemargin 0in
\topmargin -.35in
\lstset{basicstyle=\small\ttfamily,breaklines=true,numbers=left}
\lstset{escapeinside={<@}{@>}}
\usepackage{array}
\newcolumntype{L}[1]{>{\raggedright\let\newline\\\arraybackslash\hspace{0pt}}m{#1}}
\newcolumntype{C}[1]{>{\centering\let\newline\\\arraybackslash\hspace{0pt}}m{#1}}
\newcolumntype{R}[1]{>{\raggedleft\let\newline\\\arraybackslash\hspace{0pt}}m{#1}}
\begin{document}
\begin{center}
%% Change this:
\LARGE YOURNAME - SUNETID \\
\Large CS143 - Written Assignment 4
\end{center}
% A pdf file can be generated from this by running
% pdflatex -jobname=WA4_solved WA4_template.tex
% Upload WA4_solved.pdf to GradeScope
% to submit the assignment!
\begin{enumerate}
% Problem 1
\item
Consider the following program in Cool, representing a ``slightly'' over-engineered implementation which calculates the factorial of 3 using an operator class and a reduce() method:
\begin{lstlisting}
class BinOp {
operate(a: Int, b: Int): Int {
a + b
};
optype(): String {
"BinOp"
};
};
class SumOp inherits BinOp {
optype(): String {
"SumOp"
};
};
class MulOp inherits BinOp {
operate(a: Int, b: Int): Int {
a * b
};
optype(): String {
"MulOp"
};
};
class IntList {
head: Int;
tail: IntList;
empty_tail: IntList; -- Do not assign.
tail_is_empty(): Bool {
tail = empty_tail
};
get_head(): Int { head };
set_head(n: Int): Int {
head <- n
};
get_tail(): IntList { tail };
set_tail(t: IntList): IntList {
tail <- t
};
generate(n: Int): IntList {
let l: IntList <- new IntList in {
-- Point A
l.set_head(n);
if (n = 1) then
l.set_tail(empty_tail)
else
l.set_tail(generate(n-1))
fi;
l;
}
};
};
class Main {
reduce(result: Int, op: BinOp, l: IntList): Int {{
result <- op.operate(result,l.get_head());
if (l.tail_is_empty() = true) then
-- Point B
result
else
reduce(result,op,l.get_tail())
fi;
}};
main(): Object {
let op: BinOp <- new MulOp, l: IntList <- new IntList, io: IO <- new IO in {
l <- l.generate(3);
io.out_int(self.reduce(1,op,l));
}
};
};
\end{lstlisting}
The following is an abstracted representation of a memory layout of the program generated by a hypothetical Cool compiler for the above code (note that this might or might not correspond to the layout generated by your compiler or the reference coolc):
\begin{tabular}{ | c | }
\hline
\textbf{Code segment:} (\textit{low address}) \\
\begin{tabular}{ | R{2cm} | L{9cm} |}
\hline
$\mathtt{maddr}_1$: & \texttt{cgen\_m(BinOp.operate)} \\
\cline{2-2}
$\mathtt{maddr}_2$: & \texttt{cgen\_m(BinOp.optype)} \\
\cline{2-2}
$\mathtt{maddr}_3$: & \texttt{cgen\_m(SumOp.optype)} \\
\cline{2-2}
$\mathtt{maddr}_4$: & \texttt{cgen\_m(MulOp.operate)} \\
\cline{2-2}
$\mathtt{maddr}_5$: & \texttt{cgen\_m(MulOp.optype)} \\
\cline{2-2}
$\mathtt{maddr}_6$: & \texttt{cgen\_m(IntList.tail\_is\_empty)} \\
\cline{2-2}
$\mathtt{maddr}_7$: & \texttt{cgen\_m(IntList.get\_head)} \\
\cline{2-2}
$\mathtt{maddr}_8$: & \texttt{cgen\_m(IntList.set\_head)} \\
\cline{2-2}
$\mathtt{maddr}_9$: & \texttt{cgen\_m(IntList.get\_tail)} \\
\cline{2-2}
$\mathtt{maddr}_{10}$: & \texttt{cgen\_m(IntList.set\_tail)} \\
\cline{2-2}
$\mathtt{maddr}_{11}$: & \texttt{cgen\_m(IntList.generate)} \\
\cline{2-2}
$\mathtt{maddr}_{12}$: & \texttt{cgen\_m(Main.reduce)} \\
\cline{2-2}
$\mathtt{maddr}_{13}$: & \texttt{cgen\_m(Main.main)} \\
\hline
\end{tabular} \\
\textbf{Dispatch tables:} \\
\begin{tabular}{ | R{2cm} | L{9cm} |}
\hline
$\mathtt{maddr}_{14}$: & \texttt{DT[BinOp]} \\
\cline{2-2}
$\mathtt{maddr}_{15}$: & \texttt{DT[SumOp]} \\
\cline{2-2}
$\mathtt{maddr}_{16}$: & \texttt{DT[MulOp]} \\
\cline{2-2}
$\mathtt{maddr}_{17}$: & \texttt{DT[IntList]} \\
\cline{2-2}
$\mathtt{maddr}_{18}$: & \texttt{DT[Main]} \\
\hline
\end{tabular} \\
\textbf{Heap} \\
$\downarrow$ \\
\\
\\
\\
\hline
\\
\\
\\
$\uparrow$ \\
\textbf{Stack} ($\mathtt{maddr}_{19}$) (\textit{high address}) \\
\hline
\end{tabular}
In the above, $\mathtt{maddr}_i$ represents the memory address at which the corresponding method's code or dispatch table starts. You should assume that the above layout is contiguous in memory. Note that the stack starts at a high address and grows towards lower addresses.
\newpage
\begin{enumerate}
\item
Assume the MIPS assembly code to be stored starting at address $\mathtt{maddr}_{12}$ and ending immediately before $\mathtt{maddr}_{13}$ (i.e. not including the instruction starting at $\mathtt{maddr}_{13}$) was generated using the code generation process from Lecture 12 (beginning at Slide 18). In particular, assume that the caller is responsible for saving the frame pointer, and the callee is responsible for restoring the frame pointer. In addition, assume that the address to the self object is stored on the stack along with the other parameters. How many instructions using the frame pointer register ($\$fp$) will be present within such code? Why? \\
\textbf{Answer:}
%% Your answer here
\newpage
\item
The following is a representation of the dispatch table for class Main: \\
\begin{tabular}{ | l | l | l | }
\hline
Method Idx & Method Name & Address \\
\hline
0 & reduce & $\mathtt{maddr}_{12}$ \\
\hline
1 & main & $\mathtt{maddr}_{13}$ \\
\hline
\end{tabular} \\
Provide equivalent representations for the dispatch tables of BinOp, SumOp, MulOp, and IntList.\\
\textbf{Answer:}
%% Your answer here
\newpage
\item
Consider the state of the program at runtime when reaching (for the first time) the beginning of the line marked with the comment ``Point A''. Give the object layout (as per Lecture 12) of every object currently on the heap which is of a class defined by the program (i.e. ignoring Cool base classes such as IO or Int). For attributes, you can directly represent Int values by integers and an unassigned pointer by \textbf{void}. However, note that in a real Cool program, Int is an object and would have its own object layout, omitted here for simplicity. Finally, you can assume class tags are numbers from 1 to 5 given in the same order as the one in which classes appear in the layout above, and that attributes are laid out in the same order as the class definition.\\
\textbf{Answer:}
%% Your answer here
\newpage
\item
The following table represents an abstract view of the layout of the stack at runtime when reaching (for the first time) the beginning of the line marked with the comment ``Point A'': \\
\begin{tabular}{ | r | l | c | l | }
\hline
Address & Method & Contents & Description \\
\hline
$\mathtt{maddr}_{19}$ & Main.main & self & $\mbox{arg}_0$ \\
\hline
$\mathtt{maddr}_{19}-4$ & Main.main & ... & Return \\
\hline
$\mathtt{maddr}_{19}-8$ & Main.main & op & local \\
\hline
$\mathtt{maddr}_{19}-12$ & Main.main & l & local \\
\hline
$\mathtt{maddr}_{19}-16$ & Main.main & io & local \\
\hline
$\mathtt{maddr}_{19}-20$ & IntList.generate & $\mathtt{maddr}_{19} - 4$ & FP \\
\hline
$\mathtt{maddr}_{19}-24$ & IntList.generate & 3 & $\mbox{arg}_1$ \\
\hline
$\mathtt{maddr}_{19}-28$ & IntList.generate & self & $\mbox{arg}_0$ \\
\hline
$\mathtt{maddr}_{19}-32$ & IntList.generate & $\mathtt{maddr}_{13}+\delta$ & Return \\
\hline
$\mathtt{maddr}_{19}-36$ & IntList.generate & l & local \\
\hline
\end{tabular} \\
Note that we are assuming there are no stack frames above Main.main(...). This doesn't necessarily match a real implementation of the Cool runtime system, where main must return control to the OS or the Cool runtime on exit. For the purposes of this exercise, feel free to ignore this issue.\\
Since you don't have the generated code for every method above, you cannot directly calculate the return address to be stored on the stack. You should however give it as $\mathtt{maddr}_{i}+\delta$, denoting an unknown address between $\mathtt{maddr}_{i}$ and $\mathtt{maddr}_{i+1}$. This notation is used in the example above. For locals, you should use the variable name, but remember that in practice it is the heap address that gets stored in memory for objects. For objects that have no variable names, you may give a short description of the object (e.g., ``ptr to [2, 1]'' to represent a pointer to an IntList consisting of [2, 1]).\\
Give a similar view of the stack at runtime when reaching (for the first time) the beginning of the line marked with the comment ``Point B''.\\
\textbf{Answer:}\\
%% Your answer here
\newpage
\end{enumerate}
% Problem 2
\item
Consider the following arithmetic expression: $(12 + 6) * 5 - (20 / (7 + 3)) + (4 / 2)$.
\begin{enumerate}
\item
You are given MIPS code that evaluates this expression using a stack machine with a single accumulator register (similar to the method given in class Lecture 12). This code is wholly unoptimized and will execute the operations given in the expression above in their original order (e.g. it does not perform transformations such as arithmetic simplification or constant folding). How many times in total will this code push a value to or pop a value from the stack (give a separate count for the number of pushes and the number of pops)?\\
\textbf{Answer:}
%% Your answer here
\newpage
\item
Now suppose that you have access to two registers \texttt{r1} and \texttt{r2} in addition to the stack pointer. Consider the code generated using the revised process described in lecture 12 starting on slide 30, with \texttt{r1} as an accumulator and \texttt{r2} storing temporaries. How many loads and stores are now required? \\
\textbf{Answer:}
%% Your answer here
\newpage
\end{enumerate}
% Problem 3
\item
Suppose you want to add a for-loop construct to Cool, having the following syntax:
$$\mbox{for}\ e_1\ \mbox{to}\ e_2\ \mbox{do}\ e_3\ \mbox{rof}$$
The above for-loop expression is evaluated as follows: expressions $e_1$ and $e_2$ are evaluated \textbf{only once}, then the body of the loop ($e_3$) is executed once for every integer in the range $[e_1, e_2]$ (inclusive) in order. Similar to the while loop, the for-loop returns void.
\begin{enumerate}
\item
Give the operational semantics for the for-loop construct above.\\
\textbf{Answer:}
%% Your answer here
\newpage
\item Give the code generation function $\mbox{cgen}(\mbox{for}\ e_1\ \mbox{to}\ e_2\ \mbox{do}\ e_3\ \mbox{rof})$ for this construct. Use the code generation conventions from the lecture. The result of $\mbox{cgen}(...)$ must be MIPS code following the stack-machine with one accumulator model.\\
\textbf{Answer:}
%% Your answer here
\newpage
\end{enumerate}
% Problem 4
\item
Consider the following basic block, in which all variables are integers.
\begin{lstlisting}
a := f * f + 0
b := a + 0
c := 2 + 8
d := c * b
e := f * f
x := e + d
g := b + d
h := b + d
i := g * 1
y := i / h
\end{lstlisting}
Assume that the only variables that are live at the exit of this block are $x$ and $y$, while $f$ is given as an input. In order, apply the following optimizations to this basic block. Show the result of each transformation. For each optimization, you must continue to apply it until no further applications of that transformation are possible, before writing out the result and moving on to the next optimization.
\begin{enumerate}
\item Algebraic simplification
\item Copy propagation
\item Common sub-expression elimination
\item Constant folding
\item Copy propagation
\item Dead code elimination
\end{enumerate}
When you have completed the last of the above transformations, the resulting program will still not be optimal. What optimization(s), in what order, can you apply to optimize the result further?\\
\textbf{Answer:}
\begin{enumerate}
\item
Algebraic simplification \\
%% Your answer here
\item
Copy propagation \\
%% Your answer here
\item
Common sub-expression elimination\\
%% Your answer here
\item
Constant folding\\
%% Your answer here
\item
Copy propagation\\
%% Your answer here
\item
Dead code elimination\\
%% Your answer here
\end{enumerate}
%% Your extra optimization(s) here
\newpage
% Problem 5
\item
Consider the following assembly-like pseudo-code, using 10 temporaries (abstract registers) \texttt{t0} to \texttt{t9}: \\
\begin{lstlisting}
t1 = t0 + 15
t2 = t0 * 3
if t1 < t2
then
t3 = t1 * t2
t4 = t1 * 2
else
t3 = t1 + t2
t4 = t1 + t3
fi
t5 = t4 - t2
t6 = t5 + t3
t2 = t5 + 1
t7 = t2 + t3
if t6 < t7
then
t8 = t6
else
t8 = t7
fi
t9 = t8 * 2
\end{lstlisting}
\begin{enumerate}
\item
At each program point, list the temporaries that are live. Note that \texttt{t0} is the only input temporary for the given code and \texttt{t9} will be the only live value on exit.
\textbf{Answer:}
%% Your answer here
\newpage
\item
Provide a lower bound on the number of registers required by the program.
\textbf{Answer:}
%% Your answer here
\newpage
\item
Draw the register interference graph between temporaries in the above program as described in class.
\textbf{Answer:}
%% Your answer here
\newpage
\item
Using the algorithm described in class, provide a coloring of the graph in part(c). The number of colors used should be your lower bound in part (b). Provide the final $k$-colored graph (you may use the tikz package to typeset it or simply embed an image), along with the order in which the algorithm colors the nodes.
\textbf{Answer:}
%% Your answer here
\newpage
\item
Based on your coloring, write down a mapping from temporaries to registers (labeled \texttt{r1}, \texttt{r2}, etc.).
\textbf{Answer:}
%% Your answer here
\end{enumerate}
\end{enumerate}
\end{document}