### **CS107, Lecture 15**

### Accessing the Architecture: An Introduction to Comp Arch

Reading: B&O 3.1-3.4

This document is copyright (C) Stanford Computer Science, Adam Keppler and Joel Ramirez, licensed under Creative Commons Attribution 2.5 License. All rights reserved. Based on slides created by Nick Troccoli, Chris Gregg, and Raymond Klefstad

# What should someone do if they find a vulnerability? How can we incentivize responsible disclosure?

### Disclosure

What's the best way to disclose vulnerabilities?

- Full disclosure? Make vulnerabilities public as soon as they are found? Few people now endorse this approach due to its drawbacks.
- **Responsible disclosure?** Privately alert software maker to fix in reasonable amount of time before publicizing? *Most common, and recommended by ACM code of ethics.*

### Disclosure

- Various entities may want to financially reward people for finding and reporting vulnerabilities.
- The US Federal Government is one of the largest discoverers and purchasers of O-day vulnerabilities. It follows a "Vulnerability Equities Process" (VEP) to determine which vulnerabilities to responsibly disclose and which to keep secret and use for espionage or intelligence gathering.

# How do we weigh competing stakeholder interests here, such as country vs. individual?

# Partiality

*Partiality* holds that it is acceptable to give preferential treatment to some people based on our relationships to them or shared group membership with them.

*Impartiality,* involves "acting from a position that acknowledges that all persons are ... equally entitled to fundamental conditions of well-being and respect."

### Partiality



### **Degrees of Partiality**

**Partiality**: preference towards own family, friends, and state is morally acceptable or even required

Partial Cosmpolitanism: limited preference towards own state acceptable **Universal Care**: preference towards family acceptable but not towards state Impartial Benevolence: same moral responsibilities towards all people

### **Case Study: EternalBlue**

2012-2017: NSA secretly stores the EternalBlue Microsoft vulnerability and uses it to spy on both US and non-US citizens. early 2017: EternalBlue stolen by hacker group the ShadowBrokers. NSA discloses EternalBlue to Microsoft. March 14, 2017: Microsoft releases a patch for the vulnerability. May 12, 2017: EternalBlue is the basis of the WannaCry and other ransomware attacks, leading to downtime in critical hospital and city systems and over \$1 billion of damages.

### **Microsoft's Argument**

"[T]his attack provides yet another example of why the **stockpiling of vulnerabilities** by governments is such a problem. ...

We need governments to consider the **damage to civilians** that comes from hoarding these vulnerabilities and the use of these exploits.

This is one reason we called in February for a new "Digital Geneva Convention" to govern these issues, including a **new requirement for governments to report vulnerabilities to vendors**, rather than stockpile, sell, or exploit them.

And it's why we've pledged our support for **defending every customer everywher**e in the face of cyberattacks, **regardless of their nationality**."

#### Full post here

## **Critical Questions**

- Do we have special obligations to our own country and to protect our people? If so, what would this mean?
- If intentionally exploiting a vulnerability is wrong when done by a private citizen, is it equally wrong when done by the government?
- Should I be loyal to my country, a citizen of the world, or both?
- When should I give preference to my family members and when should I strive to treat all equally?

# What you choose matters – the moral obligations you take on constitute who you are.

### **Revisiting EternalBlue**

### Federal Government



Partiality: preference towards own family, friends, and state is morally acceptable or even required

Partial Cosmpolitanism: limited preference towards own state acceptable Universal Care: preference towards family acceptable but not towards state Impartial Benevolence: same moral responsibilities towards all people

### **Partiality Takeaways**

- Understanding partiality helps us understand how we balance cases of competing interests and where we may personally fall on this spectrum.
- In order to evaluate situations, it's critical to understand the good and the bad that may come of it (e.g. EternalBlue). Better understanding privacy and privacy concerns is critical to this! (more later)

### **GCC Optimizations**

### **Tail Recursion**

**Tail recursion** is an example of where GCC can identify recursive patterns that can be more efficiently implemented iteratively.

```
long factorial(int n) {
    if (n <= 1) {
        return 1;
    }
    else return n * factorial(n - 1);
}</pre>
```

### **Tail Recursion Example**

Recall the factorial problem from assembly lectures:

```
unsigned int factorial(unsigned int n) {
    if (n <= 1) {
        return 1;
    }
    return n * factorial(n - 1);
}</pre>
```

What happens with **factorial(-1)**?

- Infinite recursion → Literal stack overflow!
- Compiled with -0g!

### Factorial: -0g vs -02

| 401146 <+0>: cmp<br>401149 <+3>: jbe<br>40114b <+5>: push                                                                                                                       | \$0x1,%edi<br>0x40115b <facto<br>%rbx</facto<br>                                                         | rial+21>                                                                                                                                                                                                |                                                                                                                                                                  |
|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <pre>40114c &lt;+6&gt;: mov 40114e &lt;+8&gt;: lea 401151 &lt;+11&gt;:callq 401156 &lt;+16&gt;:imul 401159 &lt;+19&gt;:pop 40115a &lt;+20&gt;:retq 40115b &lt;+21&gt;:mov</pre> | <pre>%edi,%ebx -0x1(%rdi),%edi 0x401146 <facto \$0x1,%eax<="" %ebx,%eax="" %rbx="" pre=""></facto></pre> |                                                                                                                                                                                                         | <ul> <li>-02:</li> <li>What happened?</li> <li>Did the compiler "fix" the infinite recursion?</li> </ul>                                                         |
| 401160 <+26>:retq                                                                                                                                                               | . ,                                                                                                      | 4011e0 <+0>: mov<br>4011e5 <+5>: cmp<br>4011e8 <+8>: jbe<br>4011ea <+10>:nopw<br>4011f0 <+16>:mov<br>4011f2 <+18>:sub<br>4011f5 <+21>:imu]<br>4011f8 <+24>:cmp<br>4011fb <+27>:jne<br>4011fd <+29>:reto | 0x4011fd <factorial+29><br/>0x0(%rax,%rax,1)<br/>%edi,%edx<br/>\$0x1,%edi<br/>%edx,%eax<br/>\$0x1,%edi<br/>0x4011f0 <factorial+16></factorial+16></factorial+29> |

### **Breaking Down the – 02**

4011e0 <+0>: mov \$0x1,%eax 4011e5 <+5>: cmp \$0x1,%edi 4011e8 <+8>: jbe 0x4011fd <factorial+29> 4011ea <+10>: nopw 0x0(%rax,%rax,1) 4011f0 <+16>: mov %edi,%edx 4011f2 <+18>: sub \$0x1,%edi 4011f5 <+21>: imul %edx,%eax 4011f8 <+24>: cmp \$0x1,%edi 4011fb <+27>: jne 0x4011f0 <factorial+16> 4011fd <+29>: retq # Initialize %eax with 1.

# Compare input value (%edi) with 1.

- # If input <= 1 (unsigned check), jump to return.
- # No operation (probably for alignment).
- # Copy current value of %edi to %edx.
- # Decrement %edi.

# Multiply %eax by %edx and store result in %eax.

- # Compare decremented value of %edi with 1.
- # If %edi is not 1, repeat the multiplication.

# Return with the result in %eax.

#### -02:

- Recursive -> Iterative
- No Stack Overflow, Saves Memory and Operations

### **GCC Optimizations**

- Constant Folding
- Common Sub-expression Elimination
- Dead Code
- Strength Reduction
- Code Motion
- Tail Recursion
- Loop Unrolling

### **Loop Unrolling**

**Loop Unrolling:** Do **n** loop iterations' worth of work per actual loop iteration, so we save ourselves from doing the loop overhead (test and jump) every time, and instead incur overhead only every n-th time.

```
for (int i = 0; i <= n - 4; i += 4) {
    sum += arr[i];
    sum += arr[i + 1];
    sum += arr[i + 2];
    sum += arr[i + 3];
} // after the loop handle any leftovers</pre>
```

### **Into the Architecture!**

| scanf / printf ———     | Level 7  | Application Layer (Prompt Engineering, UI/UX)<br>Intent Interpretation (User -> Code Translation)            |                |
|------------------------|----------|--------------------------------------------------------------------------------------------------------------|----------------|
| ,                      | Level 6  | High-Level (Problem/Object Oriented) Programming Languages<br>Translation(Compiler)                          |                |
|                        | Level 5  | Assembly Language<br>Translation(Assembler)                                                                  |                |
|                        | Level 4  | Operating System (aka the Machine Level)<br>Partial Interpretation (Syscall Interface & Hardware Abstraction | n Laver (HAL)) |
|                        | Level 3  | Instruction Set Architecture Level<br>Microprogram Interpretation or Direct Execution                        | 0, 0. (,)      |
|                        | Level 2  | Micro-architecture Level                                                                                     |                |
|                        | Level 1  | Logic Synthesis<br>Digital Logic / Circuit Design Level                                                      |                |
|                        | Level 0  | Physical/Layout Design<br>Layout for Fabrication (Defined by the OASIS Standard)                             |                |
| Program Specific Inter | ractions | Etched Silicon                                                                                               | 22             |

|                       | Level 7 | Application Layer (Prompt Engineering, UI/UX)                    |                |
|-----------------------|---------|------------------------------------------------------------------|----------------|
|                       |         | Intent Interpretation (User -> Code Translation)                 |                |
|                       | Level 6 | High-Level (Problem/Object Oriented) Programming Languages       |                |
| GCC                   |         | Translation(Compiler)                                            |                |
|                       | Level 5 | Assembly Language                                                |                |
|                       |         | Translation(Assembler)                                           |                |
|                       | Level 4 | Operating System (aka the Machine Level)                         |                |
|                       |         | Partial Interpretation (Syscall Interface & Hardware Abstraction | n Layer (HAL)) |
|                       | Level 3 | Instruction Set Architecture Level                               |                |
|                       |         | Microprogram Interpretation or Direct Execution                  |                |
|                       | Level 2 | Micro-architecture Level                                         |                |
|                       |         | Logic Synthesis                                                  |                |
|                       | Level 1 | Digital Logic / Circuit Design Level                             |                |
|                       |         | Physical/Layout Design                                           |                |
|                       | Level 0 | Layout for Fabrication (Defined by the OASIS Standard)           |                |
|                       |         | Lithography                                                      |                |
| Where GCC Gets Its Na | ame     | Etched Silicon                                                   | 23             |

|             | Level 7   | Application Layer (Prompt Engineering, UI/UX)                              |      |
|-------------|-----------|----------------------------------------------------------------------------|------|
|             |           | Intent Interpretation (User -> Code Translation)                           |      |
|             | Level 6   | High-Level (Problem/Object Oriented) Programming Languages                 |      |
| Start       |           | Translation(Compiler)                                                      |      |
|             | Level 5   | Assembly Language                                                          |      |
|             |           | Translation(Assembler)                                                     |      |
|             | Level 4   | Operating System (aka the Machine Level)                                   |      |
| Run a.out   |           | Partial Interpretation (Syscall Interface & Hardware Abstraction Layer (HA | ۹۲)) |
|             | Level 3   | Instruction Set Architecture Level                                         |      |
|             |           | Microprogram Interpretation or Direct Execution                            |      |
|             | Level 2   | Micro-architecture Level                                                   |      |
|             |           | Logic Synthesis                                                            |      |
|             | Level 1   | Digital Logic / Circuit Design Level                                       |      |
|             |           | Physical/Layout Design                                                     |      |
|             | Level 0   | Layout for Fabrication (Defined by the OASIS Standard)                     |      |
|             |           | Lithography                                                                |      |
| How far GCC | can reach | Etched Silicon 24                                                          |      |

|                       | Level 7 | Application Layer (Prompt Engineering, UI/UX)                    |              |
|-----------------------|---------|------------------------------------------------------------------|--------------|
|                       |         | Intent Interpretation (User -> Code Translation)                 |              |
|                       | Level 6 | High-Level (Problem/Object Oriented) Programming Languages       |              |
|                       |         | Translation(Compiler)                                            |              |
|                       | Level 5 | Assembly Language                                                |              |
| AS/GAS>               |         | Translation(Assembler)                                           |              |
|                       | Level 4 | Operating System (aka the Machine Level)                         |              |
|                       |         | Partial Interpretation (Syscall Interface & Hardware Abstraction | Layer (HAL)) |
|                       | Level 3 | Instruction Set Architecture Level                               |              |
|                       |         | Microprogram Interpretation or Direct Execution                  |              |
|                       | Level 2 | Micro-architecture Level                                         |              |
|                       |         | Logic Synthesis                                                  |              |
|                       | Level 1 | Digital Logic / Circuit Design Level                             |              |
|                       |         | Physical/Layout Design                                           |              |
|                       | Level 0 | Layout for Fabrication (Defined by the OASIS Standard)           |              |
|                       |         | Lithography                                                      |              |
| GNU Assembler (Inside | GCC)    | Etched Silicon                                                   | 25           |

|            | Level 7             | Application Layer (Prompt Engineering, UI/UX)                                |
|------------|---------------------|------------------------------------------------------------------------------|
|            |                     | Intent Interpretation (User -> Code Translation)                             |
|            | Level 6             | High-Level (Problem/Object Oriented) Programming Languages                   |
|            |                     | Translation(Compiler)                                                        |
|            | Level 5             | Assembly Language                                                            |
|            |                     | Translation(Assembler)                                                       |
|            | Level 4             | Operating System (aka the Machine Level)                                     |
| RUN        |                     | Partial Interpretation (Syscall Interface & Hardware Abstraction Layer (HAL) |
|            | Level 3             | Instruction Set Architecture Level                                           |
|            |                     | Microprogram Interpretation or Direct Execution                              |
|            | Level 2             | Micro-architecture Level                                                     |
|            |                     | Logic Synthesis                                                              |
|            | Level 1             | Digital Logic / Circuit Design Level                                         |
|            |                     | Physical/Layout Design                                                       |
|            | Level 0             | Layout for Fabrication (Defined by the OASIS Standard)                       |
|            |                     | Lithography                                                                  |
| OS Manages | Program -> Hardware | Etched Silicon 26                                                            |

|                  | Level 7      | Application Layer (Prompt Engineering, UI/UX)                                 |
|------------------|--------------|-------------------------------------------------------------------------------|
|                  |              | Intent Interpretation (User -> Code Translation)                              |
|                  | Level 6      | High-Level (Problem/Object Oriented) Programming Languages                    |
|                  |              | Translation(Compiler)                                                         |
|                  | Level 5      | Assembly Language                                                             |
|                  |              | Translation(Assembler)                                                        |
|                  | Level 4      | Operating System (aka the Machine Level)                                      |
|                  |              | Partial Interpretation (Syscall Interface & Hardware Abstraction Layer (HAL)) |
|                  | Level 3      | Instruction Set Architecture Level                                            |
| RUN —            |              | Microprogram Interpretation or Direct Execution                               |
|                  | Level 2      | Micro-architecture Level                                                      |
|                  |              | Logic Synthesis                                                               |
|                  | Level 1      | Logic Synthesis<br>Digital Logic / Circuit Design Level                       |
|                  |              | Physical/Layout Design                                                        |
|                  | Level 0      | Layout for Fabrication (Defined by the OASIS Standard)                        |
|                  |              | Lithography                                                                   |
| Processing the I | Machine Code | Etched Silicon 27                                                             |

|               | Level 7         | Application Layer (Prompt Engineering, UI/UX)                    |              |
|---------------|-----------------|------------------------------------------------------------------|--------------|
|               |                 | Intent Interpretation (User -> Code Translation)                 |              |
|               | Level 6         | High-Level (Problem/Object Oriented) Programming Languages       |              |
|               |                 | Translation(Compiler)                                            |              |
|               | Level 5         | Assembly Language                                                |              |
|               |                 | Translation(Assembler)                                           |              |
|               | Level 4         | Operating System (aka the Machine Level)                         |              |
|               |                 | Partial Interpretation (Syscall Interface & Hardware Abstraction | Layer (HAL)) |
|               | Level 3         | Instruction Set Architecture Level                               |              |
|               |                 | Microprogram Interpretation or Direct Execution                  |              |
|               | Level 2         | Micro-architecture Level                                         |              |
| VLSI          |                 | Logic Synthesis                                                  |              |
|               | Level 1         | Digital Logic / Circuit Design Level                             |              |
|               |                 | Physical/Layout Design                                           |              |
|               | Level 0         | Layout for Fabrication (Defined by the OASIS Standard)           |              |
|               |                 | Lithography                                                      |              |
| Very-Large-Sc | ale Integration | Etched Silicon                                                   | 28           |

|                   | Level 7       | Application Layer (Prompt Engineering, UI/UX)              |                        |
|-------------------|---------------|------------------------------------------------------------|------------------------|
|                   |               | Intent Interpretation (User -> Code Translation)           |                        |
|                   | Level 6       | High-Level (Problem/Object Oriented) Programming Languages |                        |
|                   |               | Translation(Compiler)                                      |                        |
|                   | Level 5       | Assembly Language                                          |                        |
|                   |               | Translation(Assembler)                                     |                        |
|                   | Level 4       | Operating System (aka the Machine Level)                   |                        |
|                   |               | Partial Interpretation (Syscall Interface & Hardware Ab    | straction Layer (HAL)) |
|                   | Level 3       | Instruction Set Architecture Level                         |                        |
|                   |               | Microprogram Interpretation or Direct Execution            |                        |
|                   | Level 2       | Micro-architecture Level                                   |                        |
| RTL —             |               | Logic Synthesis                                            |                        |
|                   | Level 1       | Digital Logic / Circuit Design Level                       |                        |
|                   |               | Physical/Layout Design                                     |                        |
|                   | Level 0       | Layout for Fabrication (Defined by the OASIS Standard)     |                        |
|                   |               | Lithography                                                |                        |
| RTL (Register-Tra | insfer Level) | Etched Silicon                                             | 29                     |

Many

Steps

Floorplanning

| Level 7 | Application Layer (Prompt Engineering, UI/UX)                                 |
|---------|-------------------------------------------------------------------------------|
|         | Intent Interpretation (User -> Code Translation)                              |
| Level 6 | High-Level (Problem/Object Oriented) Programming Languages                    |
|         | Translation(Compiler)                                                         |
| Level 5 | Assembly Language                                                             |
|         | Translation(Assembler)                                                        |
| Level 4 | Operating System (aka the Machine Level)                                      |
|         | Partial Interpretation (Syscall Interface & Hardware Abstraction Layer (HAL)) |
| Level 3 | Instruction Set Architecture Level                                            |
|         | Microprogram Interpretation or Direct Execution                               |
| Level 2 | Micro-architecture Level                                                      |
|         | Logic Synthesis                                                               |
| Level 1 | Digital Logic / Circuit Design Level                                          |
|         | Physical/Layout Design                                                        |
| Level 0 | Layout for Fabrication (Defined by the OASIS Standard)                        |
|         | Lithography                                                                   |
|         | Etched Silicon 30                                                             |

|                                        | Level 7 | Application Layer (Prompt Engineering, UI/UX)                    |                |
|----------------------------------------|---------|------------------------------------------------------------------|----------------|
|                                        |         | Intent Interpretation (User -> Code Translation)                 |                |
|                                        | Level 6 | High-Level (Problem/Object Oriented) Programming Languages       |                |
|                                        |         | Translation(Compiler)                                            |                |
|                                        | Level 5 | Assembly Language                                                |                |
|                                        |         | Translation(Assembler)                                           |                |
|                                        | Level 4 | Operating System (aka the Machine Level)                         |                |
|                                        |         | Partial Interpretation (Syscall Interface & Hardware Abstraction | n Layer (HAL)) |
|                                        | Level 3 | Instruction Set Architecture Level                               |                |
|                                        |         | Microprogram Interpretation or Direct Execution                  |                |
|                                        | Level 2 | Micro-architecture Level                                         |                |
|                                        |         | Logic Synthesis                                                  |                |
| Many                                   | Level 1 | Digital Logic / Circuit Design Level                             |                |
| Many<br>Steps                          |         | Physical/Layout Design                                           |                |
|                                        | Level 0 | Layout for Fabrication (Defined by the OASIS Standard)           |                |
| Mine Deviting                          |         | Lithography                                                      |                |
| Wire Routing<br>– Don't Cross the Wire | S       | Etched Silicon                                                   | 31             |

|                                         | Level 7 | Application Layer (Prompt Engineering, UI/UX)                             |      |
|-----------------------------------------|---------|---------------------------------------------------------------------------|------|
|                                         |         | Intent Interpretation (User -> Code Translation)                          |      |
|                                         | Level 6 | High-Level (Problem/Object Oriented) Programming Languages                |      |
|                                         |         | Translation(Compiler)                                                     |      |
|                                         | Level 5 | Assembly Language                                                         |      |
|                                         |         | Translation(Assembler)                                                    |      |
|                                         | Level 4 | Operating System (aka the Machine Level)                                  |      |
|                                         |         | Partial Interpretation (Syscall Interface & Hardware Abstraction Layer (H | 4L)) |
|                                         | Level 3 | Instruction Set Architecture Level                                        |      |
|                                         |         | Microprogram Interpretation or Direct Execution                           |      |
|                                         | Level 2 | Micro-architecture Level                                                  |      |
|                                         |         | Logic Synthesis                                                           |      |
| Many                                    | Level 1 | Digital Logic / Circuit Design Level                                      |      |
| Many<br>Steps                           |         | Physical/Layout Design                                                    |      |
| oteps                                   | Level 0 | Layout for Fabrication (Defined by the OASIS Standard)                    |      |
| Clash Trees Constitue                   |         | Lithography                                                               |      |
| Clock Tree Synthe<br>Time it Just Right |         | Etched Silicon 32                                                         |      |

|                    | Level 7 | Application Layer (Prompt Engineering, UI/UX)                    |              |
|--------------------|---------|------------------------------------------------------------------|--------------|
|                    |         | Intent Interpretation (User -> Code Translation)                 |              |
|                    | Level 6 | High-Level (Problem/Object Oriented) Programming Languages       |              |
|                    |         | Translation(Compiler)                                            |              |
|                    | Level 5 | Assembly Language                                                |              |
|                    |         | Translation(Assembler)                                           |              |
|                    | Level 4 | Operating System (aka the Machine Level)                         |              |
|                    |         | Partial Interpretation (Syscall Interface & Hardware Abstraction | Layer (HAL)) |
|                    | Level 3 | Instruction Set Architecture Level                               |              |
|                    |         | Microprogram Interpretation or Direct Execution                  |              |
|                    | Level 2 | Micro-architecture Level                                         |              |
|                    |         | Logic Synthesis                                                  |              |
| Many<br>Steps      | Level 1 | Digital Logic / Circuit Design Level                             |              |
|                    | •       | Physical/Layout Design                                           |              |
| 51005              | Level 0 | Layout for Fabrication (Defined by the OASIS Standard)           |              |
|                    |         | Lithography                                                      |              |
| Heat & Capacitance |         | Etched Silicon                                                   | 33           |

| Level 7                  | Application Layer (Prompt Engineering, UI/UX)                             |      |
|--------------------------|---------------------------------------------------------------------------|------|
|                          | Intent Interpretation (User -> Code Translation)                          |      |
| Level 6                  | High-Level (Problem/Object Oriented) Programming Languages                |      |
|                          | Translation(Compiler)                                                     |      |
| Level 5                  | Assembly Language                                                         |      |
|                          | Translation(Assembler)                                                    |      |
| Level 4                  | Operating System (aka the Machine Level)                                  |      |
|                          | Partial Interpretation (Syscall Interface & Hardware Abstraction Layer (H | AL)) |
| Level 3                  | Instruction Set Architecture Level                                        |      |
|                          | Microprogram Interpretation or Direct Execution                           |      |
| Level 2                  | Micro-architecture Level                                                  |      |
|                          | Logic Synthesis                                                           |      |
| Level 1                  | Digital Logic / Circuit Design Level                                      |      |
|                          | Physical/Layout Design                                                    |      |
| Level 0                  | Layout for Fabrication (Defined by the OASIS Standard)                    |      |
| ASML                     | Lithography                                                               |      |
| Checkout EUV Lithography | Etched Silicon 34                                                         |      |

| Level 7                          | Application Layer (Prompt Engineering, UI/UX)                         |                                |  |  |
|----------------------------------|-----------------------------------------------------------------------|--------------------------------|--|--|
|                                  | Intent Interpretation (User -> Code Translation)                      |                                |  |  |
| Level 6                          | High-Level (Problem/Object Oriented) Programming Languages            |                                |  |  |
| Level 5                          | Assembly Language<br>Translation(Assembler)                           | Which layer throws a segfault? |  |  |
| Level 4                          | Operating System (aka the Machine Level)                              |                                |  |  |
|                                  | Partial Interpretation (Syscall Interface & Hardwa                    | re Abstraction Layer (HAL))    |  |  |
| Level 3                          | Instruction Set Architecture Level                                    |                                |  |  |
|                                  | Microprogram Interpretation or Direct Execution                       |                                |  |  |
| Level 2 Micro-architecture Level |                                                                       |                                |  |  |
|                                  | Logic Synthesis                                                       |                                |  |  |
| Level 1                          | Digital Logic / Circuit Design Level<br>Physical/Layout Design        |                                |  |  |
|                                  | Physical/Layout Design                                                |                                |  |  |
| Level 0                          | Layout for Fabrication (Defined by the OASIS Standard)<br>Lithography |                                |  |  |
|                                  | Etched Silicon                                                        |                                |  |  |

| HAL IS<br>WATCHING                  | Level 7 | Application Layer (Prompt Engineering, UI/UX)                                 |
|-------------------------------------|---------|-------------------------------------------------------------------------------|
|                                     |         | Intent Interpretation (User -> Code Translation)                              |
|                                     | Level 6 | High-Level (Problem/Object Oriented) Programming Languages                    |
|                                     |         | Translation(Compiler)                                                         |
|                                     | Level 5 | Assembly Language                                                             |
|                                     |         | Translation(Assembler)                                                        |
|                                     | Level 4 | Operating System (aka the Machine Level)                                      |
|                                     |         | Partial Interpretation (Syscall Interface & Hardware Abstraction Layer (HAL)) |
|                                     | Level 3 | Instruction Set Architecture Level                                            |
|                                     |         | Microprogram Interpretation or Direct Execution                               |
|                                     | Level 2 | Micro-architecture Level                                                      |
|                                     |         | Logic Synthesis                                                               |
|                                     | Level 1 | Digital Logic / Circuit Design Level                                          |
|                                     |         | Physical/Layout Design                                                        |
|                                     | Level 0 | Layout for Fabrication (Defined by the OASIS Standard)                        |
|                                     |         | Lithography                                                                   |
| Program Memory Managed<br>By The OS |         | Etched Silicon 36                                                             |

### **More on the Compiler**

• One Unix Command – A lot of steps!

gcc hello.c -o hello



- Preprocessing Handle Programmer Conveniences
  - #Macros convert to normal C code
  - Lines split by \ are joined
  - Comments are removed
    - NOTE: Some comments are added, but our comments are removed
  - Bring in functions and variables from the headers
    - This is how the #include is resolved

gcc -E hello.c > pre\_processed\_hello



Compilation – C to Assembly

gcc -S hello.c

- Will generate intermediate 'human-readable' assembly
- There are different styles/syntax for x86, we use AT&T
  - AT&T is also the gcc default



• Object Generation – C to Object File

gcc -c hello.c

- "Just compile; Don't link"
- This outputs a non-human readable Object File
  - It is defined as a type of incomplete machine code
  - With extra metadata to power linking
- Using objdump –d hello.o , we can see the assembly



- Linking Bringing All the pieces together
  - Object Files & Libraries -> Fully Executable Machine Code

gcc hello.o -o hello

- Id -o hello hello.o -lc -dynamic-linker /lib64/ld-linux-x86-64.so.2 /usr/lib/x86\_64-linux-gnu/crt1.o /usr/lib/x86\_64-linux-gnu/crti.o /usr/lib/x86\_64-linux-gnu/crtn.o
- NOTE: We can get our .o in more than one-way gcc -c hello.c

OR

as hello.s



#### What does the Assembler Do?

## **A Two Step Process**

- Pass 1: Setup Memory Addresses
  - The program reads in the assembly program identifying and tracking:
    - Labels
    - Literals
    - Data Variables
- Pass 2: Generate the Machine Code (Byte/Binary Code)
  - Identify Opcode from the mnemonic assembly
  - Resolve labels/literals/variables using the tables from Step 1
  - Convert Data to Binary
  - Identifies External (Out of Program) References and places markers for the Linker
  - Setup Metadata for linking if this program has loadable parts

Final Output is not runnable, but has all the parts need if linking can complete

## Why do we need a linker?

## **Many Links**

- Every C file corresponds to a .o
- Libraries can also be made into linkable formats
- We don't want to have to write all our code in 1 file and we want to use the STL
- The linker makes this all possible

- Multi-Step Process -> Multiple Failure Points
- Compilation can fail for many reasons at different points
- Mainly two areas that fail 'Compilation' or Linking
- If compilation succeeds, Intermediate Assembly will be good!



## **Peeking at Memory**

- CPU is the most important place
  - Closer to CPU, less travel time
  - But limited space, so bottleneck getting there
- Think of the CPU like downtown, generally expensive and highly desirable real estate
- The BUS (actual technical name) is our transit system around the computer
- Places close to the CPU are more limited and more valuable, since they can get to the CPU faster



- All of Memory (Temporary Storage on the right) and the registers is rent only, so data is constantly moving around
- Many algorithms developed to decide which data gets to live where and for how long
- Proper access makes a huge difference on performance



#### • Approximate Access Times

|  | Resource                                          | Latency Time                     |
|--|---------------------------------------------------|----------------------------------|
|  | Register                                          | 0 Cycles (already here)          |
|  | Level 1 Cache                                     | ~0.5 ns                          |
|  | Level 2 Cache                                     | ~7 ns (14x L1)                   |
|  | RAM                                               | ~100 ns (20x L2, 200x L1)        |
|  | SSD                                               | ~100-150 us (~14Kx L2, 200Kx L1) |
|  | Hard (Spinning) Disk                              | ~10 ms (~2.8Mx L2, 40Mx L1)      |
|  | Network Packet CA -> Netherlands -><br>CA         | ~150 ms (~21Mx L2, 300Mx L1)     |
|  | Average Human Response Time to<br>Visual Stimulus | ~200 ms (~28Mx L2, 400Mx L1)     |



- Pre-emptive requests and moving of data is critical
- Orders of Magnitude Improvements from high locality
- Every part of the pyramid is working on making this faster
- Better BUS, faster storage(both temporary and permanent), bigger RAM, better algorithms



## What is Locality?

- Temporal Locality
  - Has the data been used recently? Then we expect to be used again soon
- Spatial Locality
  - The data appears close together in the program/memory, so it will likely be needed at the same time.
- Hardware and OS designers consider algorithms to predict and leverage locality to optimize management of memory resources
- Cache in particular is a limited resource and must be used effectively to leverage benefits

## Who Gets to Manage the Memory?

- Registers Managed by the Compiler/Assembler
- Cache Managed by Hardware Designers
- Memory Mainly the OS, influenced by hardware
- Disk Managed by the user and occasionally OS



#### **Architecture & The ISA**

## **Programming Levels**

|                                    | Level 7 | Application Layer (Prompt Engineering, UI/UX)                    |              |
|------------------------------------|---------|------------------------------------------------------------------|--------------|
|                                    |         | Intent Interpretation (User -> Code Translation)                 |              |
| Level 6                            |         | High-Level (Problem/Object Oriented) Programming Languages       |              |
| Level 5                            |         | Translation(Compiler)                                            |              |
|                                    |         | Assembly Language                                                |              |
|                                    |         | Translation(Assembler)                                           |              |
|                                    | Level 4 | Operating System (aka the Machine Level)                         |              |
|                                    |         | Partial Interpretation (Syscall Interface & Hardware Abstraction | Layer (HAL)) |
|                                    | Level 3 | Instruction Set Architecture Level                               |              |
|                                    |         | Microprogram Interpretation or Direct Execution                  |              |
| Processor                          | Level 2 | Micro-architecture Level                                         |              |
|                                    |         | Logic Synthesis                                                  |              |
|                                    | Level 1 | Digital Logic / Circuit Design Level                             |              |
|                                    |         | Physical/Layout Design                                           |              |
|                                    | Level 0 | Layout for Fabrication (Defined by the OASIS Standard)           |              |
|                                    |         | Lithography                                                      |              |
| These levels are integrally linked |         | Etched Silicon                                                   | 56           |

- MIC-1 Architecture (Tanenbaum -Structured Computer Organization 6<sup>th</sup> Edition)
- IJVM ISA Subset of the Java Virtual Machine
- A 'Vanilla' processor design



- Control Store is the most important part!
- Our ISA is defined by that unit
- 9 wires in -> 2\*\*9 possible combinations, 2\*\*9 (512) possible commands
- Each command drives 36 wires to control the chip
- Assembly/Machine Language is defined by the hardware



- ALU Arithmetic & Logic Unit
  - Performs Math & Logic Operations
- MAR H are the registers
- B + Decoder Enables Register to load onto B Bus
- Z and N act similar to our condition codes, but in a much more limited/simple way
- C controls the C Bus, informing the destination register to receive its value



- Notice how the ALU is only able to take in the left operand from the H register
- All two operand ALU operations, would need to first load the left operand to H
- This would be an example of a hardware based constraint



#### **Better Design Better Performance**

- The MIC-2 Fixes this issue by adding another BUS improving the Datapath
- Design directly impacts the ISA that we can make available

