Today: ethics: privacy, program style: readability and decomposition, bits and bytes

Ethics - Privacy

By "privacy" here we're referring to individuals vs. the government. (Thanks to Ethics Fellow Wanheng Hu for feedback.)

Encryption Technology - SMS vs. E2E

1. SMS is a traditional setup. Alice and Bob have a key, but Verizon also has the key. The message is encrypted in transit, but Verizon has a copy. (The keys may be per-hop, but the essential feature is that Verizon has a copy.)

2. E2E. Alice and Bob both have a key, and Verizon does not. Thus Verizon only sees the ciphertext. Essential point: if the government asks for the plaintext, Verizon does not have it.

Encryption Technology - Phones, Hard Drives

The files on phones are typically encrypted, only unlocked by the owner's PIN or fingerprint/face-id unlock. Likewise, an external hard drive can be encrypted with a user's password. Such encryption is effective, even in the face of law-enforcement efforts.

Ethics: Respect Some Privacy

Respect some privacy for yourself and others. Allowing people some privacy is good for society.

Short answer: tolerance. Giving people some privacy helps give them some individual freedom, even in the face of intolerance. In computer terms, privacy could be described as a "hack" which helps get a sort of tolerance.

Privacy is not a black-and-white issue. You do not want 0% or 100% privacy. This gets back to the dual-use pattern — we have both sympathetic and unsympathetic users of privacy, so we end up with a compromise of "some", but not 100%, privacy.

Privacy - Sympathetic examples

The terror group ISIS was very unfriendly to gay people. That such a person is able to keep their phone, messages encrypted away from ISIS seems good. Note also the de-facto tolerance angle. Or a dissident smuggling their memoirs out of an authoritarian regime.

If we just look at these examples, privacy looks great. But unfortunately there are just as many unsympathetic examples

Privacy - Unsympathetic examples

Criminals are highly aware of using encryption for chats, data etc. A man was an alleged pedophile and refused to unlock his encrypted hard drive. The courts kept him in jail for years, and eventually he was released. The legality of this situation is currently debated in the US. Does the 5th amendment right against self-incrimination apply to one's phone?

The Nth Room case of blackmail and cybersex trafficking on (encrypted) Telegram.

Aside - Crazy "Phone For Criminals" Story

There was a privacy-focussed phone, marketed to criminals. It turned out to be an FBI front, which used the information for convictions. For an entertaining hour, check out the Search Engine podcast episode:Best Phone For Crimes. Evidence was primarily not used against US citizens, I suspect because its collection violated the limited-government need for a warrant — perhaps an example of the system working as intended.

Compromise - Some Privacy

So we end up with a compromise where individuals have some, but not absolute privacy.

US History - limited government. Includes limitations on the government spying on citizens. Compromise: government needs a warrant, probably cause to get info.

Edward Snowden PRISM US was spying on citizens to some degree.

In contrast, in the above crime-phone story .. they did not pursue US citizens with the info. Here the limited-government rules seemed to be followed.

E2E vs. Warrant

Note that the technology of end-to-end encryption short-circuits the warrant system. Verizon does not have the data to give.

Law Enforcement Back-Door Requests

Law enforcement has lobbied for a "back door" to be added to encryption, where trusted parts of the government can, say, decrypt anyone's phone. Apple/Google argue convincingly that any such backdoor will then be used by ISIS, Russia etc. etc. The current state in the US is that there is no back door.

Current Headline: UK Asks For Backdoor

UK Asks Apple For Backdoor

This shows the E2E vs. backdoor issue is very live!


History Note - Democracy vs. Authoritarian

Democracy was increasing 1945-2000, but now Authoritarianism seems to be on the rise. I suspect this is temporary, and Democracy will again increase. But who knows, perhaps this is my own wishful thinking? This will an interesting arc of history that coincides with your adult life, see what happens.

Note that China, North Korea, Iran ... Whats App is illegal in all these countries. Authoritarian governments do not like to extend privacy to their citizens. I think citizens flourish more in democracies, and that's where I want to live.


Big Picture - The Truth About Software

We'll start a the highest level, seeing the truisms that guide software building. There's software in everything, so you should know the lay of the land.

Goal #1 - Code That Computes the Correct Answer

The main thing we want from code. If code produces the wrong answer, do we really care how fast it runs?

Problem: Natural Sate of Code = Broken

"Broken" is the natural state of code. It's easy to type in some code, and have it not work. We need a plan to work in this environment. Code can work so nicely, we should keep in mind that even more easily it can fail to work.

Can You Judge Code By Looking at It?

Can you judge code correctness by looking at it? The surprising answer is - no. To really judge, you need to simulate what the loops and if-statements will with various inputs. In effect, you need to run the code to see what it does.

How To Judge - Run Tests

We need to run the code against a few inputs, checking the output for each case. If the code works against a few cases, suggests it is probably correct. It is not a 100% proof, which is surprisingly difficult or impossible to obtain, but tests are very good in practice.

Corollary: Code Not Run is Probably Buggy

Code that the computer has never run over likely has bugs in it.

This can happen if an if-test is always false in a program. This happened with the AT&T phone network, where there was some code in the phone-switching system like this.

if rare_error_condition:
    code to
    route around     # un-noticed bug here
    error condition

The error handling code within the if_statement had a simple bug in it, but those lines had never run, so nobody noticed. Until one day the if-statement was true and the code ran (for the first time) and crashed, taking out a part of the US phone system for a while.

Code tests can help with this. There are modern "code coverage" tools that look at all the tests, making sure that every line has been run in some test or other.

Goal #2 - Clean Code

Clean code with good style. This helps reduce bugs in the first place, and it's easier to fix and add features to code that is already clean. Stanford has always put an emphasis on writing clean code with good style.

Goal #3 - Run Fast

If the code is works correctly and looks good, we might also want to tune it to run fast or use less memory. For some bits of code, speed is crucial. However, the best strategy is generally getting the code working first before messing with it for maximum performance.


Program Design Strategy

Why is code written the way it is? Today we tell the outside, strategic story, driving what forms of code work best.

For lecture, we'lllook over to the three Python guide chapters on style readability and decomposition.

Python Guide: PEP8 Tactics (mostly did this one on an earlier lecture)


Python Guide: Readable Code - key points copied to these notes.

Readable-1 - Good Function Names

Good function names are the first step in readable code. Function names often use verbs indicating what calling the function will accomplish. Look at how the function names below make the surrounding code read nicely.

delete_files(files)


if is_url_sketchy(url):
    display_alert('That url looks sketchy!')
else:
    html = download_url(url)


s = remove_digits(s)


count = count_duplicates(coordinates)


canvas.draw_line(0, 0, 10, 10)

Boolean Functions: is_xxx() has_xxx()

If a function returns a boolean value, starting its name with is_ or has_ can be a good choice. Think about how the function call will read when used in an if or while:

if is_weak(password):
    ...

Function Name - Principle Of Least Surprise

is_url_sketchy(url)  # does what?

The Principle of Least Surprise is a convention for function names. When designing a function, e.g. is_url_sketchy(url), imagine that another programmer is writing code to call this function. Assume that all the other programmers knows is its name since they don't bother to read the documentation. Therefore, the function should only take actions that one might expect given its name. So is_url_sketchy() should not, say, delete a bunch of files.

Readability-2 - Good Variable Names

The code in a function is a story, a narrative, and the variable and function names help you keep the parts of the story clear in your mind. A variable name provides a short label for a bit of data in the story.

Bugs - mix up two values. Many bugs result from the programmer mixing up two data values just in the two minutes they are working on those lines, resulting in a round of debugging.

brackets() Example

Previous lecture example - "left" is a fine variable name in there, labelling and distinguishing that value within the function. "x" or "i" would not be good choices.

def brackets(s):
    left = s.find('[')
    if left == -1:
        return ''
    right = s.find(']')
    return s[left + 1: right]

Too Long and Too Short Names

Here are some other possible names for left, exploring how long or short a variable name could be.

left                  # fine
left_index            # fine


int_index_of_left_paren   # too long
index_of_left_paren       # too long
# Don't need to spell out
# every detail in the name

a         # meaningless
li        # cryptic
l         # too short, and don't use "l"

Var Names For Similar/Related Values

Suppose the algorithm stored both the index and the character at that index - two values it would be very easy to mix up in the code. In that case, the variable names need added words to keep the two values straight:

left_index       # index of left char
left_ch          # char at that index

From the Sand homework, the x_from and x_to variables are good variable name examples. That code was difficult, but at least each variable was labeled as what it was. The code would have been more difficult if the four x/y variables were named a, b, c, d.

brackets() - Bad Names x, y, z Example

Here is a version of brackets() with bad, meaningless names - a, b, c:

def brackets(a):
    c = a.find('[')
    if c == -1:
        return ''
    b = a.find(']')
    return a[b + 1:c]   # compare below

Good vs. Bad Vars Example

Looking at the last lines of the good and bad versions demonstrates the role of good variable names. Look at the last line of the bad names version below. Is that line correct?

# Bad names version
return a[b + 1:c]  # buggy?


# Good names version
return s[left + 1:right]

With a bad variable, you have to look upwards in the code to remind yourself what value it holds. That's the sign of bad variable naming! The name of the variable should tell the story right there, not scrolling up to remind yourself what it holds. Save yourself some time and give the variable a sensible name.

Idiomatic Short Variable Names

There are some circumstances that are so common and idiomatic, that there are standard, idiomatic short variable names tuned for that situation.

Never name a variable lowercase L or O - these look too much like the digits 1 and 0.


Design strategy of decomposition - here we'll click over to the Python guide chapter for this topic.

Python Guide: Decomposition


Extra topic for fun if we have time.

Bits and Bytes

At the smallest scale in the computer, information is stored as bits and bytes. In this section, we'll look at how that works.

Bit

Byte

How Many Patterns With N Bits?

How many different patterns can be made with 1, 2, or 3 bits?

Number of bits Different Patterns
1 0 1
2 00 01 10 11
3 000 001 010 011
100 101 110 111
Number of bits Different Patterns
1 0 1
2 00 01 10 11
3 000 001 010 011
100 101 110 111
Number of bits Number of Patterns
1 2
2 4
3 8
4 16
5 32
6 64
7 128
8 256

One Byte - 256 Patterns

"HDR" Image

Future Image Format: AVIF