Section 2 Solutions

NOTE: this website is out of date. This is the course web site from a past quarter, Fall 2023. If you are a current student taking the course, you should visit the current class web site instead. If the current website is not yet visible by going to cs111.stanford.edu, it may be accessible by visiting this link until the new page is mounted at this address. Please be advised that courses' policies change with each new quarter and instructor, and any information on this out-of-date page may not apply to you.

Solutions

1. Viewing Logs

Q1: What is the inumber of the file we created, file.txt?

A1: 101 - we can see this from the following log entry, which adds a directory entry in the root directory for file.txt:

[offset 33562730]
* LSN 2945288249
  LogPatch
    blockno: 1026
    offset_in_block: 32
    bytes: 650066696c652e747874000000000000
  dirent (101, "file.txt")

Q2: What block number was allocated to store file.txt's payload data?

A2: 1027 - we can see this from the following 2 log entries, which mark the block as allocated and add it to the inode's list of block numbers:

[offset 33562816]
* LSN 2945288253
  LogBlockAlloc
    blockno: 1027
    zero_on_replay: 0

[offset 33562832]
* LSN 2945288254
  LogPatch
    blockno: 8
    offset_in_block: 136
    bytes: 0304
  inode #101 (i_addr[0] = block pointer 1027)

Q3: What is the inumber for file.txt? Does that line up with your findings from the log?

A3: It should! :) You should see something like the following from running v6:

101 -rw-------   1   0   0       14 Oct 10 2023 23:15:25  /file.txt
      ino: 101
      i_mode: 0100600
      i_nlink: 1
      i_uid: 0
      i_gid: 0
      size(): 14
      i_addr[0]: 1027
      i_addr[1]: 0
      i_addr[2]: 0
      i_addr[3]: 0
      i_addr[4]: 0
      i_addr[5]: 0
      i_addr[6]: 0
      i_addr[7]: 0
      atime: Oct 10 2023 23:15:25
      mtime: Oct 10 2023 23:15:25

Q4: What is the reported file size of file.txt? Does that line up with what you expect given the contents you stored there? (note: echo adds an ending \n character to the text you ask it to print)

A4: It's reported as 14 bytes - 13 for the text, plus 1 for the trailing newline echo adds.

2. Log Replay

Q5: Is there anything there in the version of the filesystem from just after the crash?

A5: Nope :(

Q6: What do you predict the filesystem will look like after these log entries are replayed? What block(s) will no longer be in the free list?

A6: It should have a single file in the root directory called "a-file.txt", and block 4 should no longer be in the free list. Here's an annotated look at the log entries - note that each transaction is wrapped in a LogBegin and LogCommit:

[offset 3584]
* LSN 342534660
  LogBegin

// update root dir size (perhaps to add new dirent)
[offset 3597]
* LSN 342534661
  LogPatch
    blockno: 2
    offset_in_block: 5
    bytes: 003000
  inode #1 (i_size0, i_size1)

// update root dir modified time
[offset 3618]
* LSN 342534662
  LogPatch
    blockno: 2
    offset_in_block: 28
    bytes: 2665983c
  inode #1 (i_mtime)

// make new inode 16 (perhaps for a-file.txt)
[offset 3640]
* LSN 342534663
  LogPatch
    blockno: 2
    offset_in_block: 480
    bytes: 8081010000000000000000000000000000000000000000002665983c2665983c
  inode #16 (whole inode)

// add dirent to root dir for a-file.txt, inode 16
[offset 3690]
* LSN 342534664
  LogPatch
    blockno: 3
    offset_in_block: 32
    bytes: 1000612d66696c652e74787400000000
  dirent (16, "a-file.txt")

// update root dir modified time
[offset 3724]
* LSN 342534665
  LogPatch
    blockno: 2
    offset_in_block: 28
    bytes: 2665983c
  inode #1 (i_mtime)

[offset 3746]
* LSN 342534666
  LogCommit
    sequence: 342534660

[offset 3763]
* LSN 342534667
  LogBegin

// allocate block 4 (perhaps for a-file.txt payload)
[offset 3776]
* LSN 342534668
  LogBlockAlloc
    blockno: 4
    zero_on_replay: 0

// use block 4 as first block for a-file.txt payload
[offset 3792]
* LSN 342534669
  LogPatch
    blockno: 2
    offset_in_block: 488
    bytes: 0400
  inode #16 (i_addr[0] = block pointer 4)

// update a-file.txt size
[offset 3812]
* LSN 342534670
  LogPatch
    blockno: 2
    offset_in_block: 485
    bytes: 001c00
  inode #16 (i_size0, i_size1)

// update a-file.txt modified time
[offset 3833]
* LSN 342534671
  LogPatch
    blockno: 2
    offset_in_block: 508
    bytes: 2665983c
  inode #16 (i_mtime)

[offset 3855]
* LSN 342534672
  LogCommit
    sequence: 342534667

[offset 3872]
* Exiting because: bad checksum

Q7: What was recovered by replaying the log? (note that in this case, the file you find has its intended file contents, but that data was not recovered by the log - it must have been written to disk already).

A7: "a-file.txt" was recovered! Woohoo!

3. `valgrind` and File Descriptors

Q8: Why does the read function need to be called in a loop?

A8: read may not read all the bytes we request, so we must loop until it in total reads all the bytes we want.

Q9: Which file descriptor was left open that should be closed? Add a line to the code to ensure all file descriptors we need to close are closed.

A9: file descriptor 3; we need to add close(sourceFD); to the end of main to ensure all file descriptors are properly closed.

4. Where the Log Ends

Q10: the log space is reused and looped back over repeatedly, meaning the garbage data could appear as valid entries. How can we use the LSN to know that an entry is old?

A10: This is a consecutively assigned counter that is unique for each log entry. For that reason, if we find a log number that is not consecutive, we know we are looking at old log entries or garbage data.

Q11: sometimes, log entries may span sector boundaries (i.e. part of entry at the end of one sector, part at the start of another sector). What if a seemingly valid entry actually consists of the first half of a new log entry followed by the second half of an old one? How can storing the LSN in the footer help with this?

A11: The second copy helps detect situations in which a log entry spans a sector boundary, because we can check that the LSN in the footer matches the one in the header.

Q12: There's still the possibility that the "right" log sequence number for the footer could appear in arbitrary payload data from an older log record. How can we use the checksum to help with this?

A12: Even if the LSN matches, we can compute the checksum ourselves of the header+payload and make sure it matches the checksum in the footer. If it doesn't, we know it's not valid.

Q13: Multiple log entries may be grouped together as part of a transaction. How might we know that a transaction wasn't fully completed and shouldn't be replayed?

A13: If a transaction doesn't have a matching LogCommit entry at the end with the same LSN, we know that it wasn't fully written to disk and therefore we shouldn't replay any of its entries (since it is supposed to be done atomically - either in its entirety or not at all).

Checkoff Questions

The checkoff questions are all pulled from questions in the actual section handout - see above for answers!