NOTE: this website is out of date. This is the course web site from a past quarter, Fall 2023. If you are a current student taking the course, you should visit the current class web site instead. If the current website is not yet visible by going to cs111.stanford.edu, it may be accessible by visiting this link until the new page is mounted at this address. Please be advised that courses' policies change with each new quarter and instructor, and any information on this out-of-date page may not apply to you.
Solutions
1. Viewing Logs
Q1: What is the inumber of the file we created, file.txt
?
A1: 101
- we can see this from the following log entry, which adds a directory entry in the root directory for file.txt
:
[offset 33562730]
* LSN 2945288249
LogPatch
blockno: 1026
offset_in_block: 32
bytes: 650066696c652e747874000000000000
dirent (101, "file.txt")
Q2: What block number was allocated to store file.txt
's payload data?
A2: 1027
- we can see this from the following 2 log entries, which mark the block as allocated and add it to the inode's list of block numbers:
[offset 33562816]
* LSN 2945288253
LogBlockAlloc
blockno: 1027
zero_on_replay: 0
[offset 33562832]
* LSN 2945288254
LogPatch
blockno: 8
offset_in_block: 136
bytes: 0304
inode #101 (i_addr[0] = block pointer 1027)
Q3: What is the inumber for file.txt
? Does that line up with your findings from the log?
A3: It should! :) You should see something like the following from running v6
:
101 -rw------- 1 0 0 14 Oct 10 2023 23:15:25 /file.txt
ino: 101
i_mode: 0100600
i_nlink: 1
i_uid: 0
i_gid: 0
size(): 14
i_addr[0]: 1027
i_addr[1]: 0
i_addr[2]: 0
i_addr[3]: 0
i_addr[4]: 0
i_addr[5]: 0
i_addr[6]: 0
i_addr[7]: 0
atime: Oct 10 2023 23:15:25
mtime: Oct 10 2023 23:15:25
Q4: What is the reported file size of file.txt
? Does that line up with what you expect given the contents you stored there? (note: echo
adds an ending \n
character to the text you ask it to print)
A4: It's reported as 14 bytes - 13 for the text, plus 1 for the trailing newline echo
adds.
2. Log Replay
Q5: Is there anything there in the version of the filesystem from just after the crash?
A5: Nope :(
Q6: What do you predict the filesystem will look like after these log entries are replayed? What block(s) will no longer be in the free list?
A6: It should have a single file in the root directory called "a-file.txt", and block 4 should no longer be in the free list. Here's an annotated look at the log entries - note that each transaction is wrapped in a LogBegin
and LogCommit
:
[offset 3584]
* LSN 342534660
LogBegin
// update root dir size (perhaps to add new dirent)
[offset 3597]
* LSN 342534661
LogPatch
blockno: 2
offset_in_block: 5
bytes: 003000
inode #1 (i_size0, i_size1)
// update root dir modified time
[offset 3618]
* LSN 342534662
LogPatch
blockno: 2
offset_in_block: 28
bytes: 2665983c
inode #1 (i_mtime)
// make new inode 16 (perhaps for a-file.txt)
[offset 3640]
* LSN 342534663
LogPatch
blockno: 2
offset_in_block: 480
bytes: 8081010000000000000000000000000000000000000000002665983c2665983c
inode #16 (whole inode)
// add dirent to root dir for a-file.txt, inode 16
[offset 3690]
* LSN 342534664
LogPatch
blockno: 3
offset_in_block: 32
bytes: 1000612d66696c652e74787400000000
dirent (16, "a-file.txt")
// update root dir modified time
[offset 3724]
* LSN 342534665
LogPatch
blockno: 2
offset_in_block: 28
bytes: 2665983c
inode #1 (i_mtime)
[offset 3746]
* LSN 342534666
LogCommit
sequence: 342534660
[offset 3763]
* LSN 342534667
LogBegin
// allocate block 4 (perhaps for a-file.txt payload)
[offset 3776]
* LSN 342534668
LogBlockAlloc
blockno: 4
zero_on_replay: 0
// use block 4 as first block for a-file.txt payload
[offset 3792]
* LSN 342534669
LogPatch
blockno: 2
offset_in_block: 488
bytes: 0400
inode #16 (i_addr[0] = block pointer 4)
// update a-file.txt size
[offset 3812]
* LSN 342534670
LogPatch
blockno: 2
offset_in_block: 485
bytes: 001c00
inode #16 (i_size0, i_size1)
// update a-file.txt modified time
[offset 3833]
* LSN 342534671
LogPatch
blockno: 2
offset_in_block: 508
bytes: 2665983c
inode #16 (i_mtime)
[offset 3855]
* LSN 342534672
LogCommit
sequence: 342534667
[offset 3872]
* Exiting because: bad checksum
Q7: What was recovered by replaying the log? (note that in this case, the file you find has its intended file contents, but that data was not recovered by the log - it must have been written to disk already).
A7: "a-file.txt" was recovered! Woohoo!
3. valgrind
and File Descriptors
Q8: Why does the read
function need to be called in a loop?
A8: read
may not read all the bytes we request, so we must loop until it in total reads all the bytes we want.
Q9: Which file descriptor was left open that should be closed? Add a line to the code to ensure all file descriptors we need to close are closed.
A9: file descriptor 3; we need to add close(sourceFD);
to the end of main
to ensure all file descriptors are properly closed.
4. Where the Log Ends
Q10: the log space is reused and looped back over repeatedly, meaning the garbage data could appear as valid entries. How can we use the LSN to know that an entry is old?
A10: This is a consecutively assigned counter that is unique for each log entry. For that reason, if we find a log number that is not consecutive, we know we are looking at old log entries or garbage data.
Q11: sometimes, log entries may span sector boundaries (i.e. part of entry at the end of one sector, part at the start of another sector). What if a seemingly valid entry actually consists of the first half of a new log entry followed by the second half of an old one? How can storing the LSN in the footer help with this?
A11: The second copy helps detect situations in which a log entry spans a sector boundary, because we can check that the LSN in the footer matches the one in the header.
Q12: There's still the possibility that the "right" log sequence number for the footer could appear in arbitrary payload data from an older log record. How can we use the checksum to help with this?
A12: Even if the LSN matches, we can compute the checksum ourselves of the header+payload and make sure it matches the checksum in the footer. If it doesn't, we know it's not valid.
Q13: Multiple log entries may be grouped together as part of a transaction. How might we know that a transaction wasn't fully completed and shouldn't be replayed?
A13: If a transaction doesn't have a matching LogCommit
entry at the end with the same LSN, we know that it wasn't fully written to disk and therefore we shouldn't replay any of its entries (since it is supposed to be done atomically - either in its entirety or not at all).
Checkoff Questions
The checkoff questions are all pulled from questions in the actual section handout - see above for answers!