Sections Thu Oct 13 to Fri Oct 14
Solutions
1. Viewing Logs
Q1: What is the inumber of the file we created, file.txt?
A1: 101 - we can see this from the following log entry, which adds a directory entry in the root directory for file.txt:
[offset 33562730]
* LSN 1838326412
LogPatch
blockno: 1026
offset_in_block: 32
bytes: 650066696c652e747874000000000000
dirent (101, "file.txt")
Q2: What block number was allocated to store file.txt's payload data?
A2: 1027 - we can see this from the following 2 log entries, which mark the block as allocated and add it to the inode's list of block numbers:
[offset 33562846]
* LSN 1838326418
LogBlockAlloc
blockno: 1027
zero_on_replay: 0
[offset 33562862]
* LSN 1838326419
LogPatch
blockno: 8
offset_in_block: 136
bytes: 0304
inode #101 (i_addr[0] = block pointer 1027)
Q3: What is the inumber for file.txt? Does that line up with your findings from the log?
A3: It should! :) You should see something like the following from running v6:
101 -rw------- 1 0 0 14 Oct 12 2022 15:39:44 /file.txt
ino: 101
i_mode: 0100600
i_nlink: 1
i_uid: 0
i_gid: 0
size(): 14
i_addr[0]: 1027
i_addr[1]: 0
i_addr[2]: 0
i_addr[3]: 0
i_addr[4]: 0
i_addr[5]: 0
i_addr[6]: 0
i_addr[7]: 0
atime: Oct 12 2022 15:39:44
mtime: Oct 12 2022 15:39:44
Q4: What is the reported file size of file.txt? Does that line up with what you expect given the contents you stored there? (note: echo adds an ending \n character to the text you ask it to print)
A4: It's reported as 14 bytes - 13 for the text, plus 1 for the trailing newline echo adds.
2. Log Replay
Q5: Is there anything there in the version of the filesystem from just after the crash?
A5: Nope :(
Q6: What do you predict the filesystem will look like after these log entries are replayed? What block(s) will no longer be in the free list?
A6: It should have a single file in the root directory called "a-file.txt", and block 4 should no longer be in the free list. Here's an annotated look at the log entries - note that each transaction is wrapped in a LogBegin and LogCommit:
[offset 3584]
* LSN 3691737586
LogBegin
// update root dir size (perhaps to add new dirent)
[offset 3597]
* LSN 3691737587
LogPatch
blockno: 2
offset_in_block: 5
bytes: 003000
inode #1 (i_size0, i_size1)
// update root dir modified time
[offset 3618]
* LSN 3691737588
LogPatch
blockno: 2
offset_in_block: 28
bytes: 4763e04d
inode #1 (i_mtime)
// make new inode 16 (perhaps for a-file.txt)
[offset 3640]
* LSN 3691737589
LogPatch
blockno: 2
offset_in_block: 480
bytes: 8081010000000000000000000000000000000000000000004763e04d4763e04d
inode #16 (whole inode)
// add dirent to root dir for a-file.txt, inode 16
[offset 3690]
* LSN 3691737590
LogPatch
blockno: 3
offset_in_block: 32
bytes: 1000612d66696c652e74787400000000
dirent (16, "a-file.txt")
// update root dir modified time
[offset 3724]
* LSN 3691737591
LogPatch
blockno: 2
offset_in_block: 28
bytes: 4763e04d
inode #1 (i_mtime)
[offset 3746]
* LSN 3691737592
LogCommit
sequence: 3691737586
[offset 3763]
* LSN 3691737593
LogBegin
// allocate block 4 (perhaps for a-file.txt payload)
[offset 3776]
* LSN 3691737594
LogBlockAlloc
blockno: 4
zero_on_replay: 0
// use block 4 as first block for a-file.txt payload
[offset 3792]
* LSN 3691737595
LogPatch
blockno: 2
offset_in_block: 488
bytes: 0400
inode #16 (i_addr[0] = block pointer 4)
// update a-file.txt size
[offset 3812]
* LSN 3691737596
LogPatch
blockno: 2
offset_in_block: 485
bytes: 001c00
inode #16 (i_size0, i_size1)
// update a-file.txt modified time
[offset 3833]
* LSN 3691737597
LogPatch
blockno: 2
offset_in_block: 508
bytes: 4763e04d
inode #16 (i_mtime)
[offset 3855]
* LSN 3691737598
LogCommit
sequence: 3691737593
[offset 3872]
* Exiting because: bad checksum
Q7: What was recovered by replaying the log?
A7: "a-file.txt" was recovered! Woohoo! (note that the file has its intended file contents, but that data was not recovered by the log - it must have been written to disk already).
3. Where the Log Ends
Q8: the log space is reused and looped back over repeatedly, meaning the garbage data could appear as valid entries. How can we use the LSN to know that an entry is old?
A8: This is a consecutively assigned counter that is unique for each log entry. For that reason, if we find a log number that is not consecutive, we know we are looking at old log entries or garbage data.
Q9: sometimes, log entries may span sector boundaries (i.e. part of entry at the end of one sector, part at the start of another sector). What if a seemingly valid entry actually consists of the first half of a new log entry followed by the second half of an old one? How can storing the LSN in the footer help with this?
A9: The second copy helps detect situations in which a log entry spans a sector boundary, because we can check that the LSN in the footer matches the one in the header.
Q10: There's still the possibility that the "right" log sequence number for the footer could appear in arbitrary payload data from an older log record. How can we use the checksum to help with this?
A10: Even if the LSN matches, we can compute the checksum ourselves of the header+payload and make sure it matches the checksum in the footer. If it doesn't, we know it's not valid.
Q11: Multiple log entries may be grouped together as part of a transaction. How might we know that a transaction wasn't fully completed and shouldn't be replayed?
A11: If a transaction doesn't have a matching LogCommit entry at the end with the same LSN, we know that it wasn't fully written to disk and therefore we shouldn't replay any of its entries (since it is supposed to be done atomically - either in its entirety or not at all).
Checkoff Questions
The checkoff questions are all pulled from questions in the actual section handout - see above for answers!