Writing Comments

Lecture Notes for CS 190
Spring 2016
John Ousterhout

  • Why are comments needed?
    • Code alone can't represent cleanly all the information in the mind of the designer
    • Even if information could be deduced from code, it might be time-consuming:
    • Comments provide clarity that reduces complexity:
      • E.g., make abstractions more clear
  • Comments are still controversial (!)
    • A significant fraction of all commercial code (50%?) is uncommented
    • Excuses:
      • "This code is self-documenting"
      • "Comments get out of date and become misleading"
      • "I don't have time to write comments"
      • "The comments I have seen are worthless; why bother?"
  • Comments should describe things that are not obvious from the code.
  • Mistake #1: comments duplicate code (see slides)
  • Mistake #2: non-obvious info is not described
    • Lower level details (especially for variables, arguments, return values):
      • Exactly what is this thing?
      • What are the units?
      • Boundary conditions
        • Does "end" refer to the last value, or the value after the last one?
        • Is a null value allowed? If so, what does it mean?
      • If memory is dynamically allocated, who is responsible for freeing it?
      • Invariants?
    • Higher-level information (capture the intuition):
      • Abstractions: a higher-level description of what the code is doing.
      • Rationale for the current design: why the code is this way.
      • How to choose the value of a configuration parameter.
  • Two kinds of documentation for classes and methods:
    • Interface: what someone needs to know in order to use this class or method
    • Implementation: how the method or class works internally to implement the advertised interface.
    • Important to separate these: do not describe the implementation in the interface documentation!
  • Interface documentation:
    • Put immediately before the class or method declaration
    • Goal: create simple, intuitive model for users
    • Simpler is better
    • Complete: must include everything that any user might need to know
    • Interface description may use totally different terms than the implementation (if they are simpler)
  • Example: index range lookup
    • Large table of objects in a storage system, split across dozens of servers
    • Table has indexes for looking up objects by certain attributes (name, salary, etc.)
    • IndexLookup class retrieves a range of values in index order
      query = new IndexLookup(table, index, key1, key2);
      while (true) {
          object = query.getNext();
          if (object == null) {
              break;
          }
          ...
      }
      
  • Implementation documentation (comments inside methods):
    • For many methods, not needed
    • For longer methods, document major blocks of code
      • Describe what's happening at a higher level
      • E.g., what does each loop iteration do?
    • Document tricky aspects, non-obvious reasons for code
    • Document dependencies ("if you change this, you better also...")
    • Documenting variables is less important for method local variables (can see all of the uses), but sometimes needed for longer methods or tricky variables.
  • Documenting cross-module design decisions:
    • Example: network protocol
    • Example: how does the system deal with zombie servers?
    • Challenging:
      • No single rational place to put the documentation (people won't know where to look for it)
      • Don't want to repeat everywhere
    • One possible approach:
      • Create designNotes file, with various tagged sections
        • "Zombies"
        • "Timing-dependent tests"
      • In the code, just refer to the design notes file:
        // See "Zombies" in designNotes.
        
  • To maximize value of comments:
    • It must be easy for people to find the right documentation at the right time
    • The documentation must get updated as the code changes
  • Techniques:
    • Document each thing exactly once: don't duplicate documentation (it won't get maintained)
      • Use references rather than repeating documentation: "See documentation for xyz method".
    • Put documentation as close as possible to the relevant code
      • Next to variable and method declarations
      • Push in-method documentation down to the tightest enclosing context
    • Don't say anything more in documentation than you need to
      • e.g., don't use comments in one place to describe design decisions elsewhere
      • Higher-level comments are less likely to become obsolete
    • Look for "obvious" locations where people can easily find documentation (see Status example in slides)
  • Most people put off writing comments:
    • "Why waste time writing comments when the code is still changing?"
    • "Once I get the code done, I'll write all the comments"
  • Problems with this approach:
    • You probably won't go back and write the comments later
    • If you do, the comments will be bad:
      • You're in a hurry and emotionally checked out
      • You have forgotten many of the design decisions, subtleties
      • Comments will be a superficial duplication of what's obvious from the code
  • Try writing comments at the beginning:
    • I write class comments, method headers (signature and comments) before writing the bodies of methods
    • Helps me to define the overall APIs, juggle functionality between methods
    • As I write and test code, can revise the comments to make them better and better
  • Eliminate unnecessary comments so you can focus on the ones that matter.
    • "Feature" code needs far less documentation than infrastructure code
  • Name choice is an important form of documentation
    • Take time to think of names that are clear and unambiguous
    • Be specific (but not too long)!
    • Always use the same variable name for the same kind of object
    • Avoid using the same name to refer to different kinds of things
  • Suggestions for projects:
    • Header comment blocks for every method, every class.
    • Document every class instance variable, every method parameter, every result.
    • Add comments inside methods if/when needed
    • Skip comments only if you're sure it will be obvious to readers
    • Follow Javadoc conventions
  • Red flags for comments:
    • Hard to come up with a clear and simple name for variable?
    • Method documentation has to document every internal feature of the algorithm in order to be complete?
    • Interface documentation for a class or method has to be very long in order to be complete?