Modular Design
Lecture Notes for CS 190
Spring 2016
John Ousterhout
- How to minimize dependencies?
- Modular Design
- Divide system into modules that are relatively independent
- Ideal: each module completely independent of the others
- System complexity = complexity of worst module
- In reality, modules are not completely independent
- Some modules must invoke facilities in other modules
- Design decisions in one module must sometimes be known to other
modules
- Can't change one module without understanding parts of
other modules
- Abstraction
- Minimize dependencies between modules
- Divide each module into two parts:
- Interface of a module: anything about that module that
must be known to other modules
- Formal aspects: method signatures, public variables, etc.
- Informal aspects: side effects, algorithms that affect behavior of
methods, etc.
- Implementation: code that enforces the promises made
by the interface
- Goal for interface design: maximize functionality/(interface complexity)
(a sweet interface or module)
- How to design sweet interfaces?
- Parnas paper
- Information Hiding
- Each module (class) encapsulates certain knowledge or design
decisions:
- The interface does not reflect these design decisions (much)
- Can modify the implementation without impacting other classes
- Opposite of information hiding: information leakage
- Implementation details exposed, other classes depend on them
- Information hiding works inside a class as well as between
classes
- Design methods to encapsulate information
- Classes Should be Thick
- Thin class or method:
- Not much functionality
- Short methods
- It takes almost as much code to invoke a method as it
would take to just retype the method inline
- Classic example: linked list
- Also see User.java
- Thin classes don't hide much information
- Thick class or method:
- Lots of functionality, yet simple interface
- Hides lots of information
- Related problem: classitis
- Too many classes
- Bad example: Java libraries
- Rule of thumb: 200-2000 lines is a good size for classes
- Below 200 lines: probably pretty thin
- Above 2000 lines: internal complexity of the class can
become unmanageable. See if it can be subdivided cleanly.
- However, size itself isn't the most important metric: it's
functionality/(interface complexity)
- When to bring things together into one class, when to separate?
- Bring together if:
- Shared information: when you see information leakage, see if you
can pull the code together
- Related task: do the whole job in one place
- Repeated pattern
- Situations that need to be handled in the same way
(exception handling)
- Benefits of bringing together:
- Eliminates dependencies
- Eliminates redundancy
- Examples from Tweeter project:
- HTTP request gets processed twice: one class reads it in, puts it in a
string; then another class parses it.
- Query value parsing: retaining URL escaping in parsed values,
unescape only on use.
- Separate if:
- There isn't much shared information
- Things truly need to be treated differently
- Multi-purpose (generic) versus single-purpose (application-
or feature-specific)
- API Simplicity
- How to design APIs for a class?
- Decide what's important, design the interface around that
- Focus on the things that are done most frequently
- Technique #1: if a particular task is invoked repeatedly,
design an API around that task (even better, do it automatically,
without having to be invoked).
- Technique #2: if a collection of tasks are not identical,
look for common features shared by all of them; design
APIs for the common features.
- It's OK to provide APIs for infrequently-used features,
but design them in a way that you don't need to be
aware of them when using the common features.
- Bad example: Java I/O
- Good example: device-independent I/O in UNIX/Linux:
- Before UNIX: different kernel calls for opening and accessing
files vs. devices.
- Different kernel calls for each device: terminal, tape, etc.
- Different naming mechanisms for each device
- UNIX emphasized commonality across devices:
- Devices have names in the file system: special device files
- All devices have same basic access structure: open, read,
write, seek, close
- Handle device-specific operations with one additional kernel
call:
int result = ioctl(int fd, int request,
void* inBuffer, int inputSize,
void* outBuffer, int outputSize);
- Pick the most general-purpose/abstract API that meets today's needs
- Example, not "open file" or "open device", just "open"
- Increases the likelihood you can reuse for other purposes
- How much to plan ahead?
- "Should I implement extra features beyond those that I need today?
- Design facilities that are general-purpose when possible
(but don't get carried away)
- Don't create a lot of specific features that aren't needed now;
you can always add them later.
- When you discover that new features or a more general architecture
are needed, do it right away: don't hack around it.
- Module writers should embrace suffering:
- Take on hard problems
- Solve completely
- Make solution easy for others to use
- Take more challenges for yourself, so that others have
fewer issues to deal with
- Push complexity down into modules:
- Let a few module developers suffer, rather than thousands
of users
- Simple APIs are more important than a simple implementation
- Solve, don't punt:
- Handle error conditions rather than throwing exceptions
- Minimize "voodoo constants" (configuration parameters)
- If you don't know the right value, how will a user or
administrator ever figure it out?
- Are long methods OK?
- Sometimes: see TransportDispatcher.cc
(method consists of relatively independent pieces).
- Shorter is generally better, but only decompose if it can
be done cleanly (are there dependencies between the parts?).
- Applying These Ideas
- May be hard initially to apply these ideas when writing code.
- Make 2 designs and compare
- Pick one and write some code
- Review this topic to look for potential problems
- Revise code
- Take advantage of code reviews
- Red flags to look for:
- Information leakage & dependencies
- Thin classes
- Repeated pieces of code (DRY)
- Very deep call stacks (especially if one method simply calls another
with essentially the same arguments)
- Lint: little bits of unnecessary complexity