Style Guide

Written by Nick Troccoli and Julie Zelenski, based on writeups by Matthew Trost and others. Based on CS106B and CS107 style guides.

Aiming for good style isn't just something to do for good code review scores - it's something that will help you write code that is easier to reason about, maintain, debug, and develop. Below are some of the general style qualities that we expect your programs to have in order to receive full credit. This is not an exhaustive list; please also refer to each assignment spec for other style practices to follow. While it may be possible to write good code that violates these guidelines (and you may feel free to contact us if you are unclear about or disagree with some of them), the course staff will use this style guide when doing code reviews. In most professional work environments you are expected to follow that company's style standards, and learning to carefully obey a style guide, and writing code with a group of other developers where the style is consistent among them, are valuable job skills.

This document is a work in progress.

Any guidelines written here are in addition to what is mentioned in the given assignment's spec, so you are responsible for reading that spec and following its instructions. If there is ever a conflict between this style guide and an assignment spec, follow the assignment spec.

Whitespace and Indentation

Indenting: Increase your indentation by one increment on each brace {, and decrease it once on each closing brace }.
Use 2-4 spaces per indent level, being consistent, and use spaces instead of tabs for editor consistency.
Place a line break after every {.
Do not place more than one statement on the same line.

// worse style
int x = 3, y = 7;  double z = 4.25;  x++;
if (a == b) { foo(); }

// better style
int x = 3;
int y = 7;
double z = 4.25;

x++;
if (a == b) {
    foo();
}

Long Lines: When any line is longer than 120 characters, break it into two lines by pressing Enter after an operator and resuming on the next line. Indent the trailing second part of the line by two increments (e.g. two tabs). For example:

int result = reallyLongFunctionOne() + reallyLongFunctionTwo() +
        reallyLongFunctionThree() + reallyLongFunctionFour();

int result2 = reallyLongFunction(parameterOne, parameterTwo, 
        parameterThree, parameterFour, parameterFive);

Expressions: Place a space between operators and their operands. Add parentheses to disambiguate intended order of operations if it may not be clear to the reader.

int x = (a + b) * c / d + foo();

Parameters: Place a space between elements separated by commas.

myFunction(x, y, z) // not myFunction(x,y,z)

Blank Lines: Place a blank line between functions and between groups of statements.

void foo() {
    ...
}
                            // this blank line here
void bar() {
    ...
}

Naming and Variables

Structs: When using structs, use a typedef to avoid having to type the struct keyword whenever you declare a new variable of the struct type.

typedef struct MyStruct {
    ...
} MyStruct;

...

// Because we used the syntax above, now we can say
MyStruct s = ...

// instead of
struct MyStruct s = ...

Names: Give variables descriptive names, such as firstName or accountStatus. Avoid one-letter names like x or c, except for loop counter variables such as i. When naming variables, ask "what is it?" and use a noun, e.g. scores or maxScore. When naming functions, ask "what does it do?" and use a verb, e.g. findSmallest, getAge, isPrime.
Capitalization: Be consistent in your name capitalization. Name variables and functions using either camel-casing likeThis, or using snake-casing like_this. Always name constants in uppercase LIKE_THIS. Capitalize the names of classes/types/structs, e.g. GridLocation.
Types: Choose appropriate data types for your variables. For example, if a given variable can store only integers, give it type int rather than double.
Constants: If a particular constant value is used frequently in your code, or if it has special significance, declare it as a constant using const or #DEFINE at the top of your program, and always refer to the constant in the rest of your code rather than referring to the corresponding value. Or, use sizeof if it is a known value. Don't have "magic numbers", which are hardcoded numbers that seem to be "magically chosen" because it's not necessarily clear to another reader why that particular value is being used. That being said, you don't need to make a constant to replace every number literal hardcoded into your program. A good general rule of thumb is to ask yourself whether the number represents something more significant than just the hardcoded number itself. For instance, if you have a for loop like this:

for (int i = 0; i < NUM_ITERATIONS; i++) {
...

The 0 probably doesn't need to be a constant, because it's representing just the literal number 0. However, the limit here represents the number of iterations in addition to its actual value (E.g. 8) so that is a good thing to make a constant. Another way to think about it is, if someone else is reading your code, would that reader have a hard time understanding the significance of the hardcoded number? If so, that may be a sign that a constant is a good idea. Constants must be named with ALL_CAPS and underscores. The names of constants must be descriptive - if they cannot sufficiently describe the purpose of the constant, then you can add a comment above them to further explain what they are for and/or why they have the value they do.

const int MAX_RESPONSE_LENGTH = 100;
// or
#define MAX_RESPONSE_LENGTH 100;

There are some cases where an inline comment would suffice to document the value if it only appears once - too many constants can pollute your code and force the reader to constantly cross-reference the constants at the top of the file as they read. But, even if a value only appears once, it can at times still be useful to document it as a constant if it is sufficiently important and where perhaps someone would want to come along and change the value later in the program.

Variable Scope

Scope: Declare variables in the narrowest possible scope. For example, if a variable is used only inside a specific if statement, declare it inside that if statement rather than at the top of the function or at the top of the file.
Avoid Global Variables: Never declare a modifiable global variable. The only global named values in your code should be constants. Instead of making a value global, pass it as a parameter and/or return it as needed.

// worse style
int count;  // global variable!

void func1() {
    count = 42;
}

void func2() {
    count++;
}

int main() {
    func1();
    func2();
}

// better style
int func1() {
    return 42;
}

int func2(int count) {
    return count + 1;
}

int main() {
    int count = func1();
    count = func2(count);
}

Core C/C++ Statements

Prefer C++ idioms over C idioms: Since C++ is based on C, there is often a "C++ way" to do a given task and also a "C way". For example, the "C++ way" to print output is via the output stream cout, while the "C way" is using printf. C++ strings use the string class, older code uses the C-style char *. When writing C++ code/files, prefer the modern C++ way, and only use the C way when writing C files.
for vs. while: Use a for loop when the number of repetitions is known (definite); use a while loop when the number of repetitions is unknown (indefinite).

// repeat exactly 'size' times
for (int i = 0; i < size; i++) {
    ...
}

// repeat until the end of a linked list
while (node->next != NULL) {
    ...
}

break and continue in loops: Wherever possible, structure loops with a clear loop condition (and clear loop start and increment steps for for loops), and no additional loop control like break - this makes it easier for the reader to understand when the loop runs and exits. However, there are some uses of break that are necessary and ok, such as loop-and-a-half (while(true) with break) or a need to exit a loop mid-iteration. Try to avoid using continue (it is rare and often confusing) - instead, restructure your loop body to achieve the desired behavior.
Use of fallthrough in switch cases: A switch case should almost always end with a break or return that prevents continuing into the subsequent case. In the very rare case that you intend to "fallthrough", add a comment to make that clear. Accidental fallthrough is the source of many a difficult bug!

switch (val) {
    case 1:
        handleOne();
        break;
    case 2:
        handleTwo(); 
        // NOTE: fallthrough ***
    case 3:
        handleTwoOrThree();

{} And Control Statements: When using control statements like if/else, for, while, etc., always include {} and proper line breaks, even if the body of the control statement is only a single line.

// worse style
if (size == 0) printf("not ok\n");
else
    for (int i = 0; i < 10; i++) printf("ok\n");

// better style
if (size == 0) {
    printf("not ok\n");
} else {
    for (int i = 0; i < 10; i++) {
        printf("ok\n");
    }
}

if/else Patterns: When using if/else statements, properly choose between various if and else patterns depending on whether the conditions are related to each other. Avoid redundant or unnecessary if tests.

// worse style
if (degreesF >= 80) {
    printf("It's hot outside.");
}
if (degreesF >= 60 && degreesF < 80) {
    printf("It's nice outside.");
}
if (degreesF >= 50 && degreesF < 60) {
    printf("It's cool outside.");
}

// better style
if (degreesF >= 80) {
    printf("It's hot outside.");
} else if (degreesF >= 60) {
    printf("It's nice outside.");
} else if (degreesF >= 50) {
    printf("It's cool outside.");
}

Returning Booleans: If you have an if/else statement that returns a bool value based on a test, just directly return the test's result instead.

// worse style
if (score1 == score2) {
    return true;
}
return false;

// better style
return score1 == score2;

Testing Booleans: Don't test whether a bool value is == or != to true or false.

// worse style
if (x == true) {
    ...
} else if (x != true) {
    ...
}

// better style
if (x) {
    ...
} else {
    ...
}

in C++, favor &&, ||, and ! over and, or, and not: For various reasons mostly related to international compatibility, C++ has two ways of representing the logical connectives AND, OR, and NOT. Traditionally, the operators &&, ||, and ! are used for AND, OR, and NOT, respectively, and the operators are the preferred ways of expressing compound booleans. The words and, or, and not can be used instead, but it would be highly unusual to do so and a bit jarring for C++ programmers used to the traditional operators.

// worse style
if ((even and positive) or not zero) {
    ...
}

// better style
if ((even && positive) || !zero) {
    ...
}

Clean Syntax: Use the most clean, direct, conventional syntax available to you, e.g. ptr->field instead of (*ptr).field. Similarly, be thoughtful/consistent in use of array subscripts vs. pointer arithmetic. It's more common to use subscripts when accessing an individual array element, and more common to use pointer arithmetic when accessing a subarray. Avoid unnecessary use of obscure constructs, such as the comma operator, unions, etc. Use standard language features appropriately, for instance the bool type from stdbool.h when in C, const for read-only pointers, etc.
Appropriate Pointer Usage:
- no unnecessary levels of indirection in variable/parameter declarations
- uses specific pointee type whenever possible, void* only where required
- low-level pointer manipulation/raw memory operators used only when required
- allocation uses appropriate storage (stack versus heap, based on requirements)
- allocations are of appropriate size
- use typecasts only and exactly where necessary and appropriate

Redundancy

Minimize Redundant Code: If you repeat the same code two or more times, try to find a way to remove the redundant code so that it appears only once. For example, place it into a helper function that is called from both places. If the repeated code is nearly but not entirely the same, try making your helper function accept a parameter to represent the differing part.

// worse style
foo();
x = 10;
y++;
...

foo();
x = 15;
y++;

// better style
helper(10, &x);
helper(15, &x);
...

void helper(int newX, int *x) {
    foo();
    *x = newX;
    y++;
}

if/else Factoring: Move common code out of if/else statements so that it is not repeated.

// worse style
if (x < y) {
    foo();
    x++;
    printf("hi");
} else {
    foo();
    y++;
    printf("hi");
}

// better style
foo();
if (x < y) {
    x++;
} else {
    y++;
}
printf("hi");

Efficiency

Save expensive call results in a variable: If you are calling an expensive function and using its result multiple times, save that result in a variable rather than having to call the function multiple times.

// worse style
for (int i = 0; i < strlen(str); i++) {
    ...
}

str[strlen(str)] = '\0';
if (strlen(str) > 10) {
    ...
}

// better style
int stringLength = strlen(str);
for (int i = 0; i < stringLength; i++) {
    ...
}

str[stringLength] = '\0';
if (stringLength > 10) {
    ...
}

Avoid making copies of data: when possible, avoid making unneeded copies of large data in your programs. In C++, pass the object by reference to avoid this. In C, avoid passing the large data or, if appropriate, pass a pointer.

Comments

The purpose of commenting your code is to make it easy to read and understand. Programmers spend significantly more time reading than writing code, and code you write only once can be read many times by many people. It is therefore important to prioritize readability throughout the code-writing process.

The Comment Hierarchy is a structure for doing just this. Each decreasing rung in the hierarchy becomes more specific to the implementation. From top to bottom:

1: File headers: At the top of the hierarchy are file-level headers. These should give the reader the highest-level understanding of what is contained in the file. You should place a descriptive comment heading on the top of every file to describe that file's purpose, along with your name and the course name. As an example, if the file is a collection of utility functions, it could describe the general category of the utilities and where they might be used. If it’s designed to be run (i.e. has a main()), then it can briefly describe the program behavior. If the file is a class, then it can briefly describe the type it defines. Assume that the reader of your comments is an intelligent programmer but not someone who has seen this assignment before. This comment should not describe great detail about how it is implemented. Do not mention language-specific details like the fact that the code uses an if/else statement, that a function declares an array, that a function loops over a list and counts various elements, etc.

 1 /* CS107 Lecture 7
 2  * Code by Nick Troccoli and Lisa Yan
 3  * 
 4  * This program converts text to pig latin as an example of
 5  * how to use dynamic memory allocation on the heap with malloc,
 6  * realloc and free.
 7  *
 8  * If there are no additional command-line arguments, the program
 9  * prints out tests of converting individual words to pig latin.
10  * If there are additional command-line arguments, the program
11  * concatenates the pig latin versions of them and prints out the
12  * resulting string.
13  */
14 
15 #include <stdlib.h>
16 #include <stdio.h>
17
18 // ...rest of file...

2: Function headers: Function headers are second in the hierarchy, and should be placed on each function in your file to describe each function's behavior. In addition to describing the high-level functionality of the function, these should describe the inputs and outputs to the function, and critically, any assumptions the function makes about those parameters. Specifically, if your function accepts parameters, briefly describe their purpose and meaning. If your function returns a value, briefly describe what it returns. If your function makes any assumptions, such as assuming that parameters will have certain values, mention this in your comments. It should also describe the function’s response to errors, if relevant. It should not describe great detail about how it is implemented. Do not mention language-specific details like the fact that the function uses an if/else statement, that the function declares an array, that the function loops over a list and counts various elements, etc. Note that for the main function, a comment above main is not always needed, since it may be a duplicate of the file header comment for the program. But it could be appropriate for some programs to talk about its behavior in slightly more detail than the program header comment might. Try to make the best decision you can for your program for whether a main comment is helpful for the reader or redundant. Note: if you have a header file with function prototypes, it's fine to have function header comments in either the header file or the implementation file, but not both.

27 /* Function: pig_latin
28  * --------------------------
29  * This function returns a pig-latinified version of the in string.
30  * It is the caller's responsibility to free the returned string,
31  * which is allocated on the heap.
32  * Simplified pig latin rules are:
33  * - if the word starts with a vowel, append "way"
34  * - otherwise, move all initial consonants to the end, and append
35  *   "ay"
36  *
37  * This function assumes that the provided word is lowercase.
38  * If the word begins with non-alphabetic characters, this function
39  * returns NULL.
40  */
41 char *pig_latin(const char *in) {
42     // If the word starts with non-alphabetic characters, we can't translate
43     if (strcspn(in, LOWERCASE_ALPHABET) > 0) {
44         return NULL;
45     }
46
47     // ...rest of function...

3: Block comments: Third are block comments - code blocks, e.g. if statements, for loops, or just groups of related lines, should be commented if it is unclear from just reading the code what is happening or if they are sufficiently lengthy. These comments should describe the logical flow of whatever part of the function the block contains. They do not describe the mechanics of the operations but instead provide insight into what the result of the operations means and how it is used. A good rule of thumb is: explain what the code accomplishes rather than repeat what the code says. If what the code accomplishes is obvious, then don't bother.

41 char *pig_latin(const char *in) {
42     // If the word starts with non-alphabetic characters, we can't translate
43     if (strcspn(in, LOWERCASE_ALPHABET) > 0) {
44         return NULL;
45     }
46 
47     char *out = NULL;
48 
49     // If the word starts with a vowel, add "way"
50     if (strchr(LOWERCASE_VOWELS, in[0]) != NULL) {
51         int out_len = strlen(in) + strlen("way");
52         out = malloc(sizeof(char) * (out_len + 1)); // +1 for null terminator
53         assert(out != NULL);
54 
55         pig_way(out, in);
56     } else {
57         // Otherwise, move all initial consonants to the end, and append "ay"
58         int out_len = strlen(in) + strlen("ay");
59         out = malloc(sizeof(char) * (out_len + 1));
60         assert(out != NULL);
61 
62         // ...rest of function...

4: Line comments: Lastly, line-level comments are for explaining the mechanics of dense operations on a given line or two. It is much easier to read English than it is to read complex bit operations or arithmetic. (Note, however, that very dense lines are generally a sign that you can make your code simpler or at least break it into multiple lines.). Again, a good rule of thumb is: explain what the code accomplishes rather than repeat what the code says. If what the code accomplishes is obvious, then don't bother.

52         out = malloc(sizeof(char) * (out_len + 1)); // +1 for null terminator

Redundancy: Don't repeat what is already said by the code; instead, if you add comments, add additional detail or explanation.
Wording: Your comment headers should be written in complete sentences, and should be written in your own words, not copied from other sources (such as copied verbatim from the homework spec document).
TODOs: You should remove any // TODO: comments from a program before turning it in.
Commented-out code: It is considered bad style to turn in a program with chunks of code "commented out". It's fine to comment out code as you are working on a program, but if the program is done and such code is not needed, just remove it. You should also remove other dead or unused code.

Functions and Procedural Design

Designing a good function: A well-designed function exhibits properties such as the following:
- Fully performs a single independent, coherent task.
- Does not do too large a share of the work.
- Is not unnecessarily connected to other functions.
- Helps indicate and subdivide the structure of the overall program.
- Helps remove redundancy that would otherwise be present in the overall program.
- Interface (parameters, return value) is clean and well-encapsulated.
- Uses parameters for flexibility/re-use (rather than not being generalized).
- Clear relationship between information in (parameters) and out (return value)
Function Structure: If you have a single function that is very long, break it apart into smaller sub-functions. The definition of "very long" is vague, but let's say a function longer than roughly 40 lines is pushing it. If you try to describe the function's purpose and find yourself using the word "and" a lot, that probably means the function does too many things and should be split into sub-functions.
C++ Value vs. reference parameters: In C++, use reference parameters when need to modify the value of a parameter passed in, or to send information out from a function. Prefer reference parameters over raw pointers. Don't use reference parameters when it is not necessary or beneficial. Notice that a, b, and c are not reference parameters in the following function because they don't need to be:

/* 
 * Solves a quadratic equation ax^2 + bx + c = 0,
 * storing the results in output parameters root1 and root2.
 * Assumes that the given equation has two real roots.
 */
void quadratic(double a, double b, double c,
               double& root1, double& root2) {
    double discr = sqrt((b * b) -(4 * a * c);
    root1 = (-b + discr) / (2 * a);
    root2 = (-b - discr) / (2 * a);
}

Returning vs parameters: When possible, favor returning a result from a function instead of using a pointer parameter (or, in C++, a reference parameter) to pass back data.

// worse style
void max(int a, int b, int *result) {
    if (a > b) {
        *result = a;
    } else {
        *result = b;
    }
}

// better style
int max(int a, int b) {
    if (a > b) {
        return a;
    }
    return b;
}

// worse style
void max(int a, int b, int& result) {
    if (a > b) {
        result = a;
    } else {
        result = b;
    }
}

// better style
int max(int a, int b) {
    if (a > b) {
        return a;
    } else {
        return b;
    }
}

const parameters: If you are passing a pointer or reference type to a function and your code will not modify the data it refers to, pass it as const. For example, if you will not modify the contents of a string, pass it as a const char * instead of a char *. If you will not modify a reference to a vector, pass it as const vector<...>&.
Reimplementation: avoid reimplementing the functionality of standard library functions; instead, use the provided functions where possible (e.g. string manipulation, type converstion, etc.).
Appropriate data structures: choose appropriate data structures and types for your program data.
Avoid "chaining" calls: Chaining calls is where many functions call each other in a chain without ever returning to main. Make sure that main is a concise summary of your overall program. Here is a rough diagram of call flow with and without chaining:

// worse style
main
|
+-- function1
    |
    +-- function2
        |
        +-- function3
            |
            +-- function4
            |
            +-- function5
                |
                +-- function6

// better style
main
|
+-- function1
|
+-- function2
|   |
|   +-- function3
|       |
|       +-- function4
|
+-- function5
|   |
|   +-- function6

Dead Code: delete any "dead code", meaning code that is never executed and cannot be reached. For instance, code after a return statement that always executes will never be reached.

// worse style
void doSomething() {
    ...
    return;

    // Dead code!  Never executed.
    int x = 2;
}

// better style
void doSomething() {
    ...
    return;
}

C++ Class Design

Default to private members: only make an instance variable or method public if necessary.
Prefer getters and setters instead of public instance variables: for instance variables you do need to make public, instead consider making public functions to get and/or set that variable's value. They can help limit the access to that variable - for instance, it allows you to do error-checking to limit how an instance variable can be changed.