2013/Style

From CS61
Jump to: navigation, search

Style

Coding style refers to some of the literary aspects of writing code. What vocabulary do you use?—for instance, for functions, or for variable names? What spelling and punctuation conventions? Overall, what orthography?

Everyone has their own speaking style, their own prose style, and their own coding style. That diversity is great. But some coding styles are worse than others: harder for others to read, harder to maintain. In the conflict between descriptive and prescriptive linguistics, we must be a little prescriptive.

The point of coding style is to make code readable and easily understandable by other human beings (and your future self). The best way to make code readable is to pick a consistent style and stick with it. And the best styles to pick are those with precedent in others’ code. These guidelines can help in that goal.

This guide isn’t about mandatory rules; treat it instead as a set of suggestions. Alternatively, you can follow the “K&R style”, named after its authors, Brian Kernighan and Dennis Ritchie, who created the C programming language; or perhaps GNU’s style guidelines on C.

Example: The Three Bears

What does this code do?

int srev(char*x, char*y) { int l;for(l=0;
y[0];
l++)y++;while(l){*x++=*--y;--l;
}*x = NULL
;}

OK, what does this code do?

int reverse_string(char *the_input_string, char *the_string)
{
   // Function: The characters in "the_input_string" are reversed and stored into "the_string".
   // For instance, if "the_input_string" equals "abcde", then after the function returns,
   // "the_string" will equal "ebcda". The function returns 0 on success and -1 on failure.

   int length_of_string = strlen(the_input_string);
   int position = 0;      // tracks our position in the OUTPUT string
   int reverse_position = 0; // tracks our position in the input string
   char *the_output_string;        // the destination of the copy

   the_output_string = the_string;


   // We want to copy the_input_string IN REVERSE!!! So we need to move reverse_position to the END of the_input_string first....
   for (reverse_position = 0; ; reverse_position += 1)
   {

       // When we reach a null character we are at the end of the string!
       if (the_input_string[reverse_position] == NULL)
       {

            // So we should exit the loop
            break;
       } // end if

   } // end for

   // That loop left reverse_position pointing ONE BEYOND the last character in the string. That's no good!
   reverse_position -= 1;

   // Now run backwards!!
   while (reverse_position >= 0)
   {

       // Make sure we really have a character!!
       assert(the_input_string[reverse_position] != '\0');

       // OK, we have a character.
       // Better copy it to the output string!
       memcpy  (&the_output_string[position],
               &the_input_string[reverse_position],
               1);

       // Great, we did the copy!! Advance to the next position in the output string.
       position += 1;
       // And back up to the PREVIOUS position in the INPUT string!
       reverse_position -= 1;

    } // end while

    // OK, we've copied all the characters into the output string. But we're not done yet!!!
    // We did not copy the last character in the string. Better do that now.
    memset(&the_output_string[position], '\0', 1);

    // Now we're really done!!
    return position;


} // end reverse_string function!

And finally, what does this code do?

/** @brief Reverse the characters of @a src into @a dst, returning the length of @a src. */
int strreverse(char *dst, const char *src) {
    assert(dst && src);  // neither parameter should be NULL
    assert(dst != src);   // we can’t reverse a string into itself
    int len = strlen(src);
    for (int pos = 0; pos != len; ++pos)
         dst[pos] = src[len - pos - 1];
    dst[len] = 0;
    return len;
}

Well, the answer is they all do the same thing: they reverse the input string. But we feel the style of the third example is much better than the styles of the first two. The first example is too terse, the second too verbose, and the third (arguably) just right. Let's talk about some of the problems with the first two examples, and how the third example avoids them.

Example 1

  • Inscrutable variable and function names (x? y? srev?) don’t give good hints about the function’s behavior.
  • Horrible, inconsistent indentation makes the program look like random noise.
  • No explanatory comments.
  • No parameter checking.
  • Incorrect code: the function claims to return an int, but the function body doesn’t have a return statement.

Example 2 Beginning students often err in this direction: making code too verbose.

  • Verbose variable names (the_string, the_input_string) take up space and mental energy without adding descriptiveness.
  • Even though the variable names are verbose, they’re not consistent. length_of_string is actually length_of_input_string. Why have both the_string and the_output_string? Why not call the parameter version the_output_string, to contrast with the_input_string?
  • Chatty comments that repeat information obvious from the code. (E.g. // ... exit the loop.)
  • Comments that lie. The comment above memset, near the end of the function, claims that the following line copies the last character in the input string. No it doesn’t: the line terminates the output string with a null character. And the header comment claims the function returns 0 or -1. No it doesn’t: it returns the length of the input string.
  • Unused variables (length_of_string). Did the programmer make them redundant and then just forget to remove them, or does the programmer really need them, but hasn’t gotten around to using them yet? If you really intend to leave a variable unused, tell the reader explicitly, with a statement like “(void) length_of_string;”.
  • Type mixups. An if statement in the first loop compares a character against NULL. But NULL is a pointer, not a character. Use the right types to show you understand what’s going on.
  • Redundant assertions. Consider assert(the_input_string[reverse_position] != '\0'). The code above the loop ensures that this assertion is true! The assertion indicates that the programmer didn’t understand their own code.
  • Not using standard library functions. The first loop could be written much shorter as reverse_position = strlen(the_input_string); The standard library, particularly the simple functions like strlen, forms a basic vocabulary all C programmers share. Use it!
  • Not following standard library conventions. Most C standard library string functions have names that start with str. Furthermore, those standard functions modify their first argument. (strcpy(a, b) copies the string b into the string a.) This function modifies its second argument.
  • Overusing standard library functions. Note the use of memcpy inside the second loop to copy a single character.

Example 3: “Just Right” (sort of)

  • Short, yet evocative, variable names (dst and src).
  • Use of C types to provide more information about arguments (the const in const char *src promises that the function will not modify the characters in src).
  • Standard library functions used when appropriate (strlen).
  • Following standard library conventions for function names and argument order.
  • Correct, yet brief, description of the function’s job.
  • Assertions that check for error conditions, including one you might not have thought of on first reading.

You may find Example 3 harder to follow than Example 2 at first. Learning to read code like Example 3 quickly is a skill, and talented programmers will disagree on whether Example 3 is really “just right,” or too terse. Feel free to write code to your taste! But do think about your style, and think about the points above, and you’ll become a better programmer naturally.

The Golden Rule

Be consistent…

  • If a construct appears more than once, you should format it the same way each time. So format function calls the same way each time; name variables in similar ways; and so forth.
  • Develop a set of conventions and stick to them.

…within reason.

  • There are always special situations that call for flexibility. The end goal is readability; don’t lose sight of it.

Formatting

Again, it doesn't really matter how you format your program, as long as you do so consistently and apply the same style rules uniformly to your program.

  • Choose an indentation depth and stick with it. 4 spaces is a common indentation depth.
  • Consider indenting with spaces and not tabs. Git and different text editors sometimes don't play nice with tabs.
  • Use line breaks to separate logical sections in your program.
  • Separate longer programs/files into sections. For example, group all the macros together, all the data structure definitions together, all the global variables together, etc.
  • Avoid super long lines. 80 characters is a common limit, but it's not a hard limit in this course.
  • Be consistent with curly brackets.
  • Be consistent with white space between operators.
  • A variable or function name should be clear and concise, such that the name is sufficient to understand the purpose of the variable/function/struct.

Comments

Commenting is good, but commenting too much can be bad. Even worse, comments need to be updated when code is changed, so having more comments requires more maintenance.

Ideally, you should try to create code that is itself readable and does not require comments. This isn't always possible though, so comments may be necessary to explain especially tricky or complicated sections in your program. If another student cannot read your code (or worse, you can't understand your own code), consider changing the code to be more readable, or add a comment if all else fails. That said, here are some tips for commenting:

  • Make the code itself readable.
  • Comment the tricky sections/hacks.
  • Avoid comments that simply repeat what the code does, or state the obvious.
  • Comment each function on what it does, what its arguments are, and what the return value is. See the part about Doxygen below.
  • If working with multiple source files, write a brief comment at the top of the file explaining the purpose of the file.

Although not required, one good practice is to use automatic documentation generators. Doxygen will scan through your source files and automatically generate documentation (as html, pdf, etc.), provided that your comments are formatted in a certain way. For Java programmers, this is very similar to Javadoc.

Miscellaneous

  • Check every return value for errors.
  • Avoid creating global variables, unless it's really necessary.
  • Use macros instead of “magic numbers.”
  • Use prototypes (also called “signatures”), and put them either at the top of the source code file or in a corresponding header file (foo.h). foo.c will include foo.h and provide the actual implementations.