Coding style refers to the literary aspects of software. What vocabulary is used? How are functions and variables named? What are the spelling and punctuation conventions? What orthography?
Everyone has their own speaking style, their own prose style, and their own coding style. This is great—the more the better!—but some coding styles are worse than others. Code should not only be correct and performant: it should also be readable and easily understandable by other human beings, such as your future self. (Anyone can read the code they wrote earlier that day. The real test comes a couple days later during debugging!) Since humans often need to understand or modify code, coding styles that confuse human readers cause problems. In the conflict between descriptive and prescriptive linguistics, coding style must be a little prescriptive.
The best way to make code readable is to pick a consistent style and stick with it. And the best styles to pick are those with precedent in others’ code.
This guide isn’t about mandatory rules; treat it instead as a set of suggestions. Alternatively, you can follow the “K&R style”, named after its authors, Brian Kernighan and Dennis Ritchie, who created the C programming language; or perhaps GNU’s style guidelines on C.
Example: The Three Bears
What does this code do?
int srev(char*x, char*y) { int l;for(l=0; y[0];l++)
y++;while(l){*x++ =*--y;--l;
}*x = NULL
;}
OK, what does this code do?
int reverse_string(char* the_input_string, char* the_string)
{
// Function: The characters in "the_input_string" are reversed and stored into "the_string".
// For instance, if "the_input_string" equals "abcde", then after the function returns,
// "the_string" will equal "ebcda". The function returns 0 on success and -1 on failure.
int length_of_string = strlen(the_input_string);
int position = 0; // tracks our position in the OUTPUT string
int reverse_position = 0; // tracks our position in the input string
char *the_output_string; // the destination of the copy
the_output_string = the_string;
// We want to copy the_input_string IN REVERSE!!! So we need to move reverse_position to the
// END of the_input_string first.
for (reverse_position = 0; ; reverse_position += 1)
{
// When we reach a null character we are at the end of the string!
if (the_input_string[reverse_position] == NULL)
{
// So we should exit the loop
break;
} // end if
} // end for
// That loop left reverse_position pointing ONE BEYOND the last character in the string. That's no good!
reverse_position -= 1;
// Now run backwards!!
while (reverse_position >= 0)
{
// Make sure we really have a character!!
assert(the_input_string[reverse_position] != '\0');
// OK, we have a character.
// Better copy it to the output string!
memcpy (&the_output_string[position],
&the_input_string[reverse_position],
1);
// Great, we did the copy!! Advance to the next position in the output string.
position += 1;
// And back up to the PREVIOUS position in the INPUT string!
reverse_position -= 1;
} // end while
// OK, we've copied all the characters into the output string. But we're not done yet!!!
// We did not copy the last character in the string. Better do that now.
memset(&the_output_string[position], '\0', 1);
// Now we're really done!!
return position;
} // end reverse_string function!
And finally, what does this code do?
/** @brief Reverse the characters of @a src into @a dst, returning the length of @a src. */
size_t strreverse(char* dst, const char* src) {
assert(dst && src); // neither parameter should be NULL
assert(dst != src); // we can’t reverse a string into itself
size_t len = strlen(src);
for (size_t pos = 0; pos != len; ++pos) {
dst[pos] = src[len - pos - 1];
}
dst[len] = 0;
return len;
}
Well, they all do the same thing: they reverse the input string. But we feel the style of the third example is much better than the styles of the first two. The first example is too terse, the second too verbose, and the third pretty good. Here are some of the problems with the first two examples, and how the third example avoids them.
Example 1
- Inscrutable variable and function names (
x
?y
?srev
?) don’t give good hints about the function’s behavior. - Horrible, inconsistent indentation makes the program look like random noise.
- No explanatory comments.
- No parameter checking.
- Incorrect code: the function claims to return an
int
, but the function body doesn’t have areturn
statement.
Example 2
Beginning students often err in this direction: making code too verbose.
- Verbose variable names (
the_string
,the_input_string
) take up space and mental energy without adding descriptiveness. - Even though the variable names are verbose, they’re not consistent.
length_of_string
is actuallylength_of_input_string
. Why have boththe_string
andthe_output_string
? Why not call the parameter versionthe_output_string
, to contrast withthe_input_string
? - Chatty comments that repeat information obvious from the code. (E.g.
// ... exit the loop
.) - Comments that lie. The comment above
memset
, near the end of the function, claims that the following line copies the last character in the input string. No it doesn’t: the line terminates the output string with a null character. And the header comment claims the function returns 0 or -1. No it doesn’t: it returns the length of the input string. - Unused variables (
length_of_string
). Did the programmer make them redundant and then just forget to remove them, or does the programmer really need them, but hasn’t gotten around to using them yet? If you really intend to leave a variable unused, tell the reader explicitly, with a statement like “(void) length_of_string;
”. - Type mixups. An if statement in the first loop compares a character against
NULL
. ButNULL
is a pointer, not a character. Use the right types to show you understand what’s going on. - Redundant assertions. Consider
assert(the_input_string[reverse_position] != '\0')
. The code above the loop ensures that this assertion is true! The assertion indicates that the programmer didn’t understand their own code. - Not using standard library functions. The first loop could be written much shorter as
reverse_position = strlen(the_input_string);
The standard library, particularly the simple functions likestrlen
, forms a basic vocabulary all C programmers share. Use it! - Not following standard library conventions. Most C standard library string functions have names that start with
str
. Furthermore, those standard functions modify their first argument. (strcpy(a, b)
copies the stringb
into the stringa
.) This function modifies its second argument. - Overusing standard library functions. Note the use of
memcpy
inside the second loop to copy a single character.
Example 3: “Just Right” (sort of)
- Short, yet evocative, variable names (
dst
andsrc
). - Use of C types to provide more information about arguments (the
const
inconst char* src
promises that the function will not modify the characters insrc
). - Standard library functions used when appropriate (
strlen
). - Types match standard library types (
strlen
returnssize_t
, so does this function). - Following standard library conventions for function names and argument order.
- Correct, yet brief, description of the function’s job.
- Assertions that check for error conditions, including one you might not have thought of on first reading.
You may find Example 3 harder to follow than Example 2 at first. Learning to read code like Example 3 quickly is a skill, and talented programmers will disagree on whether Example 3 is really “just right,” or too terse. Feel free to write code to your taste! But do think about your style, and think about the points above, and you’ll become a better programmer naturally.
The Golden Rule
Be consistent…
- If a construct appears more than once, you should format it the same way each time. So format function calls the same way each time; name variables in similar ways; and so forth.
- Develop a set of conventions and stick to them.
…within reason.
- No conventions will fit all situations. The end goal is readability; don’t lose sight of it.
Formatting
Again, it doesn't really matter how you format your program, as long as you do so consistently and apply the same style rules uniformly to your program. But here are some guidelines.
- Choose an indentation depth and stick with it. 4 spaces is a common indentation depth, and the code we hand out indents by 4 spaces.
- Use line breaks to separate logical sections in your program.
- Separate longer programs/files into sections. For example, group all the macros together, all the data structure definitions together, all the global variables together, etc.
- Avoid super long lines. 80 characters is a common limit, but it's not a hard limit in this course.
- Be consistent with curly brackets.
- Be consistent with white space between operators.
- A variable or function name should be clear and concise, such that the name is sufficient to understand the purpose of the variable/function/struct.
Comments
Commenting is good, but commenting too much can be bad. Even worse, comments need to be updated when code is changed, so having more comments requires more maintenance.
Ideally, you should try to create code that is itself readable and does not require comments. This isn't always possible though, so comments may be necessary to explain especially tricky or complicated sections in your program. If another student cannot read your code (or worse, you can't understand your own code), consider changing the code to be more readable, or add a comment if all else fails. That said, here are some tips for commenting:
- Make the code itself readable.
- Comment the tricky sections/hacks.
- Avoid comments that simply repeat what the code does, or state the obvious.
- Comment each function on what it does, what its arguments are, and what the return value is. See the part about Doxygen below.
- If working with multiple source files, write a brief comment at the top of the file explaining the purpose of the file.
Although not required, one good practice is to use automatic documentation generators. Doxygen will scan through your source files and automatically generate documentation (as html, pdf, etc.), provided that your comments are formatted in a certain way. For Java programmers, this is very similar to Javadoc.
Miscellaneous
-
Check every return value for errors.
-
Avoid creating global variables, unless it's really necessary.
-
Use macros instead of “magic numbers.”
-
Use prototypes (also called “signatures”), and put them either at the top of the source code file or in a corresponding header file (foo.hh). foo.cc will include foo.hh and provide the actual implementations.
-
When printing to stdout or stderr, it is good practice to append a newline
\n
to whatever string is being printed. This avoids anything printed afterward (expected or unexpected, like an error message) from messily joining to your intended message:My message has printed!Aborted (core dumped)
Instead print:
My message has printed! Aborted (core dumped)