Code is read more often than it is written.
Despite this fact, programmers often write code as if they do not expect to have to maintain it or even read it in the future. This leads to code that is incomprehensible when it is read months or years later.
Therefore, one of the most important things you can do as a programmer (in any language) is to write readable code.
This chapter explores principles for writing readable code, as well as some of the standards adopted by the Python community at large for writing code in a consistent manner.
Before discussing specific standards that the Python community has adopted, or additional recommendations that have been proposed by others, it is important to consider a few overarching principles.
Remember that the purpose of readability standards is to improve readability. The rules exist to serve the people reading and writing code, not the other way around.
This section discusses a few principles to keep in mind.
It is very easy to believe that the work you are doing at the moment will not require additions or maintenance in the future. This is because it is difficult to anticipate future needs, and it is easy to underestimate your own propensity to introduce bugs. However, very little of the code that you write will simply exist untouched into perpetuity.
If you assume that code that you are writing is going to be “a one-off” and something that you will not have to read, debug, or amend later, it is frighteningly easy to ignore other principles of readable code simply because you believe that “it does not matter this time.”
Therefore, preserve a healthy distrust of any instinct you may have that code you write will not need to be maintained. The safe bet is always that you will see that code again. Furthermore, if you do not, someone else will.
The two aspects of consistency are internal consistency and external consistency.
Your code should be as internally consistent as possible. This is true both of style and structure. The style should be consistent in that any formatting rules should be followed throughout the project. The structure should be consistent in that the same types of code should be organized into the same places, so that projects are navigable.
You code should also be externally consistent. Structure your projects and your code similarly to how other people do. If a new developer opens up your project, he or she should not react by saying, “I have never seen anything like this before.” Community guidelines matter, because they are what developers will expect to see when they come to your project. Similarly, and for the same reasons, take seriously the standards surrounding how to accomplish common tasks and how to organize code when using certain frameworks.
Ontology basically means “the study of being.” In philosophy (where the word is most commonly used), ontology is the study of the nature of reality and existence, and is a subset of metaphysics.
When it comes to writing software applications, ontology refers to a focus on what the various “things” in your application are. How do you represent your concepts in your database? What about your class structure?
What this sort of question ultimately affects is the way you write and structure your code. Do you use inheritance or composition to structure the relationship between two classes? In what database table does this or that column belong?
This advice effectively boils down to, “Think before you write.” Specifically, think about what the objects in your application are, and how they interact with one another. Your application is a world where objects and data interact. So, what are the rules by which they work together?
When writing code, consider situations in which you are reusing a value that could change over time. Is that value being used in multiple modules and functions? How much work would it be to change it if it became necessary to do so?
The same principle applies to functions. Do you have a common boilerplate that you find yourself constantly repeating throughout your application? If the boilerplate is longer than a couple of lines, you may want to consider abstracting it out into a function, so that if the need to change it arises, it is manageable to do so.
On the other hand, it is possible to take this principle too far. Not every value needs to be defined as a constant in a module (and doing so can impair readability and maintainability). Use wise judgment. Consistently be asking the question, “If this changes, how much work would it be to update it everywhere?”
Your code is a story. It is an explanation of what occurs, from beginning to end, as users interact with your program. The program starts in one location (potentially with some input), moves through a series of “choose your own adventure” steps to reach an end point, and then concludes (probably with some output).
Consider adopting a commenting style where every few lines of code is preceded by a comment block explaining what that code is doing. If your code is a story, your comments are an illumination and explanation of that story.
When narrative commenting is done well, a reader can parse the code (for example, when trying to troubleshoot a problem or maintain the code) by reading the comments to get the story, then quickly zero in on the code that requires maintenance, and only then focus on the vocabulary of the code itself.
Narrative commenting also helps explain intent. It helps answer the question, “What did the person who wrote this code aim to accomplish?” Occasionally, it will help answer the question, “Why was this done this way?” These are questions you naturally ask when you read code, and providing the answers to those questions aids in understanding.
Therefore, comments should explain the rationale for anything in the code that is not simple and salient. If a somewhat complex algorithm is being used, consider including a link to an article explaining the pattern and providing other examples of its use.
The most important principle for writing maintainable code is colloquially known as Occam's Razor: the simplest solution is usually the best one. In his “The Zen of Python” web posting (https://www.python.org/dev/peps/pep-0020/
), which is a collection of proverbs for programming (for example, type import this
in a Python console to read it), Tim Peters includes a similar line: “If the implementation is hard to explain, it's a bad idea.”
This principle is true in both how your code works and how it looks. When it comes to how your code works, simple systems are more maintainable. Simplicity of implementation means that you are less likely to write esoteric bugs, and that those who come after you to maintain your work (including yourself) are more likely to intuitively understand what is happening and be able to add to the application without hitting unexpected snags.
As far as how your code looks, remember that, as much as is possible, reading code should be about learning the story of what the code is doing, not about parsing the vocabulary. The vocabulary is the means, while the story is the end. It is easy to write rules such as, “Do not use ternary operators.” However, following rules you can run through a linter (while valuable) is not a sufficient condition for clarity. Focus on writing and organizing code so that it is as simple as possible.
The Python community largely follows a style guide known as PEP 8 (https://www.python.org/dev/peps/pep-0008/
), which is written by Guido van Rossum (the creator of Python) and is adopted by most major Python projects, including the Python standard library.
The universality of the PEP 8 standard is one of its greatest strengths. It has been adopted by so much of the community that you can reasonably expect that most Python code you encounter will conform to it. As you write code this way, it will become easier to read code written similarly.
Many of the guidelines in PEP 8 are quite straightforward. Highlights include the following:
Use four spaces for indentation. Do not use literal tabs (\t
).
Variables should be spelled with underscores, not camel case (my_var
, not myVar
). Class names start with a capital letter and are in camel case (for example, MyClass
).
If a variable is intended to be “internal use only,” prefix it with an underscore.
Use a single space around operators (for example, x + y
, not x+y
), including assignment (z = 3
, not z=3
), except in keyword arguments, in which case, the spaces are omitted.
Omit unnecessary whitespace in lists and dictionaries (for example, [1, 1, 2, 3, 5]
, not [ 1, 1, 2, 3, 5 ]
).
Read the Python style guide for additional examples and further discussion on these rules.
Remember that, in Python, if the first statement in a function or class is a string, that string is automatically assigned to the special __doc__
variable, and is then used if you call help
(and in a few other cases).
PEP 8 designates that docstrings (as they are colloquially called) should be written as an imperative sentence.
"""Do X, Y, and Z, then return the result."""
This is contrasted with writing the docstring as a description, which is frowned upon.
"""Does X, Y, and Z, then returns the result."""
If the docstring is a single line, follow it with an empty line before the body of the class or function begins. If the docstring spans multiple lines, place the closing quotes on their own line in lieu of the empty line.
"""Do X, Y, and Z, then call the a() method to transform all the things,
then return the result.
"""
Blank lines are used for logical segmentation.
PEP 8 designates that two blank lines should separate “top level” classes and function definitions in a module.
class A(object):
pass
class B(object):
pass
PEP 8 also designates that after the top level, class and function definitions should be separated by one blank line each.
class C(object):
def foo(self):
pass
def bar(self):
pass
It is acceptable to use single blank lines within functions or other blocks of code to delineate logical segments. Consider preceding all such segments with comments explaining the block.
Python allows both absolute and relative imports. In Python 2, the interpreter will attempt a relative import, and then attempt an absolute import if no relative import matches.
In Python 3, relative imports are given a special syntax—a leading period (.
) character—and “normal” imports only attempt absolute imports. The Python 3 syntax is available starting in Python 2.6. Additionally, you can turn off implicit relative imports using from __future__ import absolute_import
.
You should always stick to absolute imports whenever possible. If you must use a relative import, you should use the explicit style. If you are writing code for Python 2.6 and 2.7, consider explicitly opting in to the Python 3 behavior.
When you are importing modules, each module should be given its own line.
import os
import sys
However, if you are importing multiple names from the same module, it is perfectly acceptable to group them on the same line.
from datetime import date, datetime, timedelta
Additionally, although PEP 8 does not mandate this, consider keeping imports grouped by the packages that they come from. Within each group, sort imports by alphabetical order.
Also, when doing imports, do not forget about the ability to alias names that are imported using the as
keyword.
from foo.bar import really_long_name as name
This often allows you to shorten long or unwieldy names that are going to be repeated often. Aliasing is a valuable tool when an import is used frequently, and when the original name is difficult for whatever reason.
On the other hand, remember that when you do this, you are effectively masking the original name within your module, which can reduce clarity if you do it when it is not really necessary. Like any tool, use this with discretion.
As mentioned earlier, variable names are spelled with underscores, not camel case (for example, my_var
, not myVar
). Additionally, it is important that variable names be descriptive.
It is generally not appropriate to use extremely short variable names, although there are situations where this is acceptable, such as the iterator variables in loops (for example, for k, v in mydict.items()
).
Avoid naming variables after common names already in the Python language, even when the interpreter would allow it. You should never name a variable or a function something like sum
or print
. Similarly, avoid type names such as list
or dict
.
If you must name a variable after a Python type or keyword, the convention is to include a trailing underscore; this is explicitly preferable over altering the spelling. For example, if you are passing a class to a function, the function argument should be named class_
, not klass
. (The exception to this is class methods, which by convention take cls
as their initial argument.)
Comments should be written in English, using complete sentences, and written in a block above the relevant code. You should use correct capitalization, spelling, and grammar.
Also, ensure that comments are kept up to date. If the code changes, the comments may need to change along with it. You do not want to end up with a series of comments that actually contradict the code, which can easily cause confusion.
Modules may include a comment header, usually generated by your version-control system, detailing the version of that file. This can make it easier to see if the file has been changed, and is particularly useful if you are distributing a module for use by others.
The single most controversial (and most often rejected) aspect of the Python style guide is its limitations on line length. PEP 8 requires that lines be no longer than 79 characters, and that docstring lines be no longer than 72 characters.
This rule frustrates many developers, who point out that we live in an age of 27-inch monitors and widescreen displays. GitHub, a popular website for sharing code, uses a window with a width of 120 characters.
Proponents point out that many people still use narrower displays or 80-character terminals, or simply do not set their code window up to maximize the screen.
There will likely never be harmony on this issue. You should code to the standards of the projects you are working on. Regardless of whether you conform to a 79-character standard or some greater width, you should know how to wrap code when the situation arises.
The best way to wrap a long single line is by using parentheses, as shown here:
if (really_long_identifier_that_maybe_should_be_shorter and
other_really_long_identifier_that_maybe_should_be_shorter):
do_something()
Whenever it is feasible, use this method instead of using a \ character before the line break. Note that in cases where an operator such as and
is being used, it should appear before the line break if possible.
It is also possible to wrap function calls. PEP 8 lists many acceptable ways to do this. The general rule to follow is that indentation of the trailing lines should be consistent.
really_long_function_name(
categories=[
x.y.COMMON_PHRASES,
x.y.FONT_PREVIEW_PHRASES,
],
phrase='The quick brown fox jumped over the lazy dogs.',
)
When using line continuation within a function call, list, or dictionary, include a trailing comma on the final line.
Many times, the person coming along a year later and reading your code will be you. Memories are never as good as they intuitively seem to be, and code written without a constant eye to readability and maintainability will be naturally difficult to read and maintain.
Throughout this book, you have learned how to use various modules, classes, and structures in the Python language. When deciding how to solve a problem, remember that it often takes more skill to debug code than it does to write it.
Therefore, aim to have your code be as simple as possible, and as readable as possible. You will thank yourself a year from now. Your coworkers and fellow contributors will, too.