Most programming languages have standards for documenting code using comments. Java, Python, and Kotlin have explicit rules for documentation comments. They also have tools to create professional looking API documentation from these comments!
IDEs use documentation comments to provide context-sensitive help as you write code – even help on your own project code (if you write the document comments in a standard format).
The Python convention for writing documentation in comments is called docstring. You should use it.
Python Docstring Comments
Documentation comments in Python are called docstring.
Unfortunately, there are several different conventions for writing docstring.
| Docstring Style | Description |
|---|---|
| Sphinx/Pydoc | Standard for Python, used by Sphinx |
| Google Docstring | Verbose style using visual formatting |
| NumPy/SciPy | Combines reStructured text and Google style |
| EpyDoc | Structured docstrings similar to Javadoc |
| ISP Style | Sphinx style + Type Hints |
Common Example
All the styles have a few things in common, as shown in this example:
def gcd(a, b):
"""Return the greatest common divisor of two ints a and b. (1 & 2)
(3)
The greatest common divisor is the largest int n such that (4)
`a/n` and `b/n` are integers. If both a and b are zero,
the gcd is 1, unlike the Python math.gcd which returns 0
in this case.
Parameters: (5)
a (int): first value for greatest common divisor
b (int): second value for greatest common divisor
Returns: (6)
int: the greatest common divisor of a and b.
Raises:
TypeError if a or b are not type `int`. (7)
"""
The rules are:
- The docstring is the first thing inside a function or class, and uses a multi-line comment (“””).
- The first line of the docstring is a complete sentence describing what it does, ending with a period.
- The Sphinx style seems to allow the initial sentence to be on more than one line.
- A blank line after the first sentence.
- (Optional) Additional text describing the function or class in more detail. You can omit this if the function is simple.
- Describe parameters, their names, meaning, any preconditions on their values, e.g. “must be positive”, “may not be empty”.
- Document what the function returns (return value).
- Document any Exceptions raised.
- (Optional) “See Also” - reference to other documents or other functions.
The docstring styles vary in how they document parameters, returns, and exceptions.
Sphinx Style
This style uses reStructured Text (reST) and Markdown formatting. It is the default style generated by PyCharm and used on ReadTheDocs.org.
def gcd(a, b) -> int:
"""Return the greatest common divisor of two integers.
The greatest common divisor is the largest int n such that
a/n and b/n is an integer value. If both a and b are zero,
the gcd is 1, unlike the Python math.gcd which returns 0.
:param int a: first value for greatest common divisor
:param int b: second value for greatest common divisor
:returns: the greatest common divisor of `a` and `b`.
:rtype: int
:raises TypeError: if a or b arg not int type.
"""
You can also specify parameter type on a separate line:
:param a: first value for greatest common divisor
:type a: int
- Sphinx Docstring Tutorial https://sphinx-rtd-tutorial.readthedocs.io/en/latest/docstrings.html
- reStructured Text Tutorial
- Jetbrains Pycharm generates Sphinx-style docstring by default
- In VSCode the autoDocstring extension can generate Sphinx style docstring. You have to select it in settings (default is Google style).
Google Docstring Style
def gcd(a, b) -> int:
"""Return the greatest common divisor of two integers a and b.
The greatest common divisor is the largest int n such that a/n and b/n
is an integer value. If both a and b are zero, the gcd is 1, unlike
the Python math.gcd which returns 0 in this case.
Google Docstrings allow `PEP 484`_ type annotations.
Args:
a (int): first value for greatest common divisor
b (int): second value for greatest common divisor
Returns:
int: the greatest common divisor of `a` and `b`.
Raises:
TypeError: if a or b arg not int type.
.. _Google Python Style Guide:
https://google.github.io/styleguide/pyguide.html
.. _PEP 484:
https://www.python.org/dev/peps/pep-0484/
"""
- Relies on visual formatting and unstructured text.
- Too long. A docstring can easily fill the whole screen.
- Data types are documented in comments instead of type hints.
Google Python Style Guide covers coding style in great detail.
Numpy Docstring Style
The same style is used by SciPy.
def gcd(a, b):
"""Return the greatest common divisor of two integers.
The greatest common divisor is the largest int n such that a/n and b/n
is an integer value. If both a and b are zero, the gcd is 1, unlike
the Python math.gcd which returns 0 in this case.
Parameters
----------
a : int
first value for greatest common divisor.
b : int
second value for greatest common divisor.
Returns
-------
int
The greatest common divisor of a and b.
Raises
------
TypeError
if `a` or `b` is not type `int`.
See Also
--------
math.gcd : Python library function for GCD.
.. note::
This implementation uses Euclid's algorithm.
You can include hyperlinks `like this <https://somehost/somepath>`
and relative hyperlinks `like this </reference/gcd>`.
Examples
--------
>>> gcd(16, 24)
8
>>> gcd(-16, -20)
4
"""
- First Line: “A one-line summary that does not use variables names or the function name.”
- Too long! Waste of time and space on boilerplate text.
- Details: https://numpydoc.readthedocs.io/en/latest/format.html
ISP Docstring Style
In ISP, please use
- Sphinx Style Docstring, but…
- Use type hints for parameter and return types instead of writing them in comments.
Why?
- Visual formatting is a waste of your time!
- waste time correcting formatting when things change
- extra lines make comments longer, so you cannot see the entire comment on one screen.
- Sphinx-style is the easiest for tools to parse and use.
- Type hints are used by static code analyizers to find errors. IDEs use type hints to improve type completion, offer inline help, and flag problems. Documentation tools handle Style style well.
- Documenting data types in comments is useless and error-prone (you may change the data type and forget to update the comment) .
- Use type hints and don’t duplicate type info in comments (DRY).
This max function works with either int or float, so we declare the type as Number which includes both int and float:
from numbers import Number
def max(a: Number, b: Number) -> Number:
"""Return the maximum of two numeric values.
:param a: first value to compare
:param b: second value to compare
:returns: the maximum of a and b
:raises: TypeError if a or b are not numeric values (Numbers)
"""
if not isinstance(a, Number) or not isinstance(b, Number):
raise TypeError("parameters must be int or float")
if a > b:
return a
return b
Notice the docstring does not include the data type of parameters or returns.
Module and Class Comments
PEP257 recommends
- File begins with a module comment
- Classes have a comment describing the class and it’s members.
- OK to omit “protected” members from comments
"""A bank account that manages deposits and withdrawals."""
from re import split
from money import Money
class BankAccount:
"""The first line is a sentence describing bank account.
Then a longer description of a bank account and its methods.
"""
def __init__(self, name: str, min_balance: float|None=0):
"""Create a new bank account with an owner and initial balance of zero.
:param name: name of the account
:param min_balance: minimum required balance, default is 0.
"""
Docstring for Classes
-
Document the purpose of the class and what the class does, not how it does it.
-
Document special dependencies or preconditions required by the class.
-
Give an example of how to create objects of the class.
-
Document attributes of the class.
-
Don’t write a summary of all the methods (as the PEP does). That’s redundant! Each method has it’s own docstring that describes it.
Where to Write Docstrings
You should write docstring comments for:
- Classes
- describes purpose of the class
- parameters of the constructor
- public methods
- example of using the class
- Functions and methods
- describe purpose of the function or method
- parameters, and restrictions on their values
- the return value, if any
- exceptions that may be raised
- Modules
- describe purpose of the module
- module docstrings go at top of the file, before imports
- Packages (for this course, package docstrings are not required)
- put package docstrings in the package’s
___init__.pyfile. - purpose of the package
- list the modules and subpackages (this can become out-of-date! Python should do this automatically)
- put package docstrings in the package’s
Viewing Python docstrings
You can view the docstring comments in the Python interpreter. This works for functions, classes, modules, and packages (if they have a docstring):
>>> help(print)
print(...)
print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
Prints the values to a stream, or to sys.stdout by default.
Optional keyword arguments:
file: a file-like object (stream); defaults to the current sys.stdout.
sep: string inserted between values, default a space.
...
>>> import re # 're' is the incredibly useful regular expression module
>>> help(re)
Another way is simply print the __doc__ “magic” attribute:
>>> print(max.__doc__)
Here’s an example class docstring from the popular requests library,
an add-on package for performing HTTP requests.
It uses the PEP-267 standard (reST).
>>> import requests
>>> help(requests.Request)
class Request(RequestHooksMixin)
A user-created :class:`Request <Request>` object.
Used to prepare a :class:`PreparedRequest <PreparedRequest>`, which is sent to the server.
:param method: HTTP method to use.
:param url: URL to send.
:param headers: dictionary of headers to send.
:param files: dictionary of {filename: fileobject} files to multipart upload.
Usage::
>>> import requests
>>> req = requests.Request('GET', 'https://httpbin.org/get')
Use pydoc to view docstrings
Use pydoc from the command line to view documentation for packages, classes, modules, and functions:
cmd> pydoc os
(shows docstring for os module)
cmd> pydoc math.sqrt
(shows docstring for sqrt function)
cmd> pydoc requests.Request
(Request class in requests module)
Python Doctest Comments
Doctest comments are runnable code examples included in docstrings. They provide examples of how to invoke a method, class, or function, and also provide a quick test. Here’s an example
from typing import List
def average(values: List[float]) -> float:
"""Return the average of a list of numbers.
Parameters:
values: a list or tuple of numbers to average
>>> average([2, 3, 4])
3.0
>>> average([2, 3, 4, 0])
2.25
>>> average([2])
2.0
"""
return sum(values)/len(values) if len(values) > 0 else 0.0
Each line starting with >>> is a Python statement that produces
some result. The next line(s) are the expected result.
The doctest module will execute the doctest statements and
compare the actual and expected results.
By default, doctest prints nothing if the test is correct and an error if it fails. Use the “verbose” option (or -v flag) to always print the result.
There are two ways to run doctest. Using the command line:
cmd> python -m doctest -v average.py
3 tests in 1 item.
3 passed and 0 failed.
Or by providing a “main” block that runs doctest:
if __name__ == '__main__':
import doctest
doctest.testmod(verbose=True)
The -v flag and verbose=True arguments are optional, of course.
Doctest are most commonly used in function and method docstrings, but you can also use them in a class docstring to illustrate how to use the class.
The expected output that you write for a doctest must exactly
match the actual output. For the average function, if we wrote:
def average(values):
"""
>>> average([2, 3, 4])
3
The test will fail! Because the actual output is 3.0 not 3.
If the expected output is a string, then use single quotes not double quotes because that’s the way the Python interpreter displays strings.
Type Hints (Annotations)
To call a function of object constructor, a programmer needs to know what type of value(s) to pass as parameters. Python doesn’t require you to define the data type of parameters and returned values, but you can optionally do so.
Python 3.6 and above let you can include type annotations in your code. Python ignores them when running your code.
Annotations help document your code, and also are used by some tools to find bugs
such as passing incorrect value types to a function.
Example: function to compute length of an (x,y) vector. Its also the hypothenuse of a right triangle, hence the function name.
def hypot(x: float, y: float) -> float:
"""Return the Euclidean norm of a vector with given x and y lengths."""
return math.sqrt(x*x + y*y)
Meandering PEPs
Many PEPs address docstrings, but the proposals are not consistent and some have either been replaced or rejects (PEP 257 was rejected).
Resources
To learn more about Python docstrings:
- Documenting Python Code on realpython.com has examples of function and class docstrings, and advise on how to write. They have some videos, too.
- Sphinx Docstring Tutorial https://sphinx-rtd-tutorial.readthedocs.io/en/latest/docstrings.html.
- Google Docstrings example at readthedocs.io.
- Google Python Style Guide covers coding style in great detail.
- NumPy Docstrings example also on readthedocs.io.
- NumPyDoc official documentation of NumPy/SciPy docstrings.
Java:
- The Java documentation standard is called “Javadoc” which is much better than Python and can generate beautiful, cross-referenced HTML pages. The entire Java API docs are created using Javadoc.