Java, Python, Kotlin, and other languages have standards for documenting code using comments. They also have tools to create professional looking API documentation from these comments.
IDEs use documentation comments to provide context-sensitive help as you write code – even help on your own project code (if you write the document comments in a standard format).
The Python convention for writing documentation in comments is called docstring. You should use it.
Python Docstring Comments
Documentation comments in Python are called docstring.
Unfortunately, there are several different conventions for writing docstring:
Docstring Style | Description |
---|---|
Sphinx/Pydoc | Standard for Python, used by Sphinx |
Google Docstring | Verbose style using visual formatting |
NumPy/SciPy | Combines reStructured text and Google style |
EpyDoc | Structured docstrings similar to Javadoc |
All the styles have a few things in common, as shown in this example:
def gcd(a, b):
"""Return the greatest common divisor of two ints a and b. (1 & 2)
(3)
The greatest common divisor is the largest int n such that (4)
`a/n` and `b/n` are integers. If both a and b are zero,
the gcd is 1, unlike the Python math.gcd which returns 0
in this case.
Parameters: (5)
a (int): first value for greatest common divisor
b (int): second value for greatest common divisor
Returns: (6)
int: the greatest common divisor of a and b.
Raises:
TypeError if a or b are not type `int`. (7)
"""
The rules are:
- The docstring must be the first thing inside a function and use a multi-line comment (“””). Same rule for class docstrings.
- The first line of the docstring is a complete sentence describing what it does, ending with a period. The Sphinx style seems to allow the initial sentence to be on more than one line.
- A blank line after the first sentence.
- (Optional) Additional text describing the function or class in more detail. You can omit this if the function is simple.
- Document parameters, their names, what the mean, any preconditions on their values, e.g. “must be positive”, “may not be empty”.
- Document what the function returns (return value).
- Document any Exceptions raised.
- (Optional) “See Also” - reference to other documents.
The docstring styles vary in how they document parameters, returns, and exceptions.
Sphinx Style
This style uses reStructured Text (reST) and some Markdown formatting. It is the default style generated by PyCharm and used on ReadTheDocs.org.
def gcd(a, b) -> int:
"""Return the greatest common divisor of two integers.
The greatest common divisor is the largest int n such that
a/n and b/n is an integer value. If both a and b are zero,
the gcd is 1, unlike the Python math.gcd which returns 0.
:param int a: first value for greatest common divisor
:param int b: second value for greatest common divisor
:returns: the greatest common divisor of `a` and `b`.
:rtype: int
:raises TypeError: if a or b arg not int type.
"""
You can also specify parameter type on a separate line:
:param a: first value for greatest common divisor
:type a: int
- Sphinx Docstring Tutorial https://sphinx-rtd-tutorial.readthedocs.io/en/latest/docstrings.html
- reStructured Text Tutorial
- Jetbrains Pycharm generates Sphinx-style docstring by default
- VS Code the autoDocstring extension can generate Sphinx style docstring. You have to select it in settings (default is Google style).
Google Docstring
def gcd(a, b) -> int:
"""Return the greatest common divisor of two integers a and b.
The greatest common divisor is the largest int n such that a/n and b/n
is an integer value. If both a and b are zero, the gcd is 1, unlike
the Python math.gcd which returns 0 in this case.
Google Docstrings allow `PEP 484`_ type annotations.
Args:
a (int): first value for greatest common divisor
b (int): second value for greatest common divisor
Returns:
int: the greatest common divisor of `a` and `b`.
Raises:
TypeError: if a or b arg not int type.
.. _Google Python Style Guide:
https://google.github.io/styleguide/pyguide.html
.. _PEP 484:
https://www.python.org/dev/peps/pep-0484/
"""
- Relies on visual formatting and unstructured text.
- Too long. A docstring can easily fill the whole screen.
- Data types documented in comments instead of type hints.
Google Python Style Guide covers coding style in great detail.
Numpy Docstring
Same style is used by SciPy.
def gcd(a, b):
"""Return the greatest common divisor of two integers.
The greatest common divisor is the largest int n such that a/n and b/n
is an integer value. If both a and b are zero, the gcd is 1, unlike
the Python math.gcd which returns 0 in this case.
Parameters
----------
a : int
first value for greatest common divisor.
b : int
second value for greatest common divisor.
Returns
-------
int
The greatest common divisor of a and b.
Raises
------
TypeError
if `a` or `b` is not type `int`.
See Also
--------
math.gcd : Python library function for GCD.
.. note::
This implementation uses Euclid's algorithm.
You can include hyperlinks `like this <https://somehost/somepath>`
and relative hyperlinks `like this </reference/gcd>`.
Examples
--------
>>> gcd(16, 24)
8
>>> gcd(-16, -20)
4
"""
- First Line: “A one-line summary that does not use variables names or the function name.”
- https://numpydoc.readthedocs.io/en/latest/format.html
ISP Docstring Style
In ISP, please use the Sphinx style (looks like ReStructured Text), but use type hints for parameter and return types instead of writing them in comments.
Why?
- Visual formatting is a waste of your time!
- waste time correcting formatting when things change
- documenting data types in comments is useless. Use type hints.
- extra lines cause comments to fill the window, so you cannot see comments and the code in one screen.
- Type hints are used by static analyizer and IDEs, so document types using type hints instead of in comments.
This max
function works with either int or float, so we declare the type as Number
which includes both int
and float
:
from numbers import Number
def max(a: Number, b: Number) -> Number:
"""Return the maximum of two numeric values.
:param a: first value to compare
:param b: second value to compare
:returns: the maximum of a and b
:raises: TypeError if a or b are not numeric values (Numbers)
"""
if not isinstance(a, Number) or not isinstance(b, Number):
raise TypeError("parameters must be numeric (int or float)")
if a > b:
return a
return b
Notice the docstring does not include the data type of parameters or returns.
Module and Class Comments
PEP257 recommends
- File begins with a module comment
- Classes have a comment describing the class and it’s members.
- OK to omit “protected” members from comments
"""A bank account that performs deposits and withdrawals."""
from re import split
from money import Money
class BankAccount:
"""The first line is a sentence describing bank account.
Then a longer description of a bank account and its methods.
"""
def __init__(self, name, min_balance=0):
"""Create a new bank account with an owner and initial balance of zero.
Parameters:
name (str): name of the account
min_balance (float): minimum required balance, default is 0.
"""
My Recommendation for Class Docstrings
-
Document what the class does, not how it does it.
-
Document any special dependencies or preconditions required by the class.
-
Give an example of how to create objects of the class.
-
Document attributes of the class.
-
Don’t write a summary of all the methods (as the PEP does). That’s redundant! Each method has it’s own docstring that describes it.
Viewing Python docstrings
You can view the docstring comments in the Python interpreter. This works for functions, classes, modules, and packages (if they have a docstring):
>>> help(print)
print(...)
print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
Prints the values to a stream, or to sys.stdout by default.
Optional keyword arguments:
file: a file-like object (stream); defaults to the current sys.stdout.
sep: string inserted between values, default a space.
...
>>> import re # 're' is the incredibly useful regular expression module
>>> help(re)
Another way is simply print the __doc__
“magic” attribute:
>>> print(max.__doc__)
Here’s an example class docstring from the popular requests
library,
an add-on package for performing HTTP requests.
It uses the PEP-267 standard (reST).
>>> import requests
>>> help(requests.Request)
class Request(RequestHooksMixin)
A user-created :class:`Request <Request>` object.
Used to prepare a :class:`PreparedRequest <PreparedRequest>`, which is sent to the server.
:param method: HTTP method to use.
:param url: URL to send.
:param headers: dictionary of headers to send.
:param files: dictionary of {filename: fileobject} files to multipart upload.
Usage::
>>> import requests
>>> req = requests.Request('GET', 'https://httpbin.org/get')
Use pydoc to view docstrings
Use pydoc
from the command line to view documentation for packages, classes, modules, and functions:
cmd> pydoc os
(shows docstring for os module)
cmd> pydoc math.sqrt
(shows docstring for sqrt function)
cmd> pydoc requests.Request
(Request class in requests module)
When to Write Docstrings
You should write docstring comments for:
- Classes
- describes purpose of the class
- parameters of the constructor
- public methods
- example of using the class
- Functions and methods
- describe purpose of the function or method
- parameters, and restrictions on their values
- the return value, if any
- exceptions that may be raised
- Modules
- describe purpose of the module
- module docstrings go at top of the file, before imports
- Packages (for this course, package docstrings are not required)
- put package docstrings in the package’s
___init__.py
file. - purpose of the package
- list the modules and subpackages (this can become out-of-date! Python should do this automatically)
- put package docstrings in the package’s
Python Doctest Comments
Doctest comments are runnable code examples included in docstrings. They provide examples of how to invoke a method, class, or function, and also provide a quick test. Here’s an example
from typing import List
def average(values: List[float]) -> float:
"""Return the average of a list of numbers.
Parameters:
values: a list or tuple of numbers to average
>>> average([2, 3, 4])
3.0
>>> average([2, 3, 4, 0])
2.25
>>> average([2])
2.0
"""
return sum(values)/len(values) if len(values) > 0 else 0.0
Each line starting with >>>
is a Python statement that produces
some result. The next line(s) are the expected result.
The doctest
module will execute the doctest statements and
compare the actual and expected results.
By default, doctest prints nothing if the test is correct and an error if it fails. Use the “verbose” option (or -v flag) to always print the result.
There are two ways to run doctest. Using the command line:
cmd> python -m doctest -v average.py
3 tests in 1 item.
3 passed and 0 failed.
Or by providing a “main” block that runs doctest:
if __name__ == '__main__':
import doctest
doctest.testmod(verbose=True)
The -v
flag and verbose=True
arguments are optional, of course.
Doctest are most commonly used in function and method docstrings, but you can also use them in a class docstring to illustrate how to use the class.
The expected output that you write for a doctest must exactly
match the actual output. For the average
function, if we wrote:
def average(values):
"""
>>> average([2, 3, 4])
3
The test will fail! Because the actual output is 3.0
not 3
.
If the expected output is a string, then use single quotes not double quotes because that’s the way the Python interpreter displays strings.
Using Type Hints (Annotations)
To call a function of object constructor, a programmer needs to know what type of value(s) to pass as parameters. Python doesn’t require you to define the data type of parameters and returned values, but you can optionally do so.
Python 3.6 and above let you can include type annotations in your code. Python ignores them when running your code.
Annotations help document your code, and also are used by some tools to find bugs
such as passing incorrect value types to a function.
Example: function to compute length of an (x,y) vector. Its also the hypothenuse of a right triangle, hence the function name.
def hypot(x: float, y: float) -> float:
"""Return the Euclidean norm of a vector with given x and y lengths."""
return math.sqrt(x*x + y*y)
Meandering PEPs
Many PEPs address docstrings, but the proposals are not consistent and some have either been replaced or rejects (PEP 257 was rejected).
Resources
To learn more about Python docstrings:
- Documenting Python Code on realpython.com has examples of function and class docstrings, and advise on how to write. They have some videos, too.
- Sphinx Docstring Tutorial https://sphinx-rtd-tutorial.readthedocs.io/en/latest/docstrings.html.
- Google Docstrings example at readthedocs.io.
- Google Python Style Guide covers coding style in great detail.
- NumPy Docstrings example also on readthedocs.io.
- NumPyDoc official documentation of NumPy/SciPy docstrings.
Java:
- The Java documentation standard is called “Javadoc” which is much better than Python and can generate beautiful, cross-referenced HTML pages. The entire Java API docs are created using Javadoc.