Java, Python, Kotlin, and other languages have standards for documenting code using comments. They also have tools to create professional looking API documentation from these comments.

IDEs use documentation comments to provide context-sensitive help as you write code – even help on your own project code (if you write the document comments in a standard format).

The Python convention for writing documentation in comments is called docstring. You should use it.

Python Docstring Comments

Documentation comments in Python are called docstring.

Unfortunately, there are several different conventions for writing docstring:

Docstring Style Description
Sphinx/Pydoc Standard for Python, used by Sphinx
Google Docstring Verbose style using visual formatting
NumPy/SciPy Combines reStructured text and Google style
EpyDoc Structured docstrings similar to Javadoc

All the styles have a few things in common, as shown in this example:

def gcd(a, b):
    """Return the greatest common divisor of two ints a and b.   (1 & 2)
                                                                 (3)
    The greatest common divisor is the largest int n such that   (4)
    `a/n` and `b/n` are integers.  If both a and b are zero, 
    the gcd is 1, unlike the Python math.gcd which returns 0 
    in this case.

    Parameters:                                                  (5)
        a (int): first value for greatest common divisor
        b (int): second value for greatest common divisor

    Returns:                                                     (6)
        int: the greatest common divisor of a and b.
    
    Raises:
        TypeError if a or b are not type `int`.                  (7)
    """

The rules are:

  1. The docstring must be the first thing inside a function and use a multi-line comment (“””). Same rule for class docstrings.
  2. The first line of the docstring is a complete sentence describing what it does, ending with a period. The Sphinx style seems to allow the initial sentence to be on more than one line.
  3. A blank line after the first sentence.
  4. (Optional) Additional text describing the function or class in more detail. You can omit this if the function is simple.
  5. Document parameters, their names, what the mean, any preconditions on their values, e.g. “must be positive”, “may not be empty”.
  6. Document what the function returns (return value).
  7. Document any Exceptions raised.
  8. (Optional) “See Also” - reference to other documents.

The docstring styles vary in how they document parameters, returns, and exceptions.

Sphinx Style

This style uses reStructured Text (reST) and some Markdown formatting. It is the default style generated by PyCharm and used on ReadTheDocs.org.

def gcd(a, b) -> int:
    """Return the greatest common divisor of two integers.

    The greatest common divisor is the largest int n such that 
    a/n and b/n is an integer value.  If both a and b are zero, 
    the gcd is 1, unlike the Python math.gcd which returns 0.

    :param int a: first value for greatest common divisor
    :param int b: second value for greatest common divisor
    :returns: the greatest common divisor of `a` and `b`.
    :rtype:   int
    :raises TypeError: if a or b arg not int type.

    """

You can also specify parameter type on a separate line:

    :param a: first value for greatest common divisor
    :type a:  int

Google Docstring

def gcd(a, b) -> int:
    """Return the greatest common divisor of two integers a and b.

    The greatest common divisor is the largest int n such that a/n and b/n
    is an integer value.  If both a and b are zero, the gcd is 1, unlike
    the Python math.gcd which returns 0 in this case.

    Google Docstrings allow `PEP 484`_ type annotations.

    Args:
        a (int): first value for greatest common divisor
        b (int): second value for greatest common divisor

    Returns:
        int: the greatest common divisor of `a` and `b`.
    
    Raises:
        TypeError: if a or b arg not int type.

    .. _Google Python Style Guide:
       https://google.github.io/styleguide/pyguide.html

    .. _PEP 484:
       https://www.python.org/dev/peps/pep-0484/
    """

  • Relies on visual formatting and unstructured text.
  • Too long. A docstring can easily fill the whole screen.
  • Data types documented in comments instead of type hints.

Google Python Style Guide covers coding style in great detail.

Numpy Docstring

Same style is used by SciPy.

def gcd(a, b):
    """Return the greatest common divisor of two integers.

    The greatest common divisor is the largest int n such that a/n and b/n
    is an integer value.  If both a and b are zero, the gcd is 1, unlike
    the Python math.gcd which returns 0 in this case.

    Parameters
    ----------
        a : int
            first value for greatest common divisor.
        b : int
            second value for greatest common divisor.

    Returns
    -------
    int 
        The greatest common divisor of a and b.
    
    Raises
    ------
    TypeError 
        if `a` or `b` is not type `int`.

    See Also
    --------
    math.gcd : Python library function for GCD.

    .. note::
    This implementation uses Euclid's algorithm.
    You can include hyperlinks `like this <https://somehost/somepath>`
    and relative hyperlinks `like this </reference/gcd>`.

    Examples
    --------
    >>> gcd(16, 24)
    8
    >>> gcd(-16, -20)
    4
    """

ISP Docstring Style

In ISP, please use the Sphinx style (looks like ReStructured Text), but use type hints for parameter and return types instead of writing them in comments.

Why?

  1. Visual formatting is a waste of your time!
    • waste time correcting formatting when things change
    • documenting data types in comments is useless. Use type hints.
    • extra lines cause comments to fill the window, so you cannot see comments and the code in one screen.
  2. Type hints are used by static analyizer and IDEs, so document types using type hints instead of in comments.

This max function works with either int or float, so we declare the type as Number which includes both int and float:

from numbers import Number

def max(a: Number, b: Number) -> Number:
    """Return the maximum of two numeric values.

    :param a: first value to compare
    :param b: second value to compare
    :returns: the maximum of a and b
    :raises:  TypeError if a or b are not numeric values (Numbers)
    """
    if not isinstance(a, Number) or not isinstance(b, Number):
        raise TypeError("parameters must be numeric (int or float)")
    if a > b: 
       return a
    return b

Notice the docstring does not include the data type of parameters or returns.

Module and Class Comments

PEP257 recommends

  • File begins with a module comment
  • Classes have a comment describing the class and it’s members.
  • OK to omit “protected” members from comments
"""A bank account that performs deposits and withdrawals."""
from re import split

from money import Money


class BankAccount:
    """The first line is a sentence describing bank account.

    Then a longer description of a bank account and its methods.
    """

    def __init__(self, name, min_balance=0):
    """Create a new bank account with an owner and initial balance of zero.

    Parameters:
    name (str): name of the account
    min_balance (float):  minimum required balance, default is 0.
    """

My Recommendation for Class Docstrings

  1. Document what the class does, not how it does it.

  2. Document any special dependencies or preconditions required by the class.

  3. Give an example of how to create objects of the class.

  4. Document attributes of the class.

  5. Don’t write a summary of all the methods (as the PEP does). That’s redundant! Each method has it’s own docstring that describes it.

Viewing Python docstrings

You can view the docstring comments in the Python interpreter. This works for functions, classes, modules, and packages (if they have a docstring):

>>> help(print)
print(...)
    print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
    
    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file:  a file-like object (stream); defaults to the current sys.stdout.
    sep:   string inserted between values, default a space.
    ...
>>> import re    # 're' is the incredibly useful regular expression module
>>> help(re)

Another way is simply print the __doc__ “magic” attribute:

>>> print(max.__doc__)

Here’s an example class docstring from the popular requests library, an add-on package for performing HTTP requests. It uses the PEP-267 standard (reST).

>>> import requests
>>> help(requests.Request)

class Request(RequestHooksMixin)
    A user-created :class:`Request <Request>` object.
    
    Used to prepare a :class:`PreparedRequest <PreparedRequest>`, which is sent to the server.
    
    :param method: HTTP method to use.
    :param url: URL to send.
    :param headers: dictionary of headers to send.
    :param files: dictionary of {filename: fileobject} files to multipart upload.
  Usage::
    >>> import requests
    >>> req = requests.Request('GET', 'https://httpbin.org/get')

Use pydoc to view docstrings

Use pydoc from the command line to view documentation for packages, classes, modules, and functions:

cmd> pydoc os
   (shows docstring for os module)
cmd> pydoc math.sqrt
   (shows docstring for sqrt function)
cmd> pydoc requests.Request
   (Request class in requests module)

When to Write Docstrings

You should write docstring comments for:

  • Classes
    • describes purpose of the class
    • parameters of the constructor
    • public methods
    • example of using the class
  • Functions and methods
    • describe purpose of the function or method
    • parameters, and restrictions on their values
    • the return value, if any
    • exceptions that may be raised
  • Modules
    • describe purpose of the module
    • module docstrings go at top of the file, before imports
  • Packages (for this course, package docstrings are not required)
    • put package docstrings in the package’s ___init__.py file.
    • purpose of the package
    • list the modules and subpackages (this can become out-of-date! Python should do this automatically)

Python Doctest Comments

Doctest comments are runnable code examples included in docstrings. They provide examples of how to invoke a method, class, or function, and also provide a quick test. Here’s an example

from typing import List

def average(values: List[float]) -> float:
    """Return the average of a list of numbers.

    Parameters:
        values: a list or tuple of numbers to average

    >>> average([2, 3, 4])
    3.0
    >>> average([2, 3, 4, 0])
    2.25
    >>> average([2])
    2.0
    """
    return sum(values)/len(values) if len(values) > 0 else 0.0

Each line starting with >>> is a Python statement that produces some result. The next line(s) are the expected result. The doctest module will execute the doctest statements and compare the actual and expected results.

By default, doctest prints nothing if the test is correct and an error if it fails. Use the “verbose” option (or -v flag) to always print the result.

There are two ways to run doctest. Using the command line:

cmd>  python -m doctest -v average.py

3 tests in 1 item.
3 passed and 0 failed.

Or by providing a “main” block that runs doctest:

if __name__ == '__main__':
    import doctest
    doctest.testmod(verbose=True)

The -v flag and verbose=True arguments are optional, of course.

Doctest are most commonly used in function and method docstrings, but you can also use them in a class docstring to illustrate how to use the class.

The expected output that you write for a doctest must exactly match the actual output. For the average function, if we wrote:

def average(values):
   """
   >>> average([2, 3, 4])
   3

The test will fail! Because the actual output is 3.0 not 3.

If the expected output is a string, then use single quotes not double quotes because that’s the way the Python interpreter displays strings.

Using Type Hints (Annotations)

To call a function of object constructor, a programmer needs to know what type of value(s) to pass as parameters. Python doesn’t require you to define the data type of parameters and returned values, but you can optionally do so.

Python 3.6 and above let you can include type annotations in your code. Python ignores them when running your code.
Annotations help document your code, and also are used by some tools to find bugs such as passing incorrect value types to a function.

Example: function to compute length of an (x,y) vector. Its also the hypothenuse of a right triangle, hence the function name.

def hypot(x: float, y: float) -> float: 
    """Return the Euclidean norm of a vector with given x and y lengths."""
    return math.sqrt(x*x + y*y)

Meandering PEPs

Many PEPs address docstrings, but the proposals are not consistent and some have either been replaced or rejects (PEP 257 was rejected).

  • PEP8
  • PEP256 Road Map to the for Docstring PEPs
  • PEP287 reStructured Text Docstring Format

Resources

To learn more about Python docstrings:

Java:

  • The Java documentation standard is called “Javadoc” which is much better than Python and can generate beautiful, cross-referenced HTML pages. The entire Java API docs are created using Javadoc.