Python is a programming language that is great for developing stand-alone scripts. In order to achieve the desired result using a similar script, you need to write several tens or hundreds of lines of code. And after the work is done, you can simply forget about the written code and proceed to solve the next problem.
If, say, six months after a certain “one-time” script was written, someone asks the author about why this script crashes, the script author may not be aware of this. This happens due to the fact that no documentation was written for such a script, due to the use of parameters that are hard-coded in the code, due to the fact that the script does not log anything during operation, and due to the lack of tests that allowed to quickly understand the cause of the problem.
It should be noted that turning a script written in haste into something much better is not so difficult. Namely, such a script is quite easy to turn into a reliable and understandable code that is convenient to use, into code that is simple to support both its author and other programmers.
The author of the material, the translation of which we publish today, is going to demonstrate such a “transformation” using the classic
Fizz Buzz Test problem as an example. This task is to display a list of numbers from 1 to 100, replacing some of them with special lines. So, if the number is a multiple of 3 - you need to print the
Fizz
line instead, if the number is a multiple of 5 - the
Buzz
line, and if both of these conditions are met -
FizzBuzz
.
Source
Here is the source code for a Python script that solves the problem:
import sys for n in range(int(sys.argv[1]), int(sys.argv[2])): if n % 3 == 0 and n % 5 == 0: print("fizzbuzz") elif n % 3 == 0: print("fizz") elif n % 5 == 0: print("buzz") else: print(n)
Let's talk about how to improve it.
Documentation
I find it helpful to write documentation before writing code. This simplifies the work and helps not to delay the creation of documentation indefinitely. The documentation for the script can be placed at its top. For example, it might look like this:
The first line gives a brief description of the purpose of the script. The remaining paragraphs contain additional information about what the script does.
Command line arguments
The next task to improve the script will be to replace the values that are hardcoded in the code with the documented values passed to the script through the command line arguments. This can be done using the
argparse module. In our example, we suggest the user to specify a range of numbers and specify the values for "fizz" and "buzz" used when checking numbers from the specified range.
import argparse import sys class CustomFormatter(argparse.RawDescriptionHelpFormatter, argparse.ArgumentDefaultsHelpFormatter): pass def parse_args(args=sys.argv[1:]): """Parse arguments.""" parser = argparse.ArgumentParser( description=sys.modules[__name__].__doc__, formatter_class=CustomFormatter) g = parser.add_argument_group("fizzbuzz settings") g.add_argument("--fizz", metavar="N", default=3, type=int, help="Modulo value for fizz") g.add_argument("--buzz", metavar="N", default=5, type=int, help="Modulo value for buzz") parser.add_argument("start", type=int, help="Start value") parser.add_argument("end", type=int, help="End value") return parser.parse_args(args) options = parse_args() for n in range(options.start, options.end + 1):
These changes are of great benefit to the script. Namely, the parameters are now properly documented, you can find out their purpose using the
--help
flag. Moreover, according to the corresponding command, the documentation that we wrote in the previous section is also displayed:
$ ./fizzbuzz.py --help usage: fizzbuzz.py [-h] [--fizz N] [--buzz N] start end Simple fizzbuzz generator. This script prints out a sequence of numbers from a provided range with the following restrictions: - if the number is divisible by 3, then print out "fizz", - if the number is divisible by 5, then print out "buzz", - if the number is divisible by 3 and 5, then print out "fizzbuzz". positional arguments: start Start value end End value optional arguments: -h, --help show this help message and exit fizzbuzz settings: --fizz N Modulo value for fizz (default: 3) --buzz N Modulo value for buzz (default: 5)
The
argparse
module is a very powerful tool. If you are not familiar with it, it will be useful for you to view the
documentation on it. In particular, I like his ability to define
subcommands and
groups of arguments .
Logging
If you equip the script with the ability to display some information during its execution, this will turn out to be a pleasant addition to its functionality. The
logging module is well suited for this purpose. First, we describe an object that implements logging:
import logging import logging.handlers import os import sys logger = logging.getLogger(os.path.splitext(os.path.basename(sys.argv[0]))[0])
Then we will make it possible to control the details of the information displayed during logging. So, the
logger.debug()
command should output something only if the script is run with the
--debug
switch. If the script is run with the
--silent
, the script should not display anything except exception messages. To implement these features, add the following code to
parse_args()
:
Add the following function to the project code to configure logging:
def setup_logging(options): """Configure logging.""" root = logging.getLogger("") root.setLevel(logging.WARNING) logger.setLevel(options.debug and logging.DEBUG or logging.INFO) if not options.silent: ch = logging.StreamHandler() ch.setFormatter(logging.Formatter( "%(levelname)s[%(name)s] %(message)s")) root.addHandler(ch)
The main script code will change as follows:
if __name__ == "__main__": options = parse_args() setup_logging(options) try: logger.debug("compute fizzbuzz from {} to {}".format(options.start, options.end)) for n in range(options.start, options.end + 1):
If you plan to run the script without direct user participation, for example, using
crontab
, you can make its output go to
syslog
:
def setup_logging(options): """Configure logging.""" root = logging.getLogger("") root.setLevel(logging.WARNING) logger.setLevel(options.debug and logging.DEBUG or logging.INFO) if not options.silent: if not sys.stderr.isatty(): facility = logging.handlers.SysLogHandler.LOG_DAEMON sh = logging.handlers.SysLogHandler(address='/dev/log', facility=facility) sh.setFormatter(logging.Formatter( "{0}[{1}]: %(message)s".format( logger.name, os.getpid()))) root.addHandler(sh) else: ch = logging.StreamHandler() ch.setFormatter(logging.Formatter( "%(levelname)s[%(name)s] %(message)s")) root.addHandler(ch)
In our small script, a similar amount of code seems necessary to just use the
logger.debug()
command. But in real scripts this code will not seem like this anymore and the benefit from it will come to the forefront, namely that with its help users will be able to find out about the progress of solving the problem.
$ ./fizzbuzz.py --debug 1 3 DEBUG[fizzbuzz] compute fizzbuzz from 1 to 3 1 2 fizz
Tests
Unit tests are a useful tool for checking if applications behave as they should. Unit scripts are used infrequently in scripts, but their inclusion in scripts significantly improves code reliability. We transform the code inside the loop into a function and describe several interactive examples of its use in its documentation:
def fizzbuzz(n, fizz, buzz): """Compute fizzbuzz nth item given modulo values for fizz and buzz. >>> fizzbuzz(5, fizz=3, buzz=5) 'buzz' >>> fizzbuzz(3, fizz=3, buzz=5) 'fizz' >>> fizzbuzz(15, fizz=3, buzz=5) 'fizzbuzz' >>> fizzbuzz(4, fizz=3, buzz=5) 4 >>> fizzbuzz(4, fizz=4, buzz=6) 'fizz' """ if n % fizz == 0 and n % buzz == 0: return "fizzbuzz" if n % fizz == 0: return "fizz" if n % buzz == 0: return "buzz" return n
You can verify the correct operation of the function using
pytest
:
$ python3 -m pytest -v --doctest-modules ./fizzbuzz.py ============================ test session starts ============================= platform linux -- Python 3.7.4, pytest-3.10.1, py-1.8.0, pluggy-0.8.0 -- /usr/bin/python3 cachedir: .pytest_cache rootdir: /home/bernat/code/perso/python-script, inifile: plugins: xdist-1.26.1, timeout-1.3.3, forked-1.0.2, cov-2.6.0 collected 1 item fizzbuzz.py::fizzbuzz.fizzbuzz PASSED [100%] ========================== 1 passed in 0.05 seconds ==========================
In order for all this to work, you need the
.py
extension to come after the script name. I don’t like adding extensions to script names: language is just a technical detail that does not need to be shown to the user. However, it seems like equipping a script name with an extension is the easiest way to let systems for running tests, like
pytest
, find the tests included in the code.
If an error
pytest
will display a message indicating the location of the corresponding code and the nature of the problem:
$ python3 -m pytest -v --doctest-modules ./fizzbuzz.py -k fizzbuzz.fizzbuzz ============================ test session starts ============================= platform linux -- Python 3.7.4, pytest-3.10.1, py-1.8.0, pluggy-0.8.0 -- /usr/bin/python3 cachedir: .pytest_cache rootdir: /home/bernat/code/perso/python-script, inifile: plugins: xdist-1.26.1, timeout-1.3.3, forked-1.0.2, cov-2.6.0 collected 1 item fizzbuzz.py::fizzbuzz.fizzbuzz FAILED [100%] ================================== FAILURES ================================== ________________________ [doctest] fizzbuzz.fizzbuzz _________________________ 100 101 >>> fizzbuzz(5, fizz=3, buzz=5) 102 'buzz' 103 >>> fizzbuzz(3, fizz=3, buzz=5) 104 'fizz' 105 >>> fizzbuzz(15, fizz=3, buzz=5) 106 'fizzbuzz' 107 >>> fizzbuzz(4, fizz=3, buzz=5) 108 4 109 >>> fizzbuzz(4, fizz=4, buzz=6) Expected: fizz Got: 4 /home/bernat/code/perso/python-script/fizzbuzz.py:109: DocTestFailure ========================== 1 failed in 0.02 seconds ==========================
Unit tests can also be written as regular code. Imagine that we need to test the following function:
def main(options): """Compute a fizzbuzz set of strings and return them as an array.""" logger.debug("compute fizzbuzz from {} to {}".format(options.start, options.end)) return [str(fizzbuzz(i, options.fizz, options.buzz)) for i in range(options.start, options.end+1)]
At the end of the script, we add the following unit tests using the
pytest
for using
parameterized test functions :
Please note that, since the script code ends with a call to
sys.exit()
, tests will not be executed when it is called normally. Thanks to this,
pytest
not needed to run the script.
The test function will be called once for each group of parameters. The
args
entity is used as input to the
parse_args()
function. Thanks to this mechanism, we get what we need to pass to the
main()
function. The
expected
entity is compared to what
main()
. Here is what
pytest
will tell us if everything works as expected:
$ python3 -m pytest -v --doctest-modules ./fizzbuzz.py ============================ test session starts ============================= platform linux -- Python 3.7.4, pytest-3.10.1, py-1.8.0, pluggy-0.8.0 -- /usr/bin/python3 cachedir: .pytest_cache rootdir: /home/bernat/code/perso/python-script, inifile: plugins: xdist-1.26.1, timeout-1.3.3, forked-1.0.2, cov-2.6.0 collected 7 items fizzbuzz.py::fizzbuzz.fizzbuzz PASSED [ 14%] fizzbuzz.py::test_main[0 0-expected0] PASSED [ 28%] fizzbuzz.py::test_main[3 5-expected1] PASSED [ 42%] fizzbuzz.py::test_main[9 12-expected2] PASSED [ 57%] fizzbuzz.py::test_main[14 17-expected3] PASSED [ 71%] fizzbuzz.py::test_main[14 17 --fizz=2-expected4] PASSED [ 85%] fizzbuzz.py::test_main[17 20 --buzz=10-expected5] PASSED [100%] ========================== 7 passed in 0.03 seconds ==========================
If an error occurs,
pytest
will provide useful information about what happened:
$ python3 -m pytest -v --doctest-modules ./fizzbuzz.py [...] ================================== FAILURES ================================== __________________________ test_main[0 0-expected0] __________________________ args = '0 0', expected = ['0'] @pytest.mark.parametrize("args, expected", [ ("0 0", ["0"]), ("3 5", ["fizz", "4", "buzz"]), ("9 12", ["fizz", "buzz", "11", "fizz"]), ("14 17", ["14", "fizzbuzz", "16", "17"]), ("14 17 --fizz=2", ["fizz", "buzz", "fizz", "17"]), ("17 20 --buzz=10", ["17", "fizz", "19", "buzz"]), ]) def test_main(args, expected): options = parse_args(shlex.split(args)) options.debug = True options.silent = True setup_logging(options) assert main(options) == expected E AssertionError: assert ['fizzbuzz'] == ['0'] E At index 0 diff: 'fizzbuzz' != '0' E Full diff: E - ['fizzbuzz'] E + ['0'] fizzbuzz.py:160: AssertionError ----------------------------- Captured log call ------------------------------ fizzbuzz.py 125 DEBUG compute fizzbuzz from 0 to 0 ===================== 1 failed, 6 passed in 0.05 seconds =====================
The output from the
logger.debug()
command is
logger.debug()
included in this output. This is another good reason to use logging mechanisms in scripts. If you want to know more about the great features of
pytest
, take a look at
this material.
Summary
You can make Python scripts more reliable by following these four steps:
- Equip the script with documentation located at the top of the file.
- Use the
argparse
module to document the parameters with which the script can be called. - Use the
logging
module to display information about the script operation process. - Write unit tests.
Here is the complete code for the example discussed here. You can use it as a template for your own scripts.
Interesting discussions started around this material - you can find them
here and
here . The audience, it seems, well received recommendations on documentation and on command line arguments, but what about logging and tests seemed to some readers to be "a shot from a gun on sparrows."
Here is the material that was written in response to this article.
Dear readers! Do you plan to apply the recommendations for writing Python scripts given in this publication?