Chapter 11
Unit Testing

When you think about testing the code that you write, the first thing that probably comes to mind is simply running your program directly. If your program executes, you at least know that you do not have any syntax errors (provided every module was imported).

Similarly, if you provide appropriate inputs, and do not get a traceback, you know that your program completes successfully with those inputs. And, if the result matches the result you expect, that is additional inductive evidence that your program works.

This has a couple of key limitations, though. The first is that for a non-trivial program, it is not possible to test every scenario. It is impossible to avoid this limitation, although it is important to be as complete as possible when thinking through potential scenarios to test.

The second limitation (and the one that the bulk of this chapter covers) is time. For most ­applications, it is not practical to manually test every scenario you imagine for every change that you ever make to your program, because iterating over these scenarios is time-consuming.

It is possible, however, to ameliorate this limitation somewhat by automating your tests. An automated test suite can run while you are absent or working on something else, providing a significant time savings and making it much easier to test your work early and often.

This chapter explores some of the world of testing. Specifically, it focuses on unit testing using the built-in tools provided by the Python standard library (such as unittest and mock), and some common packages available for testing.

The Testing Continuum

So, what is a unit test exactly? Furthermore, how does it differ between a functional test or an integration test or some other kind of test? To answer this, this chapter discusses two different testing scenarios.

The Copied Ecosystem

First, consider a very complete testing environment. If you are writing an application that primarily runs on servers, this might entail a “staging” server that has a copy of relevant data, and where potentially breaking actions can be performed safely. For a script or desktop application, the ­principle is the same. It runs in an area with a copy of anything it must touch or alter.

In this scenario, everything your program must do mimics what it does in its actual live environment. If you connect to a particular type of database, that database is still present in your test environment (just at a different ­location). If you get data from a web service, you still make that same request.

Essentially, in the copied ecosystem, any external dependencies your program relies on must still be present and set up in an identical way.

This type of testing scenario is designed not only to test specific code being worked on, but also to test that the entire ecosystem structure that is put in place is viable. Any data that is passed back and forth between different components of your application is actually passed in exactly the same way.

Automated tests that are run against a copied ecosystem such as this are generally called system tests. This term signifies the complete duplicated ecosystem under which these tests run. This kind of test is designed not only to test your specific code, but also to detect breaking changes in the external environment.

The Isolated Environment

Another very distinct type of test is one that is intended to test a very specific block of code, and to do so in an isolated environment.

In a copied ecosystem, any external requirements and dependencies (such as a database, external service, or the like) are all duplicated. On the other hand, tests intended to be run in an isolated environment do so generally by hand-waving the interactions between the tested code and the ­external dependencies, focusing only on what the actual code does.

This sort of hand wave is done by stipulating that an external service or dependency received a given input and returned a given output. The purpose of this kind of test is explicitly not to test the interaction between your application and the other service. Rather, it is to test what your application does with the data it receives from that service.

For example, consider a function that determines a person's age at the time of his or her wedding. It first gets information about the person (birthday and anniversary) from an external database, and then ­computes the delta between the two dates to determine the person's age at the time.

Such a function might look like this:

def calculate_age_at_wedding(person_id):
    """Calculate the age of a person at his or her wedding, given the
    ID of the person in the database.
    """
    # Get the person from the database, and pull out the birthday
    # and anniversary datetime.date objects.
    person = get_person_from_db(person_id)
    anniversary = person['anniversary']
    birthday = person['birthday']
# Calculate the age of the person on his or her wedding day. age = anniversary.year – birthday.year
# If the birthday occurs later in the year than the anniversary, then # subtract one from the age. if birthday.replace(year=anniversary.year) > anniversary: age -= 1
# Done; return the age. return age

Of course, if you try to actually run this function, it will fail. This function depends on another function, get_person_from_db, which is not defined in this example. You intuitively understand from reading the comments and code around it that it gets a specific type of record from a database and returns a dictionary-like object.

When testing a function like this, a copied ecosystem would simply reproduce the database, pull a person record with a particular ID, and test that the function returns the expected age. In contrast, a test in an isolated environment wants to avoid dealing with the database at all. An isolated environment test would declare that you got a particular record, and test the remainder of the function against that record.

This kind of test, which seeks to isolate the code being tested from the rest of the world (and even sometimes the rest of the application itself) is called a unit test.

Advantages and Disadvantages

Both of these fundamental types of tests have advantages and disadvantages, and most applications must have some of both types of tests as part of a robust testing framework.

Speed

One of the most important advantages to unit tests that run in an isolated environment is speed. Tests that run against a copied ecosystem often have long setup and teardown processes. Furthermore, the I/O required to pass data between the various components is often one of the slowest aspects of your application.

By contrast, tests that run in an isolated environment are usually extremely fast. In the previous example, the time it takes to do the arithmetic to determine this person's age is far less (by several orders of magnitude) than the time it takes to ask the database for the row corresponding to the ­person's ID and to pass the data over the pipe.

Having a set of isolated tests that run very fast is valuable, because you are able to run them extremely often and get feedback from running those tests very quickly.

Interactivity

The primary reason why isolated tests are so fast is precisely because they are isolated. Isolated tests stipulate the interactions between various services involved in powering your application.

However, these interactions require testing, too. This is why you also need tests in a copied ecosystem. This enables you to ensure that these services continue to interact the way that you expect.

Testing Code

The focus of this chapter is specifically on unit testing. Therefore, how can you write a test that runs the calculate_age_at_wedding function in the previous example ? Your goal is to not actually talk to a database to get a record of a person, so you must test the function and provide that information.

Code Layout

In many cases, the best and by far the most straightforward way to handle testing such a function is simply to organize your code in a way that makes it easily testable.

In the example of the calculate_age_at_wedding function, you may not need to retrieve a record from the database at all. Depending on your application, it might be fine (and even preferable) to have the function simply accept the full record, rather than the person_id variable. In other words, the baton handoff to this function would not happen until the database call already occurred, and the only thing this function would do would be to perform the arithmetic.

Reorganizing in this way would also make the function less opinionated about what kind of data it gets. Any dictionary-like object with the appropriate keys would do.

The following trimmed-down function only does the calculation of the age, and is expected to receive a full person record (where it gets it from is not relevant).

def calculate_age_at_wedding(person):
    """Calculate the age of a person at his or her wedding, given the
    record of the person as a dictionary-like object.
    """
    # Pull out the birthday and anniversary datetime.date objects.
    anniversary = person['anniversary']
    birthday = person['birthday']
# Calculate the age of the person on his or her wedding day. age = anniversary.year - birthday.year
# If the birthday occurs later in the year than the anniversary, then # subtract one from the age. if birthday.replace(year=anniversary.year) > anniversary: age -= 1
# Done; return the age. return age

In most ways, this function is almost exactly the same as the previous version. The only thing that has changed is that the call to get_person_from_db has been removed (and the comments and docstring updated to match).

Testing the Function

When it comes to testing this function, the problem is now very simple. Just pass a dictionary and make sure you get the correct result.

>>> from datetime import date
>>>
>>> person = {'anniversary': date(2012, 4, 21),
...           'birthday': date(1986, 6, 15)}
>>> age = calculate_age_at_wedding(person)
>>> age
25

Of course, a couple limitations exist here. First, this is still something that was run manually in the interactive terminal. The value of a unit testing suite is that you run it in an automated fashion.

A second (and even more important) limitation to recognize is that this tests only one input against only one output. Suppose you ­gutted the function the next day and replaced it with the following:

def calculate_age_at_wedding(*args, **kwargs):
    return 25

The test would still pass, even though the function would be extremely broken.

Indeed, the test does not even cover some sections of this function. After all, there is an if block in the function based on whether or not the birthday falls before or after the anniversary in a calendar year. At a minimum, you would want to ensure that your test takes both pathways.

The following test function handles this:

from datetime import date

def test_calculate_age_at_wedding(): """Establish that the 'calculate_age_at_wedding' function seems to calculate a person's age at his wedding correctly, given a dictionary-like object representing a person. """ # Assert that if the anniversary falls before the birthday in a # calendar year, that the calculation is done properly. person = {'anniversary': date(2012, 4, 21), 'birthday': date(1986, 6, 15)} age = calculate_age_at_wedding(person) assert age == 25, 'Expected age 25, got %d.' % age
# Assert that if the anniversary falls after the birthday in a calendar # year, that the calculation is done properly. person = {'anniversary': date(1969, 8, 11), 'birthday': date(1945, 2, 15)} age = calculate_age_at_wedding(person) assert age == 24, 'Expected age 24, got %d.' % age

Now you have a function that can be run by an automated process. Python includes a test runner, which is explored shortly. Also, this test covers a couple of different permutations of the function. It certainly does not cover every possible input (it would be impossible to do that), but it provides a slightly more complete sanity check.

However, always remember that the tests are not an exhaustive check. They only test the inputs and outputs that you provide. For example, this test function says nothing about what would happen if the calculate_age_at_wedding function were sent something other than a dictionary, or if it were sent a dictionary with the wrong keys, or if datetime objects were used instead of date objects, or if you were to send an anniversary date that is earlier than the birth date, or any number of other permutations. This is fine. It is simply important to understand what the limits of your tests are.

The assert Statement

What about the assert statement that the test function is using? Consider what a unit test fundamentally is. A unit test is an assertion or a set of assertions. In this case, you assert that if you send a properly formatted dictionary with specific dates, you get a specific integer result.

In Python, assert is a keyword, and assert statements are used almost exclusively for testing (although they need not appear exclusively in test code). The assert statement expects the expression sent to it to evaluate to True. If it does, the assert statement does nothing; if it does not, AssertionError is raised. You can optionally provide a custom error message to be raised with the AssertionError, as the previous example does.

When writing tests, you want to use AssertionError as the exception to be raised when a test fails, either by raising it directly, or (usually) by using the assert statement to assert the test's pass conditions, because all of the unit testing frameworks will catch the error and handle it appropriately when compiling test failures.

Unit Testing Frameworks

Now that you have your test as a function, the next step is to set up a process to run that test (as well as any others you may write to test the remainder of the application).

Several unit testing frameworks, such as py.test and nose, are available as third-party packages. However, the Python standard library also ships with a quite robust unit testing framework, available under the unittest module in the standard library.

Consider the testing function from the previous example, but structured to be run by the unittest module.

import unittest
from datetime import date

class Tests(unittest.TestCase): def test_calculate_age_at_wedding(self): """Establish that the 'calculate_age_at_wedding' function seems to calculate a person's age at his wedding correctly, given a dictionary-like object representing a person. """ # Assert that if the anniversary falls before the birthday # in a calendar year, that the calculation is done properly. person = {'anniversary': date(2012, 4, 21), 'birthday': date(1986, 6, 15)} age = calculate_age_at_wedding(person) self.assertEqual(age, 25)
# Assert that if the anniversary falls after the birthday # in a calendar year, that the calculation is done properly. person = {'anniversary': date(1969, 8, 11), 'birthday': date(1945, 2, 15)} age = calculate_age_at_wedding(person) self.assertEqual(age, 24)

In most ways, this looks the same as what you saw before. However, it has a couple of key differences. The first difference is that you now have a class, which subclasses unittest.TestCase. The unittest module expects to find tests grouped using unittest.TestCase subclasses. Each test must be a function whose name begins with test. As a corollary, because the test itself is now a method of the class rather than an unbound function, it now has self as an argument.

The other change is that the raw assert statements have been replaced with calls to self.assertEqual. The unittest.TestCase class provides a number of wrappers around assert that standardize error messages and provide some other boilerplate.

Running Unit Tests

Now it is time to actually run this test within the unittest framework. To do this, save both the function and the test class in a single module, such as wedding.py.

The Python interpreter provides a flag, -m, which takes a module in the standard library or on sys.path, and runs it as a script. The unittest module supports being run in this way, and accepts the Python module to be tested. (If you named your module wedding.py, this would be wedding.)

$ python -m unittest wedding
.
----------------------------------------------------------------------
Ran 1 test in 0.000s
OK

What is happening here? The wedding module was loaded, and the unittest module found a unittest.TestCase subclass. It instantiated the class and then ran every method beginning with the word test, which the test_calculate_age_at_wedding method does.

The unittest output prints a period character (.) for a successful test, or a letter for failures (F), errors (E), and a few other cases, such as tests that are intentionally skipped (s). Because there was only one test, and it was successful, you see a single . character followed by the concluding output.

Failures

You can observe what happens when a test fails by simply changing the test's condition so that it will intentionally fail.

To illustrate this, add the following method to your Tests class:

def test_failure_case(self):
    """Assert a wrong age, and fail."""
    person = {'anniversary': date(2012, 4, 21),
              'birthday': date(1986, 6, 15)}
    age = calculate_age_at_wedding(person)
    self.assertEqual(age, 99)

This is a similar test, except that it asserts that the age is 99, which is wrong. Observe what happens if you run tests on the module now:

$ python -m unittest wedding
.F
======================================================================
FAIL: test_failure_case (wedding.Tests)
Assert a wrong age, and fail.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "wedding.py", line 50, in test_failure_case
    self.assertEqual(age, 99)
AssertionError: 25 != 99
---------------------------------------------------------------------- Ran 2 tests in 0.000s
FAILED (failures=1)

Now you have two tests. You have the main test from before, which still passes, and a second test with a bogus age, which fails.

If you ran the function directly, you would just get a standard traceback when AssertionError is raised. However, the unittest module actually catches this error and tracks the failure, and prints the output nicely at the end of the test run.

This may seem like an unimportant distinction at this point, but if you have hundreds of tests, this difference matters. A Python module will terminate when it comes across the first uncaught exception, so your test run would stop on the first failure. When you're using unittest, the tests continue to run, and you get all the failures at once at the end.

The unittest output also includes the test function and the beginning of the docstring, so it is easy to go find the failing test and investigate, as well as the full traceback, so you still have the same insight into the offending code.

Errors

Only a small difference distinguishes an error from a failure. A test that raises AssertionError is considered to have failed, whereas a test that raises any exception other than AssertionError is considered to be in error.

Consider what would happen if the person variable being tested is an empty dictionary. Add the following function to your Tests class in the wedding module:

def test_error_case(self):
    """Attempt to send an empty dict to the function."""
    person = {}
    age = calculate_age_at_wedding(person)
    self.assertEqual(age, 25)

Now what happens when you run tests?

$ python -m unittest wedding
.EF
======================================================================
ERROR: test_error_case (wedding.Tests)
Attempt to send an empty dict to the function.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "wedding.py", line 55, in test_error_case
    age = calculate_age_at_wedding(person)
  File "wedding.py", line 10, in calculate_age_at_wedding
    anniversary = person['anniversary']
KeyError: 'anniversary'
====================================================================== FAIL: test_failure_case (wedding.Tests) Assert a wrong age, and fail. ---------------------------------------------------------------------- Traceback (most recent call last): File "wedding.py", line 50, in test_failure_case self.assertEqual(age, 99) AssertionError: 25 != 99
---------------------------------------------------------------------- Ran 3 tests in 0.000s
FAILED (failures=1, errors=1)

Now you have three tests. You have the passing and failing test from earlier, and a test that is in error. Instead of raising AssertionError, the error case raised KeyError, because the calculate_age_at_wedding function expected an anniversary key in the dictionary (and the key was not there).

For most practical purposes, you probably will not actually put much stock in the difference between a failure and an error. They are simply failing tests that fail in slightly different ways.

Skipped Tests

It is also possible to mark that a test should be skipped under certain situations. For example, say that an application is designed to run under Python 2 or Python 3, but a particular test only makes sense in one of the two environments. Rather than have the test fail when it should not, it is possible to declare that a test should run only under certain conditions.

The unittest module provides skipIf and skipUnless decorators that take an expression. The skipIf decorator causes the test to be skipped if the expression it receives evaluates to True, and the skipUnless decorator causes the test to be skipped if the expression it receives evaluates to False. In addition, both decorators take a second, required argument, which is a string that describes why the test was skipped.

To see skipped tests in action, add the following function to your Tests class. (To keep the output shown here down to a reasonable size, the failure and error tests have been removed.)

@unittest.skipIf(True, 'This test was skipped.')
def test_skipped_case(self):
    """Skip this test."""
    pass

This function is decorated with unittest.skipIf. True is a valid expression in Python, and obviously evaluates to True. Now see what happens when you run the tests:

$ python -m unittest wedding
.s
----------------------------------------------------------------------
Ran 2 tests in 0.000s
OK (skipped=1)

The output for a skipped test is an s, rather than the traditional period character that denotes a test that passed. The use of a lowercase letter rather than an uppercase one (as in F and E) signifies that this is not an error condition, and indeed, the complete test run is considered to be a success.

Loading Tests

So far, you have run tests out of a single module, and the tests have lived in the same module where the code that it is testing also lives. This is fine for a trivial example but entirely unfeasible for a large application.

The unittest module understands this, and provides an extensible mechanism for programmatically loading tests from a complete project tree. The default class, which is suitable for most needs, is unittest.TestLoader.

If you are just using the default test loading class, which is what you want most of the time, you can trigger it by using the word discover instead of the module name to be tested.

$ python -m unittest discover
---------------------------------------------------------------------- Ran 0 tests in 0.000s
OK

Where did your tests go? The test discovery follows certain rules for determining where it goes to actually look for tests. By default, it expects all files containing tests to be named according to the pattern test*.py.

This is what you really want to do anyway. The value of test discovery is that you can separate your tests from the rest of your code. So, if you move the passing test itself from the wedding.py file to a new file matching that pattern (for example, test_wedding.py), the test discovery system will find it. (Note that you must import the calculate_age_at_wedding function explicitly, because it is not in the same module anymore!)

Sure enough, now the test discovery finds the tests:

$ python -m unittest discover
.
----------------------------------------------------------------------
Ran 1 test in 0.000s
OK

Mocking

To make the calculate_age_at_wedding function something that was capable of being easily unit tested, recall how you had to remove part of the function. The idea was that you organize your code to make that function easily testable by doing a database call elsewhere.

Often, organizing your code in a way that makes it easily testable is the ideal approach to this problem, but sometimes it is not possible or wise. Instead of implicitly hand-waving certain functionality by organizing your code around atomic testing, how do you explicitly hand-wave a segment of tested code?

The answer is mocking. Mocking is the process of declaring within a test that a certain function call should be stipulated to give a particular output, and the function call itself should be suppressed. Additionally, you can assert that the mocked call that you expect was made in a particular way.

Beginning in Python 3.3, the unittest module ships with unittest.mock, which contains tools for mocking. If you are using Python 3.2 or earlier, you can use the mock package, which you can download from www.pypi.python.org.

The API between these is identical, but how you import it obviously changes. If you are using Python 3.3, you want from unittest import mock; if you are using the installed package, you want import mock.

Mocking a Function Call

Consider again the original function for calculate_age_at_wedding, which included a call to retrieve a record from an unspecified database. (If you are following along, you should create a new file.)

def calculate_age_at_wedding(person_id):
    """Calculate the age of a person at his or her wedding, given the
    ID of the person in the database.
    """
    # Get the person from the database, and pull out the birthday
    # and anniversary datetime.date objects.
    person = get_person_from_db(person_id)
    anniversary = person['anniversary']
    birthday = person['birthday']
# Calculate the age of the person on his or her wedding day. age = anniversary.year – birthday.year
# If the birthday occurs later in the year than the anniversary, then # subtract one from the age. if birthday.replace(year=anniversary.year) > anniversary: age -= 1
# Done; return the age. return age

Before, you tested most of this function by actually changing the function itself. You reorganized the code around ease of testability. However, you also want to be able to test code where this is either impossible or undesirable.

First things first. You still do not actually have a get_person_from_db function, so you want to suppress that function call. Therefore, add a function that raises an exception.

def get_person_from_db(person_id):
    raise RuntimeError('The real 'get_person_from_db' function '
                       'was called.')

At this point, if you actually try to run the calculate_age_at_wedding function, you will get a RuntimeError. This is convenient for this example because it will make it very obvious if your mocking does not work. Your test will loudly fail.

Next comes the test. If you just try to run the same test from before, it will fail (with RuntimeError). You need a way of getting around the get_person_from_db call. This is where mock comes in.

The mock module is essentially a monkey-patching library. It temporarily replaces a variable in a given namespace with a special object called a MagicMock, and then returns the variable to its previous value after the scope of the mock is concluded. The MagicMock object itself is extremely permissive. It accepts (and tracks) basically any call made to it, and returns whatever you tell it.

In this case, you want the get_person_from_db function to be replaced with a MagicMock object for the duration of your test.

import unittest
import sys
from datetime import date
# Import mock regardless of whether it is from the standard library # or from the PyPI package. try: from unittest import mock except ImportError: import mock

class Tests(unittest.TestCase): def test_calculate_age_at_wedding(self): """Establish that the 'calculate_age_at_wedding' function seems to calculate a person's age at his wedding correctly, given a person ID. """ # Since we are mocking a name in the current module, rather than # an imported module (the common case), we need a reference to # this module to send to 'mock.patch.object`. module = sys.modules[__name__]
with mock.patch.object(module, 'get_person_from_db') as m: # Ensure that the get_person_from_db function returns # a valid dictionary. m.return_value = {'anniversary': date(2012, 4, 21), 'birthday': date(1986, 6, 15)}
# Assert that that the calculation is done properly. age = calculate_age_at_wedding(person_id=42) self.assertEqual(age, 25)

The big new thing going on here is the call to mock.patch.object. This is a function that can be used either as a context manager or a decorator, and it takes two required arguments: a module that contains the callable being mocked, and then the name of the callable as a string. In this case, because the function and the test are all contained in a single file (which is not what you would normally do), you must get a reference to the current module, which is always sys.modules[__name__].

The context manager returns a MagicMock object, which is m in the previous example. Before you can call the function being tested, however, you must specify what you expect the MagicMock to do. In this case, you want it to return a dictionary that approximates a valid record of a person. The return_value property of the MagicMock object is what handles this. Setting it means that every time the MagicMock is called, it will return that value. If you do not set return_value, another MagicMock object is returned.

If you run tests on this module, you will see that the test passes. (Here, the new module is named mock_wedding.py.)

$ python -m unittest mock_wedding
.
----------------------------------------------------------------------
Ran 1 test in 0.000s
OK

Asserting Mocked Calls

This test passes, but it is still fundamentally incomplete in one important way. It mocks the function call to get_person_from_db, and tests that the function does the right thing with the output.

What the test does not do is actually verify that the baton handoff to the get_person_from_db function actually occurred. In some ways, this is redundant. You know the call happened, because otherwise you would not have received the return value from the mock object. However, sometimes you will mock function calls that do not have a return value.

Fortunately, MagicMock objects track calls made to them. Rather than just spitting out the return value and being done, the object stores information about how many times it was called, and the signature of each call. Finally, MagicMock provides methods to assert that calls occurred in a particular fashion.

Probably the most common method you will use for this purpose is MagicMock.assert_called_once_with. This asserts two things: that the MagicMock was called once and exactly once, and that the specified argument signature was used. Consider an augmented test function that ensures that the get_person_from_db method was called with the expected person ID:

class Tests(unittest.TestCase):
    def test_calculate_age_at_wedding(self):
        """Establish that the 'calculate_age_at_wedding' function seems
        to calculate a person's age at his wedding correctly, given
        a person ID.
        """
        # Since we are mocking a name in the current module, rather than
        # an imported module (the common case), we need a reference to
        # this module to send to 'mock.patch.object`.
        module = sys.modules[__name__]
with mock.patch.object(module, 'get_person_from_db') as m: # Ensure that the get_person_from_db function returns # a valid dictionary. m.return_value = {'anniversary': date(2012, 4, 21), 'birthday': date(1986, 6, 15)}
# Assert that that the calculation is done properly. age = calculate_age_at_wedding(person_id=42) self.assertEqual(age, 25)
# Assert that the 'get_person_from_db' method was called # the way we expect. m.assert_called_once_with(42)

The thing that has changed here is that the MagicMock object is now being checked at the end to ensure that you got the call to it that you expected. The call signature is simply a single positional argument: 42. This is the person ID used in the test (just a few lines earlier). It is sent as a positional argument because that is the way the argument is provided in the original function.

    person = get_person_from_db(person_id)

Notice that person_id is provided as a single positional argument, so that is what the MagicMock will record.

If you run the test, you will see that it still passes:

$ python -m unittest mock_wedding
.
----------------------------------------------------------------------
Ran 1 test in 0.000s
OK

What happens if the MagicMock's assertions are incorrect? The tests fail with a useful failure message, as you can see by changing the assert_called_once_with argument signature:

$ python -m unittest mock_wedding
F
======================================================================
FAIL: test_calculate_age_at_wedding (wedding.Tests)
Establish that the 'calculate_age_at_wedding' function seems
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/luke/Desktop/wiley/wedding.py", line 58, in
       test_calculate_age_at_wedding
    m.assert_called_once_with(84)
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/unittest
       /mock.py", line 771, in assert_called_once_with
    return self.assert_called_with(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/unittest
       /mock.py", line 760, in assert_called_with
    raise AssertionError(_error_message()) from cause
AssertionError: Expected call: get_person_from_db(84)
Actual call: get_person_from_db(42)
---------------------------------------------------------------------- Ran 1 test in 0.001s

Here you are told which call the MagicMock expected to get, as well as the call it actually received. You would get similar errors if there were no call, or more than one call.

The assert_called_once_with method has a close cousin, which is assert_called_with. This is identical except for the fact that it does not fail if the MagicMock has been called more than once, and it checks the call signature against only the most recent call.

Inspecting Mocks

You can inspect MagicMock objects in several other ways to determine what occurred. You may just want to know that it was called, or how many times it was called. You also may want to assert a sequence of calls, or only look at part of the call's signature.

Call Count and Status

A couple of the easiest and most straightforward questions are whether a MagicMock has been called, and how many times it has been called.

If you just want to know whether a MagicMock has been called at all, you can check the called property, which is set to True the first time that the MagicMock is called.

>>> from unittest import mock
>>> m = mock.MagicMock()
>>> m.called
False
>>> m(foo='bar')
<MagicMock name='mock()' id='4315583152'>
>>> m.called
True

On the other hand, you may also want to know exactly how many times the MagicMock has been called. This is available, too, as call_count.

>>> from unittest import mock
>>> m = mock.MagicMock()
>>> m.call_count
0
>>> m(foo='bar')
<MagicMock name='mock()' id='4315615752'>
>>> m.call_count
1
>>> m(spam='eggs')
<MagicMock name='mock()' id='4315615752'>
>>> m.call_count
2

The MagicMock class does not have built-in methods for asserting the presence of a call or a given call count, but the assertEqual and assertTrue methods that are part of unittest.TestCase are more than sufficient for that task.

Multiple Calls

You may also want to assert the composition of multiple calls to a MagicMock in one fell swoop. MagicMock objects provide the assert_has_calls method for this purpose.

To use assert_has_calls, you must understand call objects, which are provided as part of the mock library. Whenever you make a call to a MagicMock object, it internally creates a call object that stores the call signature (and appends it to the mock_calls list on the object). These call objects are considered to be equivalent if the signatures match.

>>> from unittest.mock import call
>>> a = call(42)
>>> b = call(42)
>>> c = call('foo')
>>> a is b
False
>>> a == b
True
>>> a == c
False

This is actually how assert_called_once_with and similar methods work under the hood. They make a new call object, and then ensure that it is equivalent to the one in the mock_calls list.

The assert_has_calls method takes a list (or other similar object, such as a tuple) of call objects. It also accepts an optional keyword argument, any_order, which defaults to False. If this remains False, this means that it expects the calls to have occurred in the same sequence that they do in the list. If it is set to True, only the presence of each call to the MagicMock is relevant, not the order of the calls.

Here is what assert_has_calls looks like in action:

>>> from unittest.mock import MagicMock, call
>>>
>>> m = MagicMock()
>>> m.call('a')
<MagicMock name='mock.call()' id='4370551920'>
>>> m.call('b')
<MagicMock name='mock.call()' id='4370551920'>
>>> m.call('c')
<MagicMock name='mock.call()' id='4370551920'>
>>> m.call('d')
<MagicMock name='mock.call()' id='4370551920'>
>>> m.assert_has_calls([call.call('b'), call.call('c')])

It is worth noting that although assert_has_calls does expect the calls to occur in order, it does not require that you send it the entire list of calls. Having other calls on either end of the list is fine.

Inspecting Calls

Sometimes, you may not want to test the entirety of a call signature. Perhaps it is only important that a certain argument be included. This is a little bit more difficult to do. There is no ready-made method for a call to declare that it matches anything other than a complete call signature.

However, it is possible to inspect the call object itself and look at the arguments sent to it. The way this works is that the call class is actually a subclass of tuple, and call objects are tuples with three elements, the second and third of which are the call signature.

>>> from unittest.mock import call
>>> c = call('foo', 'bar', spam='eggs')
>>> c[1]
('foo', 'bar')
>>> c[2]
{'spam': 'eggs'}

By inspecting the call object directly, you can get a tuple of the positional arguments and a dictionary of the keyword arguments.

This gives you the capability to test only part of a call signature. For example, what if you want to ensure that the string bar was one of the arguments given to the call, but you do not care about the rest of the arguments?

>>> assert 'bar' in c[1]
>>> assert 'baz' in c[1]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AssertionError
>>> assert c[2]['spam'] == 'eggs'

Once you have access to the positional arguments as a tuple and the keyword arguments as a dictionary, testing for the presence or absence of a single argument is no different than testing for the presence of an element in a list or dictionary.

Other Testing Tools

Several other testing tools are available that you may want to consider using as you build out a unit test suite in your applications.

coverage

How do you actually know what code is being tested? Ideally, you want to test as much of your code as possible in each test run, while still maintaining a test suite that runs quickly.

If you want to know just how much of your code your test suite is exercising, you will want to use the coverage application, which is available from www.pypi.python.org. Originally written by Ned Batchelder, coverage is a tool that keeps track of all of the lines of code in each module that run as your tests are running, and provides a report detailing what code did not run. Of course, coverage runs on both Python 2 and Python 3.

The application works by installing a coverage script, and you use coverage run as a substitute for python when invoking a Python script of any kind, including your unit test script. The output will look fundamentally similar.

$ coverage run -m unittest mock_wedding
.
----------------------------------------------------------------------
Ran 1 test in 0.000s
OK

However, if you look at the directory, you will see that a .coverage file was created in the process. This file contains information about what code in the file actually ran.

You can view this information with coverage report.

$ coverage report
Name           Stmts   Miss  Cover
----------------------------------
mock_wedding      22      1    95%

This report shows how many statements ran and how many statements are in the file that did not run. So, you know that one statement was omitted, but not which one. Adding -m to the command adds output showing which lines were skipped:

$ coverage report -m
Name           Stmts   Miss  Cover   Missing
--------------------------------------------
mock_wedding      22      1    95%   24

Now you know that line 24 was the test that did not run. (In the example mock_wedding.py file, line 24 corresponds to the RuntimeError that is raised if the “real” get_person_from_db function was called.)

The coverage application can also write attractive HTML output using the coverage html command. This highlights in red the lines that did not run. Additionally, if you have a statement with multiple branches (such as an if statement), it highlights those in yellow if only one path was taken.

tox

Many Python applications need to run on multiple versions of Python, including both Python 2 and Python 3. If you are writing an application that runs in multiple environments (even just multiple minor revisions), you want to run your tests against all of those environments.

Attempting to run tests manually across every environment you support is likely to be cumbersome. If you need to do this, consider tox. Written by Holger Krekel, tox is a tool that automatically ­creates virtual environments (using virtualenv) with the appropriate versions of Python (provided you have them installed) and runs the tests within those environments.

Other Test Runners

This chapter has focused primarily on the test runner provided by Python itself, but other alternatives are available. Some, such as nose and py.test, are quite popular, and add numerous features and hooks for extensibility.

These libraries are easy to adopt even if you already have a robust unit test suite, because both ­support using unittest tests out of the box. However, both libraries support other ways of adding tests to the pool.

Both of these libraries are available on www.pypi.python.org, and run on Python 2.6 and up.

Summary

Unit testing is a powerful way to ensure that your code remains consistent over time. It is a useful way to discover when your code changes, and how to make adjustments accordingly.

This is an important facet of any application. Having a robust testing suite makes it easier to detect some bugs and makes you aware when a function's behavior changes, thus simplifying application maintenance.

Chapter 12 examines the optparse and argparse tools for using Python on the command-line ­interface (CLI).

    Reset