If you enjoyed this book considering buying a copy

Chapter 8: Monkeypatching #

Alfredo Deza

The Python testing community has had quite a few mocking and patching libraries. Perhaps the most popular one is mock which is now part of the standard library. The mock library is very flexible and has lots of different helpers meant to ease testing. When Python 3 was not available, it wasn’t surprising to see projects declare mock as a dependency, but since Python 3.3 it is available in Python as the unittest.mock module.

Pytest has the monkeypatching fixture that can also be used, and it does offer a few ways to patch objects properly. Explaining patching can be tricky, especially for a dynamic language like Python, but in short, it allows us to (safely) replace code at runtime with other code that has the desired behavior for a test. The difficulty comes when trying to configure the patching. Patching is needed when there is no way to pass an object (with the desired behavior) to the test. Sometimes, when writing code, it is preferable to think ahead about how the implementation will allow writing tests easily. If writing tests is challenging, it is a red flag, because it means that the code is not modular, or has too many dependencies on other parts of the code.

Similarly, when patching is overused, it is also a red flag because it means that the code defines lots of state or dependencies that it doesn’t control. It also means that tests will have to do the heavy lifting by creating (or recreating in some cases) the state and dependencies required to verify specific behavior.

Why and when to monkeypatch? #

Most of the code that a developer will interact with is already written. There is not that much chance to create large codebases from scratch. The reality is that a lot of large codebases have almost no tests, which is a direct relation with high-complexity code. When writing new code, like a function, for example, if testing is treated as a first-class citizen, at the forefront of producing the function, then it is highly probable that the function will be easier to test. The reason for this is because as you are writing the function, you will think about how to make it easier for yourself when a test gets added.

When testing is not even considered, then expect things to go wild. That small function you produced will grow and grow until it is seven hundred lines of code that no one wants to touch because something always breaks when a change is done. Reason number one to monkeypatch? When a piece of code has dependencies that cannot be passed in as an argument, making it impossible to interact with.

For large pieces of code, I recommend extracting the functionality one needs to interact with and then test that. Do not try to monkeypatch the universe so that you can keep growing the seven hundred line function!

monkeypatching is hard #

Don’t be discouraged if the patching doesn’t work. I’ve written Python professionally for over a decade, and I still fumble on patching correctly. It is so rare that I get it right on the first try that it becomes something I celebrate when I do. Patching is difficult because the patching needs to happen where the code (or module) is used, not where it lives. If you are testing a helper() function in my_project.utils and you need to patch the urllib.request module, then the path to patch would be my_project.utils.urllib.request, not urllib.request!

If the import statement in the utils.py file is:

from urllib import request

Then the patching does not need to include urllib, and it becomes: my_project.utils.request. This is not intuitive, it takes time to get used to it, and it is why patching is so hard. It remains a difficult thing to master.

The simplest monkeypatching #

I’m going to re-use the build_message() helper from Chapter 7, and modify it to demonstrate how hard it would be to test unless patching is done. The small function created a Python dictionary with useful information from a response object that is passed in as an argument. I’m modifying so that is no longer the case, and now the function calls out to a separate function in the same module.

from urllib import request

def make_request():
    response = request.urlopen('http://api.example.com/')
    response.body = response.read()
    return response

def build_message():
    response = make_request()
    message = {
        "success": True,
        "error": "",
    }
    if response.status >= 400:
        message["success"] = False
        message["error"] = response.body
    return message

This code is now a tragedy. The build_message() function no longer accepts the response object. Instead, it uses a separate function call (make_request()) to make a request to some URL and then get the response object. This type of code is easy to write when not thinking about how to test it out. Patching can get us out of this problem without having to modify production code.

The first thing we need to do is determine where exactly is the module being imported so that the patching can is done correctly. That is the most challenging part of patching! The file layout for the example has utils.py at the same level as the test_utils.py, it looks like this:

.
├── test_utils.py
└── utils.py

0 directories, 2 files

The most straightforward path is going to be patching the make_request() function so that when it is called by build_message(), it gets the FakeResponse object. Since we now know the name of the function we need to patch, we need to figure out the right path to it. The utils module is a top-level module, so the path is going to be: utils.make_request. This is an important thing to understand! Remember: the path is determined by where the code is used, not where it lives. If the make_request function was defined in an http.py module, the patching path would not change!

Create a test that uses monkeypatching to patch the make_request function:

import utils

class FakeResponse:

    def __init__(self, status=200, body=""):
        self.status = status
        self.body = body


def test_build_message_success(monkeypatch):
    def fake_request():
        return FakeResponse()
    monkeypatch.setattr('utils.make_request', fake_request)
    result = utils.build_message()
    assert result["success"] is True

Since the build_message is living in the utils.py file, it requires the import statement at the top. The FakeResponse class is the same as before, and the test function changes a bit to require the monkeypatching fixture. It creates a nested function called fake_request that just returns the fake response class. This is followed by the actual patching, which is done with the setattr (short for set attribute). That patching statement means that at runtime, the build_message() function will not call make_request(), instead it calls fake_request, all handled by the Pytest framework, which patches the code for the test and leaves everything as it was at the end of the test.

The test passes, and it ends the test run. One important thing to note is that the framework is also in charge of cleaning up after every test. Since the code gets altered at runtime, it needs to ensure that after each patching and after the test ends (regardless of success, error, or failure), it cleans up and leave things how they were initially. If that didn’t happen, it would cause other tests that require a normal behavior, to have unexpected problems.

Another approach for patching if using the module path as a string is not working (or inconvenient) is by importing the module containing the callable and then using that module. So instead of: monkeypatch.setattr('utils.make_request', fake_request) the patching for the example test would look like this:

import utils

def test_build_message_success(monkeypatch):
    def fake_request():
        return FakeResponse()
    monkeypatch.setattr(utils, 'make_request', fake_request)
    result = utils.build_message()
    assert result["success"] is True

Not much different, but offers some flexibility to achieve the same objective: that make_request shouldn’t be using a real HTTP request.

Automatic and global monkeypatching #

In a small project I worked on, a library had to implement an abstraction to interact with the Jenkins API. The library depended on a third-party dependency that would always try to contact the remote Jenkins API over HTTP. The code was written in a way that required this behavior at runtime, and being unit tests, having a remote Jenkins instance for testing was out of the question.

Fixtures can help here by configuring to automatically be used. In this case in particular, I want to monkeypatch the library globally, for all tests all the time. First define the stub, or code that will react in place of the real library:

class fake_jenkins:
    def get_node_config(self, *a):
        return """<?xml version="1.0" encoding="UTF-8"?><slave></slave>"""

    def __getattr__(self, *a):
        return self

    def __call__(self, *a, **kw):
        return {}

This fake_jenkins() class is fairly specific, it is setting some default behavior, returning valid XML when get_node_config is called. Next, create the fixture:

import pytest

@pytest.fixture(autouse=True)
def no_jenkins_requests(monkeypatch):
    monkeypatch.setattr("jenkins.Jenkins", lambda *a: fake_jenkins())

This fixture, in particular, is using two essential features: it defines autouse=True so that it is automatically called, and it depends on the monkeypatch fixture. The combination of these two features does the patching work automatically via the framework. By placing the fixture in conftest.py and setting the autouse argument to True makes jenkins.Jenkins to return the patched version (fake_jenkins) for anything that calls it during the test session. No extra configuration or flags needed.

Other patching #

Patching is not only for modules, functions, or classes. The monkeypatch fixture offers a lot more than that. So far, I’ve only demonstrated the setattr() call that sets whatever attribute we need on the patched function, class, or class attribute. Just like setattr() sets the attribute, the delattr() removes attributes.

Another useful one is monkeypatch.setitem which sets an item in a dictionary. This might not seem like a big deal, but applications that hold a global configuration based on dictionaries can easily get polluted after a test is messing with values. setitem() safely adds what it needs to and cleans up afterward (like all patching callables).

Two other useful patching helpers, are the setenv() and delenv() when environment variables are needed to alter behavior in tests. Like all patching helpers, these also take care of leaving everything as it was before the manipulation.

This is the list of monkeypatch utilities:

monkeypatch.setattr(obj, name, value, raising=True)
monkeypatch.delattr(obj, name, raising=True)
monkeypatch.setitem(mapping, name, value)
monkeypatch.delitem(obj, name, raising=True)
monkeypatch.setenv(name, value, prepend=False)
monkeypatch.delenv(name, raising=True)
monkeypatch.syspath_prepend(path)
monkeypatch.chdir(path)

The raising=True argument means that if the attribute doesn’t exist, it raises an AttributeError. It allows to ignore the error handling and forcefully patch.

Finally, the chdir() utility can change the current working path for the executing test. It is utilities like this one in the framework that make it so nice to work with. After going through some of the lesser-known monkeypatch capabilities, I’m sure I will be using those more often, rather than implementing them (poorly) on my own.

When not to monkeypatch #

I emphasize that complex code, with lots of conditionals and nested code, that has too many dependencies, not only is hard to test, but it is also horrible to patch correctly. So much patching is going to have to be done that it is going to become a brittle test. Write new functionality (or extract existing ones) into smaller chunks that are easily testable.

This test uses the mock library, it abuses the patching so that the state is just perfect for the code it needs to verify. It isn’t a good test because there is too much going on, it is easy to break. When a test is too brittle, it defies its purpose, which is to give confidence when developing. No one likes to fix several broken tests because the patching is wrong!

    @patch('barn.suite.util.find_git_parent')
    @patch('barn.suite.run.Run.schedule_jobs')
    @patch('barn.suite.util.has_packages_for_distro')
    @patch('barn.suite.util.get_package_versions')
    @patch('barn.suite.util.get_install_task_flavor')
    @patch('__builtin__.file')
    @patch('barn.suite.run.build_matrix')
    @patch('barn.suite.util.git_ls_remote')
    @patch('barn.suite.util.package_version_for_hash')
    @patch('barn.suite.util.git_validate_sha1')
    @patch('barn.suite.util.get_arch')
    def test_newest_failure(
        self,
        m_get_arch,
        m_git_validate_sha1,
        m_package_version_for_hash,
        m_git_ls_remote,
        m_build_matrix,
        m_file,
        m_get_install_task_flavor,
        m_get_package_versions,
        m_has_packages_for_distro,
        m_schedule_jobs,
        m_find_git_parent,
    ):

If more than three patches are needed, it is probably time to rethink the strategy. Not only the above test is brittle, but it is also hard to read and understand what is going on. Readable tests keep developers happy and keeps you sane. If developers are happy and you have kept your sanity, it snowballs into writing more tests when code is modified, or new features are added.

Once you fully understand how to patch and do it correctly, it is very tempting to try and patch everything all the time. It is the path of least resistance! You need to add a test, and you can be done with patching quickly or take a bit longer and possibly refactor the code. Which one to pick?

One thing I tend to do to avoid patching is to allow passing dependencies as arguments. Even if these aren’t required, just for making it easier to test. This is an example based on the official Python documentation on how to use the argparse module used in command-line tools:

>>> parser = argparse.ArgumentParser(prog='myprogram')
>>> parser.print_help()

Use it to create a single file and name it cli.py as a command-line tool (name it cli.py):

import argparse

def main():
    parser = argparse.ArgumentParser(prog='myprogram')
    parser.parse_args(argv)
    parser.print_help()

if __name__ == '__main__':
    main()

It prints the help menu on the terminal when executed with python cli.py. To test different scenarios here requires messing with how Python sees the arguments via sys.argv. I don’t want to do that, and I want to avoid a possible collision. Allowing argv to be passed into the main() function makes it straightforward, but there is a catch: this is only useful for testing, and the regular sys.argv should be used otherwise. I use this pattern of optionally allowing it, falling back to the real sys.argv that is needed:

def main(argv=None):
    argv = argv or sys.argv
    parser = argparse.ArgumentParser(prog='myprogram')
    parser.parse_args(argv)
    parser.print_help()

By setting argv=None and then switching to the real sys.argv if it is a None, allows the code to work regularly and make it so much easier to test with. Some refer to this technique as dependency injection. It means that dependencies are injected (passed in as arguments) to work with the code.

At test time, there is no need to patch, argv can be passed in:

>>> from cli import main
>>> main(argv=["--help"])
usage: myprogram [-h]

optional arguments:
  -h, --help  show this help message and exit

Patching builtin modules #

Don’t do it! Patching builtin modules is problematic because it affects the behavior in other libraries (including the test framework). One of the most used built-in modules is the os module. Patching anything in the os module risks unexpected behavior from other tools that are out of control from the code under test.

If the code tested is using the os module directly, there are a couple of workarounds rather than attempting to patch it. One of them is adding an optional argument so that the dependency is injected.

This code uses os.walk to traverse the filesystem to find files that have the Python suffix:

import os

def find(path='.'):
    if path == '.':
        path = os.getcwd()
    python_files = []
    for root, dirs, files in os.walk(path):
        for _file in files:
            if _file.endswith('.py'):
                python_files.append(os.path.join(root, _file))
    return python_files

To add tests for this function, you shouldn’t patch os.walk. Modify it to use dependency injection as the first approach:

import os

def find(path='.', _walk=None):
    _walk = _walk or os.walk
    if path == '.':
        path = os.getcwd()
    python_files = []
    for root, dirs, files in _walk(path):
        for _file in files:
            if _file.endswith('.py'):
                python_files.append(os.path.join(root, _file))
    return python_files

By using the pattern of _walk = _walk or os.walk it is defaulting to the right os.walk module unless it is passed in (injected). The other alternative would be to extract the part where the filesystem is traversed:

def walk(path):
    for root, dirs, files in os.walk(path):
        for _file in files:
            yield os.path.join(root, _file)

def find(path='.'):
    if path == '.':
        path = os.getcwd()
    python_files = []
    for _file in walk(path):
        if _file.endswith('.py'):
            python_files.append(_file)
    return python_files

Extracting and refactoring looks perfect for this example, as it adapts the walk() function to be a bit more useful returning absolute paths, which in turn makes the find() function smaller and faster to grasp.