Chapter05 Command Line Tools

If you find this content useful, consider buying this book:

  • Amazon
  • Purchase all books bundle
  • Purchase from pragmatic ai labs
  • Subscribe all content monthly
  • If you enjoyed this book considering buying a copy

    Chapter 5: Build a command-line tool with Click #

    Alfredo Deza

    Command-line tools are fun, most of my first projects using Python, where command-line tools to automate tedious tasks at work as a system administrator. I describe them as one of the hidden concepts behind the DevOps culture - where automation is king. Whenever I’ve thought about solving a problem with a command-line tool, it is automation that I am thinking about.

    One of those tasks was to create a new issue tracker site, along with a sub-domain, for every new client the media agency had. This was particularly important for large clients and big projects that required lots of coordination with project managers, designers, and developers. From beginning to end, it would take me about half an hour to get all of these things wired up together. Installations, configurations, DNS changes, restarting, or re-loading the webserver. The list of items to accomplish kept growing longer, and with that, the ability to miss out on a step and get things out of order became more probable. Whenever I did make a mistake, I would have to backtrack, think about what exactly did I miss and debug. No fun.

    Useful basics #

    Command-line tools follow specific rules and conventions; few of those took me a while to get acquainted with. It is good to have an understanding of these environments to be productive. In most Linux and Unix environments, you have exit codes, for example. These are important because an exit code can signal an error condition if it is non-zero. A zero exit status commonly indicates a non-error completion.

    The relationship between the operating system and the command-line tool is very tight. In Python, one can interact with the command-line arguments (if any) by using the sys.argv module. For example, in this file (save it as cli.py):

    import sys
    for item in sys.argv:
        print(sys.argv)
    

    Run it in the terminal directly with Python first, and later with some bogus arguments to see what happens:

    $ python cli.py
    cli.py
    $ python cli.py one-arg --flag
    cli.py
    one-arg
    --flag
    

    If you are curious, try printing out the whole sys.argv. It is a list! So under the covers, any library (including the default ones that come with Python) are looking at this list trying to figure out flags, arguments, and any other value that might be present. One can wonder why utilities are needed if all you need is to poke inside the sys.argv list to determine what is going on. One of the many problems is that everything present in sys.argv is a string:

    import sys
    
    print(sys.argv)
    

    Save it as cli.py once again and run it in the terminal:

    $ python cli.py --foo 1 3 400 3.1
    ['cli.py', '--foo', '1', '3', '400', '3.1']
    

    If the command-line tool is intending to deal with integers or floats, then this is a problem. Further, if the --foo flag purpose is to be a boolean flag (that signals True or False) then it needs to treat the next argument as separate (the '1' in this case). Lots of questions need solving: is it a flag or an argument that requires a value? If it is a value, what type of value? An integer, float, or perhaps a comma-separated list of values? Even for the most straightforward use cases, I would recommend understanding how tooling like argparse and click work to help out. Just like small scripts grow larger with time (and sometimes end in production), command-line tools grow options and flags too, and handling them directly with sys.argv is far from ideal.

    I mentioned exit status but didn’t say how that is detected. Continuous integration and delivery platforms (referred to as CI/CD) like Jenkins, Gitlab, and CircleCI depend on this exit status to determine if a test run was successful or not. Run the cli.py file again and check for yourself what the exit code is:

    $ python cli.py
    cli.py
    $ echo $?
    0
    

    You check the exit status of a program in the terminal with echo $?. In this case, is a 0. With the sys module, you can force a non-zero exit status for any given condition you need:

    import sys
    
    sys.exit(72)
    

    Rerun the program:

    $ python cli.py
    $ echo $?
    72
    

    Being able to use any number from 0 to 255 (usually) gives you the ability to be as granular as you chose to signal other systems an error condition.

    Standard library tools #

    Python does come with a few helpers to create useful command-line tools. But not everything that comes in the standard library is meant to have the best developer experience. One of the reasons some of this happens is because the standard library in Python moves at a glacier pace. Releases come every few months, and they have to be fully backwards compatible in most cases. Libraries and tooling that are outside of that cycle can move super fast and implement and prototype things very quickly. Tools that are outside of the standard library aren’t bound by most restrictions and can do some incredible things very quickly.

    It’s useful to take a look at what Python has to offer to build command-line tools to understand why Noah and I recommend using a separate library. In this case, the module is argparse. Its API is very rich, and it can be daunting to browse the documentation to figure it out.

    Without much effort, and in its simplest form, the module gets you a help menu and some handling. Create a new cli.py file with the following:

    import argparse
    parser = argparse.ArgumentParser(
        description='My first command-line tool, it handles arguments and values'
    )
    parser.parse_args()
    

    In the terminal, it shouldn’t do anything or output anything unless the help flag gets passed in. Try again with -h:

    $ python cli.py -h
    usage: cli.py [-h]
    
    My first command-line tool, it handles arguments and values
    
    optional arguments:
      -h, --help  show this help message and exit
    

    A help menu! For free! And one of my pet peeves of command-line tools: it can accept both -h and --help flags to show it. There is not much going on in this file, and there aren’t any functions at all. To separate the argument handling from other functions that do other work, then it is common (but not required!) to create a main() function and add some special syntax to call it:

    import argparse
    
    
    def main():
        parser = argparse.ArgumentParser(
            description='My first command-line tool, it handles arguments and values'
        )
        parser.parse_args()
    
    
    if __name__ == '__main__':
        main()
    

    The quirky if __name__ at the bottom is necessary so that Python understands that if called on the command-line, it should execute the main() function. This type of separation is vital so that the code is contained and separated in individual functions as the program grows. You should avoid having everything pilled into one large file without separation of concerns. Run the same script, and the results are going to be the same; even though everything is within the function, nothing else has changed in the flow of code.

    The argparse module has several utilities to help with arguments and processing them. You can use validators and ensure that the inputs are correct even before they hit code in the application. This is great because it avoids having to use try/except blocks everywhere dealing with the inputs, making code much cleaner and easier to read.

    One of the issues that I find with argparse that isn’t problematic with Click is that adding sub-commands is done through some other classes and instantiating them as “sub-parsers”. Grasping what these sub-parsers are and how they work is tricky, to say the least, even for me that I’ve used argparse several times throughout the past decade in multiple command-line tools I’ve maintained.

    A few years ago, one of the tools I worked on had several different sub-commands. The tool had to deal with filesystem technology like LVM and ZFS. These sub-parsers had to implement logic that was somewhat similar, but distinct enough that required them to be separate. To roughly replicate the structure of the tool with argparse, create a new file called filesystem.py, and re-use the if __name__ pattern so that the first function to get executed is main():

    import argparse
    
    def main():
        parser = argparse.ArgumentParser(
            description="A tool that deals with filesystems"
        )
    
        parser.add_argument('--verbose', action='store_true', help='Produce more output')
    
    
        subparsers = parser.add_subparsers(help='Filesystem sub-commands')
    
        # LVM sub-command
        lvm_parser = subparsers.add_parser('lvm', help='The LVM sub-command')
        lvm_parser.add_argument('size', type=int, help='The size to work with')
    
        # ZFS sub-command
        zfs_parser = subparsers.add_parser('zfs', help='The ZFS sub-command')
        zfs_parser.add_argument('size', type=int, help='The size to work with')
    
        parser.parse_args()
    
    
    if __name__ == '__main__':
        main()
    

    There is quite a bit of information and new classes and methods that are getting called in argparse to make this whole apparatus work with sub-commands. Before taking it apart, run it with the help menu flag to check the output:

    $ python filesystems.py -h
    usage: filesystems.py [-h] [--verbose] {lvm,zfs} ...
    
    A tool that deals with filesystems
    
    positional arguments:
      {lvm,zfs}   Filesystem sub-commands
        lvm       The LVM sub-command
        zfs       The ZFS sub-command
    
    optional arguments:
      -h, --help  show this help message and exit
      --verbose   Produce more output
    

    Even though the tool defines sub-commands, the output looks clunky. I’m not sure what exactly are “positional arguments” if I implemented sub-commands. The help output is a bit confusing, and although it can be improved by (once again) making more changes to the module, it is far from ideal. The impression is that it is a module that doesn’t support sub-commands.

    Sub-commands then, are created by adding sub-parsers, which are associated with a parent parser that defines options and descriptions. Those sub-parsers need to define their options, as I did with the size argument that needs to be an integer. A few tests in the terminal demonstrate the behavior with the other commands:

    $ python filesystems.py
    usage: filesystems.py lvm [-h] size
    
    positional arguments:
      size        The size to work with
    
    optional arguments:
      -h, --help  show this help message and exit
    
    $ python filesystems.py zfs -h
    usage: filesystems.py zfs [-h] size
    
    positional arguments:
      size        The size to work with
    
    optional arguments:
      -h, --help  show this help message and exit
    

    So size shows up as a positional argument, just like when the sub-commands got defined. This is confusing! Is it a sub-command, a flag, or an argument? What is a positional argument? Try it out once again and see what happens:

    $ python filesystems.py lvm size
    usage: filesystems.py lvm [-h] size
    filesystems.py lvm: error: argument size: invalid int value: 'size'
    

    An error returns because size is defined as requiring an integer. My first impression is that perhaps I can add an integer to it to make it happy:

    $ python filesystems.py lvm size 1
    usage: filesystems.py lvm [-h] size
    filesystems.py lvm: error: argument size: invalid int value: 'size'
    

    The error handling here is pretty weak; it thinks that size is a value, and it completely ignored the integer. Again, these are things that can be handled better with more configuration in argparse. But these are the sort of things that have caused other tooling to proliferate so that they handle command-line construction better. In the end, what the filesystemspy file wants is this:

    $ python filesystems.py lvm 1
    

    In the next section, I use the same structure for this example to demonstrate how it looks with Click. You can decide what fits your use case better, but we highly recommend using Click whenever possible.

    Introduction to Click #

    Click is incredible. The project was started by Armin Ronacher, one of the most prolific Pythonistas out there that have several (extremely) popular libraries under his belt. These libraries aren’t just popular; they are extremely well-written. These are just a few of the ones he has authored or co-authored throughout the years, see if you can recognize a few:

    • Pygments
    • Jinja
    • Sphinx
    • Werkzeug
    • Flask

    And of course, Click, which is the library we highly recommend for dealing with command-line tools. This is the simplest example you can write to create a command-line tool with Click:

    import click
    
    @click.command()
    def main():
        return
    
    if __name__ == '__main__':
        main()
    

    Run this in the command-line, and if you went through the argparse section, compare on how it behaves now:

    $ python cli.py --help
    Usage: cli.py [OPTIONS]
    
    Options:
      --help  Show this message and exit.
    

    Almost identical! That is good. One thing that I wish would work is support -h, --help, and help out of the box. Even though I consider this type of customization a little bit advanced, let’s make the changes necessary so that both -h and --help work with cli.py:

    CONTEXT_SETTINGS = dict(help_option_names=['-h', '--help'])
    
    @click.command(context_settings=CONTEXT_SETTINGS)
    def main():
        return
    
    if __name__ == '__main__':
        main()
    

    Because the help menu is built-in into Click, it requires some customization via the context_settings argument. Try it again to see how it behaves:

    $ python cli.py -h
    Usage: cli.py [OPTIONS]
    
    Options:
      -h, --help  Show this message and exit.
    
    $ python cli.py --help
    Usage: cli.py [OPTIONS]
    
    Options:
      -h, --help  Show this message and exit.
    

    One thing I haven’t mentioned up until now is that the Click framework uses decorators. If you are familiar with decorators and how they work, then this all feels very natural to work with. If you aren’t, decorators are a way to define functions that wrap other functions, that is: their input is a function. So @click.command is, in reality, a function that is taking main() as an input. In the process, that function is making some modifications. Decorators are a whole world of fun and have many different uses (they can even be classes). But knowing that they alter functions like main() is enough to understand Click.

    Sub-commands #

    Now that there is some parity between the two cli.py files (one with Click, and the other one with argparse) in their help menus, dive in to get sub-commands working with a couple of options. First, create a new file called filesystems.py by copying the existing cli.py, and make these modifications:

    import click
    
    CONTEXT_SETTINGS = dict(help_option_names=['-h', '--help'])
    
    @click.command(context_settings=CONTEXT_SETTINGS)
    @click.option('--verbose', is_flag=True, help='Produce more output')
    def main(verbose):
        """ A Tool that deals with filesystems """
        return
    
    if __name__ == '__main__':
        main()
    

    A new option is added, called --verbose. By using is_flag=True, it tells the framework that this should be treated like a boolean flag (on or off behavior). Another addition is the docstring inside the function which is extracted by Click to produce the help output. Run it to see it in action:

    $ python filesystems.py
    
    Usage: filesystems.py [OPTIONS]
    
      A Tool that deals with filesystems
    
    Options:
      --verbose   Produce more output
      -h, --help  Show this message and exit.
    

    Now add the sub-commands. Keep the filesystems.py mostly the same, but make these changes:

    @click.group(context_settings=CONTEXT_SETTINGS)
    @click.option('--verbose', is_flag=True, help='Produce more output')
    def main(verbose):
        """ A Tool that deals with filesystems """
        return
    
    @main.command()
    def zfs(size):
        pass
    
    @main.command()
    def lvm(size):
        pass
    

    Two important changes occurred here, one of them is that the @click.group decorates the main() function instead of .command(), and the additions of the zfs() and lvm() functions are decorated with the name of the main function. The first time I saw this, it took me a while to wrap my head around what is going on. These two other functions get decorated with something that seems magical because it doesn’t look like it is defined anywhere else. The framework makes that happen seamlessly, and it is how it ties everything together:

    $ python filesystems.py -h
    Usage: filesystems.py [OPTIONS] COMMAND [ARGS]...
    
      A Tool that deals with filesystems
    
    Options:
      --verbose   Produce more output
      -h, --help  Show this message and exit.
    
    Commands:
      lvm
      zfs
    

    Now both lvm and zfs appear correctly as sub-commands. Adding the size options like in the argparse example is not too complicated, those need to be in the decorator itself:

    @main.command()
    @click.argument('size')
    def zfs(size):
        pass
    
    @main.command()
    @click.argument('size')
    def lvm(size):
        pass
    

    Note how the framework requires click.argument instead of main.argument or even click.option. Good to have this distinction! The help menu takes care of explaining:

    $ python filesystems.py lvm --help
    Usage: filesystems.py lvm [OPTIONS] SIZE
    
    Options:
      -h, --help  Show this message and exit.
    

    One thing that is missing is the type of the argument itself. Just like argparse (and very unlike looking at sys.argv directly), the framework can map values into a type. The size in this case needs to be an integer, and this argument needs to be just one, not multiple:

    @main.command()
    @click.argument('size', type=int)
    def lvm(size):
        pass
    

    If any other type gets passed in, the tool complains. This is a form of validation, and allows the application to assume the value will always be an integer:

    $ python filesystems.py lvm some-invalid-input
    Usage: filesystems.py lvm [OPTIONS] SIZE
    Try 'filesystems.py lvm -h' for help.
    
    Error: Invalid value for 'SIZE': some-invalid-input is not a valid integer
    

    Accessing values #

    Since Click requires commands and sub-commands to declare the options and arguments as function arguments, that is essentially how you can interact with the input. If I remove the size argument from the lvm() function Click complains:

    $ python filesystems.py lvm 100
    Traceback (most recent call last):
      File "filesystems.py", line 22, in <module>
        main()
      File "lib/python3.8/site-packages/click/core.py", line 829, in __call__
        return self.main(*args, **kwargs)
      File "lib/python3.8/site-packages/click/core.py", line 782, in main
        rv = self.invoke(ctx)
      File "lib/python3.8/site-packages/click/core.py", line 1259, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
      File "lib/python3.8/site-packages/click/core.py", line 1066, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "lib/python3.8/site-packages/click/core.py", line 610, in invoke
        return callback(*args, **kwargs)
    TypeError: lvm() got an unexpected keyword argument 'size'
    

    The way to access values from arguments and option in the command-line then is to declare them as arguments of the function, just like a normal Python function would:

    @main.command()
    @click.argument('size', type=int)
    def lvm(size):
        if size > 50:
            print('Got a large-enough size!')
        else:
            print('Size of %s might not be enough for this operation' % size)
    

    The print statements show in the output now:

    $ python filesystems.py lvm 100
    Got a large-enough size!
    

    One advanced feature of the framework is that a context object can get passed around from one command to the other. This especial object has information about arguments and other options and has access to the parent command. In situations where sub-commands need to poke inside other parameters and flags or do something interesting that may affect some sub-commands, that is when click.pass_context should be used:

    @click.group(context_settings=CONTEXT_SETTINGS)
    @click.option('--verbose', is_flag=True, help='Produce more output')
    @click.pass_context
    def main(ctx, verbose):
        """ A Tool that deals with filesystems """
        return
    

    Now that the main() function uses the context decorator, it must accept it as an argument. Other sub-commands can make use of the context now:

    @main.command()
    @click.argument('size', type=int)
    @click.pass_context
    def lvm(ctx, size):
        print('Parent params: %s' % ctx.parent.params)
        print('Current params: %s' % ctx.params)
    

    The lvm() function is looking into the parameters of the current and parent commands, but there are lots of other advanced things one can do with the context. Run it to see it print out some information about the context:

    $ python filesystems.py lvm 100
    Parent params: {'verbose': False}
    Current params: {'size': 100}
    

    Recommendations #

    Before Click existed, I preferred to avoid argparse at all costs and even roll my clunky version of a command-line framework. The hard requirement of classes and having to instantiate them with all kinds of parameters, and not having real support for sub-commands is annoying enough in argparse that Click should be a top choice.

    As I’ve mentioned in this chapter, not having both -h and --help define is annoying in command-line tools. If a user is trying to get help, let’s try to make it as easy as possible. I once had an argument with a developer where he wanted to display a link to the online documentation instead of displaying it in the terminal. This developer was lazy because he felt it was too much work to update both the command-line help and the online version of the documentation. Be as helpful as you can, with rich information on flags.

    When adding descriptions to flags and arguments, use less of a software engineer wording, and more a real-life wording. For example, refer to flags as actions or options that affects behavior, don’t refer to them as a Python class or internal object name that nobody (except perhaps you) understands. For example, in a command-line tool I started working with, the input is the name of a container. If this container is not found, the following error message returns:

    Error: cannot use input image string (no discovered imageDigest)
    

    What is “image string” ? And what is an imageDigest? If I look at the function that implements it, the imageDigest is an internal object. This makes it easier for the engineer implementing it, but hard for the user to read the error report.

    Writing useful command-line tools is hard work, and like most things good, they take some effort to get there. Take a look at other tools that you like and see what features and positive things you can implement yourself. Perhaps you find other nifty ways to have a useful help menu!