Chapter06 Build Web Apps Flask

If you find this content useful, consider buying this book:

If you enjoyed this book considering buying a copy

Chapter 6: Build a web application with Flask #

Alfredo Deza

Continuing with great projects that Armin Ronacher has authored, we move towards the excellent Flask web framework. If you went through the chapter on building command lines applications with Click, you’d find a few similarities on its API.

Web frameworks can be intimidating and complicated. As with most tools in any field, there is a hard balance to achieve between being very useful and easy to use. It sounds similar, but it is certainly not the same. The struggle comes with covering more use cases, and adding more options, and therefore making it harder to use. Back in 2010, I was working at a small startup that was using a framework called TurboGears. This framework used a way of routing requests to Python code called “object dispatch”. I struggled so much trying to understand this concept; it made me doubt I was the right person for the job. Object dispatching relies on classes and defining unique class methods to jump from one class to the other for each part of a url. Not for the faint of heart!

Even though object dispatch is excellent (once you understand it that is), it exemplifies this issue of being a fantastic choice for many different things, including exceptional cases for power users, and complicated for people getting started. As someone who co-authored a whole web-framework based on object dispatch (called Pecan if you are interested), I do believe that Flask is great for getting started and very well suited for broad applications.

Without web frameworks like Flask, dealing with HTTP requests in pure Python is very hard. I know because I was naive enough to think this was possible, failing miserably at it.

HTTP Basics #

Before diving into framework-specific things, some HTTP basics need to be in place. I used to work with a friend of mine that is referred to as an HTTP stickler. I had to look up the word stickler to understand what was going on. A stickler is someone who adheres to a specific quality or pattern, and I started learning from this friend that HTTP could get quite complicated beyond simple requests and responses.

It turns out that a lot happens to map an HTTP request to Python code at runtime. Is it a GET or POST request? What about JSON? There are quite a few HTTP verbs that are available, but most commonly, you handle a subset:

  • GET
  • POST
  • DELETE
  • PUT
  • DELETE
  • HEAD

Those are the request methods, and they can optionally include a body. On the receiving end, servers (and therefore frameworks) have to account for those possibilities. The GET request is the most common one and is the one that happens in a web browser when you open a URL like www.google.com, for example. When a request gets sent, always expect a response, and here is where things get tricky because sometimes, a network might not be up and running when a response is issued. The client portion of the interaction (the web browser would be a client, and google.com would be a server) has to account for potential fallouts of broken requests and issue proper timeouts. There are all kinds of complicated interactions between clients and servers, and they both need to adhere to the HTTP specification.

In a simplified fashion, the interactions between a client (like a web browser) and a server are fundamental to follow, even if at some point things aren’t quite working. In this chapter, we go through some requests and responses, and even some automatic behavior!

The most simple web application #

Most web frameworks have a simple one or two function example that directly routes a request to the Python code. Flask is no different, and it is, in fact, a single function with a couple of other lines (save it as basic.py):

from flask import Flask
app = Flask('Simple App')

@app.route('/')
def index():
    return 'index function'

Like all the other chapters in this book that require Python libraries, create a virtual environment, activate it, and then install the flask framework:

$ python3 -m venv venv
$ source venv/bin/activate
$ (venv) pip install "flask==1.1.1"
Collecting flask==1.1.1
  Using cached Flask-1.1.1-py2.py3-none-any.whl (94 kB)
Collecting Jinja2>=2.10.1
  Using cached Jinja2-2.11.1-py2.py3-none-any.whl (126 kB)
Collecting itsdangerous>=0.24
  Using cached itsdangerous-1.1.0-py2.py3-none-any.whl (16 kB)
Collecting Werkzeug>=0.15
  Using cached Werkzeug-1.0.0-py2.py3-none-any.whl (298 kB)
Collecting click>=5.1
  Using cached click-7.1.1-py2.py3-none-any.whl (82 kB)
Collecting MarkupSafe>=0.23
  Using cached MarkupSafe-1.1.1-cp38-cp38-macosx_10_9_x86_64.whl (16 kB)
Installing collected packages:
  MarkupSafe, Jinja2, itsdangerous, Werkzeug, click, flask
Successfully installed:
  Jinja2-2.11.1 MarkupSafe-1.1.1 Werkzeug-1.0.0 click-7.1.1
  flask-1.1.1 itsdangerous-1.1.0

The Flask class becomes the application after getting called with the name of the file. This name is important because it allows Flask to understand where to look for other files that might be required for your app to run. Then it uses app as a decorator in the index() function, using a callable called route() that indicates what URL should map to this index. This means that the top-level URL for the application hits this function. To run this application, you need to use an environment variable that points to the file that has the running code; in this case, it is the basic.py file. In the same line, after defining the environment variable, call the flask executable to run the application:

$ FLASK_APP=simple.py flask run
 * Serving Flask app "basic.py"
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)

Open a web browser and use the address that is shown in the output (http://127.0.0.1:5000/ in the example output) or try to use curl in the command-line to issue a GET request:

$ curl http://127.0.0.1:5000
index function

Try requesting a url that is not mapped in the application to check the behavior when there is a failure:

$ curl http://127.0.0.1:5000/foo
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>404 Not Found</title>
<h1>Not Found</h1>
<p>The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.</p>

This is the most straightforward web application you can build with Flask. With a couple of concepts like the application and route callable, you are now ready to expand onto a new application that is bigger with more functionality.

URLs to Python code #

So far a single route to the index method is implemented. By using .route('/'), the code is instructing flask that any request that comes to the top url (/ usually) will go to the function that is decorated. What about other URLs and HTTP methods? Try sending a POST request with curl to the running Flask app to see what happens:

$ curl -X POST http://127.0.0.1:5000/
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>405 Method Not Allowed</title>
<h1>Method Not Allowed</h1>
<p>The method is not allowed for the requested URL.</p>

Not allowed! Because the function doesn’t explicitly define that it allows POST requests, the framework is rejecting it. To allow other methods, the decorator needs to be updated:

@app.route('/', methods=['GET', 'POST'])
def index():
    return 'index function'

Using curl again to submit the POST request works out well:

$ curl -X POST http://127.0.0.1:5000/
index function

A POST request without information like JSON or any argument is not very useful. The framework allows functions to look inside the request to inspect the body or other aspects of the request, like the method. In this case, we have a single function that allows both the GET and POST requests, and using request can help us modify this to add behavior specific to POST:

from flask import request

app = Flask('Simple App')

@app.route('/', methods=['GET', 'POST'])
def index():
    if request.method == 'POST':
        if request.data:
            return 'Got a POST request with a body: %s' % str(request.data)
        else:
            return 'Got a POST request without a body!'
    else:
        return 'index function from a GET request'

Now that the function is handling different request methods check the behavior in the command-line. These types of checks in the command-line are going to be easier than with a browser (unlike before) because the requests need to be carefully crafted. Creating a POST request with a browser with custom headers and payloads isn’t trivial. Learning tools like curl is always a great way to step up your expertise!

First, a request with curl and a plain-text body:

$ curl -d 'a plain text body' -X POST http://127.0.0.1:5000/
Got a POST request without a body!

This looks unexpected. We sent a body with the -d flag. The problem here is that the headers of the request were unset, so Flask couldn’t determine how to interpret the body of the request, and so, it didn’t populate request.data. Set the right headers for the plain-text body and try again:

$ curl -d 'a plain text body' -H 'Content-Type: plain/txt' \
  -X POST http://127.0.0.1:5000/
Got a POST request with a body: b'a plain text body'

Exactly what was supposed to happen. The body of the request got loaded into request.data. What would happen if we send JSON?

$ curl -d '{"key": "value"}'  -X POST http://127.0.0.1:5000/
Got a POST request without a body!

Similarly, this fails because we aren’t advertising the right headers with this request, fix that and try again:

$ curl -d '{"key": "value"}' -H "Content-Type: application/json" \
   -X POST http://127.0.0.1:5000/
Got a POST request with a body: b'{"key": "value"}'

Much better, but we are in the same place as with the plain text, in which the function is expecting JSON, would have to load the incoming data. The Flask framework has helpers that automatically load JSON for the function; the resulting JSON is loaded as a Python dictionary and assigned into request.json instead of request.data. This distinction is useful because if the handling in the function needs to make a distinction between what it received as a raw payload and what Flask loaded as JSON, then it can check both attributes of the request object. Modify the function to use request.json and POST the same request again:

@app.route('/', methods=['GET', 'POST'])
def index():
    if request.method == 'POST':
        if request.json:
            return 'Got a POST request with a body: %s' % str(request.json)
        else:
            return 'Got a POST request without a body!'
    else:
        return 'index function from a GET request'

Note how request.json is used instead of request.data. Now submit the same POST request with JSON:

$ curl -d '{"key": "value"}' -H "Content-Type: application/json"  -X POST http://127.0.0.1:5000/
Got a POST request with a body: {'key': 'value'}

The difference might be hard to detect, but the return value is now a Python dictionary. These types of facilities the framework offers are convenient, allowing a great deal of flexibility.

Creating an application #

There are many great tutorials on building web applications. I even remember a few years back where a couple of famous tutorials explained how to create a blogging app in less than ten minutes. Although Flask can scale nicely and be as simple as a single function as I’ve shown in the previous examples, I think that building a useful application while keeping it as simple as possible is critical to gain more knowledge. This section creates an application that covers a few key aspects, like serving files and using templates - something that has not been done yet in this chapter. Along the way, a few problems show up, and I try to correct them or work around them when there are no options.

Gaining necessary knowledge on how to get this done is more important than the application itself.

In my local development environment, I have multiple different Python projects that I clone from online repositories all the time. Some of them I work on a lot, and some others are left there without much attention. In a few of the critical projects, I run lots of tests and generate reports with the coverage tool. The coverage tool allows generating an HTML report of how much tests are covering the production code - a fundamental metric when creating unit tests. This web application does a directory listing from the path where all these projects exist and create links to each one of them. The idea is that I want to be able to browse through those projects and then look at the generated HTML in the web browser.

Start by defining where your Python projects live so that it can get used in the application. My projects live in /Users/alfredo/python, so I use that in the examples below. The first iteration of the application is to display every project:

import os
from flask import Flask, render_template
app = Flask(__name__)

@app.route('/')
def index():
    TOP_DIR='/Users/alfredo/python'
    python_projects = [i for i in os.listdir(TOP_DIR)]
    return render_template('index.html', files=python_projects)

Before running the application, a new concept needs an explanation: templating! If you’ve never seen HTML templates or worked with a template engine before for the web, they are pretty cool. They can be powerful because it allows avoiding repetitive blocks of information and can even handle logic to produce formatted HTML. In the current example, the TOP_DIR is the location where the projects I want to show exist, it then gets listed to capture the directory names to return a Flask helper called render_template() finally. This is using index.html and passing a keyword argument called files. The HTML file is the template, and this is how it looks:

<ul>
    {% for file in files %}
    <li><a href="{{ file }}">{{ file }}</a></li>
    {% endfor %}
</ul>

It looks like plain HTML except for the curly brackets and % characters. Anything within those brackets is a variable or a logic block. Within the double curly brackets, the template is using the name file which comes from looping the values of files. If you double-check the application code, you can see that files is what gets passed into the template. The template itself receives that information and can use it as Python mixed with HTML.

To make this work seamlessly, I’m relying on a convention of Flask applications which is to place templates in a directory called templates that is living in the same top directory as the application:

tree
.
├── app.py
└── templates
    └── index.html

1 directory, 2 files

Now that both the app.py file and the template is in place run the application. Make sure that the TOP_DIR variable represents an actual path in your local environment where your Python projects live:

$ FLASK_APP=app.py flask run

Go to the url and the site displays content similar to what I have:

The output is very rough but useful enough for our purposes. There is one big flaw, though, and that is that all the projects that exist in that directory are showing, not only the ones I’m working on. Further, I’m interested in only the ones that have an HTML coverage report available. This condition is essential because if there isn’t HTML to display then, it doesn’t make sense to have it around, at least not for this particular application.

Remove the list comprehension and add a loop that checks if the project has the htmlcov/index.html file present in the directory:


@app.route('/')
def index():
    TOP_DIR='/Users/alfredo/python'

    python_projects = []
    for directory in os.listdir(TOP_DIR):
        index_path = os.path.join(TOP_DIR, directory, 'htmlcov/index.html')
        if os.path.exists(index_path):
            python_projects.append(directory)

    return render_template('index.html', files=python_projects)

Rerun the application, and reload the URL to see what projects have met the new constraints:

In my case, it is only a single project called remoto. This project has lots of tests already; all I had to generate the report was to run the coverage tool in the top directory. Reusing the same environment for the Flask application, I installed pytest and pytest-cov to generate the coverage report. Installing pytest-cov also pulls in the coverage tool, which I then run:

$ pip install pytest pytest-cov
[...]
Successfully installed attrs-19.3.0 coverage-5.0.4 more-itertools-8.2.0
  packaging-20.3 pluggy-0.13.1 py-1.8.1 pyparsing-2.4.6 pytest-5.4.1
  pytest-cov-2.8.1 six-1.14.0 wcwidth-0.1.9

I ran the tests with pytest, and then I generate the report:

$ pytest --cov remoto
[...]
 111 passed in 4.20s
$ coverage html

That produces an htmlcov directory with lots of files, including the index.html file that the Flask application is looking for. At this point, I have everything that I need, except for the fact that if I click in the remoto link on the website, the application returns a 404 code because there is no handler for the /remoto path. Let’s fix that by introducing a new concept: Flask doesn’t need to know every project name to work; it can work with dynamic URL input. If I add ten more projects with coverage reports tomorrow, the application shouldn’t require adding ten more functions to work with. I’m introducing a new function, and I see that the directory listing is going to be reused, so I extract that into a helper and create the new function. This is how that handling looks:

import os
from flask import Flask, render_template, send_file
app = Flask(__name__)


def valid_projects():
    TOP_DIR='/Users/alfredo/python'
    python_projects = []
    for directory in os.listdir(TOP_DIR):
        index_path = os.path.join(TOP_DIR, directory, 'htmlcov/index.html')
        if os.path.exists(index_path):
            python_projects.append(directory)
    return python_projects


@app.route('/')
def index():
    return render_template('index_coverage.html', files=valid_projects())


@app.route('/<project>')
def project(project):
    TOP_DIR='/Users/alfredo/python'
    if project not in valid_projects():
        abort(404, description='Invalid Project, not found!')
    index_file = os.path.join(TOP_DIR, project, 'htmlcov/index.html')
    return send_file(index_file)

The application is importing a new Flask helper to serve a static file called send_file. The index.html is not a template. In this case, it is a fully-functioning file on its own, so we are forced to serve it as a static file. The new handled function (project()) returns a 404 if the project doesn’t have the coverage report available, using the abort() helper from Flask. Rerun the application and click on one of the available projects. Again, in my case, it is the remoto project, which now shows the coverage report:

Things don’t look quite right; the page doesn’t have any styling. Before going into the details, let’s examine what is going on. First, the application is listing all projects that have a valid htmlcov/index.html file present at the root of the directory. In my case, I only have one project that has this. If I click in the link, it takes me to the /remoto URL, which is handled by the project() function. This function is correctly returning the index.html that finds for the remoto project. If you are curious and want to exercise the code, try accessing some other endpoint that might not exist like /fake-project to see what happens. A 404 should be returned, handled by the project() function.

On the server side of things, this is the output I have:

 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
127.0.0.1 - - [05/Apr/2020 09:51:17] "GET /remoto HTTP/1.1" 200 -
127.0.0.1 - - [05/Apr/2020 09:51:17] "GET /style.css HTTP/1.1" 404 -
127.0.0.1 - - [05/Apr/2020 09:51:17] "GET /jquery.min.js HTTP/1.1" 404 -
127.0.0.1 - - [05/Apr/2020 09:51:17] "GET /keybd_closed.png HTTP/1.1" 404 -
127.0.0.1 - - [05/Apr/2020 09:51:17] "GET /keybd_open.png HTTP/1.1" 404 -
127.0.0.1 - - [05/Apr/2020 09:51:17] "GET /debounce.min.js HTTP/1.1" 404 -
127.0.0.1 - - [05/Apr/2020 09:51:17] "GET /jquery.min.js HTTP/1.1" 404 -
127.0.0.1 - - [05/Apr/2020 09:51:17] "GET /jquery.hotkeys.js HTTP/1.1" 404 -
127.0.0.1 - - [05/Apr/2020 09:51:17] "GET /coverage_html.js HTTP/1.1" 404 -
127.0.0.1 - - [05/Apr/2020 09:51:17] "GET /keybd_closed.png HTTP/1.1" 404 -
127.0.0.1 - - [05/Apr/2020 09:51:17] "GET /keybd_open.png HTTP/1.1" 404 -

What is going on here? There are multiple 404 (not found) requests for things like JavaScript and CSS libraries. Those are the reason why the page looks so odd. The index.html defines its CSS and JavaScript requirements with a relative path. This gets interpreted as / which doesn’t work since the page comes from the /remoto URL. Ideally, you want to serve static assets and other static files via a web server like Nginx and not with Flask directly. But getting into this sort of trouble by not following my recommendation is great because now you know a horrible workaround to get these things to work nicely!

Since I don’t have a function called remoto() to handle the /remoto url, I have to rely on the dynamic function that is already implemented. The workaround involves adapting the handler to treat the project argument as somethign that might be a static file, and if so, then serve it instead of giving a 404:

@app.route('/<project>')
def project(project):
    TOP_DIR='/Users/alfredo/python'
    if project not in valid_projects():
        # project may be a static file
        for valid_project in valid_projects():
            static_file = os.path.join(TOP_DIR, valid_project, 'htmlcov', project)
            if os.path.exists(static_file):
                return send_file(static_file)
        abort(404, description='Invalid Project, not found!')
    index_file = os.path.join(TOP_DIR, project, 'htmlcov/index.html')
    return send_file(index_file)

I am not proud of this function, but it gets the job done. Reloading the application and hitting the URL once again looks way nicer:

And on the server side, where the application is running in the terminal, the output is full of 200s:

 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
127.0.0.1 - - [05/Apr/2020 18:02:45] "GET /remoto HTTP/1.1" 200 -
127.0.0.1 - - [05/Apr/2020 18:02:45] "GET /style.css HTTP/1.1" 200 -
127.0.0.1 - - [05/Apr/2020 18:02:45] "GET /debounce.min.js HTTP/1.1" 200 -
127.0.0.1 - - [05/Apr/2020 18:02:45] "GET /tablesorter.min.js HTTP/1.1" 200 -
127.0.0.1 - - [05/Apr/2020 18:02:45] "GET /jquery.hotkeys.js HTTP/1.1" 200 -
127.0.0.1 - - [05/Apr/2020 18:02:45] "GET /jquery.min.js HTTP/1.1" 200 -
127.0.0.1 - - [05/Apr/2020 18:02:45] "GET /coverage_html.js HTTP/1.1" 200 -
127.0.0.1 - - [05/Apr/2020 18:02:45] "GET /keybd_closed.png HTTP/1.1" 200 -
127.0.0.1 - - [05/Apr/2020 18:02:45] "GET /keybd_open.png HTTP/1.1" 200 -