Patrick's Software Blog

Learning Python Web Application Development with Flask

Using Docker for Flask Application Development (not just Production!)

Introduction

I’ve been using Docker for my staging and production environments, but I’ve recently figured out how to make Docker work for my development environment as well.

When I work on my personal web applications, I have three environments:

  • Production – the actual application that serves the users
  • Staging – a replica of the production environment on my laptop
  • Development – the environment where I write source code, unit/integration test, debug, integrate, etc.

While having a development environment that is significantly different (ie. not using Docker) from the staging/production environments is not an issue, I’ve really enjoyed the switch to using Docker for development.

The key aspects that were important to me when deciding to switch to Docker for my development environment were:

  1. Utilize the Flask development server instead of a production web server (Gunicorn)
  2. Allow easy access to my database (Postgres)
  3. Maintain my unit/integration testing capability

This blog post shows how to configure Docker and Docker Compose for creating a development environment that you can easily use on a day-to-day basis for developing a Flask application.

For reference, my Flask project that is the basis for this blog post can be found on GitLab.

Architecture

The architecture for this Flask application is illustrated in the following diagram:

Docker Application Architecture

Each key component has its own sub-directory in the repository:

$ tree
.
├── docker-compose.yml
├── nginx
│   ├── Dockerfile
├── postgresql
│   └── Dockerfile  * Not included in git repository
└── web
    ├── Dockerfile
    ├── create_postgres_dockerfile.py
    ├── instance
    ├── project
    ├── requirements.txt
    └── run.py

Configuration of Dockerfiles and Docker Compose for Production

The setup for my application utilizes separate Dockerfiles for the web application, Nginx, and Postgres; these services are integrated together using Docker Compose.

Web Application Service

Originally, I had been using the python-*:onbuild image for my web application image, as this seemed like a convenient and reasonable option (it provided the standard configurations for a python project). However, in reading the notes in the python page on Docker Hub, the use of the python-*:onbuild images are not recommended anymore.

Therefore, I created a Dockerfile that I use for my web application:

FROM python:3.6.1
MAINTAINER Patrick Kennedy <patkennedy79@gmail.com>

# Create the working directory
RUN mkdir -p /usr/src/app/web
WORKDIR /usr/src/app/web

# Install the package dependencies (this step is separated
# from copying all the source code to avoid having to
# re-install all python packages defined in requirements.txt
# whenever any source code change is made)
COPY requirements.txt /usr/src/app/web
RUN pip install --no-cache-dir -r requirements.txt

# Copy the source code into the container
COPY . /usr/src/app/web

It may seem odd or out of sequence to copy the requirements.txt file from the local system into the container separately from the entire repository, but this is intentional. If you copy over the entire repository and then ‘pip install’ all the packages in requirements.txt, any change in the repository will cause all the packages to be re-installed (this can take a long time and is unnecessary) when you build this container. A better approach is to first just copy over the requirements.txt file and then run ‘pip install’. If changes are made to the repository (not to requirements.txt), then the cached intermediate container (or layer in your service) will be utilized. This is a big time saver, especially during development. Of course, if you make a change to requirements.txt, this will be detected during the next build and all the python packages will be re-installed in the intermediate container.

Nginx Service

Here is the Dockerfile that I use for my Nginx service:

FROM nginx:1.11.3
RUN rm /etc/nginx/nginx.conf
COPY nginx.conf /etc/nginx/
RUN rm /etc/nginx/conf.d/default.conf
COPY family_recipes.conf /etc/nginx/conf.d/

There is a lot of complexity when it comes to configuring Nginx, so please refer to my blog post entitled ‘How to Configure Nginx for a Flask Web Application‘.

Postgres Service

The Dockerfile for the postgres service is very simple, but I actually use a python script (create_postgres_dockerfile.py) to auto-generate it based on the credentials of my postgres database. The structure of the Dockerfile is:

FROM postgres:9.6

# Set environment variables
ENV POSTGRES_USER <postgres_user>
ENV POSTGRES_PASSWORD <postgres_password>
ENV POSTGRES_DB <postgres_database>

Docker Compose

Docker Compose is a great tool for connecting different services (ie. containers) to create a fully functioning application. The configuration of the application is defined in the docker-compose.yml file:

version: '2'

services:
  web:
    restart: always
    build: ./web
    expose:
      - "8000"
    volumes:
      - /usr/src/app/web/project/static
    command: /usr/local/bin/gunicorn -w 2 -b :8000 project:app
    depends_on:
      - postgres

  nginx:
    restart: always
    build: ./nginx
    ports:
      - "80:80"
    volumes:
      - /www/static
    volumes_from:
      - web
    depends_on:
      - web

  data:
    image: postgres:9.6
    volumes:
      - /var/lib/postgresql
    command: "true"

  postgres:
    restart: always
    build: ./postgresql
    volumes_from:
      - data
    expose:
      - "5432"

The following commands need to be run to build and then start these containers:

  docker-compose build
  docker-compose -f docker-compose.yml up -d

Additionally, I utilize a script to re-initialize the database, which is frequently used in the staging environment:

  docker-compose run --rm web python ./instance/db_create.py

To see the application, utilize your favorite web browser and navigate to http://ip_of_docker_machine/ to access the application; this will often be http://192.168.99.100/. The command ‘docker-machine ip’ will tell you the IP address to use.

Changes Needed for Development Environment

The easiest way to make the necessary changes for the development environment is to create the changes in the docker-compose.override.yml file.

Docker Compose automatically checks for docker-compose.yml and docker-compose.override.yml when the ‘up’ command is used. Therefore, in development use ‘docker-compose up -d’ and in production or staging use ‘docker-compose -f docker-compose.yml up -d’ to prevent the loading of docker-compose.override.yml.

Here are the contents of the docker-compose.override.yml file:

version: '2'

services:
  web:
    build: ./web
    ports:
      - "5000:5000"
    environment:
      - FLASK_APP=run.py
      - FLASK_DEBUG=1
    volumes:
      - ./web/:/usr/src/app/web
    command: flask run --host=0.0.0.0

  postgres:
    ports:
      - "5432:5432"

Each line in the docker-compose.override.yml overrides the applicable setting from docker-compose.ml.

Web Application Service

For the web application container, the web server is being switched from Gunicorn (used in production) to the Flask development server. The Flask development server allows auto-reloading of the application whenever a change is made and has debugging capability right in the browser when exceptions occurs. These are create features to have during development. Additionally, port 5000 is now accessible from the web application container. This allows the developer to gain access to the Flask web server by navigating to http://ip_of_docker_machine:5000.

Postgres Service

For the postgres container, the only change that is made is to allow access to port 5432 by the host machine instead of just other services. For reference, here is a good explanation of the use of ‘ports’ vs. ‘expose’ from Stack Overflow.

This change allows direct access to the postgres database using the psql shell. When accessing the postgres database, I prefer specifying the URI:

psql postgresql://<username>:<password>@192.168.99.100:5432/<postgres_database>

This allows you access to the postgres database, which will come in really handy at some point during development (almost a guarantee).

Nginx Service

While there are no override commands for the Nginx service, this service will be basically ignored during development, as the web application is accessed directly through the Flask web server by navigating to http://ip_of_docker_machine:5000/. I have not found a clear way to disable a service, so the Nginx service is left untouched.

Running the Development Application

The following commands should be run to build and run the containers:

docker-compose stop    # If there are existing containers running, stop them
docker-compose build
docker-compose up -d

Since you are running in a development environment with the Flask development server, you will need to navigate to http://ip_of_docker_machine:5000/ to access the application (for example, http://192.168.99.100:5000/). The command ‘docker-machine ip’ will tell you the IP address to use.

Another helpful command that allows quick access to the logs of a specific container is:

docker-compose logs <service>

For example, to see the logs of the web application, run ‘docker-compose logs web’. In the development environment, you should see something similar to:

$ docker-compose logs web
Attaching to flaskrecipeapp_web_1
web_1 | * Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)
web_1 | * Restarting with stat
web_1 | * Debugger is active!
web_1 | * Debugger pin code: ***-***-***

Conclusion

Docker is an amazing product that I have really come to enjoy using for my development environment. I really feel that using Docker makes you think about your entire architecture, as Docker provides such an easy way to start integrating complex services, like web services, databases, etc.

Using Docker for a development environment does require a good deal of setup, but once you have the configuration working, it’s a great way for quickly developing your application while still having that one foot towards the production environment.

References

Docker Compose File (version 2) Reference
https://docs.docker.com/compose/compose-file/compose-file-v2/

Dockerizing Flask With Compose and Machine – From Localhost to the Cloud
NOTE: This was the blog post that got me really excited to learn about Docker!
https://realpython.com/blog/python/dockerizing-flask-with-compose-and-machine-from-localhost-to-the-cloud/

Docker Compose for Development and Production – GitHub – Antonis Kalipetis
https://github.com/akalipetis/docker-compose-dev-prod
Also, check out Antonis’ talk from DockerCon17 on YouTube.

Overview of Docker Compose CLI
https://docs.docker.com/compose/reference/overview/

Docker Command Reference

Start or Re-start Docker Machine:
$ docker-machine start default
$ eval $(docker-machine env default)

Build all of the images in preparation for running your application:
$ docker-compose build

Using Docker Compose to run the multi-container application (in daemon mode):
$ docker-compose up -d
$ docker-compose -f docker-compose.yml up -d

View the logs from the different running containers:
$ docker-compose logs
$ docker-compose logs web # or whatever service you want

Stop all of the containers that were started by Docker Compose:
$ docker-compose stop

Run a command in a specific container:
$ docker-compose run –rm web python ./instance/db_create.py
$ docker-compose run web bash

Check the containers that are running:
$ docker ps

Stop all running containers:
$ docker stop $(docker ps -a -q)

Delete all running containers:
$ docker rm $(docker ps -a -q)

Delete all untagged Docker images
$ docker rmi $(docker images | grep “^” | awk ‘{print $3}’)

Software Development Checklist for Python Applications

Introduction

After resurrecting one of the first python applications that I wrote, I realized that I had learned a lot about the python language and ecosystem since my first adventure with the language. After going through the updates that I wanted to make to this application, I realized that I had created a checklist of software development concepts that would be beneficial to most python projects.

This list is inspired by the classic article by Joel Spolsky entitled “The Joel Test: 12 Steps to Better Code“. I provide some narrative around each checklist item, including links to more detailed descriptions or tutorials to help with learning more about the concept.

Checklist

1. Are you using virtual environments to manage the packages used by your application?

Virtual environments are a must for every python project. Virtual environments isolate the packages used for a specific application, so you can easily use different versions of packages for different projects.

I’ve been really happy using virtualenvwrapper. I wrote a blog post about how to use the virtualenvwrapper module.

As of python 3.3, there is now a standard library module (venv) available for creating virtual environments.

2. Are you utilizing a source control tool to manage your source code?

The de facto source control tool is git. It is widely used because it is a great tool. I highly recommend learning the more complex aspects of git to gain the benefits of this powerful tool. Git Pro is a great book for learning the details of git.

3. Are you using a centralized source repository (GitLab, GitHub, BitBucket) to enable collaboration and sharing?

I’m a big fan of GitLab as it has so many features combined (source control, bug tracking, and CI). However, all of the popular choices (GitLab, GitHub, and BitBucket) are good choices.

For an example of a GitLab project, here is the project that I’ve updated as I developed this checklist: https://gitlab.com/patkennedy79/picture_video_organizer

4. Do you have a README file to assist both active developers and first-time viewers?

The README file is often the first impression that people get of your project.

I’ve found the a good README file can benefit both the developers of a project and the people viewing the project for the first time. A good README file should appeal to both groups by having the following:

  • Purpose of the project
  • Screenshot or visual aid of the project
  • How to run the application and run the tests for the project
  • Current status of the project (test status: pass/fail, test coverage)

For a good overview of what the README should contain, I recommend the following blog post from Dan Bader: Write a Great README.

5. Are you testing (unit/integration/function) the application?

Testing a python application has become quite easy and there are lots of great options to help with quickly writing and running tests. In terms of frameworks, both unittest (part of standard library) and pytest are great choices. Additionally, I’ve used nose2 as a unit test runner to easily run my unit tests.

I wrote a blog post about how to use the unittest module for writing unit tests.

If you’re using the Flask web framework, check out my blog post about writing tests for a Flask application.

6. Are you using at least one static analysis tool to check for common errors in your application?

Linters are static analysis tools that check your source code for coding style errors, design issues, and bad design patterns. There are a lot of great options available, but I would recommend the following modules:

flake8
pydocstyle

The best way to run these tools is via a Continuous Integration (CI) tool…

7. Are you using Continuos Integration (CI) to automate your testing?

Setting up a Continuous Integration (CI) process just makes life easier. It’s easy to set up a CI process to run all your tests (unit/integration) and run linters. You no longer have to worry about remembering to run all your test cases and check the linter results manually. There are lots of great options for CI: TravisCI, Jenkins, GitLab, etc.

I have a CI system configured on GitLab for a command-line (CLI) application that runs the unit tests and executes two linters (flake8 and pydocstyle) whenever a new commit is made. This is all defined in a single file (.gitlab-ci.yml).

8. Are you thoroughly documenting your code, including the use of a document generation tool (Sphinx)?

The documentation of your project is important for getting other people to understand your source code, but it also really important for the developers who are trying to remember what they did in a specific module six months ago.

I recommend using Sphinx to help generate nice documentation. I wrote a blog post about how to setup Sphinx for a python project.

9. Are you using a logging tool to track the execution of your application?

Using print statements to debug an application might work for getting something to work, but I really prefer using a logging tool to track the execution of an application. The standard library module (logging) is a bit complex to setup, but it does work quite well.

Check out my blog post about how to use the logging module for more information.

10. Are you tracking bugs and enhancement ideas?

Assuming that you’re using a centralized source repository (GitLab, GitHub, BitBucket), there should be a built-in system for tracking defects and enhancements. This is pretty simple… you need to document what bugs there are with your project so that you can work down these issues; you need to document good ideas to enhance your project so that you know what improvements to make.

11. Can you easily run the application with limited introduction to the tool?

The ideal situation for an application is being able to run it with a single command. If you need to execute multiple, manual steps, you’re just asking for weird problems to arise and for people to loss interest in your project.

While there might need to be some configuration for your application, this process should be easy to follow. You should ideally be able to run your application with a single command.

Honorable Mentions (or Future Checklist Items)

1. Docker

The use of containers for running and deploying applications is a great idea, especially for web applications. There is a lot to learn when it comes to Docker, so I would recommend starting out with the book “Docker for Developers” by Chris Tankersley. This book provides a great introduction into the history of containers and clearly describes all the different aspects/components of Docker.

I’ve recently switched to use Docker for developing and deploying my Flask web applications, which I documented the why and how in separate blog posts.

2. Debugging

At some point in the development of a project, you are likely to run into a difficult problem that can’t be easily solved using print statements or logging statements. This might be the time to use a debugger to understand the details of how the program is executing. The pdb module is a standard library module for debugging python source code.

A great resource for learning about pdb is the tutorial from the Python Module of the Week site: https://pymotw.com/2/pdb/

3. Profiling

Profiling is a dynamic (ie. while your application is running) analysis that measures how long it takes all the pieces of your application to run. This analysis can be really beneficial to help identify parts of your source code that are taking up large amounts of execution time (it’s often not the parts that you suspect!).

Conclusion

As I was coming up with this list, I realized just how amazing the python ecosystem has become. The number of great tools that are available for building web applications, writing unit tests, performing static analysis, etc. is just incredible.

Having a good understanding of the python modules that are available for you can really help save you tons of time in terms of testing, documenting, and maintaining your source code.

Python Logging Tutorial

Introduction

Somewhere in between getting your python project to run and getting to the point where even a debugger won’t help you find a bug, you might realize that creating a log file of what your program is doing could be really beneficial. I’ve typically found that once you get a working version of your program, you’ll want to really understand what is happening during subsequent executions of the program. The simplest (note: not best) way to accomplish this is to put lots of print statements in your code. However, this is really a bad idea as you’re going to be getting lots of print out in your console and you are likely going to be deleting these print statements once you get a bug resolved. So what’s a better approach? Logging!

Python has a built-in library called ‘logging’, which is a great library for logging your program’s execution to a file. The tutorial for ‘logging’ provides a good range of examples from basic to more advanced uses of the library. However, I want to illustrate how I got the ‘logging’ module to work successfully in this blog post.

The source code for these examples can be found on GitLab, organized by tags:
https://gitlab.com/patkennedy79/python_logging

Example #1 – Single File Application

tag: v0.1 – https://gitlab.com/patkennedy79/python_logging/tree/v0.1

The first example is a simple program to help illustrate the basic capabilities of the ‘logging’ module. This program is comprised of a single file named app.py that contains a single class:

class FirstClass(object):
    def __init__(self):
        self.current_number = 0

   def increment_number(self):
        self.current_number += 1

    def decrement_number(self):
        self.current_number -= 1

    def clear_number(self):
        self.current_number = 0

number = FirstClass()
number.increment_number()
number.increment_number()
print "Current number: %s" % str(number.current_number)
number.clear_number()
print "Current number: %s" % str(number.current_number)

You can run this program by executing:

$ python run.py 
Current number: 2
Current number: 0

This program uses two print statements to print to the console. This is fine for getting a program to work, but switching over to logging messages is a better long-term approach.

Configuring the Logging Module

The configuration of the logging module can get complex as you start to specify more and more details of how the logging should be performed. I’ve found the following sequence provides a good configuration for:

  • setting the logging severity level
  • setting the file to log messages to
  • setting the format of the log messages

In order to configure the logging module, the constructor for the FirstClass class should be updated to:

    def __init__(self):
        self.current_number = 0

        # Create the Logger
        self.logger = logging.getLogger(__name__)
        self.logger.setLevel(logging.WARNING)

        # Create the Handler for logging data to a file
        logger_handler = logging.FileHandler('python_logging.log')
        logger_handler.setLevel(logging.WARNING)

        # Create a Formatter for formatting the log messages
        logger_formatter = logging.Formatter('%(name)s - %(levelname)s - %(message)s')

        # Add the Formatter to the Handler
        logger_handler.setFormatter(logger_formatter)

        # Add the Handler to the Logger
        self.logger.addHandler(logger_handler)
        self.logger.info('Completed configuring logger()!')

The constructor configures the usage of the logging module and finishes by logging a message that the configuration of the logger is completed.

Adding Log Messages

In order to add log messages, you can utilize one of the methods from the logging module to log different severity log messages:

    def increment_number(self):
        self.current_number += 1
        self.logger.warning('Incrementing number!')
        self.logger.info('Still incrementing number!!')

    def clear_number(self):
        self.current_number = 0
        self.logger.warning('Clearing number!')
        self.logger.info('Still clearing number!!')

If you run the program again, you’ll still see the same console output.

$ python run.py 
Current number: 2
Current number: 0

Logging Severity Levels

To see what the logging module is doing, check the log file that was created:

$ cat python_logging.log 
__main__ - WARNING - Incrementing number!
__main__ - WARNING - Incrementing number!
__main__ - WARNING - Clearing number!

Interesting, is that what you expected? I initially thought that there should be two “Still incrementing number!!” statements, but they don’t get displayed. Why? Well, the ‘logging’ module has 5 severity levels:

  • DEBUG (lowest)
  • INFO
  • WARNING
  • ERROR
  • CRITICAL (highest)

Our program is using the default setting (WARNING) for the logging severity, which means that any log message with a lower severity than WARNING will not be display. Hence, the INFO messages are not displayed.

Change the following lines in __init__():

        # Create the Logger
        self.logger = logging.getLogger(__name__)
        self.logger.setLevel(logging.WARNING)

        # Create the Handler for logging data to a file
        logger_handler = logging.FileHandler('python_logging.log')
        logger_handler.setLevel(logging.WARNING)

Now check the log file to see that all of the log messages are included:

$ cat python_logging.log 
__main__ - WARNING - Incrementing number!
__main__ - WARNING - Incrementing number!
__main__ - WARNING - Clearing number!
__main__ - INFO - Completed configuring logger()!
__main__ - WARNING - Incrementing number!
__main__ - INFO - Still incrementing number!!
__main__ - WARNING - Incrementing number!
__main__ - INFO - Still incrementing number!!
__main__ - WARNING - Clearing number!
__main__ - INFO - Still clearing number!!

Example #2 – Logging in a Module

tag: v0.2 – https://gitlab.com/patkennedy79/python_logging/tree/v0.2

The second example is slightly more complex, as it updates the structure of the program to include a package with a single module:

python_logging
    python_logging
        __init__.py
        first_class.py
    run.py

The first_class.py file basically contains the FirstClass class that was created in the first example:

import logging


class FirstClass(object):
    def __init__(self):
        self.current_number = 0
        self.logger = logging.getLogger(__name__)

        # Create the Logger
        self.logger = logging.getLogger(__name__)
        self.logger.setLevel(logging.DEBUG)

        # Create the Handler for logging data to a file
        logger_handler = logging.FileHandler('python_logging.log')
        logger_handler.setLevel(logging.DEBUG)

        # Create a Formatter for formatting the log messages
        logger_formatter = logging.Formatter('%(name)s - %(levelname)s - %(message)s')

        # Add the Formatter to the Handler
        logger_handler.setFormatter(logger_formatter)

        # Add the Handler to the Logger
        self.logger.addHandler(logger_handler)
        self.logger.info('Completed configuring logger()!')

    def increment_number(self):
        self.current_number += 1
        self.logger.warning('Incrementing number!')
        self.logger.info('Still incrementing number!!')

    def decrement_number(self):
        self.current_number -= 1

    def clear_number(self):
        self.current_number = 0
        self.logger.warning('Clearing number!')
        self.logger.info('Still clearing number!!')

In order to utilize this module, update the run.py file in the top-level directory to import the FirstClass class to utilize it:

from python_logging.first_class import FirstClass


number = FirstClass()
number.increment_number()
number.increment_number()
print "Current number: %s" % str(number.current_number)
number.clear_number()
print "Current number: %s" % str(number.current_number)

Check that the log file is unchanged.

Example #3: Logging in a Package (Part I)

tag: v0.3 – https://gitlab.com/patkennedy79/python_logging/tree/v0.3

The third example adds a second class to the python_logging package to show how to configure the logging module for a whole package. Here is the structure of this example:

python_logging
    python_logging
        __init__.py
        first_class.py
        second_class.py
    run.py

Here is the basic version of the second_class.py file:

class SecondClass(object):
    def __init__(self):
        self.enabled = False

    def enable_system(self):
        self.enabled = True

    def disable_system(self):
        self.enabled = False

Your first inclination might be to duplicate the configuration of the logger in the constructor of this class (copy from first_class.py). This would result in a lot of unnecessary, repetitive code. A better option is to move the configuration of the logger to the __init__.py file:

from os import path, remove
import logging
import logging.config

from .first_class import FirstClass
from .second_class import SecondClass


# If applicable, delete the existing log file to generate a fresh log file during each execution
if path.isfile("python_logging.log"):
    remove("python_logging.log")

# Create the Logger
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)

# Create the Handler for logging data to a file
logger_handler = logging.FileHandler('python_logging.log')
logger_handler.setLevel(logging.DEBUG)

# Create a Formatter for formatting the log messages
logger_formatter = logging.Formatter('%(name)s - %(levelname)s - %(message)s')

# Add the Formatter to the Handler
logger_handler.setFormatter(logger_formatter)

# Add the Handler to the Logger
logger.addHandler(logger_handler)
logger.info('Completed configuring logger()!')

Depending on the type of program you are creating, you might find it beneficial to delete any existing log files prior to logging any new messages from this execution of the program. One option to consider if you want to maintain an on-going log of an application is the RotatingFileHandler within the logging module.

Now that the configuration of the logging module is done in __init__.py, the second_class.py file can be greatly simplified to utilize the logger, instead of having to worry about configuring it first:

import logging


class SecondClass(object):
    def __init__(self):
        self.enabled = False
        self.logger = logging.getLogger(__name__)

    def enable_system(self):
        self.enabled = True
        self.logger.warning('Enabling system!')
        self.logger.info('Still enabling system!!')

    def disable_system(self):
        self.enabled = False
        self.logger.warning('Disabling system!')
        self.logger.info('Still disabling system!!')

There are similar updates to the first_class.py file.

Finally, the updates to __init__.py result in needing the following updates to run.py:

from python_logging import FirstClass, SecondClass


number = FirstClass()
number.increment_number()
number.increment_number()
print "Current number: %s" % str(number.current_number)
number.clear_number()
print "Current number: %s" % str(number.current_number)

system = SecondClass()
system.enable_system()
system.disable_system()
print "Current system configuration: %s" % str(system.enabled)

Try running the program again and looking at the log file:

$ cat python_logging.log 
python_logging - INFO - Completed configuring logger()!
python_logging.first_class - WARNING - Incrementing number!
python_logging.first_class - INFO - Still incrementing number!!
python_logging.first_class - WARNING - Incrementing number!
python_logging.first_class - INFO - Still incrementing number!!
python_logging.first_class - WARNING - Clearing number!
python_logging.first_class - INFO - Still clearing number!!
python_logging.second_class - WARNING - Enabling system!
python_logging.second_class - INFO - Still enabling system!!
python_logging.second_class - WARNING - Disabling system!
python_logging.second_class - INFO - Still disabling system!!

Notice how the module names are printed! This is a really handy feature to quickly identify where specific operations are happening.

Example #4: Logging in a Package (Part II)

tag: v0.4 – https://gitlab.com/patkennedy79/python_logging/tree/v0.4

The fourth (and final) example expands upon the logging capability that was added to a package by adding an input file (JSON) to configure the logger. Keep an eye out for how the log messages are unaffected by this configuration change…

The first change for this example is to __init__.py to change the configuration of the logger to utilize a JSON input file:

from os import path, remove
import logging
import logging.config
import json

from .first_class import FirstClass
from .second_class import SecondClass


# If applicable, delete the existing log file to generate a fresh log file during each execution
if path.isfile("python_logging.log"):
    remove("python_logging.log")

with open("python_logging_configuration.json", 'r') as logging_configuration_file:
    config_dict = json.load(logging_configuration_file)

logging.config.dictConfig(config_dict)

# Log that the logger was configured
logger = logging.getLogger(__name__)
logger.info('Completed configuring logger()!')

Now that we have the code to process the input file, let’s define the input file (python_logging_configuration.json). Make sure to include this file in your top-level folder so that it can be easily identified by the python interpreter. You should be running this program from the top-level folder, so that will make the input JSON file accessible by the python interpreter since it is in the current/working directory.

Here is the configuration file:

{
    "version": 1,
    "disable_existing_loggers": false,
    "formatters": {
        "simple": {
            "format": "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
        }
    },

    "handlers": {
        "file_handler": {
            "class": "logging.FileHandler",
            "level": "DEBUG",
            "formatter": "simple",
            "filename": "python_logging.log",
            "encoding": "utf8"
        }
    },

    "root": {
        "level": "DEBUG",
        "handlers": ["file_handler"]
    }
}

Run the program again and take a look at the log file:

$ cat python_logging.log 
2017-03-09 22:32:01,846 - python_logging - INFO - Completed configuring logger()!
2017-03-09 22:32:01,847 - python_logging.first_class - WARNING - Incrementing number!
2017-03-09 22:32:01,847 - python_logging.first_class - INFO - Still incrementing number!!
2017-03-09 22:32:01,847 - python_logging.first_class - WARNING - Incrementing number!
2017-03-09 22:32:01,847 - python_logging.first_class - INFO - Still incrementing number!!
2017-03-09 22:32:01,847 - python_logging.first_class - WARNING - Clearing number!
2017-03-09 22:32:01,847 - python_logging.first_class - INFO - Still clearing number!!
2017-03-09 22:32:01,848 - python_logging.second_class - WARNING - Enabling system!
2017-03-09 22:32:01,848 - python_logging.second_class - INFO - Still enabling system!!
2017-03-09 22:32:01,848 - python_logging.second_class - WARNING - Disabling system!
2017-03-09 22:32:01,848 - python_logging.second_class - INFO - Still disabling system!!

The log file is almost the same, but the date/time stamp has been added to each log message. I like this format of logging messages: date/time – package.module – log message

If JSON is not your favorite, the input file can also be defined as a YAML file. Here’s an example: Good Logging Practice in Python

Conclusion

The idea of moving beyond print statements to actually logging messages is made so easy thanks to the logging module that is built-in to python. The logging module requires a bit of configuration, but this is a small price to pay to such an easy-to-use module.

I’ve found that having a log file for a program, especially command-line applications, provides a great way to understand what the program is doing. You might go weeks without looking at the log file if things are working well, but having a lot of data about your program’s execution readily available is so beneficial when things go wrong.

References

Python Logging Module Documentation: https://docs.python.org/3/howto/logging.html

Blog Post on Good Logging Practice in Python: https://fangpenlin.com/posts/2012/08/26/good-logging-practice-in-python/
 
Blog Post on Logging in Python: https://www.relaxdiego.com/2014/07/logging-in-python.html

Receiving Files with a Flask REST API

Introduction

Over the past two months, I’ve spent a lot of time learning about designing and implementing REST APIs. It’s been a lot of fun learning what a REST API is and I really enjoyed learning how to implement a REST API from scratch. While I thought about writing a series of blog posts about what I’ve learning, I think there are two excellent resources already available:

After using these guides from Miguel Grinberg, I decided to implement a REST API for one of my websites that allows you to store recipes (including an image of the recipe). The one area that I struggled with was receiving a file via the API, so I wanted to document how I designed, implemented, and tested this feature of the API.

The source code for this project can be found on my GitLab page: https://gitlab.com/patkennedy79/flask_recipe_app

Design

The concept of sending a file and the associated metadata to a REST API has many design options, as outlined on the following Stack Overflow discussion: Posting a File and Associated Data to a RESTful Webservice

After researching the choices available, I really liked the following sequence:

  1. Client creates a new recipe (via a POST)
  2. Server returns a URL to the new resource
  3. Client sends the image (via PUT)

This sequence feels different than the process used via a web form, where all the data and the image are sent together as a ‘multipart/form-data’ POST. However, the concept of creating a new recipe (via POST) and then following up with the upload of the image to a specific URL seems like a clean and straight-forward implementation for the API.

Implementation

One of my biggest takeaways from Miguel Grinberg’s Building Web APIs with Flask (video course) was the concept of minimizing the amount of logic/code in your routes (views.py) and pushing the complexity to your database interface (models.py). The idea behind this is to keep your routes as simple to understand as possible to improve maintainability. I really like this philosophy and I’ll be utilizing it during this implementation.

The first change for being able to process files is to update the import_data() method in the Recipe model (…/web/project/models.py). Previously, this method would import JSON data associated with a recipe, but now this method needs to be able to also import an image (lines added are highlighted):

    def import_data(self, request):
        """Import the data for this recipe by either saving the image associated
        with this recipe or saving the metadata associated with the recipe. If
        the metadata is being processed, the title and description of the recipe
        must always be specified."""
        try:
            if 'recipe_image' in request.files:
                filename = images.save(request.files['recipe_image'])
                self.image_filename = filename
                self.image_url = images.url(filename)
            else:
                json_data = request.get_json()
                self.recipe_title = json_data['title']
                self.recipe_description = json_data['description']
                if 'recipe_type' in json_data:
                    self.recipe_type = json_data['recipe_type']
                if 'rating' in json_data:
                    self.rating = json_data['rating']
                if 'ingredients' in json_data:
                    self.ingredients = json_data['ingredients']
                if 'recipe_steps' in json_data:
                    self.recipe_steps = json_data['recipe_steps']
                if 'inspiration' in json_data:
                    self.inspiration = json_data['inspiration']
        except KeyError as e:
            raise ValidationError('Invalid recipe: missing ' + e.args[0])
        return self

This method assumes that the image will be defined as ‘recipe_image’ within the request.files dictionary. If this element is defined, then the image is saved and the image filename and URL are extracted.

The remainder of the method is unchanged for processing the JSON data associated with a recipe.

The route that uses the import_data() function is the api1_2_update_recipe() function (defined in …/web/project/recipes_api/views.py):

@recipes_api_blueprint.route('/api/v1_2/recipes/<int:recipe_id>', methods=['PUT'])
def api1_2_update_recipe(recipe_id):
    recipe = Recipe.query.get_or_404(recipe_id)
    recipe.import_data(request)
    db.session.add(recipe)
    db.session.commit()
    return jsonify({'result': 'True'})

The simplicity of this function is amazing. All of the error handling is handled elsewhere, so you’re just left with what the true purpose of the function is: save data to an existing recipe in the database.

Since the import_data() method associated with the Recipe model was changed, the function for creating a new recipe via the API needs to be updated as well to pass in the request instead of the JSON data extracted from the request:

@recipes_api_blueprint.route('/api/v1_2/recipes', methods=['POST'])
def api1_2_create_recipe():
    new_recipe = Recipe()
    new_recipe.import_data(request)
    db.session.add(new_recipe)
    db.session.commit()
    return jsonify({}), 201, {'Location': new_recipe.get_url()}

Testing

There is already a test suite for the Recipes API Blueprint, so expanding these tests is the best approach for testing the receipt of images. The tests are defined in …/web/project/tests/ and each test suite has a filename that starts with ‘test_*’ to allow it to be discoverable by Nose2.

Here is the test case for sending a file:

    def test_recipes_api_sending_file(self):
        headers = self.get_headers_authenticated_admin()
        with open(os.path.join('project', 'tests', 'IMG_6127.JPG'), 'rb') as fp:
            file = FileStorage(fp)
            response = self.app.put('/api/v1_2/recipes/2', data={'recipe_image': file}, headers=headers,
                                    content_type='multipart/form-data', follow_redirects=True)
            json_data = json.loads(response.data.decode('utf-8'))

            self.assertEqual(response.status_code, 200)
            self.assertIn('True', json_data['result'])

This test utilizes an image (IMG_6127.JPG) that I’ve copied to the same folder as the test suites (…/web/project/tests/). The first call to get_headers_authenticated_admin() is defined in the Helper Function section of this test suite as it is utilized by multiple test cases. Next, the image file is opened utilizing a context manager to ensure the proper cleanup. The PUT call to ‘/api/v1_2/recipes/2’ includes the header with the authentication keys, the content type of ‘multipart/form-data’, and the actual image defined in the data dictionary. Note how the file is associated with ‘recipe_image’ in the data dictionary to ensure that it is processed properly by the API.

The response from the PUT call is then checked to make sure the result was successful and the status code returned is 200, as expected.

Common Issue Encountered

I ran into a lot of problems trying to get this unit test to work properly. One error message that I got numerous time was along the lines of:

requests.exceptions.ConnectionError: ('Connection aborted.', BrokenPipeError(32, 'Broken pipe’))

This means that the server is not processing the file being sent to it by the client.

This error message feels a bit misleading, as it typically indicates that the server did not properly process the file that was attempted to be sent by the client (could be a unit test case). This should be seen as an error with saving the file on the server side, so double-check the following section in …/web/project/models.py:

                filename = images.save(request.files['recipe_image'])
                self.image_filename = filename
                self.image_url = images.url(filename)

Client Application

Getting the right combination of inputs for the unit test was a frustrating exercise, but necessary to develop a test case for this new functionality. Luckily, developing a simple client application with the Requests module from Kenneth Reitz is a much more enjoyable process.

Here’s a simple application to test out some of the functionality of the Recipes API:

import requests


URL_BASE = 'http://localhost:5000/'
auth = (‘EMAIL_ADDRESS', 'PASSWORD')

# API v1.2 - Get Authentication Token
print('Retrieving authentication token...')
url = URL_BASE + 'get-auth-token'
r = requests.get(url, auth=auth)
print(r.status_code)
print(r.headers)
auth_request = r.json()
token_auth = (auth_request['token'], 'unused')

# API v1.2 - GET (All)
print('Retrieving all recipes...')
url = URL_BASE + 'api/v1_2/recipes'
r = requests.get(url, auth=token_auth)
print(r.status_code)
print(r.text)

# API v1.2 - PUT (Metadata)
print('Updating recipe #2...')
url = URL_BASE + 'api/v1_2/recipes/2'
json_data = {'title': 'Updated recipe', 'description': 'My favorite recipe'}
r = requests.put(url, json=json_data, auth=token_auth)
print(r.status_code)
print(r.text)

# API v1.2 - PUT (Add image)
print('Updating recipe #2 with recipe image...')
url = URL_BASE + 'api/v1_2/recipes/2'
r = requests.put(url, auth=token_auth, files={'recipe_image': open('IMG_6127.JPG', 'rb')})
print(r.status_code)
print(r.text)

Utilizing the Requests module to write this simple client application was a joy, especially compared to getting the test case to work. The process of sending a file in a PUT call requires a single line of code (see highlighted line).

In order to run this client application, you need to have the application running with the development server in a separate terminal window:

(ffr_env) $ python run.py 
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)

In another window, run the client:

$ python testing_api.py

You’ll see lots of text go flying by, but take a look at the output from the Flask development server:

127.0.0.1 - - [23/Jan/2017 22:16:22] "GET /get-auth-token HTTP/1.1" 200 -
127.0.0.1 - - [23/Jan/2017 22:16:22] "GET /api/v1_2/recipes HTTP/1.1" 200 -
127.0.0.1 - - [23/Jan/2017 22:16:22] "PUT /api/v1_2/recipes/2 HTTP/1.1" 200 -
127.0.0.1 - - [23/Jan/2017 22:16:23] "PUT /api/v1_2/recipes/2 HTTP/1.1" 200 -

This shows that each request to the server was handled successfully (as seen by 200 status code sent back for each request). The JSON data return from each request also confirms the proper response from the server.

Conclusion

I’ve thoroughly enjoyed learning about how to design, implement, and test a REST API in Flask. Having a REST API for your web application is a great addition. My recommendation is to develop your own REST API from scratch, especially if you are new to REST or developing APIs. Later, you may want to consider a module (such as Flask-RESTFul) for assisting you in developing a REST API.

The materials from Miguel Grinberg are excellent and I highly recommend them to anyone interested in REST APIs for Flask. This blog post was intended to supplement that material by describing how to handle receiving an image with a REST API.

For the complete source code discussed in this blog post, please see my GitLab page: https://gitlab.com/patkennedy79/flask_recipe_app

Creating Charts with Chart.js in a Flask Application

Introduction

As I’ve been learning about developing websites using Python and Flask, I’ve had to learn a decent amount of HTML and CSS in order to make the web pages look decent.  Bootstrap has been the primary framework that I’ve been using to make the web pages look good.  Up to this point, I’ve avoiding doing anything with Javascript as I’ve been satisfied with the combination of HTML/CSS/Bootstrap for the websites that I’ve been developing.

My main motivation for learning Javascript is that I wanted to be able to generate charts/graphs in my web applications.  I found the Chart.js library to be awesome, so that was the motivation for learning Javascript!  Chart.js has almost 27,000 stars on GitHub as of mid-December 2016. Based on my experience, I would highly recommend learning the basics of Javascript first (I thought the Codecademy course called Learning Javascript provided a great introduction).
 
This blog post provides an introduction to Chart.js and provides three examples (of increasing complexity) of line charts.   The documentation for Chart.js is excellent, so this blog post should be considered a supplement to the documentation with a focus on using Chart.js within a Flask application.
 
The source code from this blog post can be found on GitLab with separate tags (example1, example2, and example3) for each example: https://gitlab.com/patkennedy79/flask_chartjs_example

Structure of the Application

The structure of this application will be simple:

.
├── README.md
├── app.py
├── requirements.txt
├── static
│   └── Chart.min.js
└── templates
    ├── chart.html
    ├── line_chart.html
    └── time_chart.html

While there is the option to use a CDN (Content Delivery Network) site to get the Chart.js library, I’ve chosen to just download the library and store it in the static/ directory. In the third example, I’ll show the other method of using a CDN for retrieving the Moment.js library.

Within the templates directory, there are separate template files for each of the three examples:

  • Example #1 – chart.html
  • Example #2 – line_chart.html
  • Example #3 – time_chart.html

Example 1: Simple Line Chart

The first example illustrates utilizing a simple Flask application that originates the data to be displayed by Chart.js as a simple line chart.  I’ll be focusing on the Javascript side of this application, but please refer to my Flask Tutorial for a more detailed description of how to create a Flask application.

Flask Application

The Flask application is contained within the app.py file:
 

from flask import Flask
from flask import render_template
from datetime import time


app = Flask(__name__)


@app.route("/simple_chart")
def chart():
    legend = 'Monthly Data'
    labels = ["January", "February", "March", "April", "May", "June", "July", "August"]
    values = [10, 9, 8, 7, 6, 4, 7, 8]
    return render_template('chart.html', values=values, labels=labels, legend=legend)


if __name__ == "__main__":
    app.run(debug=True)

This file creates a new Flask application which has a single route (‘/simple_chart’) that will render the chart.html file. The data being passed to the chart.html template is a set of values for the first 8 months of the year (just for illustrative purposes).

Template File (including Javascript)

The template file (chart.html) is a combination of a number of languages:

  • HTML
  • Jinja2 template scripts
  • Javascript

In order to use the Chart.js library, the ‘Chart.min.js’ file needs to be specified in the ‘head’ section:

  <head>
    <meta charset="utf-8" />
    <title>Chart.js Example</title>
    <!-- import plugin script -->
    <script src='static/Chart.min.js'></script>
  </head>

The chart is then defined as a HTML5 canvas element:

    <h1>Simple Line Chart</h1>
    <!-- bar chart canvas element -->
    <canvas id="myChart" width="600" height="400"></canvas>
    <p id="caption">The chart is displaying a simple line chart.</p>

Finally, the Javascript section of this file does the following in order:

  1. Defines the global parameters that apply to all charts
  2. Defines the chart data for this specific chart
  3. Gets the HTML canvas element
  4. Creates the chart to be displayed in the canvas element

Here are the contents of the ‘script’ section containing the Javascript that utilizes the Chart.js library:

      // Global parameters:
      // do not resize the chart canvas when its container does (keep at 600x400px)
      Chart.defaults.global.responsive = false;

      // define the chart data
      var chartData = {
        labels : [{% for item in labels %}
                   "{{item}}",
                  {% endfor %}],
        datasets : [{
            label: '{{ legend }}',
            fill: true,
            lineTension: 0.1,
            backgroundColor: "rgba(75,192,192,0.4)",
            borderColor: "rgba(75,192,192,1)",
            borderCapStyle: 'butt',
            borderDash: [],
            borderDashOffset: 0.0,
            borderJoinStyle: 'miter',
            pointBorderColor: "rgba(75,192,192,1)",
            pointBackgroundColor: "#fff",
            pointBorderWidth: 1,
            pointHoverRadius: 5,
            pointHoverBackgroundColor: "rgba(75,192,192,1)",
            pointHoverBorderColor: "rgba(220,220,220,1)",
            pointHoverBorderWidth: 2,
            pointRadius: 1,
            pointHitRadius: 10,
            data : [{% for item in values %}
                      {{item}},
                    {% endfor %}],
            spanGaps: false
        }]
      }

      // get chart canvas
      var ctx = document.getElementById("myChart").getContext("2d");

      // create the chart using the chart canvas
      var myChart = new Chart(ctx, {
        type: 'line',
        data: chartData,
      });

Most of the parameters that are set in this section are straight from the Chart.js documentation, so please refer to the documentation for descriptions of each parameter being set for this chart.

The key parameter that I changed from the default value is the only parameter defined in the global settings section (‘responsive’). I recommend setting this parameter to ‘false’ to ensure that the size of the chart is not resized, but is maintained at 600x400px (good size for viewing on a laptop/desktop). This parameter can be easily changed, but it does provide an example of setting a global parameter that will be applied to all Chart objects in your application.

Running the Application

In order to run the application, go to a terminal and navigate to the top-level directory of the project. Next, run:

$ python app.py

Then go to your favorite web browser and navigate to ‘http://localhost:5000/simple_chart’. You should see the following web page:

Excellent! We’ve been able to create a simple chart that ties in data being passed from a Flask application to a template utilizing the Chart.js template.

Example 2: Adding Callback Functions to a Line Chart

As you start to read through the documentation for Chart.js, you’ll definitely notice that there are a ton of amazing features of the library. One aspect that I really thought was cool was being able to define callback functions when certain actions occur (such as clicking on a part of the chart). In this second example, I’ll provide two examples of callback functions:

  1. Callback function to update the caption that gets displayed for each data point
  2. Callback function to update the selected data point on the chart

Flask Application

The Flask application (defined in the app.py file) is just updated to add a new route:

@app.route("/line_chart")
def line_chart():
    legend = 'Temperatures'
    temperatures = [73.7, 73.4, 73.8, 72.8, 68.7, 65.2,
                    61.8, 58.7, 58.2, 58.3, 60.5, 65.7,
                    70.2, 71.4, 71.2, 70.9, 71.3, 71.1]
    times = ['12:00PM', '12:10PM', '12:20PM', '12:30PM', '12:40PM', '12:50PM',
             '1:00PM', '1:10PM', '1:20PM', '1:30PM', '1:40PM', '1:50PM',
             '2:00PM', '2:10PM', '2:20PM', '2:30PM', '2:40PM', '2:50PM']
    return render_template('line_chart.html', values=temperatures, labels=times, legend=legend)

This new route defines a more complex set of data to display. I’ll show a better method for displaying time-based data in the third example.

Adding a Callback for Updating the Caption

The template file (line_chart.html) that is used for this example utilized the previous template (chart.html) as a base. The Javascript code that creates the chart is updated to add a callback function for changing the caption that is displayed when the user hovers over a data point:

      // create the chart using the chart canvas
      var myChart = new Chart(ctx, {
        type: 'line',
        data: chartData,
        options: {
          tooltips: {
            enabled: true,
            mode: 'single',
            callbacks: {
              label: function(tooltipItems, data) {
                       return tooltipItems.yLabel + ' degrees';
                     }
            }
          },
        }
      });

This Javascript code defines a callback function that adds the word ‘degrees’ to the temperature that gets displayed in the caption.

Adding a Callback for the Selected Data Point

The Javascript code that creates the callback function to update the text for which data point (using a zero-based index) has been selected starts by creating a variable for identifying the canvas:

      // get chart canvas
     var holder = document.getElementById("myChart");

Next, a variable is created for identifying the text to be updated:

      // get the text element below the chart
      var pointSelected = document.getElementById("pointSelected");

Finally, the callback function is defined:

      // create a callback function for updating the selected index on the chart
      holder.onclick = function(evt){
        var activePoint = myChart.getElementAtEvent(evt);
       pointSelected.innerHTML = 'Point selected... index: ' + activePoint[0]._index;
      };

The callback function gets called whenever the user clicks on a data point on the chart and it updates the text below the chart indicating the index of the data point that was selected. Additionally, you could include some logs to see what additional information is available from the activePoint variable:

        console.log(activePoint);
        console.log('x:' + activePoint[0]._view.x);
        console.log('maxWidth: ' + activePoint[0]._xScale.maxWidth);
        console.log('y: ' + activePoint[0]._view.y);

Running the Application

Just like with the previous application, you run the application through the terminal by navigating to the top-level directory of the project and running:

$ python app.py

Then go to your favorite web browser and navigate to ‘http://localhost:5000/line_chart’. You should see the following web page:

One of the great features of Chart.js is the flexibility in terms of what you can do and I’m just providing a few examples of what can be done with callback functions.

Example 3: Time-based Chart

The biggest problem with example #2 is that the way that the time data is just being defined as a string. A better way to approach this is to utilize the time module on the Python/Flask side of the application and the Moment.js library on the Javascript side of the application. The Moment.js library provides the data structures and functions for easily working with time values.

The Chart.js documentation provides some recommendations on using the Chart.js library with Moment.js. I prefer the method of explicitly including Moment.js prior to Chart.js in the section of the template file. This guarantees that Moment.js is only included once and the order of loading the library is explicitly defined.

As with example 2, this example is building off of the previous examples by providing additional functionality.

Updates to the Flask Application

The first update to the Flask side (defined in app.py) is to import the time class from the datetime module:

from datetime import time

Next, a new route is defined for ‘/time_chart’, which defines a number of pre-canned temperature values and then a list of times in the hh:mm:ss format:

@app.route("/time_chart")
def time_chart():
    legend = 'Temperatures'
    temperatures = [73.7, 73.4, 73.8, 72.8, 68.7, 65.2,
                    61.8, 58.7, 58.2, 58.3, 60.5, 65.7,
                    70.2, 71.4, 71.2, 70.9, 71.3, 71.1]
    times = [time(hour=11, minute=14, second=15),
             time(hour=11, minute=14, second=30),
             time(hour=11, minute=14, second=45),
             time(hour=11, minute=15, second=00),
             time(hour=11, minute=15, second=15),
             time(hour=11, minute=15, second=30),
             time(hour=11, minute=15, second=45),
             time(hour=11, minute=16, second=00),
             time(hour=11, minute=16, second=15),
             time(hour=11, minute=16, second=30),
             time(hour=11, minute=16, second=45),
             time(hour=11, minute=17, second=00),
             time(hour=11, minute=17, second=15),
             time(hour=11, minute=17, second=30),
             time(hour=11, minute=17, second=45),
             time(hour=11, minute=18, second=00),
             time(hour=11, minute=18, second=15),
             time(hour=11, minute=18, second=30)]
    return render_template('time_chart.html', values=temperatures, labels=times, legend=legend)

Updates to Javascript

On the Javascript side of the application, the time values being used as the x-axis of the chart are now going to utilize the Moment.js library.

In order to utilize the Moment.js library, it needs to be included in the ‘head’ section of the template file. The Chart.js documentation specifies that Moment.js needs to be included prior to Chart.js:

  <head>
    <meta charset="utf-8" />
    <title>Chart.js Example</title>
    <!-- import plugin script -->
    <script src="http://cdnjs.cloudflare.com/ajax/libs/moment.js/2.13.0/moment.min.js"></script>
    <script src='static/Chart.min.js'></script>
  </head>

It may look strange to have the Moment.js library being retrieved from a CDN and the Chart.js library being imported straight from the project files, but I wanted to include both methods for including a library for this example.

The first update within the ‘script’ section is to define a new function to convert a time element passed in from the Flask application into a Moment.js structure (of the format hh:mm:ss):

      var timeFormat = 'hh:mm:ss';

      function newDateString(hours, minutes, seconds) {
		return moment().hour(hours).minute(minutes).second(seconds).format(timeFormat);
	  }

This new function (newDateString) is utilized for defining the labels of the chart:

      // define the chart data
      var chartData = {
        labels : [{% for item in labels %}
                   newDateString( {{item.hour}}, {{item.minute}}, {{item.second}} ),
                  {% endfor %}],
      …

That’s all the updates needed!

Running the Application

Just like with the previous application, you run the application through the terminal by navigating to the top-level directory of the project and running:

$ python app.py

Then go to your favorite web browser and navigate to ‘http://localhost:5000/time_chart’. You should see the following web page:

Conclusion

In retrospect, my first experience with Javascript was quite enjoyable. I think I was very fortunate to find such a well constructed and well documented library like Chart.js as my first experience with Javascript. The documentation for Chart.js is excellent and it provided a great resource for getting started with the library and for doing more complex things with the library (like callback functions).

I would highly recommend learning the basics of Javascript before diving into trying to utilize a Javascript library. I’ve had limited exposure to Javascript, so the Codecademy course called Learning Javascript) provided a great base for understanding the language and its syntax.

My initial feeling about working with the Javascript language is that is is less enjoyable to code in than Python. Working with the Python ecosystem is just such an enjoyable experience, in my opinion. With all that said, I can really see how Javascript can be powerful for developing richer client-side experiences. Therefore, I’m eager to continue learning about Javascript and my next adventure will be with learning about the Vue.js framework.
 

Unit Testing a Flask Application

Introduction

This blog post provides an introduction to unit testing a Flask application.  I’ve been on a big unit testing kick at work recently which is spilling over into updating the unit tests for my personal projects.  Therefore, I thought it would be a good time to document the basics of unit testing a Flask application.
 
There seems to be a lot of opinions about unit testing amongst software developers, but I believe the following:

  • 100% code coverage using unit tests does not mean your code works, but it’s a good indicator that the code has been developed well
  • No unit tests mean that the code does not work and cannot be expanded upon

 
The second belief might be a bit extreme, but if I inherit a software module that does not have unit tests, I’m already assuming that it does not run properly and has not been maintained.  Simply put, the lack of unit tests leads to a negative impression of the software module.

I believe that you should create an environment for developing unit test that is both easy to use and fun to use. These may sound like very fluffy words, but they have meaning for me:

  • create helper methods to allow quick generation of unit tests
  • provide quick feedback on progress of unit testing
  • create an environment that allows for tests to be added quickly
  • make sure that the unit tests can be executed with a single command
  • provide visual feedback of path coverage

As with so many things in software development, if it’s easy to use, it will be used. If it’s tough to use, it won’t be used as much as it should be.

Unit Test Frameworks

The two unit test frameworks for python that I’ve used are:

  • unittest – built-in unit test framework that is based on the xUnit framework
  • py.test – module for building unit tests

Both work really well, but I tend to prefer unittest since it’s built-in to the python library (there is a wealth of incredible modules built-in to python!).  I’ll be using unittest throughout this blog post.
 
If you want to see just how many options there are for tools to help with unit testing in python, check out this listing of python testing tools.  An incredible number of tools available!

Where to Store Your Unit Tests

Based on the flexibility that using unit test runners gives you, you could probably store your unit test files in any location in your Flask application. However, I find it best to store the unit tests (in this case located in the ‘tests’ directory) at the same level as the files that you are going to be testing:

$ tree
├── instance
├── migrations
├── project
│   ├── __init__.py
│   ├── models.py
│   ├── recipes
│   ├── static
│   ├── templates
│   ├── tests
│   │   ├── test_basic.py
│   │   ├── test_recipes.py
│   │   └── test_users.py
│   └── users
├── requirements.txt
└── run.py

In this structure, the unit tests are stored in the ‘tests’ directory and the unit tests are going to be focused on testing the functionality in the ‘recipes’ and ‘users’ modules, which are at the same level.

Creating a Basic Unit Test File

There are a lot of important principles to follow when writing unit tests, but I really believe it’s important to create an environment that makes writing unit tests easy. I’m a fan of writing unit tests, but I know a lot of software developers that just don’t like it. Therefore, I think it’s important to develop a unit testing structure that has a lot ‘helper’ functions to facilitate writing unit tests.

With that in mind, let’s create a simple unit test file that is not testing our Flask application, but is simply showing the structure of a unit test:

# project/test_basic.py


import os
import unittest

from project import app, db, mail


TEST_DB = 'test.db'


class BasicTests(unittest.TestCase):

    ############################
    #### setup and teardown ####
    ############################

    # executed prior to each test
    def setUp(self):
        app.config['TESTING'] = True
        app.config['WTF_CSRF_ENABLED'] = False
        app.config['DEBUG'] = False
        app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///' + \
            os.path.join(app.config['BASEDIR'], TEST_DB)
        self.app = app.test_client()
        db.drop_all()
        db.create_all()

        # Disable sending emails during unit testing
        mail.init_app(app)
        self.assertEqual(app.debug, False)

    # executed after each test
    def tearDown(self):
        pass


###############
#### tests ####
###############

    def test_main_page(self):
        response = self.app.get('/', follow_redirects=True)
        self.assertEqual(response.status_code, 200)


if __name__ == "__main__":
    unittest.main()

This unit test file creates a class, BasicTests, based on the unittest.TestCase class. Within this class, the setUp() and tearDown() methods are defined. This is really critical, as the setUp() method is called prior to each unit test executing and the tearDown() method is called after each unit test finishes executing.

Since we want to test out the functionality of the web application, this unit test uses a SQLite database for storing data instead of the typical Postgres database that we’ve been using. Why I’ve seen lots of discussions about whether or not this is a good idea, I feel like it really simplifies the development of the unit tests, so I’m in favor of it.

Running the Unit Tests

If you want to run the unit test that was just created, the easiest way is to just execute that file:

$ python project/tests/test_basic.py 
.
----------------------------------------------------------------------
Ran 1 test in 0.122s

OK

Please take note of the directory that the unit test was run from (top-level folder of the Flask application). Why can’t we just run from the directory where the unit test file is located? We would have this option available for simple unit test, but since we’ll importing the ‘app’, ‘db’, and ‘mail’ objects, we need to be at a location where those are discoverable. Therefore, running at the top-level directory of the Flask application allows the python interpreter to find these objects within the ‘project’ directory.

Taking it one step further, I’d highly recommend using Nose2 as the unit test runner. A unit test runner provides the ability to easily detect the unit tests in your project and then execute them.

The easiest way to run Nose2 is simply to call the executable from the top-level directory:

$ nose2

This command will find all of the unit tests (as long as the files start with test_*.py) and execute them. I’m a bit preferential to using the verbose mode:

$ nose2 -v

If you just want to run a single unit test file, you can still use Nose2:

$ nose2 -v project.tests.test_basic
test_main_page2 (project.tests.test_basic.BasicTests) ... ok

----------------------------------------------------------------------
Ran 1 test in 0.130s

OK

I’ve found Nose2 to be so easy to use and I highly recommend using it for running your unit tests.

Adding Helper Methods to Facilitate Developing More Unit Tests

For the unit testing in this blog post, we’re going to be focused on testing some of the user management aspects of the Flask application. As such, we’re likely going to be doing a lot of logging in, logging out, and registering for a new account. Let’s simplify the development of future unit tests by creating some helper methods to perform these steps:

    ########################
    #### helper methods ####
    ########################

    def register(self, email, password, confirm):
        return self.app.post(
            '/register',
            data=dict(email=email, password=password, confirm=confirm),
            follow_redirects=True
        )

    def login(self, email, password):
        return self.app.post(
            '/login',
            data=dict(email=email, password=password),
            follow_redirects=True
        )

    def logout(self):
        return self.app.get(
            '/logout',
            follow_redirects=True
        )

The ‘register’ method registers a new user by sending a POST command to the ‘/register’ URL of the Flask application. The ‘login’ method logs a user in by sending a POST command to the ‘/login’ URL of the Flask application. The ‘logout’ method logs a user out of the application by sending a GET command to the ‘/logout’ URL of the Flask application. These will come in handy when we…

…add a test to make sure we can register a new user:

    def test_valid_user_registration(self):
        response = self.register('patkennedy79@gmail.com', 'FlaskIsAwesome', 'FlaskIsAwesome')
        self.assertEqual(response.status_code, 200)
        self.assertIn(b'Thanks for registering!', response.data)

This unit test checks that we can successfully register a new user and receive a positive confirmation. We’re testing a nominal situation here, but it’s also good to check what happens when off-nominal data is provided to the application. Let’s try registering a new user where the confirmation password does not match the original password:

    def test_invalid_user_registration_different_passwords(self):
        response = self.register('patkennedy79@gmail.com', 'FlaskIsAwesome', 'FlaskIsNotAwesome')
        self.assertIn(b'Field must be equal to password.', response.data)

Try running the unit tests and you should see that all three unit tests pass:

$ nose2 -v project.tests.test_basic
test_invalid_user_registration_different_passwords (project.tests.test_basic.BasicTests) ... ok
test_main_page (project.tests.test_basic.BasicTests) ... ok
test_valid_user_registration (project.tests.test_basic.BasicTests) ... ok

----------------------------------------------------------------------
Ran 3 tests in 3.544s

OK

One of the important checks that is in the application is to prevent duplicate emails being used for registration. Let’s check that this functionality works:

   def test_invalid_user_registration_duplicate_email(self):
        response = self.register('patkennedy79@gmail.com', 'FlaskIsAwesome', 'FlaskIsAwesome')
        self.assertEqual(response.status_code, 200)
        response = self.register('patkennedy79@gmail.com', 'FlaskIsReallyAwesome', 'FlaskIsReallyAwesome')
        self.assertIn(b'ERROR! Email (patkennedy79@gmail.com) already exists.', response.data)

Running the unit tests:

$ nose2 -v project.tests.test_basic
test_invalid_user_registration_different_passwords (project.tests.test_basic.BasicTests) ... ok
test_invalid_user_registration_duplicate_email (project.tests.test_basic.BasicTests) ... ok
test_main_page (project.tests.test_basic.BasicTests) ... ok
test_valid_user_registration (project.tests.test_basic.BasicTests) ... ok

----------------------------------------------------------------------
Ran 4 tests in 10.442s

OK

Excellent!

Please take note that the order that the unit tests are listed in test_*.py does not determine the order in which the unit tests are run. This is really important to understand, as each unit test needs to be self-contained. Don’t count on unit test #1 registering a new user and unit test #2 logging that user in.

Code Coverage

In order to check the code coverage of your unit tests, I’d recommend using the Coverage module. This module works great with unit tests that have been written using the unittest model.

Start by installing the Coverage module and updating your list of modules for your project:

$ pip install coverage
$ pip freeze > requirements.txt

The order of commands to execute when using the coverage module is:

  1. coverage run …
  2. coverage report …

The ‘run’ command runs the unit tests and collects the data for determining the code coverage. The ‘report’ command shows a basic text output of the code coverage.

For example, I’ve written some unit tests for the application and the here is how I use the Coverage module:

$ coverage run project/tests/test_basic.py 
$ coverage report project/users/*.py
Name                        Stmts   Miss  Cover
-----------------------------------------------
project/users/__init__.py       0      0   100%
project/users/forms.py         14      0   100%
project/users/views.py        177    118    33%
-----------------------------------------------
TOTAL                         191    118    38%

If you add the -m flag, you can see which lines are not being tested:

$ coverage report -m project/users/*.py
Name                        Stmts   Miss  Cover   Missing
---------------------------------------------------------
project/users/__init__.py       0      0   100%
project/users/forms.py         14      0   100%
project/users/views.py        177    118    33%   33-35, 69-80, 109-124, 130-136, 141-159, 164-179, 184-206, 212, 218-238, 244-254, 260-266, 272-277
---------------------------------------------------------
TOTAL                         191    118    38%

This text-based output is nice to get a summary of the code coverage, but just seeing line numbers that are not being testing in a file is just not that beneficial. Luckily, the Coverage module is able to generate HTML to provide better insight into which lines are being tested and which are not. You can generate the HTML output by running:

$ coverage html project/users/*.py 

Now you can go to navigate to your project’s folder, open the newly created ‘htmlcov’ directory, and open the index.html file. In your web browser, you should see a summary of the path coverage for this directory:

screen-shot-2016-11-21-at-8-52-40-pm

By clicking on views.py, you get a detailed view of the file contents including a color-coded line-by-line view of which lines are tested by the unit tests and which are not:

screen-shot-2016-11-21-at-8-52-48-pm

Conclusion

This blog post was intended to show you how to develop unit tests for a Flask application with a focus on using the right tools to make the process of writing and running unit tests as easy as possible. The recommended modules for unit testing a Flask application are:
– unittest – built-in python module for developing unit tests
– nose2 – runner for identifying and running the unit tests
– Coverage – seeing the code coverage of your unit tests

To recap, I think that the following aspects are needed to have a functioning setup for writing and running unit tests:
– create helper methods to allow quick generation of unit tests
– provide quick feedback on progress of unit testing
– create an environment that allows for tests to be added quickly
– make sure that the unit tests can be executed with a single command
– provide visual feedback of path coverage

I’ve found that unit tests are critical to testing out any software module that I write. Writing unit tests should be an efficient process that can be greatly simplified by setting up a proper infrastructure.

For the source code utilized in this blog post, please see my GitLab repository.

Relational Database Migrations using Flask-Migrate

Introduction

In this blog post, I’ll show how to use the Flask-Migrate module to simplify database schema changes for a Flask web application that is using SQLAlchemy. The Flask-Migrate module is written by Miguel Grinberg and utilizes the Alembic module for changing the database schema (ie. performing database migrations). Flask-Migrate provides a convenient command-line interface for performing database migrations, including versioning of the database schema.

One of the more challenging aspects of working with relational databases is making changes to the database schema, such as adding or deleting columns. During the early development of an application, making changes to a table in a relational database is easy if you’re not worried about deleting all of the data in your database. However, once you get into production and are actually storing real data in your relational database, you need to be very cautious when changing a table. While relational databases have a lot of strong points, being able to easily update the underlying schema of a database table is not one of them.

To be explicit, the use of the Flask-Migrate module is intended for Flask applications that are using SQLAlchemy as the ORM (Object-Relational Mapper) for interfacing with the underlying database. Typically, using a PostgreSQL database is a good choice for a Flask application and SQLAlchemy is a great tool for allowing you to work in terms of python instead of SQL for interfacing with the database.

Why Is a Database Migration Tool Needed?

At first, it may seem like having a tool for doing database migrations is overkill. It very well may be for a simple application, but I’d argue that any application that is going into production (where there is real data being stored) should utilize a database migration tool.

Let’s take a simple example to show how a database migration tool (such as Flask-Migrate) can be beneficial… Suppose you have a web application with a mature database schema that you’ve pushed to production and there are a few users already using your application. As part of a new feature, you want to add the ability to have users be part of groups, so you need to update your database schema to allow users to be associated with groups. While this is a change that you can test out in your development environment with test data, it’s a significant change for your production database as you must ensure that you are not going to delete or alter any existing data. At this point, you could write a script to perform this database migration. OK, not a huge deal to write a single script. How about if you have to make three database migrations in two weeks or 7 each week, writing new scripts every time becomes quite tedious. So why not use a tool that was developed just for this purpose and has been tested out thoroughly?

Configuring Flask-Migrate

OK, hopefully you’ve been convinced that using a database migration tool (in this case, Flask-Migrate) is a good idea. Let’s start off by configuring the Flask-Migrate module for use with an existing Flask application (I’ll be using the application that I’ve been documenting on this blog: Flask Tutorial).

The first step is to install the Flask-Migrate module using pip and then update your listing of modules being used by your application (make sure that you are working within your virtual environment!):

(ffr_env) $ pip install Flask-Migrate
(ffr_env) $ pip freeze > requirements.txt

If you look at the Flask-Migrate documentation, there are two ways for utilizing this module:

  1. Including Flask-Migrate directly into your application
  2. Creating a separate script for handling database migrations

I’ve used both methods successfully, but I prefer #1 due to the simplicity. To utilize Flask-Migrate, you’ll need to update the configuration of your Flask application by adding the following lines to the __init__.py file in …/web/ (source code is truncated to just show the updates):

#################
#### imports ####
#################

from flask import Flask, render_template
from flask_sqlalchemy import SQLAlchemy
from flask_login import LoginManager
from flask_bcrypt import Bcrypt
from flask_mail import Mail
from flask_uploads import UploadSet, IMAGES, configure_uploads
from flask_pagedown import PageDown
from flask_migrate import Migrate


################
#### config ####
################

app = Flask(__name__, instance_relative_config=True)
app.config.from_pyfile('flask.cfg')

db = SQLAlchemy(app)
bcrypt = Bcrypt(app)
mail = Mail(app)
pagedown = PageDown(app)
migrate = Migrate(app, db)

You now have the ability to control the database migrations using the command-line interface.

Setting Up Versioning of the Database Schema

Before starting this section, make sure that you have the FLASK_APP environment variable set to your top-level python file for running your Flask application:

(ffr_env) $ export FLASK_APP=run.py

This will allow you to use the format ‘flask …’ from the command line instead of ‘python run.py …’. This may seem like trivial change, but it’s the recommended method for running a Flask application.

After configuring Flask-Migrate, you should create a migration repository (it will be created in …/web/migrations/) for storing the different versions of your database schema:

(ffr_env) $ flask db init
  Creating directory .../flask_recipe_app/web/migrations ... done
  Creating directory .../flask_recipe_app/web/migrations/versions ... done
  Generating .../flask_recipe_app/web/migrations/alembic.ini ... done
  Generating .../flask_recipe_app/web/migrations/env.py ... done
  Generating .../flask_recipe_app/web/migrations/README ... done
  Generating .../flask_recipe_app/web/migrations/script.py.mako ... done
  Please edit configuration/connection/logging settings in '.../flask_recipe_app/web/migrations/alembic.ini' before proceeding.

Now that you have a versioning system for your database schema, it’s a good time to add it to your overall version control system (ie. git repository):

(ffr_env) $ git status
(ffr_env) $ git add .
(ffr_env) $ git commit -m "Adding initial version of database schema versioning"
(ffr_env) $ git push origin master

Making a Change to the Database Schema

This example will be a simple update to the database schema, but it will illustrate all the steps to follow when making changes to a database schema. The database schema is defined in …/web/models.py and we’re adding the following element to the recipes table:

class Recipe(db.Model):
    __tablename__ = "recipes"

    id = db.Column(db.Integer, primary_key=True)
    recipe_title = db.Column(db.String, nullable=False)
    recipe_description = db.Column(db.String, nullable=False)
    is_public = db.Column(db.Boolean, nullable=False)
    image_filename = db.Column(db.String, default=None, nullable=True)
    image_url = db.Column(db.String, default=None, nullable=True)
    recipe_type = db.Column(db.String, default=None, nullable=True)
    rating = db.Column(db.Integer, default=None, nullable=True)
    ingredients = db.Column(db.Text, default=None, nullable=True)
    ingredients_html = db.Column(db.Text, default=None, nullable=True)
    recipe_steps = db.Column(db.Text, default=None, nullable=True)
    recipe_steps_html = db.Column(db.Text, default=None, nullable=True)
    inspiration = db.Column(db.String, default=None, nullable=True)
    user_id = db.Column(db.Integer, db.ForeignKey('users.id'))

After you’re satisfied with the change to the database schema, generate the initial migration:

(ffr_env) $ flask db migrate -m “Added inspiration field to the recipe table"

This command will generate a script for performing the database upgrade. I really like including a message for each change to the database schema, as this really helps to reinforce the similarities to a version control system like git.

It’s a good idea to review the script to make sure it’s doing what you intended… look at …/web/migrations/versions/*_added_inspiration_field*.py:

"""Added inspiration field to the recipe table

Revision ID: 56b6764ae94a
Revises: None
Create Date: 2016-10-27 22:40:30.168097

"""

# revision identifiers, used by Alembic.
revision = '56b6764ae94a'
down_revision = None

from alembic import op
import sqlalchemy as sa


def upgrade():
    ### commands auto generated by Alembic - please adjust! ###
    op.add_column('recipes', sa.Column('inspiration', sa.String(), nullable=True))
    ### end Alembic commands ###


def downgrade():
    ### commands auto generated by Alembic - please adjust! ###
    op.drop_column('recipes', 'inspiration')
    ### end Alembic commands ###

You can see in the script that the command to upgrade the database schema will add the inspiration field, while the command to downgrade the database schema will drop the inspiration field. Once you are happy with the script contents, you can apply the upgrade to the database schema:

(ffr_env) $ flask db upgrade

Now that the changes to the database schema have been applied, add the changes to your git repository:

$ git status
$ git add .
$ git commit -m “Updated database schema to add ???? to the recipe table”
$ git push origin master

This is the typically flow that you’ll want to follow as you make updates to your database:

  1. Make an update to your database schema (in your development environment) and test it out
  2. ‘flask db migrate -m
  3. Check the migration script
  4. ‘flask db upgrade’
  5. Commit to the changes to your git repository

Updating the Database Schema in Production

Once you’re happy with the changes to your database schema and the changes have been pushed up to your git repository (I prefer GitLab), head over to your production server and execute the following commands if you are using Docker and Docker-Compose:

$ git pull origin master
$ docker-compose stop
$ docker-compose build
$ docker-compose up -d
$ sudo docker-compose run --rm web bash
    > export FLASK_APP=run.py
    > flask db upgrade
    > exit

You should now see that your production database has been upgraded to include the new field in the recipes table and that all of the existing data is still intact.

Additional Helpful Commands

If you want to be able to see all of the database schema migrations:

$ flask db history

If you want to see the current version of the database schema:

$ flask db current --verbose

If you want to see a list of commands that you can use with Flask-Migrate:

$ flask db --help

Conclusion

This blog post showed how to simplify the process of making database schema changes using the Flask-Migrate module. I highly recommend using this module for any Flask application that is using SQLAlchemy. After a few configuration steps, Flask-Migrate allows you to easily make changes to your database schema using the following steps:

  1. Make an update to your database schema (in your development environment) and test it out
  2. ‘flask db migrate -m
  3. Check the migration script
  4. ‘flask db upgrade’
  5. Commit to the changes to your git repository

All of the source code from this blog post can be found on GitLab.

The Flask-Migrate module was written by Miguel Grinberg. He has been a huge inspiration to me in terms of learning about Flask, so I highly recommend his Flask Mega-Tutorial and his excellent book on Flask web development: Flask Web Development.

References

Flask-Migrate Documentation: https://flask-migrate.readthedocs.io/en/latest/

Miguel Grinberg’s Blog Post introducing Flask-Migrate: https://blog.miguelgrinberg.com/post/flask-migrate-alembic-database-migration-wrapper-for-flask
Alembic Documentation: http://alembic.zzzcomputing.com/en/latest/

How to use Docker and Docker Compose to Create a Flask Application

Introduction

In one of my previous blog posts, I explained WHY I was switching from a traditional deployment approach to using Docker. In this blog post, I’ll dive into the details of HOW to use Docker and Docker Compose to create a Flask web application.

My experience with using Docker has been filled with ups and downs, as it is part exhilarating and part frustrating. I ran into a few roadblocks when configuring my application to work with Docker that were quite frustrating to resolve (which I’ll detail in this blog post). However, the sense of accomplishment when you get a working application running in Docker, and knowing that it will be an exact match to your production environment, is AMAZING!

This blog post is not intended to be an introduction to Docker. If you’re new to Docker, I’d highly recommend ‘Docker for Developers’ by Chris Tankersley ($19.99 recommended price). If you want to gain a solid understanding of all the pieces to Docker and some of the history leading up to Docker, read this book!

Architecture

One of the (good) side-effects of using Docker is needing to think about the overall architecture of your application early in the development process. While it’s definitely beneficial to use a development server (such as the built-in development server that comes with the Flask framework), it’s also beneficial to be able to switch to a production environment using Docker for testing early in the development cycle.

The work that I did with Docker involved using the Flask web application that I’ve been documenting on this site (Flask Tutorial). Here is a diagram that illustrates the structure of the application and how Docker fits into it:

docker-application-architecture

There are four Docker containers used in this architecture:

  1. Web application – Flask web application with the Gunicorn WSGI server
  2. Web server – NGINX
  3. Relational database – PostgreSQL server
  4. Data volume – persistent data storage for Postgres database

The first three components are all created from Docker images that expand on the respective official images from Docker Hub. Each of these images are built using separate Dockerfiles. Docker Compose is then used to create all four containers and connect them correctly into a unified application.

Directory Structure

For a typical Flask application, your directory structure will typically look similar to:

$ tree
.
├── README.md
├── instance
│   ├── db_create.py
│   ├── flask.cfg
├── project
│   ├── __init__.py
│   ├── models.py
│   ├── recipes
│   │   ├── __init__.py
│   │   ├── forms.py
│   │   └── views.py
│   ├── static
│   ├── templates
│   ├── tests
│   └── users
│       ├── __init__.py
│       ├── forms.py
│       └── views.py
├── requirements.txt
└── run.py

By adding the use of Docker to your application, I’d recommend changing the directory structure of your application to:

$ tree
.
├── README.md
├── docker-compose.yml
├── nginx
│   ├── Dockerfile
│   ├── family_recipes.conf
│   └── nginx.conf
├── postgresql
│   └── Dockerfile  * Not included in git repository
└── web
    ├── Dockerfile
    ├── create_postgres_dockerfile.py
    ├── instance
    │   ├── db_create.py
    │   ├── flask.cfg
    ├── project
    │   ├── __init__.py
    │   ├── models.py
    │   ├── recipes
    │   │   ├── __init__.py
    │   │   ├── forms.py
    │   │   └── views.py
    │   ├── static
    │   ├── templates
    │   ├── tests
    │   └── users
    │       ├── __init__.py
    │       ├── forms.py
    │       └── views.py
    ├── requirements.txt
    └── run.py

At first glance, this may seem more complicated, but it’s actually a great change. Your repository is now storing your source code AND the configuration of your application environment. You have the configuration of your web server (NGINX) included, the configuration of your database (Postgres) included, and a way to tie all the pieces together (docker_compose.yml).

As I was preparing to write this blog post, I wanted to check on the NGINX configuration that I had created for a previous web application. It required logging in to my remote server, finding the configuration files, and there was no version history of these files. Being able to store the configuration of your application environment in your repository is so powerful.

Docker Image #1 – NGINX

In order to use NGINX for this web application, we’re going to take the official NGINX image from Docker Hub and then add layers on top of it to configure it for a Flask web application. Let’s take a look at the directory for NGINX in the new directory structure:

$ pwd
.../flask_recipe_app/nginx
$ tree
.
├── Dockerfile
├── family_recipes.conf
└── nginx.conf

This folder contains a Dockerfile and then two configuration files (family_recipes.conf and nginx.conf). The Dockerfile is used to specify how the new image should be created:

FROM nginx:1.11.3
RUN rm /etc/nginx/nginx.conf
COPY nginx.conf /etc/nginx/
RUN rm /etc/nginx/conf.d/default.conf
COPY family_recipes.conf /etc/nginx/conf.d/

This file starts by taking the ‘nginx:1.11.13’ image (either stored locally already on your computer or downloaded from Docker Hub), removing the default NGINX configuration files, and then copying the new NGINX configuration files from this directory to their appropriate locations in the NGINX image.

The configuration of NGINX is a very interesting topic, which I covered in detail in a previous blog post: How to Configure NGINX for a Flask Web Application. To summarize, we’re configuring NGINX to serve static content (CSS, JavaScript, Images, etc.) and to reverse proxy to our WSGI server (Gunicorn) for our Flask application to process requests.

Docker Image #2 – Web Application (including Gunicorn)

The Web Application image stores our Flask web application, the Gunicorn web server, and all of the dependent modules (Flask, SQLAlchemy, etc.). This image is built on top of the official python 3.4.5 Docker image from Docker Hub. I picked this version of python3, as I started developing this application with python 3.4.x. Remember, it’s always best to explicitly state a version number when selecting a base Docker image instead of selecting “xxx:latest”, as this will result in the version changing constantly as new versions of a particular Docker image are added to Docker Hub.

The Dockerfile that defines the Web container is just a single line(!!!):

FROM python:3.4.5-onbuild

Within the vast array of the official Docker images for python, there are a number of images that end with “-onbuild”. These are special Docker images that include a set pattern for creating a standalone python application. To better understand what these images do, let’s look at the source code for the Dockerfile for the python:3.4-onbuild:

FROM python:3.4

RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app

ONBUILD COPY requirements.txt /usr/src/app/
ONBUILD RUN pip install --no-cache-dir -r requirements.txt

ONBUILD COPY . /usr/src/app

This image uses the official python:3.4 image as the base. It is assumed that you are in the top-level of your python application when using this image.

The first steps create a new directory in the Docker image for storing the python application source code (/usr/src/app) and set this directory as the working directory, which is from where all commands will be executed from.

Next, the requirements.txt file is copied into the working directory and all of the dependent modules defined in the requirements.txt file are installed. For our application, we’re installing the Flask web framework, SQLAlchemy, WTForms, etc. Included in this list is also Gunicorn, which is the web server that we’ll be using to run our Flask web application. More details on how to configure Gunicorn are upcoming when we define the docker-compose.yml file for having all of the different containers of our web application run together.

FInally, the source code for our python application is copied to the working directory (/usr/src/app). Our web application image is now complete and ready to use.

Docker Image #3 – Postgres

The official Postgres image from the Docker Hub is a great starting point. This image will create a default user (‘postgres’) and database (‘postgres’) for you, which is convenient for development. However, it is advisable to create a separate user/password for accessing a specific database. This is accomplished by setting the following environment variables within the Postgres Docker image:

  • POSTGRES_PASSWORD
  • POSTGRES_USER
  • POSTGRES_DB

Within my Flask application, I’ve found that creating an ‘instance’ directory to store the sensitive information and then keeping that directory out of my git repository to be a convenient method for attempting to protect the sensitive parameters of the application. Within the ‘instance’ directory, there is a file called ‘flask.cfg’ which defines the Secret Key for my application, the parameters for the Postgres database, and other sensitive parameters that I don’t want others to be able to access. Therefore, storing a Dockerfile for creating the Postgres image in my git repository with all of this sensitive information like username/password is not a good idea.

Luckily, we’re using python so there is an incredible flexibility to solve problems. My solution? Create a script to automatically generate the Postgres Dockerfile by reading the sensitive parameters associated with the Postgres database from the ‘instance’ directory.

In order to get the access to the correct python modules correct, it’s easiest to store this script in the ‘…/flask_recipe_app/web/’ directory:

import os
from project import app


# Postgres Initialization Files
docker_file = 'Dockerfile'
source_dir = os.path.abspath(os.curdir)
destination_dir = os.path.join(source_dir, '../postgresql')

# Before creating files, check that the destination directory exists
if not os.path.isdir(destination_dir):
   os.makedirs(destination_dir)

# Create the 'Dockerfile' for initializing the Postgres Docker image
with open(os.path.join(destination_dir, docker_file), 'w') as postgres_dockerfile:
   postgres_dockerfile.write('FROM postgres:9.6')
   postgres_dockerfile.write('\n')
   postgres_dockerfile.write('\n# Set environment variables')
   postgres_dockerfile.write('\nENV POSTGRES_USER {}'.format(app.config['POSTGRES_USER']))
   postgres_dockerfile.write('\nENV POSTGRES_PASSWORD {}'.format(app.config['POSTGRES_PASSWORD']))
   postgres_dockerfile.write('\nENV POSTGRES_DB {}'.format(app.config['POSTGRES_DB']))
   postgres_dockerfile.write('\n')

This script will create the destination directory (‘…/flask_recipe_app/postgres/’) if it does not exist. Then, the sensitive Postgres parameters are read from the configuration parameters and the Dockerfile is written. Let’s check:

$ pwd
.../flask_recipe_app/postgresql
$ tree
.
└── Dockerfile

Nice! The Postgres image is ready to be used. Be sure to add ‘/postgresql/Dockerfile’ to your .gitignore file to make sure this Dockerfile isn’t included in your git repository.

Docker Compose

Docker Compose allows you to define the structure of your application by utilizing multiple containers. Docker Compose handles so much for you and all you have to do is define a simple docker_compose.yml file.

Docker Compose reads the docker_compose.yml file and builds the applicable ‘docker run’ commands (in the correct order!) to create the multi-container application. While it is possible to create a series of ‘docker run’ commands to build up the multi-container application, Docker Compose simplifies this process significantly. Plus, Docker Compose does a lot of stuff in the background that you don’t even have to worry about, like automatically creating a network for the containers to talk to each other on.

Here is the docker_compose.yml file contents:

version: '2'

services:
 web:
   restart: always
   build: ./web
   expose:
     - "8000"
   volumes:
     - /usr/src/app/project/static
   command: /usr/local/bin/gunicorn -w 2 -b :8000 project:app
   depends_on:
     - postgres

 nginx:
   restart: always
   build: ./nginx
   ports:
     - "80:80"
   volumes:
     - /www/static
   volumes_from:
     - web
   depends_on:
     - web

 data:
   image: postgres:9.6
   volumes:
     - /var/lib/postgresql
   command: "true"

 postgres:
   restart: always
   build: ./postgresql
   volumes_from:
     - data
   ports:
     - "5432:5432"

The first line defines the version of Docker Compose file format. I recommend using version 2, which is the latest.

The next line (‘services:’) starts the definition of each service (think of this as the containers and data volumes) for the application.

The first container that is defined is the web application. The ‘restart’ command should be set to ‘always’ to make sure that the container is always restarted regardless of the exit code. Remember, we’ll be using this same configuration in production, so we want our containers to always restart. The ‘build’ command defines the location of the Dockerfile to build the image with. The ‘expose’ command specifies the port that should be exposed to the other containers on the network, but does not pushed this port to the outside world. The ‘volumes’ command specifies the persistent data to maintain in this container, even on restarts. The ‘command’ command specifies the override of the default command for an image. Since this we’re using Gunicorn as a WSGI server to generate the dynamic content of our application, we want to start the Gunicorn server using ‘/usr/local/bin/gunicorn -w 2 -b :8000 project:app‘ once the container starts running. Finally, the ‘depends_on’ command specifies which service(s) this service depends on. This is important to making sure the containers are started in a proper order.

The second container that is defined is the NGINX container. There are similar commands used for creating this container, but there is a new command: ports. This command exposes the specified port of the container to the specified port of the host (ie. to the outside world). Since NGINX is our web server, we’ll allowing the standard HTTP port (80) to be exposed to the outside world to allow access to our application.

The third container that is defined is the persistent data volume which stores the Postgres database. The use of the ‘command‘ command is used to override the default command for the image. Since this volume is just intended to store persistent data, there is no need to run the full Postgres initialization, so ‘true’ just allows you to skip the standard installation process.

The fourth (and last) container that is defined is the Postgres container. The Dockerfile at ./postgresql is used to build this container. This container exposes the 5432 port (standard Postgres port) to the other containers in the network to allow access to the Postgres database. Due to the sensitive data that can be stored in this file, it should not be stored in your git repository. Additionally, it is automatically generated using the script created above in the Postgres Image section.

To test out the application, build the application components and then run the application in the foreground (ie. not as a daemon) by executing the following in your top-level project directory:

$ docker-compose build
$ docker-compose up

By just running ‘docker-compose up’ without any options, you are running the Docker multi-container application in the foreground. You’ll be able to see all the log information that the containers output. This can be convenient for checking the configuration of a docker_compose.yml file, but it’s preferable to run the application as a daemon (background process):

$ docker-compose up -d

One interesting thing to note is the order in which the containers are started, which we defined using the ‘depends_on’ parameters in the docker_compose.yml file:

$ docker-compose up -d
Creating flaskrecipeapp_data_1
Creating flaskrecipeapp_postgres_1
Creating flaskrecipeapp_web_1
Creating flaskrecipeapp_nginx_1

If you want to see the logs from the different containers:

$ docker-compose logs

You can also see that the individual containers are running:

$ docker ps -a

You can even see the network that Docker Compose automatically created for the application (flaskrecipeapp_default in this case):

$ docker network ls
NETWORK ID          NAME                     DRIVER              SCOPE
714ecc20febe        bridge                   bridge              local               
412dc758466a        flaskrecipeapp_default   bridge              local               
f5b6025063dd        host                     host                local               
6d1d581fe9e3        none                     null                local      

Before we can access our application via a web browser, we need to create the tables in our Postgres database. There is a script in the …/web/instances folder (outside of the git repository) that automatically creates the tables and populates some initial data. This script can be run in the context of Docker:

$ docker-compose run --rm web python ./instance/db_create.py

This script contains some text output to indicate that it was successful.

Let’s test that the application is running… find the IP address of the Docker Machine that you are running:

$ docker-machine ls

Go to your favorite web browser and enter either the IP address. You should see the main page for the Flask application:

screen-shot-2016-10-12-at-10-09-02-pm

Issues Encountered

Coming up with the proper configuration for getting this Flask web application running with Docker did have its challenges. Here are some of the issues that I encountered:

Issue #1 – Issue Encountered While Creating docker-compose.yml

The biggest issue that I encountered with getting my Flask application running in Docker was trying to initialize the Postgres database via a python script. I utilize the …/instance/ folder to store the configuration parameters for my Flask application (flask.cfg) and the script to initialize the Postgres database (db_create.py). Since these files contain a lot of sensitive information (secret key, postgres credentials, admin credentials, etc.), I do not include this folder in my git repository.

During my development work running with the Flask development server on my laptop, I’ve always been able to navigate to the top-level directory of my project and run:

> python instance/db_create.py

Honestly, I took it for granted that this just worked. Here are the sanitized contents of the ‘db_create.py’ file:

from project import db
from project.models import Recipe, User


# Drop all of the existing database tables
db.drop_all()

# Create the database and the database table
db.create_all()

# Insert user data
user1 = User(email=‘***’, plaintext_password='***', role='user')
user2 = User(email='***', plaintext_password='***', role='user')
user3 = User(email='***', plaintext_password='***', role='user')
admin_user = User(email='***', plaintext_password='***', role='admin')
db.session.add(user1)
db.session.add(user2)
db.session.add(user3)
db.session.add(admin_user)

# Commit the changes for the users
db.session.commit()

# Insert recipe data
recipe1 = Recipe('Slow-Cooker Tacos', 'Delicious ground beef that has been simmering in taco seasoning and sauce.  Perfect with hard-shelled tortillas!', admin_user.id, False)
recipe2 = Recipe('Hamburgers', 'Classic dish elevated with pretzel buns.', admin_user.id, True)
recipe3 = Recipe('Mediterranean Chicken', 'Grilled chicken served with pitas, hummus, and sauted vegetables.', user1.id, True)
db.session.add(recipe1)
db.session.add(recipe2)
db.session.add(recipe3)

# Commit the changes for the recipes
db.session.commit()

Here’s what happened… I created and started up the Docker containers using the docker-compose.yml file:

> docker-compose build
> docker-compose up -d

Before being able to actually use the Flask application, the Postgres database must be configured, which is why I run the …/instance/db_create.py file:

> docker-compose run —rm web python instance/db_create.py
Traceback (most recent call last):
  File "./instance/db_create.py", line 6, in <module>
    from project import db
ImportError: No module named 'project'

What!??! I was not expecting to see an error when I ran this script. This error message is saying that the ‘project’ module cannot be found. I had always been trying to avoid this type of error, which is why I was running the script from the ./web/ directory:

> python instance/db_create.py

As opposed to running within the ./web/instance/ directory:

> cd instance
> python db_create.py

By running in the top-level directory, I was expecting the python interpreter to recognize the …/project/ folder as a python module since it included a __init__.py file.

So why is this not the case when running in a Docker container? I’m not exactly clear why this is implemented in this fashion, but running the following command causes the /usr/src/app/instance/ folder to be added to the list of paths for the python interpreter to search through (defined in the PYTHONPATH environment variable). I was really surprised to see this when I added some debug statements to the …/instance/db_create.py script:

root@f132890a7897:/usr/src/app# python ./instance/db_create.py 
sys.path: ['/usr/src/app/instance', '/usr/local/lib/python34.zip', '/usr/local/lib/python3.4', '/usr/local/lib/python3.4/plat-linux', '/usr/local/lib/python3.4/lib-dynload', '/usr/local/lib/python3.4/site-packages']

I really expected the first folder listed to be /usr/src/app instead of /usr/src/app/instance.

Luckily, python provides the tools to solve almost any problem, so I updated …/instance/db_create.py to check if the top-level directory (/usr/src/app) is in PYTHONPATH and to add it if it is not included:

# Check the PYTHONPATH environment variable before beginning to ensure that the
# top-level directory is included.  If not, append the top-level.  This allows
# the modules within the .../project/ directory to be discovered.
import sys
import os

if os.path.abspath(os.curdir) not in sys.path:
    print('...missing directory in PYTHONPATH... added!')
    sys.path.append(os.path.abspath(os.curdir))

<.. rest of file …>

Problem solved, though this took a lot more trial-and-error than I expected would be needed with Docker. Lesson learned (again!)… no technology is perfect for every situation.

Issue #2 – Passing environment variables to a Docker container or image

I honestly don’t know why this was so challenging, but I found so many different syntax examples online for setting environment variables either in docker_compose.yml or Dockerfile. After sorting through the noise, I found that setting environment variables in a Dockerfile using the following format was the most straight-forward and worked successfully:

ENV POSTGRES_DB flask_family_recipes_db

Conclusion

Docker is awesome! The frustrations encountered while trying to configure my Flask application to work with Docker were just part of the learning curve. The excitement in creating a fully functioning web application that is running locally on your laptop is amazing. This isn’t just a development version of your application, this is the production version.

In my next blog post, I’ll discuss how to take this application and deploy it to a production server on DigitalOcean.

The source code for this application can be found on GitLab.

Reference – Key Commands when Running Docker and Docker Compose

Build all of the images in preparation for running your application:
$ docker-compose build

Using Docker Compose to run the multi-container application (in daemon mode):
$ docker-compose up -d

View the logs from the different running containers:
$ docker-compose logs

Stop all of the containers that were started by Docker Compose:
$ docker-compose stop

Run a command in a specific container:
$ docker-compose run –rm web python ./instance/db_create.py
$ docker-compose run web bash

Check the containers that are running:
$ docker ps

Stop all running containers:
$ docker stop $(docker ps -a -q)

Delete all running containers:
$ docker rm $(docker ps -a -q)

Delete all untagged Docker images
$ docker rmi $(docker images | grep “^” | awk ‘{print $3}’)

References

Dockerizing Flask With Compose and Machine – From Localhost to the Cloud (from Real Python Blog):
https://realpython.com/blog/python/dockerizing-flask-with-compose-and-machine-from-localhost-to-the-cloud/

Docker Documentation:
https://docs.docker.com

Docker Compose Documentation:
https://docs.docker.com/compose/gettingstarted/
https://docs.docker.com/compose/compose-file/

Docker for Beginners:
https://prakhar.me/docker-curriculum/

Docker for Developers (eBook) by Chris Tankersley:
http://leanpub.com/dockerfordevs ($19.99)

How to Configure NGINX for a Flask Web Application

Introduction

In this blog post, I’ll be explaining what NGINX is and how to configure it for serving a Flask web application. This blog post is part of a larger series on deploying Flask applications. I’ve found a lot of documentation about NGINX and how to configure it, but I wanted to dive into the details for how NGINX can be used in a Flask web application and how to configure it. I’ve found the configuration of NGINX to be a bit confusing, as a lot of the documentation simply shows a configuration file(s) without explaining the details of what each step does. Hopefully this blog post provides some clarity on configuring NGINX for your application.

What is NGINX?

From the NGINX (pronounced ‘engine-X’) website, here is the high-level description of the tool:

NGINX is a free, open-source, high-performance HTTP server and reverse proxy, as well as an IMAP/POP3 proxy server. NGINX is known for its high performance, stability, rich feature set, simple configuration, and low resource consumption.

Let’s expand on this description… NGINX is a server that handles HTTP requests for your web application. For a typical web application, NGINX can be configured to perform the following with these HTTP requests:

  • Reverse proxy the request to an upstream server (such as Gunicorn, uWsgi, Apache, etc.)
  • Server static content (Javascript files, CSS files, images, documents, static HTML files)

NGINX also provides a load balancing capability to allow requests to be serviced by multiple upstream servers, but that functionality is not discussed in this blog post.

Here’s a diagram illustrating how NGINX fits into a Flask web application:

nginx-in-production-environment

NGINX handles the HTTP requests that come in from the internet (ie. the users of your application). Based on how you configure NGINX, it can directly provide the static content (Javascript files, CSS files, images, documents, static HTML files) back to the requester. Additionally, it can reverse proxy the requests to your WSGI (Web Server Gateway Interface) server to allow you to generate the dynamic content (HTML) in your Flask web application to be delivered back to the user.

This diagram assumes the use of Docker, but the configuration of NGINX would be very similar if not using Docker (just omit the concept of containers from the diagram).

Why do you need NGINX and Gunicorn?

NGINX is a HTTP server that is used in lots of different application stacks. It performs a lot of functions, but it is not able to directly interface with a Flask application. That is where Gunicorn comes in to play. HTTP requests are received by NGINX and passed along to Gunicorn to be processed by your Flask application (think of the route(s) defined in your views.py). Gunicorn is a WSGI server that handles HTTP requests and routes them to any python application that is WSGI-compliant, such as Flask, Django, Pyramid, etc.

Structure of NGINX Configuration Files

NOTE: This blog post uses NGINX v1.11.3. The configuration files could be located at different locations depending on your specific version on NGINX, such as /opt/nginx/conf/.

Depending on how you installed or are using NGINX, the structure of the configuration files will be slightly different. Both structures are presented below…

Structure 1

If you compile NGINX from the source code or use an official Docker image, then the configuration files are located at: /etc/nginx/ and the main configuration file is /etc/nginx/nginx.conf. At the bottom of /etc/nginx/nginx.conf is a line to include any additional configuration files located in the /etc/nginx/conf.d/ directory:

  • include /etc/nginx/conf.d/*.conf;

Structure 2

If you installed NGINX using a package manager (such as apt-get on Ubuntu), then you will also have the following sub-directories in the /etc/nginx/ directory:

  • sites-available – contains the different configuration files, often for different sites.
  • sites-enabled – contains a symbolic link a file defined in sites-available

These directories are holdovers from Apache that have been applied to the configuration of NGINX.

Since the Flask applications that we’re developing are using Docker, we’ll be focusing on ‘Structure 1’ in this blog post.

NGINX Configuration

The top-level configuration file for NGINX is nginx.conf. NGINX allows for multiple layers of configuration files, which allows a lot of flexibility in configuring it just right for your application. For specific details about a parameter, the NGINX documentation provides a nice reference.

The configuration parameters for NGINX are grouped into blocks. Here are the blocks that we’ll be working with in this blog post:

  • Main – defined in nginx.conf (anything not defined in a block)
  • Events – defined in nginx.conf
  • Http – defined in nginx.conf
  • Server – defined in _application_name_.conf

The breakdown of these blocks into different files allows you to define the high-level configuration parameters of NGINX in nginx.conf and the specific parameters for a virtual host(s)/server(s) to be in a *.conf file(s) that is specific to your web application.

Details of nginx.conf

The default version of nginx.conf that comes with the installation of NGINX is a good starting point for most servers. Let’s investigate the details of nginx.conf and see how to expand upon the default settings…

Main Section

The main section (ie. configuration parameters not defined within blocks) of nginx.conf is:

user  nginx;
worker_processes  1;

error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;

The first parameter (user) defines the user that will own and run the Nginx server. This default value is good to use, especially when working with NGINX via a Docker container.

The second parameter (worker_processes) defines the number of worker processes. A recommended value for this parameter is the number of cores that are being used by your server. For a basic virtual private server (VPS), the default value of 1 is a good choice. Increment this number as you expand the performance of your VPS.

The third parameter (error_log) defines the location on the file system of the error log, plus a bonus parameter for the minimum severity to log messages for. The default value for this parameter is good.

The fourth parameter (pid) defines the file that will store the process ID of the main NGINX process. No need to change this default value.

events Block

The events block defines the parameters that affect connection processing. The events block is the first block in the nginx.conf file:

events {
    worker_connections  1024;
}

This block has a single parameter (worker_connections), which defines the maximum number of simultaneous connections that can be opened by a worker process. The default value for this parameter is good, as this defines 1024 total connections (but you have to count connections with users requesting sites and connections with the WSGI server).

http Block

The http block defines a number of parameters for how NGINX should handle HTTP web traffic. The http block is the second block in the nginx.conf file:

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

    sendfile        on;
    #tcp_nopush     on;

    keepalive_timeout  65;

    #gzip  on;

    include /etc/nginx/conf.d/*.conf;
}

The first parameter (include) specifies a configuration file to include, which is located at /etc/nginx/mime.types. This configuration files defines a long list of file types that are supported by NGINX. The default value should be kept for this parameter.

The second parameter (default_type) specifies the default file type that is returned to the user. For a Flask application that is generating dynamic HTML files, this parameter should be changed to: default_type text/html;

The third parameter (log_format) specifies the format of log messages. The default value should be kept for this parameter.

The fourth parameter (access_log) specifies the location of the log of access attempts to NGINX. The default value should be kept for this parameter.

The fifth parameter (send_file) and sixth parameter (tcp_nopush) start to get a bit more complicated. See this blog post about optimizing NGINX to get more details on these parameters (plus tcp_nodelay). Since we’re planning to use NGINX to deliver static content, we should set these parameters as such:

    sendfile        on;
    tcp_nopush     on;
    tcp_nodelay    on;

The seventh parameter (keepalive_timeout) defines the timeout value for keep-alive connections with the client. The default value should be kept for this parameter.

The eighth parameter (gzip) defines the usage of the gzip compression algorithm to reduce the amount of data to transmit. This reduction in data size is offset by an increase in processing needed to perform the compression. The default value (off) should be kept for this parameter.

The ninth (and last) parameter (include) defines additional configuration files (ending in *.conf) from /etc/nginx/conf.d/. We’ll now see how to use these additional configuration files to define the serving of static content and to define the reverse proxy to our WSGI server.

Final Configuration of nginx.conf

By taking the default version on nginx.conf and adjusting a few parameters for our needs (plus adding comments), here is the final version of nginx.conf:

# Define the user that will own and run the Nginx server
user  nginx;
# Define the number of worker processes; recommended value is the number of
# cores that are being used by your server
worker_processes  1;

# Define the location on the file system of the error log, plus the minimum
# severity to log messages for
error_log  /var/log/nginx/error.log warn;
# Define the file that will store the process ID of the main NGINX process
pid        /var/run/nginx.pid;


# events block defines the parameters that affect connection processing.
events {
   # Define the maximum number of simultaneous connections that can be opened by a worker process
   worker_connections  1024;
}


# http block defines the parameters for how NGINX should handle HTTP web traffic
http {
   # Include the file defining the list of file types that are supported by NGINX
   include       /etc/nginx/mime.types;
   # Define the default file type that is returned to the user
   default_type  text/html;

   # Define the format of log messages.
   log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                     '$status $body_bytes_sent "$http_referer" '
                     '"$http_user_agent" "$http_x_forwarded_for"';

   # Define the location of the log of access attempts to NGINX
   access_log  /var/log/nginx/access.log  main;

   # Define the parameters to optimize the delivery of static content
   sendfile        on;
   tcp_nopush     on;
   tcp_nodelay    on;

   # Define the timeout value for keep-alive connections with the client
   keepalive_timeout  65;

   # Define the usage of the gzip compression algorithm to reduce the amount of data to transmit
   #gzip  on;

   # Include additional parameters for virtual host(s)/server(s)
   include /etc/nginx/conf.d/*.conf;
}
Configuring NGINX for Serving Static Content and as a Reverse Proxy

If you look at the default version of /etc/nginx/conf.g/default.conf, it defines the server block and provides a simple configuration with a lot of options to uncomment if you chose. Instead of going through each item in this file, let’s discuss the key parameters that are needed for configuring NGINX to deliver static content and for reverse proxying the requests to our WSGI server. Here are the contents of _application_name_.conf that are recommended:

# Define the parameters for a specific virtual host/server
server {
   # Define the directory where the contents being requested are stored
   # root /usr/src/app/project/;

   # Define the default page that will be served If no page was requested
   # (ie. if www.kennedyfamilyrecipes.com is requested)
   # index index.html;

   # Define the server name, IP address, and/or port of the server
   listen 80;
   # server_name xxx.yyy.zzz.aaa

   # Define the specified charset to the “Content-Type” response header field
   charset utf-8;

   # Configure NGINX to deliver static content from the specified folder
   location /static {
       alias /usr/src/app/project/static;
   }

   # Configure NGINX to reverse proxy HTTP requests to the upstream server (Gunicorn (WSGI server))
   location / {
       # Define the location of the proxy server to send the request to
       proxy_pass http://web:8000;

       # Redefine the header fields that NGINX sends to the upstream server
       proxy_set_header Host $host;
       proxy_set_header X-Real-IP $remote_addr;
       proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

       # Define the maximum file size on file uploads
       client_max_body_size 5M;
   }
}

The server block defines the parameters for a specific virtual host/server, which is typically the single web application that you are hosting on a VPS.

The first parameter (root) defines the directory where the contents being requested are stored. NGINX will start looking in this directory when it receives a request from a user. This parameter should be commented out, as it is unnecessary for this configuration since there is a default location of ‘/’ defined.

The second parameter (index) defines the default page that will be served If no page was requested (ie. if www.kennedyfamilyrecipes.com is requested). This parameter should be commented it as we want all dynamic content, including the main page, to be generated by our Flask web application.

The first two parameters (root and index) are included in this configuration file, as they can be useful for some configurations of NGINX.

The third parameter (server_name) and fourth parameter (listen) should be used together. If you have a single web application being served, then you should set these parameters as (note: a port does not need to be specified as it will default to port 80):

server {
   …
   Listen 192.241.229.181;
   …
}

If you need to want to have requests for blog.kennedyfamilyrecipes.com be served by a different Flask application than the standard www.kennedyfamilyrecipes, then you will need separate ‘server’ blocks using ‘server_name’ and ‘listen’:

server {
    listen 80;
    server_name *.kennedyfamilyrecipes.com;

    . . .

}

server {
    listen 80;
    server_name blog.kennedyfamilyrecipes.com;

    . . .

}

NGINX will always select the ‘server_name’ that is the best match for the request. This means that a request for ‘blog.kennedyfamilyrecipes.com’ will be a better match to ‘blog.kennedyfamilyrecipes.com’ than ‘*.kennedyfamilyrecipes.com’.

The fifth parameter (charset) defines the specified charset to the “Content-Type” response header field. This value should be set to ‘’utf-8’.

The first ‘location’ block defines where NGINX should deliver static content from:

  location /static {
       alias /usr/src/app/project/static;
   }

The location block defines how to process the requested URI (the part of the request that comes after the domain name or IP address/port). In this first location block (/static), we are specifying that NGINX should retrieve files from the ‘/usr/src/app/project/static’ directory on the server when a request comes in for www.kennedyfamilyrecipes.com/static/. For example, a request for www.kennedyfamilyrecipes.com/static/img/img_1203.jpg will come be the picture located at /usr/src/app/project/static/img/img_1203.jpg. If this file does not exist, then the 404 error code (NOT FOUND) will be returned to the user.

The second location block (‘/’) defines the reverse proxy. This location block defines how NGINX should pass these requests to the WSGI server (Gunicorn) that can interface with our Flask application. Let’s look at each parameter in more detail:

   location / {
       proxy_pass http://web:8000;
       proxy_set_header Host $host;
proxy_set_header X-Forwarded-Proto $scheme;
       proxy_set_header X-Real-IP $remote_addr;
       proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
       client_max_body_size 5M;
   }

The first parameter (proxy_pass) in this location block defines the location of the proxy server to send the request to. If you just want to pass the request to a local server running on the same machine:

proxy_pass http://localhost:8000/;

If you want to pass the request to a specific Unix socket, such as when you have NGINX and Gunicorn running on the same server:

proxy_pass http://unix:/tmp/backend.socket:/

If you are using NGINX as a Docker container that is talking to a Gunicorn container, when you simply need to include the name of the container running Gunicorn:

proxy_pass http://web:8000;

The second parameter (proxy_pass_header) allows you to redefine the header fields that NGINX sends to the upstream server (ie. Gunciorn). This parameter is used four times to define:

  • the name and port of the NGINX server (Host $host)
  • the schema of the original client request, as in whether it was an http or an https request (X-Forwarded-Proto $scheme)
  • the IP address of the user (X-Real-IP $remote_addr)
  • the IP addresses of every server the client has been proxied through up to this point (X-Forwarded-For $proxy_add_x_forwarded_for)

The third parameter (client_max_body_size) defines the maximum size for files being uploaded, which is critical if you web application allows file uploads. Given that image sizes are often 2 MBs in size, a value of 5 MB provides some flexibility to support almost any image.

Conclusion

This blog post described what the NGINX server does and how to configure it for a Flask application. NGINX is a key component to most web applications as it serves static content to the user, reverse proxies requests to an upstream server (WSGI server in our Flask web application), and provides load balancing (not discussed in detail in this blog post). Hopefully, the configuration of NGINX is easier to understand after reading this blog post!

References

How to Configure NGINX (Linode)

NGINX Wiki

NGINX Pitfalls and Common Mistakes

How to Configure the NGINX Web Server on a VPS (DigitalOcean)

Understanding NGINX Server and Location Block Selection Algorithms (DigitalOcean)

NGINX Optimization: Understanding sendfile, tcp_nodelay, and tcp_nopush

Why I Switched from a Traditional Deployment to Using Docker

Introduction

Honestly, my motivation for looking into Docker was to see what all the hype was about… is the hype true? Can Docker really help out with my development experience? Can Docker make deploying a web application easier?

From what I’ve experienced, Docker is an amazing tool that can greatly help with developing, testing, and deploying software. I’ve heard a lot of people talk about how Docker is an ever changing ecosystem, which makes it tough to keep up to date with the tool (seriously, it feels like any blog post/video/book that is older than six months is out of date when it comes to Docker!). I’d argue that this makes it an exciting time to learn about Docker and how it can be great addition to your development and production environments. Given the short time period that Docker has been out, I think it has a huge amount of capability (Docker, Docker Hub, Docker Compose, Docker Swarm, etc.).

In this blog post, I’ll discuss why I’m switching from a traditional deployment strategy to using Docker to simplify the deployment process. I’ll discuss what I’m calling a “traditional” deployment and how Docker can simplify this process, with a focus on Flask web applications. I’ve found that Docker allows you to move past a lot of the headaches of traditional deployments, while also allowing you to configuration control your overall application environment.

Traditional Deployment

The traditional deployment process that I’ve followed in the past involved using a development web server (built-in to the Flask framework) on my laptop, but then using a production web server (combination of Nginx and Gunicorn) on my production server. While this process works, it is by no means an easy process to switch from running in the development environment to the production environment. It usually felt like the challenges in developing a Flask web application paled in comparison to getting the production environment configured and working.

Since this always felt like a big effort to deploy to a production server, I usually waited until my web application was of a decent maturity before taking on this endeavor (there might be a bit of procrastination thrown in too!). As with a lot of software projects, waiting to tackle the really high-risk items until later in the project is usually a bad idea, even with a simple personal project.

Here’s a diagram that shows a high-level view of what I’m calling a traditional deployment:

traditional-deployment

For my development, I would utilize the development server that is built-in to the Flask framework for testing my application. The Flask development server is sufficient for this type of work, but it is not a good choice in a production environment due to it not scaling well and serving only a single request at a time (see Flask documentation). As with most web applications, a database is needed for storing persistent data in a structured manner. I’ve been using Postgres as my database, which requires having a Postgres server running on your laptop to interface with the Postgres database. This is one of the areas of setting up a development environment that can be a nightmare, as the installation of Postgres can be challenging (especially before discovering Postgres.app).

When switching over to the production server, there are a ton of new technologies to learn and get configured together. Instead of using the Flask development web server, I use Gunicorn as the WSGI server and Nginx as the web server. Why are both needed? Nginx is the front-end web server that serves up static content (images, documents, css files, etc.) and reverse-proxies requests to Gunicorn. Gunicorn provides the connection to the actual Flask web application that has been developed. Nginx has no way to directly talk to your Flask application, so Gunicorn provides this middle layer (called a WSGI (Web Server Gateway Interface) server). As with the development server, a Postgres server and database are utilized.

So what are the issues with this traditional development process that would make me want to consider a better or simpler approach? I’ve found that there are three major problems with this traditional deployment process:

  • Different Environments: having a different environment for my development work on my laptop vs. on the production server.
  • Configuration Management: not being able to easily configuration manage the application environment
  • Collaboration: not being able to easily replicate an application environment (either development or production) for a colleague/friend

As it turns out, Docker is a great solution for solving each of these issues!

Using Docker for Development and Deployment

Switching to using Docker does not preclude you from using the development environment described above in the traditional deployment section. In fact, this is always a good option for starting off development of a new web application.

However, Docker provides a much easier solution for structuring web applications. One of the big side effects that I’ve really enjoyed about Docker is having to think about the overall architecture of my web application early on in the development process. Docker forces you to understand how all the pieces in your web application need to work together in a production environment earlier on in the development process. Why? Because you can easily jump right to your production configuration on your laptop. This is one of the great things about new technologies like Docker; they simplify a different process to allow you to focus on developing better applications/tools/services.

Here’s a high-level diagram that shows how I use Docker for my development and production environments:

docker-for-deployments

There are four containers that interface with each other to encapsulate the full web application:

  1. Web application container – contains the Flask web application plus all of the dependent modules; this container includes the Gunicorn WSGI server
  2. Nginx container – runs an official Nginx image with our configuration
  3. Postgres container – runs an official Postgres image with our configuration
  4. Data volume – container for storing the Postgres database

Here’s where Docker becomes so powerful… the environment that we configure for using in our development environment (such as your laptop) is the same as what gets deployed to the production server!

Let’s see how Docker solves the three major problems with traditional deployments that were identified in the previous section:

  • Different Environments: Docker allows you to come up with your “production” environment on your development machine (your laptop, for example) and then deploy this exact configuration to your production server. There are no differences between these two environments!
  • Configuration Management: The Docker configuration files (Dockerfiles and docker-compose.yml) are included in your git repository and they allow you to exactly define the configuration of your overall web application. You can specify the exact version of each image to use, such as Nginx 1.11.3.
  • Collaboration: Since the Docker configuration files are stored in your git repository on GitLab (or GitHub or BitBucket), it is easy for a colleague/friend to clone the repository and set up the same application environment on their machine using Docker.

While Docker solves a lot of problems with a traditional deployment process, there are still a number of technologies that you need to understand to do a deployment:

  • How to secure a web server
  • What Nginx is and how to configure it
  • Why you need a WSGI server, such as Gunicorn, in your web application
  • How to configure and work with Postgres

Conclusion

After spending a good deal of time researching Docker and actually using it to implement the application environment for a Flask web application, I’m a believer in Docker! The hype is real and Docker is a maturing technology that has so many advantages over traditional deployments. I really feel like Docker helps to reduce a lot of the bottlenecks that can be really frustrating with deploying a web application.

This blog post discussed the WHY of using Docker for web application and the next blog post will dive into the details of HOW to use Docker to create an application environment on your development machine that can be deployed to a production server.

References

Full Stack Python Guide to Deployments by Matt Makai
This is a great book that is focused more on traditional deployments, but it provides such great base for how to deploy web applications. I’ve used this book as the basis for deploying a Django and a Flask web application to DigitalOcean. I learned a ton from this book!

Docker for Developers by Chris Tankersley
I would recommend reading this book as the first step in learning about Docker. This book uses examples for running a PHP application, but that doesn’t take away from the great job in explaining the technology leading up to Docker and what all the different components of the Docker ecosystem are.

Dockerizing Flask With Compose and Machine – From Localhost to the Cloud (from Real Python Blog):
This blog post provides a really concise example of how to use Docker for a Flask application. I gained a lot of knowledge of how to configure a Flask application from this blog post and working through the example presented in this blog post was my first “Ah ha!” with Docker.

Docker Documentation:
https://docs.docker.com

Docker for Beginners by Prakhar Srivastav
Excellent blog post on how to go from just using a Dockerfile for a single container to using Docker Compose to create a full application with multiple containers.

« Older posts