In this post we’ll go through the entire setup of a basic python project. It will cover, in no particular order:

  • Directory structure :link:
  • Virtual environments with pipenv :link:
  • IDE vscode which integrates really well with python :link:
  • Enforce coding rules with static code analysis tools :link:
    • Pylint looks for programming errors, helps enforcing a coding standard, sniffs for code smells and offers simple refactoring suggestions :link:
    • mypy: a static type checker :link:
  • Code formatter black :link:
  • Unit tests with pytest :link:
  • Code coverage with coverage.py :link:
  • Prepare project for packaging :link:
  • git as a revision control system :link:
  • Continuous integration with github action :link:
  • Documentation with sphinx :link:
  • Next steps :link:

I took inspiration from famous python repositories like scikit-learn, Flask, Keras, Sentry, Django, Ansible, Tornado, Pandas, and also from this darker repository. Hoping that the tools their using are durable and scale well to most python projects.

This post is not a complete walk through tutorial, its aim is to give you a starter point if you are relatively new to python and you look for good practices on how to structure a python project. I also give a bunch of links if you want to dig deeper or know more about alternatives.

The order of the post may look messy (yes, in fact it is), so feel free to just go to the part of your interest by clicking on the links in the index table above.

My requirements

Here are some of my requirements:

  • Cross platform (Windows, macOS and Linux)
  • Flawless integration with the IDE
  • One environment per project
  • Suitable for various python project (web app, desktop app, framework …)

Directory structure

A basic python project looks something like this:

.
├── .github/workflows
    └── test.yaml
├── .vscode
    └── extensions.json
    └── launch.json
    └── settings.json
├── docs/
    └── Makefile
    └── conf.py
    └── index.rst
├── src/
    └── mypkg/
        └── __init__.py
        └── app.py
        └── view.py
├── tests/
    └── __init__.py
    └── foo/
        └── __init__.py
        └── test_view.py
    └── bar/
        └── __init__.py
        └── test_view.py
├── .gitignore
├── LICENSE
├── Pipfile
├── Pipfile.lock
├── README.md
├── setup.py
├── setup.cfg

As you might suppose, none of the files or directories are choosen randomly. You’ll know more about these choices reading the post. It’s worth noting that this structure might be familiar for most programers working with github and python. Indeed, inspiration for the structure comes from pytest good practices, and from the repositories previously listed.

Just one reminder when naming your files and directories, avoid spaces !

Pipenv

Pipenv automatically creates and manages a virtualenv for your projects, as well as adds/removes packages from your Pipfile as you install/uninstall packages. It also generates the Pipfile.lock, which is used to produce deterministic builds.

Pipenv: setup

For installation you may refer to the official procedure.

I personnaly used brew to install pipenv, and thus to handle python too. Since pipenv can manage different python versions via pyenv, it’s preferable to have it set up globally instead of installing it only for a specific python version using pip.

brew install pipenv

I tested it on ubuntu bionic distribution, but it also works on macOS and Windows with WSL. In my case, I had to add some paths to ~/.bashrc file.

:warning: with the integrated vscode terminal, it appears that sourcing the ~/.bashrc file or equivalent is not sufficient. It also seems that vscode share the same terminal instance across windows.

The Brewfile configuration file makes the setup task easy. Feel free to look at this example file.

You can check version and installation path of your current python installation by running

python3 -c $'import sys; print(sys.version); print(sys.executable)'

You also might want to create aliases in your ~/.bashrc or equivalent to run python3 by default:

alias python=python3
alias pip=pip3

Pipenv: custom settings

In my case I wanted the virtual environment folder to be in the project directory. For that, pipenv offers a configuration which can be activated via the PIPENV_VENV_IN_PROJECT environment variable. Just set it in your ~/.bashrc file or equivalent.

export PIPENV_VENV_IN_PROJECT=1

See doc for details, this tutorial, and this issue.

And finally you can set your .vscode/settings.json with a fix virtual environment folder across projects:

"python.pythonPath": ".venv/bin/python"

Pipenv: some basic commands

If you are completely new to pipenv, I highly recommend this short tutorial.

Description shell
Open virtual environment cd <project_directory>
pipenv shell
Exit virtual environment exit
First setup, install all packages pipenv install --dev
Add package pipenv install <package> --dev
Import dependencies from requirement.txt pipenv install -r <requirement.txt>
:warning: Locking may take a lot of time # => Locking [packages] dependencies...
# => Locking ...
Install dependencies from Pipfile.lock pipenv sync
Install dependencies from Pipfile.lock on system (for docker) pipenv install --system --deploy --ignore-pipfile
Update dev environment pipenv --rm
pipenv install --dev
Upgrade packages pipenv update
Update lock file pipenv lock
Display dependency in the requirement.txt fashion pipenv lock -r

Visual Studio Code

Visual studio code is a versatile code editor, which natively integrates with python.

One of its advantages are:

  • Cross platform (Windows, macOS and Linux)
  • Already built-in pythonic functionnalities
  • Advanced customization settings
  • Common programing task like renaming, code snippets, and other editing sugar
  • Also integrates with liveshare, which make remote pair programing possible !

We present two more features that’s worth noting: debugging and workspace settings.

Visual Studio Code: Debugging tools

Apart from the great vscode debugging support, vscode also supports Jupyter Notebooks natively through the Python interactive windows, which enables you to:

  • Work Jupyter-like code cells
  • Run code in the Python Interactive Window
  • View, inspect, and filter variables using the Variable explorer and data viewer
  • Debug a Jupyter notebook
  • Export a Jupyter notebook

For the vscode interactive window to be active, you need these three packages: jupyter, ipykernel and notebook. Here are the installation commands with pipenv:

# Jupyter
pipenv install jupyter --dev
# ipykernel
pipenv install ipykernel --dev
# notebook
pipenv install notebook --dev

Visual Studio Code: Workspace settings

Workspace settings makes settings specific to a project, they make the development process easier and easily shareable with others. Configurations is made through file located in the .vscode folder at root. Here we present two

  • settings.json
  • launch.json

settings.json gather all general settings specific to the current project. launch.json specify the type of debugging scenarios. One cool thing about launch.json is that it has Platform-specific properties which means you can have specific launch commands depending on your OS.

Linters

Linting enforces coding rules by highlighting syntactical and stylistic problems in your Python source code. It often helps you identify and correct subtle programming errors or unconventional coding practices that can lead to errors.

More details from code.visualstudio.com:

For example, linting detects use of an uninitialized or undefined variable, calls to undefined functions, missing parentheses, and even more subtle issues such as attempting to redefine built-in types or functions. Linting is thus distinct from Formatting because linting analyzes how the code runs and detects errors whereas formatting only restructures how code appears.

Linters: pylint

By looking for programming errors pylint helps to enforce a coding standard. It also gives simple refactoring suggestions. pylint is the default linter for vscode.

Installation via pipenv

pipenv install pylint --dev

And add this configuration line to the vscode settings.json file

"python.linting.enabled": true

Out there, other linters are available: flake8, pep8 (just to mention two of them).

Linters: mypy

mypy is static type checker. Since type hint were released in version 3.5 but as the Python runtime does not enforce function and variable type annotations, a type checker is needed if you want to enable type checking.

Installation via pipenv

pipenv install mypy --dev

Update the vscode settings.json file

"python.linting.mypyEnabled": true

For the configuration of mypy, it uses by default the mypy.ini file with fallback to setup.cfg.

Formatter

Restructure your code with a formatting tool.

Black

black is code formatter which does not require configuration. It integrates with vscode as well.

pipenv install black --dev --pre

Using the --pre switch was mandatory because black is not currently released. See this issue for more informations.

To make it work with vscode I added this configuration lines in settings.json

"[python]": {
   "editor.formatOnSave": true
},
"python.formatting.provider": "black"

It’s also possible to sort Python import definitions alphabetically with isort.

Another useful way to share coding styles accross IDEs is using the .editorconfig, see this for more info.

Testing

Pytest

pytest is a full-featured Python testing tool. It is already used by a lot of repositories.

Installation with pipenv

pipenv install pytest --dev

Then update vscode settings.json file with these lines

"python.testing.pytestEnabled": true,
"python.testing.pytestArgs": [
   "tests"
]

To customize pytest, your configuration must go in either one of these files: pytest.ini, tox.ini and setup.cfg.

For discovery, pytest usually search for files called like test_*.py or *_test.py and then looks for functions and methods prefixed by test. See this for the full explanation. As an alternative, pytest also discover natively unittest and nosetest.

I could not omit to talk about tox, a tool that automates and standardizes testing in Python. It integrates easily with pytest. What does tox do ? Basically it will creates a virtual environment a run the tests for you, as well as checking the package installation. Consequently, it will make your life easier when you go for a continuous integration workflow.

Code coverage

Coverage measurement is used to gauge the effectiveness of tests. It can show which parts of your code are being exercised by tests, and which are not.

For this task we use Coverage.py. Here is how you can install it with pipenv.

pipenv install coverage --dev
pipenv shell
coverage erase  # clears previous data if any
coverage run --source='.src' -m pytest
coverage report  # prints to stdout
coverage html  # creates ./htmlcov/*.html including annotated source

We can then upload it as an artifact with github action. It enables us to download the coverage report in github action tab.

Github integrates also with codecov, and make it easier to vizualise report.

You can also add comment on a pull request with this action.

Documentation

Sphinx is a tool that makes it easy to create intelligent and beautiful documentation. Originally created for the Python documentation, it’s used by a wide range of projects. It can output the documentation in HTLM and LaTeX (among other formats).

Sphinx has a lot of built-in extensions, just to name a few interesting ones:

  • sphinx.ext.coverage: collect doc coverage stats
  • sphinx.ext.graphviz: add graphiz graphs support, and combined it with pyreverse would be great.
  • sphinx.ext.mathjax: render math via javascript

Installation with pipenv:

pipenv install sphinx --dev
pipenv install sphinx_rtd_theme --dev

Documentation: initialization

cd docs
sphinx-quickstart

The quickstart will ask you a few questions and you are almost ready. As for now, version 3.0.4, tt creates 4 files conf.py, index.rst, Makefile, make.bat You should now populate your master file index.rst and create other documentation source files. Use the Makefile to build the docs, like so: make builder where “builder” is one of the supported builders, e.g. html, latex or linkcheck.

Edit conf.py file like so

import os
import sys

sys.path.insert(0, os.path.abspath("../src"))

extensions = ["sphinx.ext.autodoc"]

html_theme = "sphinx_rtd_theme"

And then edit the index.rst file

   modules

You can then run the following commands to build a basic documentation

cd docs
# might delete *.rst (except index.rst) files before
sphinx-apidoc -o . ../src/climbingboard --ext-autodoc
make html

readthedocs is a service that allows you to create, host and browse documentation. It has a complete tutorial to integrate with sphinx

One alternative to sphinx is mkdocs.

Python package

By default, in python terminology, a folder is package, a file is a module, and a module contains definitions and statements. The file name is the module name with the suffix .py appended. __init__.py is required to import the directory as a regular package, and can simply be an empty file. More information here.

We then need a build script for setuptools. It tells setuptools about your package (such as the name and version) as well as which code files to include. It’s commonly done in a setup.py located at the root of the repository. I personnaly prefer the configuration way, with a setup.cfg file.

If you are curious, take a look at Poetry, it makes python packaging and dependency management easy.

Git

  • Download and install the latest version of git.
  • Configure git with your username and email.
    git config --global user.name 'your name'
    git config --global user.email 'your email'
    
  • Clone the git repository ..
    git clone <repository>
    
  • .. or add origin
    git remote add origin <repository>
    git branch -u origin/master
    

Be sure to create a .gitignore file and set it properly.

It’s always good to have issue and pull request templates. These are located in

├── .github
    └── ISSUE_TEMPLATE.md
    └── PULL_REQUEST_TEMPLATE.md

Continuous integration

From october 2018 GitHub Actions enables developers to automate, customize, and execute workflows directly in their repositories. By workflow, I mean build, test, package, release, or deploy your software. Besides the complete built-in continuous integration service within github, it has another two interesting features:

  • Built in secret store
  • Multi-container testing, to play with docker-compose

There are two well known service: Travis and Jenkins, but as my requirements are not that high, and for the sake of simplicity I choose github action.

To go further: artifact and publishing package are common tasks, as if you want to register a Docker container image to a package provider, and are supported by github action. Go there to know more about packaging.

Github Action: Add a badge

You can easily add a workflow status badge associated with a github workflow.

For example to show on your readme:

![](https://github.com/<OWNER>/<REPOSITORY>/workflows/<WORKFLOW_FILE_PATH>/badge.svg)

Next Steps

Project template and scaffolding

There are various way to help creating a project from scratch, each one having its own specificity:

  • yeoman: code generator ecosystem, cross platform and technology agnostic
  • cookiecutter: a command line utility to create python projects from templates
  • github template: not a scaffolding tool, but an easy way to reuse a repository without having to fork it.

Database

The choice of the database is a crucial point when designing an app. If your database has to be centralized, you can take a look at Parse server and Firebase. Check this article for more informations.

If you want an easy to go local database, take a look at tinydb a minimal document oriented database, and ZODB an object oriented database.

Configuration files

There’s a lot of way to save configurations/settings. Apart from saving settings in the database, it’s common to have configuration files in these formats:

  • json
  • yaml
  • py (yes, just a classic py file)

Check this excellent post to know more about how to implement it.

Zen of Python

Just to remember, open python and run import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!