A basic python development setup
In this post we’ll go through the entire setup of a basic python project. It will cover, in no particular order:
- Directory structure
- Virtual environments with
pipenv
- IDE
vscode
which integrates really well with python - Enforce coding rules with static code analysis tools
- Code formatter
black
- Unit tests with
pytest
- Code coverage with
coverage.py
- Prepare project for packaging
-
git
as a revision control system - Continuous integration with
github action
- Documentation with
sphinx
- Next steps
I took inspiration from famous python repositories like scikit-learn, Flask, Keras, Sentry, Django, Ansible, Tornado, Pandas, and also from this darker repository. Hoping that the tools their using are durable and scale well to most python projects.
This post is not a complete walk through tutorial, its aim is to give you a starter point if you are relatively new to python and you look for good practices on how to structure a python project. I also give a bunch of links if you want to dig deeper or know more about alternatives.
The order of the post may look messy (yes, in fact it is), so feel free to just go to the part of your interest by clicking on the links in the index table above.
My requirements
Here are some of my requirements:
- Cross platform (Windows, macOS and Linux)
- Flawless integration with the IDE
- One environment per project
- Suitable for various python project (web app, desktop app, framework …)
Directory structure
A basic python project looks something like this:
.
├── .github/workflows
└── test.yaml
├── .vscode
└── extensions.json
└── launch.json
└── settings.json
├── docs/
└── Makefile
└── conf.py
└── index.rst
├── src/
└── mypkg/
└── __init__.py
└── app.py
└── view.py
├── tests/
└── __init__.py
└── foo/
└── __init__.py
└── test_view.py
└── bar/
└── __init__.py
└── test_view.py
├── .gitignore
├── LICENSE
├── Pipfile
├── Pipfile.lock
├── README.md
├── setup.py
├── setup.cfg
As you might suppose, none of the files or directories are choosen randomly. You’ll know more about these choices reading the post. It’s worth noting that this structure might be familiar for most programers working with github and python. Indeed, inspiration for the structure comes from pytest good practices, and from the repositories previously listed.
Just one reminder when naming your files and directories, avoid spaces !
Pipenv
Pipenv automatically creates and manages a virtualenv for your projects, as well as adds/removes packages from your Pipfile as you install/uninstall packages. It also generates the Pipfile.lock, which is used to produce deterministic builds.
Pipenv: setup
For installation you may refer to the official procedure.
I personnaly used brew
to install pipenv
, and thus to handle python
too. Since pipenv
can manage different python versions via pyenv, it’s preferable to have it set up globally instead of installing it only for a specific python version using pip.
brew install pipenv
I tested it on ubuntu bionic
distribution, but it also works on macOS and Windows with WSL. In my case, I had to add some paths to ~/.bashrc
file.
with the integrated vscode
terminal, it appears that sourcing the ~/.bashrc
file or equivalent is not sufficient. It also seems that vscode
share the same terminal instance across windows.
The Brewfile configuration file makes the setup task easy. Feel free to look at this example file.
You can check version and installation path of your current python installation by running
python3 -c $'import sys; print(sys.version); print(sys.executable)'
You also might want to create aliases in your
~/.bashrc
or equivalent to run python3 by default:alias python=python3 alias pip=pip3
Pipenv: custom settings
In my case I wanted the virtual environment folder to be in the project directory. For that, pipenv offers a configuration which can be activated via the PIPENV_VENV_IN_PROJECT
environment variable. Just set it in your ~/.bashrc
file or equivalent.
export PIPENV_VENV_IN_PROJECT=1
See doc for details, this tutorial, and this issue.
And finally you can set your .vscode/settings.json
with a fix virtual environment folder across projects:
"python.pythonPath": ".venv/bin/python"
Pipenv: some basic commands
If you are completely new to pipenv
, I highly recommend this short tutorial.
Description | shell |
---|---|
Open virtual environment |
cd <project_directory> pipenv shell
|
Exit virtual environment | exit |
First setup, install all packages | pipenv install --dev |
Add package | pipenv install <package> --dev |
Import dependencies from requirement.txt | pipenv install -r <requirement.txt> |
Locking may take a lot of time |
# => Locking [packages] dependencies... # => Locking ...
|
Install dependencies from Pipfile.lock | pipenv sync |
Install dependencies from Pipfile.lock on system (for docker) | pipenv install --system --deploy --ignore-pipfile |
Update dev environment |
pipenv --rm pipenv install --dev
|
Upgrade packages | pipenv update |
Update lock file | pipenv lock |
Display dependency in the requirement.txt fashion | pipenv lock -r |
Visual Studio Code
Visual studio code is a versatile code editor, which natively integrates with python
.
One of its advantages are:
- Cross platform (Windows, macOS and Linux)
- Already built-in pythonic functionnalities
- Advanced customization settings
- Common programing task like renaming, code snippets, and other editing sugar
- Also integrates with
liveshare
, which make remote pair programing possible !
We present two more features that’s worth noting: debugging and workspace settings.
Visual Studio Code: Debugging tools
Apart from the great vscode debugging support, vscode
also supports Jupyter Notebooks natively through the Python interactive windows, which enables you to:
- Work Jupyter-like code cells
- Run code in the Python Interactive Window
- View, inspect, and filter variables using the Variable explorer and data viewer
- Debug a Jupyter notebook
- Export a Jupyter notebook
For the vscode interactive window to be active, you need these three packages: jupyter
, ipykernel
and notebook
.
Here are the installation commands with pipenv:
# Jupyter
pipenv install jupyter --dev
# ipykernel
pipenv install ipykernel --dev
# notebook
pipenv install notebook --dev
Visual Studio Code: Workspace settings
Workspace settings makes settings specific to a project, they make the development process easier and easily shareable with others. Configurations is made through file located in the .vscode folder at root. Here we present two
- settings.json
- launch.json
settings.json
gather all general settings specific to the current project.
launch.json
specify the type of debugging scenarios. One cool thing about launch.json
is that it has Platform-specific properties which means you can have specific launch commands depending on your OS.
Linters
Linting enforces coding rules by highlighting syntactical and stylistic problems in your Python source code. It often helps you identify and correct subtle programming errors or unconventional coding practices that can lead to errors.
More details from code.visualstudio.com:
For example, linting detects use of an uninitialized or undefined variable, calls to undefined functions, missing parentheses, and even more subtle issues such as attempting to redefine built-in types or functions. Linting is thus distinct from Formatting because linting analyzes how the code runs and detects errors whereas formatting only restructures how code appears.
Linters: pylint
By looking for programming errors pylint
helps to enforce a coding standard. It also gives simple refactoring suggestions. pylint
is the default linter for vscode.
Installation via pipenv
pipenv install pylint --dev
And add this configuration line to the vscode settings.json
file
"python.linting.enabled": true
Out there, other linters are available: flake8, pep8 (just to mention two of them).
Linters: mypy
mypy is static type checker. Since type hint were released in version 3.5
but as the Python runtime does not enforce function and variable type annotations, a type checker is needed if you want to enable type checking.
Installation via pipenv
pipenv install mypy --dev
Update the vscode settings.json
file
"python.linting.mypyEnabled": true
For the configuration of mypy
, it uses by default the mypy.ini
file with fallback to setup.cfg
.
Formatter
Restructure your code with a formatting tool.
Black
black
is code formatter which does not require configuration. It integrates with vscode as well.
pipenv install black --dev --pre
Using the --pre
switch was mandatory because black is not currently released. See this issue for more informations.
To make it work with vscode I added this configuration lines in settings.json
"[python]": {
"editor.formatOnSave": true
},
"python.formatting.provider": "black"
It’s also possible to sort Python import definitions alphabetically with isort
.
Another useful way to share coding styles accross IDEs is using the
.editorconfig
, see this for more info.
Testing
Pytest
pytest
is a full-featured Python testing tool. It is already used by a lot of repositories.
Installation with pipenv
pipenv install pytest --dev
Then update vscode settings.json
file with these lines
"python.testing.pytestEnabled": true,
"python.testing.pytestArgs": [
"tests"
]
To customize pytest, your configuration must go in either one of these files: pytest.ini
, tox.ini
and setup.cfg
.
For discovery, pytest
usually search for files called like test_*.py
or *_test.py
and then looks for functions and methods prefixed by test. See this for the full explanation. As an alternative, pytest
also discover natively unittest and nosetest.
I could not omit to talk about tox
, a tool that automates and standardizes testing in Python. It integrates easily with pytest
. What does tox
do ? Basically it will creates a virtual environment a run the tests for you, as well as checking the package installation. Consequently, it will make your life easier when you go for a continuous integration workflow.
Code coverage
Coverage measurement is used to gauge the effectiveness of tests. It can show which parts of your code are being exercised by tests, and which are not.
For this task we use Coverage.py.
Here is how you can install it with pipenv
.
pipenv install coverage --dev
pipenv shell
coverage erase # clears previous data if any
coverage run --source='.src' -m pytest
coverage report # prints to stdout
coverage html # creates ./htmlcov/*.html including annotated source
We can then upload it as an artifact with github action. It enables us to download the coverage report in github action tab.
Github integrates also with codecov, and make it easier to vizualise report.
You can also add comment on a pull request with this action.
Documentation
Sphinx is a tool that makes it easy to create intelligent and beautiful documentation. Originally created for the Python documentation, it’s used by a wide range of projects. It can output the documentation in HTLM and LaTeX (among other formats).
Sphinx
has a lot of built-in extensions, just to name a few interesting ones:
- sphinx.ext.coverage: collect doc coverage stats
- sphinx.ext.graphviz: add graphiz graphs support, and combined it with
pyreverse
would be great. - sphinx.ext.mathjax: render math via javascript
Installation with pipenv
:
pipenv install sphinx --dev
pipenv install sphinx_rtd_theme --dev
Documentation: initialization
cd docs
sphinx-quickstart
The quickstart will ask you a few questions and you are almost ready. As for now, version 3.0.4, tt creates 4 files conf.py
, index.rst
, Makefile
, make.bat
You should now populate your master file index.rst
and create other documentation source files. Use the Makefile to build the docs, like so: make builder
where “builder” is one of the supported builders, e.g. html, latex or linkcheck.
Edit conf.py
file like so
import os
import sys
sys.path.insert(0, os.path.abspath("../src"))
extensions = ["sphinx.ext.autodoc"]
html_theme = "sphinx_rtd_theme"
And then edit the index.rst
file
modules
You can then run the following commands to build a basic documentation
cd docs
# might delete *.rst (except index.rst) files before
sphinx-apidoc -o . ../src/climbingboard --ext-autodoc
make html
readthedocs is a service that allows you to create, host and browse documentation. It has a complete tutorial to integrate with
sphinx
One alternative to sphinx is
mkdocs
.
Python package
By default, in python terminology, a folder is package, a file is a module, and a module contains definitions and statements. The file name is the module name with the suffix .py appended. __init__.py is required to import the directory as a regular package, and can simply be an empty file. More information here.
We then need a build script for setuptools. It tells setuptools about your package (such as the name and version) as well as which code files to include. It’s commonly done in a setup.py
located at the root of the repository. I personnaly prefer the configuration way, with a setup.cfg
file.
If you are curious, take a look at Poetry, it makes python packaging and dependency management easy.
Git
- Download and install the latest version of git.
- Configure git with your username and email.
git config --global user.name 'your name' git config --global user.email 'your email'
- Clone the git repository ..
git clone <repository>
- .. or add origin
git remote add origin <repository> git branch -u origin/master
Be sure to create a .gitignore
file and set it properly.
It’s always good to have issue and pull request templates. These are located in
├── .github
└── ISSUE_TEMPLATE.md
└── PULL_REQUEST_TEMPLATE.md
Continuous integration
From october 2018 GitHub Actions enables developers to automate, customize, and execute workflows directly in their repositories. By workflow, I mean build, test, package, release, or deploy your software. Besides the complete built-in continuous integration service within github, it has another two interesting features:
- Built in secret store
- Multi-container testing, to play with
docker-compose
There are two well known service: Travis and Jenkins, but as my requirements are not that high, and for the sake of simplicity I choose github action.
To go further: artifact and publishing package are common tasks, as if you want to register a Docker container image to a package provider, and are supported by github action. Go there to know more about packaging.
Github Action: Add a badge
You can easily add a workflow status badge associated with a github workflow.
For example to show on your readme:
![](https://github.com/<OWNER>/<REPOSITORY>/workflows/<WORKFLOW_FILE_PATH>/badge.svg)
Next Steps
Project template and scaffolding
There are various way to help creating a project from scratch, each one having its own specificity:
- yeoman: code generator ecosystem, cross platform and technology agnostic
- cookiecutter: a command line utility to create python projects from templates
- github template: not a scaffolding tool, but an easy way to reuse a repository without having to fork it.
Database
The choice of the database is a crucial point when designing an app. If your database has to be centralized, you can take a look at Parse server and Firebase. Check this article for more informations.
If you want an easy to go local database, take a look at tinydb a minimal document oriented database, and ZODB an object oriented database.
Configuration files
There’s a lot of way to save configurations/settings. Apart from saving settings in the database, it’s common to have configuration files in these formats:
- json
- yaml
- py (yes, just a classic
py
file)
Check this excellent post to know more about how to implement it.
Zen of Python
Just to remember, open python and run import this
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!