My Python Project Setup
Posted: | More posts about pythonI have 8+ Python Open Source projects on github.com and on codeberg.org. This post describes my current set of tools and practices I use for maintaining them.
We have a pretty active Python community in Zalando, so I could learn some good practices from colleagues who are much more experienced than me. I would not have found or adopted some of the tools without my helpful Zalando colleagues.
My setup for Python projects includes:
Python 3.7+
Poetry for dependency management
Make to leverage muscle memory
black for code formatting
mypy for type checking
py.test for unit and e2e tests
pre-commit hooks to run formatting and linting
ReadTheDocs for documentation
CalVer for releases
Python 3.7+
I try to keep my projects up-to-date with the latest Python features. I make use of more recent features such as:
pathlib (3.4+): nicer path handling (replaces most uses of the
os
module), e.g.path = Path(__file__).parent / "myfile.txt"
f-strings (3.6+): fast inline template strings:
f"Hello {name}!"
instead of"Hello {}".format(name)
asyncio (3.7+): write concurrent code with async/await
My projects therefore require at least Python 3.7.
I use pyenv on my local computer to get the latest Python version (3.8.1). See Real Python's blog series of cool new Python features: 3.7, 3.8.
Poetry
I use Poetry for dependency and package management. Poetry uses virtualenvs, has a better dependency resolver than Pipenv, and implements PEP 518 (aka pyproject.toml
).
I used Pipenv before and converted the Pipfiles with dephell to pyproject.toml
configuration (still needed some manual work afterwards).
Make
Leveraging muscle memory is powerful and make
is one of those things easy to remember and first to try in a repo.
GNU Make is pretty ubiquitous and can safely be assumed to be present on a developer machine, so by trial and error any dependency (like Poetry) will be discovered:
make make: poetry: Command not found # <-- ah, "poetry" is required! Makefile:3: recipe for target 'install' failed make: *** [install] Error 127
My standard targets are make lint
and make test
.
This blog post had some nice learnings for me (but I did not follow all advice).
black & Flake8
Code formatting can spark heated and unnecessary debates and I was jealous of Go having go fmt
, so I was very happy to see black come along:
black is a non-compromising code formatter for Python - it has nearly no options to tweak and therefore sets
a standard across the globe. Luckily Python already had PEP8 and Flake8, so black is merely a tool to achieve
standards-compliance without human effort.
There are still some dark corners with black+Flake8:
black sometimes generates code which Flake8 complains about, so we need to tell Flake8 to ignore these violations.
This can be done via .flake8
:
ignore E501: black won't always ensure a max line length, e.g. it won't linebreak docstrings or comments
ignore E203: black has problems formatting
mylist[len(prefix) :]
mypy
I started using mypy for type checking.
This was triggered by some bug I introduced months ago:
I refactored a function signature and had no tests for it --- tests would have catched the bug,
but mypy also would have covered it. I introduced mypy instead of adding tests --- shame on me!
Having mypy cover these cases is better than nothing and typing can gradually be improved: specific lines can be ignored by adding a # type: ignore
comment.
py.test
Using py.test as a test framework instead of the "old" unittest library does not need to be elaborated: it's just so much easier to use and less code to write! Example with asserting a certain exception message:
def test_invalid_weekday_range(): # Monday, November 27th 2017 dt = datetime(2017, 11, 27, 15, 33, tzinfo=timezone.utc) with pytest.raises(ValueError) as excinfo: matches_time_spec(dt, "Sun-Fri 15:30-16:00 UTC") assert "invalid range (Sun is after Fri)" in str(excinfo.value)
I created pytest-kind as a py.test plugin to support e2e testing with a local kind Kubernetes cluster.
Pre-Commit Hooks
Do you know the pre-commit framework? I did not, but fell in love with it recently!
The framework allows to configure pluggable hooks to check all kind of different files, e.g:
make sure that file endings are consistent (also important when working with Windows colleagues)
strip unnecessary whitespace (avoids unnecessary git diffs)
validate YAML/Dockerfile/... syntax
validate Kubernetes manifests (easy to get some deployment spec wrong)
format Python code with black
All code formatting (black) and linting are executed via pre-commit hooks on Travis CI. make lint
runs pre-commit on all files, e.g.:
$ make lint poetry run pre-commit run --all-files Check hooks apply to the repository.........................Passed Check for useless excludes..................................Passed Check Kubernetes manifests..................................Passed Reorder python imports......................................Passed black.......................................................Passed pydocstyle..................................................Passed yamllint....................................................Passed mypy........................................................Passed Dockerfile linter...........................................Passed Check for added large files.................................Passed Check docstring is first....................................Passed Debug Statements (Python)...................................Passed Fix End of Files............................................Passed Flake8......................................................Passed Trim Trailing Whitespace....................................Passed Check python ast............................................Passed Check builtin type constructor use..........................Passed Detect Private Key..........................................Passed Mixed line ending...........................................Passed Tests should end in _test.py................................Passed type annotations not comments...............................Passed use logger.warning(.........................................Passed check for eval()............................................Passed check for not-real mock methods.............................Passed check blanket noqa..........................................Passed
By using the pre-commit git hooks locally, I can ensure quick feedback and don't have to remember doing make lint
manually.
The .pre-commit-config.yaml
file is a helpful abstraction to share common formatting/linting configuration across repositories,
i.e. I can copy .pre-commit-config.yaml
around to apply a common standard to my projects.
My .pre-commit-config.yaml
for Python looks like:
minimum_pre_commit_version: 1.21.0 repos: - repo: meta hooks: - id: check-hooks-apply - id: check-useless-excludes # reorder Python imports - repo: https://github.com/asottile/reorder_python_imports rev: v1.9.0 hooks: - id: reorder-python-imports # format Python code with black - repo: https://github.com/ambv/black rev: 19.10b0 hooks: - id: black # check docstrings - repo: https://github.com/PyCQA/pydocstyle rev: 5.0.2 hooks: - id: pydocstyle args: ["--ignore=D10,D21,D202"] # static type checking with mypy - repo: https://github.com/pre-commit/mirrors-mypy rev: v0.761 hooks: - id: mypy - repo: https://github.com/pre-commit/pre-commit-hooks rev: v2.4.0 hooks: - id: check-added-large-files - id: check-docstring-first - id: debug-statements - id: end-of-file-fixer - id: flake8 additional_dependencies: ["flake8-bugbear"] - id: trailing-whitespace - id: check-ast - id: check-builtin-literals - id: detect-private-key - id: mixed-line-ending - id: name-tests-test args: ["--django"]
ReadTheDocs
Not all my open source projects have dedicated documentation sites, but if I need one, I pick ReadTheDocs with Sphinx to publish documentation. See the Kubernetes Web View Documentation as an example.
Calendar Versioning
I switched all my projects to Calendar Versioning (CalVer). Releases now have a version like YY.MM.MICRO, e.g. 20.1.0 for the first release in January 2020.
Why? I believe SemVer is mostly a lie, it sounds good in theory, but in practice any change can be breaking (e.g. bug fixes) and often nobody knows when to increment the major version:
Kubernetes: nobody knows when to increment from 1.* to 2.*, breaking changes are introduced over multiple releases
Some projects never make it to 1.0 ("ZeroVer", e.g. Cython is still 0.28, but used in production), this was also the case for my personal projects (I never had the courage to make it to version 1.0)
SemVer would only really work if the previous version is maintained so that users can stay with the previous major version and still get bug fixes. I don't plan to support older stable releases for my open source projects, i.e. users don't really have the option to not upgrade (if they want to receive potential bug fixes).
That being said, I still try to keep compatibility and avoid unnecessary breaking changes --- I just won't guarantee it.
A simple release counter would also do it (like Kubernetes does with 1.X where X just increments all the time), but CalVer has some nice benefits:
old versions are immediately visible: "I still use the foo library in version 18.2.0? We have 2020, the version is 2 years old!"
it encourages working in small batches and releasing more often: regular updates with monthly updates are good to stay up-to-date with the environment (all kinds of dependencies update all the time)
I think that SemVer has its merits, but it's not a silver bullet for all projects --- just having a tuple of 3 numbers does not make a semantic version.
Summary
I'm relatively happy with my current collection of tools & practices around Python. There are always new things to learn and tools to discover, e.g. I was surprised to learn about pre-commit only very recently. Anything I can do better? Do you have tips and suggestions? Please let me know on Twitter or Mastodon!