This post describes steps in creating a Python package. If you are looking for information on installing packages this is done using Python PIP.
Python packaging is in a constant state of flux. There is the official Python Packaging User Guide and the Python Packaging Authority (PyPA) which is probably the best resource to read but things change, and often quickly. The focus here is on the PyPA Setuptools using pyproject.toml
which works with Python >= 3.7, but you may wish to consider other packages such as Poetry or PDM which offer some advantages but with additional frameworks to learn.
A few examples of Python packages that I have packaged are listed below, most have also been released to PyPI.
Package Structure
You should place your code within a Git version controlled directory for your project. It is then normal to place all files in an organised hierarchy with a sub-directory of the same name for Python code, known as a "flat" structure and tests under tests
directory. It is possible to have more than one directory containing code but for now I'm sticking to the flat structure.
.
├── ./build
├── ./dist
├── ./
├── ./my_package
├── ./my_package/__init__.py
├── ./my_package/module_a.py
├── ./my_package/module_b.py
├── ./my_package/something/module_c.py
└── ./tests
├── ./tests/conftest.py
├── ./tests/resources
├── ./tests/test_module_a.py
├── ./tests/test_module_b.py
└── ./tests/something/test_module_c.py
__init__.py
In older versions of Python (<3.3) a __init__.py
was required in every directory and sub-directory that was to be a module/sub-module. In more recent versions of Python (>\=3.3) they are not essential though as Python uses namespace packages. But in most cases its simpler to include such a file in the top level of your directory. __init__.py
files can be completely empty or they can contain code that is used throughout your package, such as setting up a logger.
Configuration pyproject.toml
Package configuration has been and is in a state of flux, there was originally setup.py
which was then complemented and gradually replaced by setup.cfg
. The new method on the block though is pyproject.toml
which, with a little tweaking and judicious choice of packages can handle everything.
Setuptools is shifting towards using pyproject.toml
and whilst it is still under development its already highly functional. It’s written in Tom's Obvious Minimal Language and isn't too dissimilar in structure to setup.cfg
.
A useful reference for writing your configuration in pyproject.toml
is Configuring setuptools using pyproject.toml files. It is based around PEP 621 – Storing project metadata in pyproject.toml | peps.python.org.
A bare-bones pyproject.toml
file should reside in the top level of your directory with the following (NB This includes the minimum versions and setuptools_scm
extension for dynamically setting package version)…
build-system
[build-system]
requires = ["setuptools>=65.6.3", "setuptools_scm[tools]>=6.2", "wheel"]
build-backend = "setuptools.build_meta"
Traditionally configuration of meta-data such as author, code repository and license was made via setup.py
but you can either specify some (or most) of this in pyproject.toml
or a concurrent setup.cfg
.
project
This is the main body of the project description detailing name
, authors
, description
, readme
, license
, keywords
, classifiers
, dependencies
and version
amongst other things.
The type of license you have chosen to apply to your package. For guidance see Choose an Open Source License.
The README
of your package which may be in Markdown or Restructured Text.
Sets the components of your package which are set dynamically. In this example we only set the version dynamically using setuptools_scm
.
The dependencies
are those that are required for running the code. They should not include packages that are required for development (e.g. black
. flake8
, ruff
, pre-comit
, pylint
etc.), nor those required for testing (e.g. pytest
, pytest-regtest
, pytest-cov
etc.), documentation (e.g. Sphinx
, numpydoc
, sphinx_markdown_table
, sphinx-autodoc-typehints
, sphinxcontrib-mermaid
etc.) as these are defined in a separate section.
[project]
name = "my_package"
authors = [
{name = "Author 1", email="author1@somewhere.com"},
{name = "Author 2", email="author2@somewhere.com"},
{name = "Author 3", email="author3@somewhere.com"},
]
description = "A package that does some magic!"
license = "GNU GPLv3 only"
readme = "README.md"
dynamic = ["version"]
dependencies = [
"numpy",
"pandas",
"tqdm",
]
All other sections are considered subsections, either of project
or tool
and are defined under their own heading with [project|tool].<package>[.<options>]
.
project.urls
These are important as they define where people can find the Source
, Documentation
and Bug_Tracker
amongst other things. There may be more fields that can be configured here but I've not used the yet. Substitute these to reflect where your package is hosted, your username and the package name.
[project.urls]
Source = "https://gitlab.com/username/my_package"
Bug_Tracker = "https://gitlab.com/username/my_package/issues"
Documentation = "https://username.gitlab.com/my_package"
project.optional-dependencies
This is where you list dependencies that are not required for running a package but are required for different aspects such as development, documentation, publishing to PyPI, additional Notebooks and so forth, the options are limitless.
[project.optional-dependencies]
dev = [
"black",
"flake8",
"Flake8-pyproject",
"pre-commit",
"pylint",
"ruff",
]
docs = [
"Sphinx",
"myst-parser",
"numpydoc",
"pydata_sphinx_theme",
"sphinx-autodoc-typehints",
"sphinx_markdown_tables",
"sphinxcontrib-mermaid",
]
pypi = [
"build",
"pytest-runner",
"setuptools-lint",
"setuptools_scm",
"twine",
"wheel"
]
test = [
"pytest",
"pytest-cov",
]
notebooks = [
"ipython",
"ipywidgets",
"jupyter_contrib_nbextensions",
"jupyterthemes",
]
project.scripts
(Entry Points)
Entry points or scripts
are a neat method of providing a simple command line interface to your package that links directly into a specific module to provide a command line interface to your programme.
These are defined under project.scripts
section.
[project.scripts]
tcx2gpx = "tcx2gpx:process"
tool
tool.setuptools
setuptools is perhaps the most common package for configuring Python packages and is the one that is being exposed here. Its configuration is multi-level depending on which component you are configuring.
tool.setuptools.packages.find
Uses the find
utility to search for packages to include, based on my understanding it looks for __init__.py
in a directory and includes it (see above note about these no longer being required in every directory). Typically you would want to exclude tests/
from a package you are making as most users won’t need to run the test suite (if they do they would clone from the source repository).
[tool.setuptools.packages.find]
where = ["."]
include = ["tcx2gpx"]
exclude = ["tests"]
tool.setuptools.package-data
This allows additional, non .py
files to be included, they are listed on a per package basis and are a table (in toml parlance, list in Python terms).
[tool.setuptools.packages-data]
tcx2gpx = ["*.yaml", "*.json"]
tool.pytest
[tool.pytest.ini_options]
minversion = "7.0"
addopts = "--cov --mpl"
testpaths = [
"tests",
]
filterwarnings = [
"ignore::DeprecationWarning",
"ignore::UserWarning"
]
tool.black
[tool.black]
line-length = 120
target-version = ["py38", "py39", "py310", "py311"]
exclude = '''
(
/(
\.eggs # exclude a few common directories in the
| \.git # root of the project
| \.venv
)/
)
'''
tool.flake8
The developers of Flake8 will not be supporting pyproject.toml
for configuration. This is a shame but a work around is available in the form of Flake8-pyproject. Make sure to add this to your requirements section to ensure it is installed when people use pre-commit
.
[tool.flake8]
ignore = ['E231', 'E241']
per-file-ignores = [
'__init__.py:F401',
]
max-line-length = 120
count = true
tool.setuptools_scm
setuptools_scm is a simple to use extension to setuptools that dynamically sets the package version based on the version control data. It is important to note that by default setuptools_scm
will attempt to bump the version of the release. The following configuration forces the use of the current git tag
.
[tool.setuptools_scm]
write_to = "tcx2gpx/_version.py"
version_scheme = "post-release"
local_scheme = "no-local-version"
git_describe_command = "git describe --tags"
tool.ruff
ruff is a Python linter written in Rust which is therefore very fast. It provides the same functionality as black
, flake8
and pylint
and can auto-correct many issues if configured to do so. A GitHub Actions is also available. I'd recommend checking it out.
[tool.ruff]
fixable = ["A", "B", "C", "D", "E", "F", "R", "S", "W", "U"]
unfixable = []
Versioning
Typically the version is defined in the __version__
variable/object in the top-level __init__.py
or as a value in [metadata]
of either setup.cfg
or pyproject.toml
but this has some downsides in that you have to remember to update the string manually when you are ready for a release and it doesn't tie in with using tags in Git to tag versions of your commits.
It is worth taking a moment to read and understand about Semantic Versioning which you are likely to use when tagging versions of your software to work with setuptools_scm
.
Setuptools-scm
setuptools_scm is simpler to setup and use than versioneer as it relies solely on configuration via pyproject.toml
rather than being dependent on now deprecated setup.py
.
As shown above you should have set the minimum versions of "setuptools>=45"
and "setuptools_scm[toml]>=6.2"
, dynamic = ["version"]
under project
and set the write_to = "pkg/_version.py"
(NB substitute pkg
for your package directory, whether its src
or the package name).
-system]
[build= ["setuptools>=65.6.3", "setuptools_scm[toml]>=6.2"]
requires
[project]= ["version"]
dynamic
[tool.setuptools_scm]"pkg/_version.py"
write_to = "post-release"
version_scheme = "no-local-version"
local_scheme = "git describe --tags" git_describe_command
Including Version in Sphinx Documentation
If you have Sphinx documentation you can add the following to docs/conf.py
from importlib.metadata import version
= version("myproject")
release = ".".join(release.split(".")[:2]) version
Building your Package
Generate Distribution Archive
In your package directory you can create a distribution of your package with the latest versions of setuptools
and wheel
. To do this in your virtual environment run the following. The documentation for how to do this is at Building and Distributing Packages with Setuptools.
[build-system]
requires = [
"setuptools >= 65.6.3",
"wheel",
]
build-backend = "setuptools.build_meta"
The package can now be built locally with…
python -m pip install --upgrade setuptools wheel
python -m build --no-isolation
…and the resulting package will be generated in the dist/
directory.
Publishing to PyPI
Before pushing the package to the main PyPi server it is prudent to test things out on TestPyPI first. You must first generate an API Token from your account settings page. It needs a name and the scope should be `Entire account (all projects)`. This token will be shown once so do not navigate away from the page until you have copied it.
You use twine to upload the package and should create a .pypirc
file in the root of the package directory that contains your API key and the username __token__
. For the TestPyPI server it follows the following format.
[testpypi]
username = __token__
password = pypi-dfkjh9384hdszfkjnkjahkjfhd3YAJKSHE0089asdf0lkjsjJLLS_-0942358JKHDKjhkljna39o854yurlaoisdvnzli8yw459872jkhlkjsdfkjhasdfadsfasdf
Once this is in place you are ready to use twine
to upload the package using the configuration file you have just created.
twine upload --config-file ./.pypirc --repository testpypi dist/*
Testing Download
After having uploaded your package to the TestPyPI server you should create a clean virtual environment and try installing the package from where you have just uploaded it. You can do this using pip
and the --index-url
and --extra-index-url
, the former installs your package from TestPyPI, the later installs dependencies from PyPI.
pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ your-package
Once installed you can try running the code, scripts or notebooks associated with the package as you would normally.
Repeat for PyPI
Once you are happy this is working you can repeat the process on the main PyPI server. You can add the token that you generate to /.pypirc
under a separate heading.
[testpypi]
username = __token__
password = pypi-dfkjh9384hdszfkjnkjahkjfhd3YAJKSHE0089asdf0lkjsjJLLS_-0942358JKHDKjhkljna39o854yurlaoisdvnzli8yw459872jkhlkjsdfkjhdfJZZZZZF
[pypi]
username = __token__
password = pypi-dfkjh9384hdszfkjnkjahkjfhd3YAJKSHE0089asdf0lkjsjJLLS_-0942358JKHDKjhkljna39o854yurlaoisdvnzli8yw459872jkhlkjsdfkjhdfJZZZZZF
GitHub Action
Manually uploading is somewhat time consuming and tedious. Fortunately though with setuptools_scm
in place and tokens generated we can automate the process of building and uploading packages to PyPI using the GitHub Action gh-action-pypi-publish (read more about GitHub Actions). You will have already generated a PYPI token (and similarly one for test PyPI) and these can stored on the projects GitHub account under Settings > Secrets > Actions with the names PYPI_API_TOKEN
and TEST_PYPI_API_TOKEN
respectively. You can then add the following GitHub Action under .github/workflow/pypi.yaml
.
name: Publish package to PyPi
on:
push:
tags:
- v*
jobs:
build-release:
runs-on: ubuntu-latest
name: Publish package to PyPi
steps:
- uses: actions/checkout@v3
with:
fetch-depth: 0
- name: Setup Python
uses: actions/setup-python@v4.3.0
with:
python-version: 3.9
cache: 'pip'
- name: Installing the package
run: |
pip3 install .
pip3 install .[pypi] - name: Build package
run: |
python -m build --no-isolation - name: Publish package to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
user: __token__
password: ${{ secrets.PYPI_API_TOKEN }}
Releasing via GitHub
With setuptools_scm
in place and a GitHub Action setup and configured it is now possible to make a release to PyPI via GitHub Releases.
- Go to the Releases page (its linked from the right-hand side of the front-page).
- Draft a New release.
- Create a new tag using semantic versioning and select “Create new tag v#.#.# on publish”.
- Click the "Generate Release Notes" button, this adds all the titles for all Pull Requests, I'll often remove all these but leave the link to the
ChangeLog
that is generated for the release. - Write your release notes.
- Select "Set as latest release".
- Select "Create a discussion for this releases" and select "Announcements".
- Click on "Publish Release".
Packaging Frameworks
There are some frameworks that are meant to ease the pain of this process and make it easier. I'm yet to test these for two reasons. Firstly I wanted to understand what is going on rather than learn another framework. Secondly it was an additional framework to learn.
PDM
PDM (Python package and Dependency Manager) handles all stages of setting up and creating a package and managing its dependencies. In essence its a tool for interactively generating the configuration files described above. I've not yet.
Poetry
Poetry is another package for managing packaging and dependencies. Again, I've not yet used it.
Links
- PyPA : Building and Distributing Packages with Setuptools
- PyPA : Specifications
- Packaging Python Projects
- Python package structure information — pyOpenSci Python Packaging Guide
- Packaging Data files in a Python Distribution
- PDM - Python package and Dependency Manager
- Why you shouldn't invoke setup.py directly
- python-versioneer/python-versioneer: version-string management for VCS-controlled trees
- pypa/setuptoolsscm: the blessed package to manage your versions by scm tags
- rye one-shop-stop for Python
Reuse
Citation
@online{shephard2023,
author = {Shephard, Neil},
title = {Python {Packaging}},
date = {2023-03-25},
url = {https://blog.nshephard.dev/posts/python-packaging/},
langid = {en}
}