Skip to content

Put fewer packages on PyPI #487

Open
@mattip

Description

@mattip

This project requires an out-of-proportion amount of storage space on PyPI. This is problematic since the storage space is donated and the general assumption is that projects will not over-use the resources. In order to analyze what is going on, let's look at some data.

Each release of the project creates these artifacts (taken from the 4.35.1 release)

cp27-cp27m-macosx_10_9_x86_64.whl, 684.1 kB
cp27-cp27m-manylinux1_i686.whl, 2.4 MB
cp27-cp27m-manylinux1_x86_64.whl, 2.6 MB
cp27-cp27m-manylinux2010_i686.whl, 2.4 MB
cp27-cp27m-manylinux2010_x86_64.whl, 2.6 MB
cp27-cp27mu-manylinux1_i686.whl, 2.4 MB
cp27-cp27mu-manylinux1_x86_64.whl, 2.6 MB
cp27-cp27mu-manylinux2010_i686.whl, 2.4 MB
cp27-cp27mu-manylinux2010_x86_64.whl, 2.6 MB
cp35-cp35m-macosx_10_9_x86_64.whl, 698.8 kB
cp35-cp35m-manylinux1_i686.whl, 2.9 MB
cp35-cp35m-manylinux1_x86_64.whl, 3.1 MB
cp35-cp35m-manylinux2010_i686.whl, 2.9 MB
cp35-cp35m-manylinux2010_x86_64.whl, 3.1 MB
cp35-cp35m-manylinux2014_aarch64.whl, 3.7 MB
cp35-cp35m-win32.whl, 354.9 kB
cp35-cp35m-win_amd64.whl, 422.1 kB
cp36-cp36m-macosx_10_9_x86_64.whl, 700.7 kB
cp36-cp36m-manylinux1_i686.whl, 3.0 MB
cp36-cp36m-manylinux1_x86_64.whl, 3.3 MB
cp36-cp36m-manylinux2010_i686.whl, 3.0 MB
cp36-cp36m-manylinux2010_x86_64.whl, 3.3 MB
cp36-cp36m-manylinux2014_aarch64.whl, 3.8 MB
cp36-cp36m-win32.whl, 383.6 kB
cp36-cp36m-win_amd64.whl, 451.7 kB
cp37-cp37m-macosx_10_9_x86_64.whl, 704.6 kB
cp37-cp37m-manylinux1_i686.whl, 3.0 MB
cp37-cp37m-manylinux1_x86_64.whl, 3.2 MB
cp37-cp37m-manylinux2010_i686.whl, 3.0 MB
cp37-cp37m-manylinux2010_x86_64.whl, 3.2 MB
cp37-cp37m-manylinux2014_aarch64.whl, 3.8 MB
cp37-cp37m-win32.whl, 381.7 kB
cp37-cp37m-win_amd64.whl, 452.2 kB
cp38-cp38-macosx_10_9_x86_64.whl, 730.4 kB
cp38-cp38-manylinux1_i686.whl, 4.0 MB
cp38-cp38-manylinux1_x86_64.whl, 4.2 MB
cp38-cp38-manylinux2010_i686.whl, 4.0 MB
cp38-cp38-manylinux2010_x86_64.whl, 4.2 MB
cp38-cp38-manylinux2014_aarch64.whl, 4.8 MB
cp38-cp38-win32.whl, 394.0 kB
cp38-cp38-win_amd64.whl, 479.7 kB
cp39-cp39-macosx_10_9_x86_64.whl, 734.2 kB
cp39-cp39-manylinux1_i686.whl, 3.5 MB
cp39-cp39-manylinux1_x86_64.whl, 3.8 MB
cp39-cp39-manylinux2010_i686.whl, 3.5 MB
cp39-cp39-manylinux2010_x86_64.whl, 3.8 MB
cp39-cp39-manylinux2014_aarch64.whl, 4.3 MB
cp39-cp39-win32.whl, 392.6 kB
cp39-cp39-win_amd64.whl, 479.4 kB
pp27-pypy_73-macosx_10_9_x86_64.whl, 501.4 kB
pp27-pypy_73-manylinux1_x86_64.whl, 543.0 kB
pp27-pypy_73-manylinux2010_x86_64.whl, 543.0 kB
pp27-pypy_73-win32.whl, 342.4 kB
pp36-pypy36_pp73-macosx_10_9_x86_64.whl, 498.5 kB
pp36-pypy36_pp73-manylinux1_x86_64.whl, 542.1 kB
pp36-pypy36_pp73-manylinux2010_x86_64.whl, 542.1 kB
pp36-pypy36_pp73-win32.whl, 300.8 kB
pp37-pypy37_pp73-macosx_10_9_x86_64.whl, 498.5 kB
pp37-pypy37_pp73-manylinux1_x86_64.whl, 542.1 kB
pp37-pypy37_pp73-manylinux2010_x86_64.whl, 542.0 kB
pp37-pypy37_pp73-win32.whl, 300.8 kB

I think I left out the source tarball. This sums up to 122GB 122MB per release. The project has had about 50 releases in the first half of 2021, sometimes multiple releases on a single day. This comes out to about 12 TB 12GB a year. It seems this project has under 2000 downloads a month. Scipy, by comparision, ships 18 wheels, each about 30MB, twice a year for 30GB of yearly storage and has about 30 million downloads a month (take those statistics with a grain of salt, they say the last version of this package is 1.2.0).

So how can you reduce the resource requirements by three orders of magnitude?

  • Release a pure-python version of the package. This would reduce both the number of wheels and the size. Is it clear that the cython speed is a requirement of the project? Note this would not preclude building wheels for the "more important" platforms, pip install will prefer binary wheels to pure python ones. You may be interested in refactoring the code to use the "pure python" mode available in cython 3.0, which will make supporting both modes in the codebase simpler.
  • Release 4 times a year instead of ~100 times a year. (a 25x reduction)
  • Do not release both manylinux1 and manylinux2010 packages (a 2x reduction). I would stick with manylinux2010, but you know your users better than I do.
  • Drop older versions of python (3.5, 3.6, pypy2.7, pypy3.6) (around a 2x reduction)
  • Strip the builds. I see you use cibuildwheel, there is a discussion on how to do this Strip debug symbols of wheels pypa/cibuildwheel#331 (maybe ~3x reduction, maybe more?).

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions