Let’s talk about wheel today, definitely hoping it was a ferris wheel but its going to be python’s wheel for now. ferris wheel

A wheel file is a relocatable package format for distribution of python packages for easy, quick and deterministic installations. All the packages published on pypi have under their Download files section a tar file and whl file.

For example you can head over to Django and download its wheel file which can then be installed simply by doing

pip install Django-5.0.6-py3-none-any.whl

Naming Format

A wheel file name tells us details about the particular package that we are going to install. In our case the Django wheel file tells us that:

PropertyValue
Package/Distribution NameDjango
Version5.0.6
Implementation Languagepy3
ABI (Application Binary Interface)none
Platformany

Contents

A wheel file apart from the package code also contains metadata. To look at the contents of wheel file we can run the following commands to unpack things as its just a zip file.

cp Django-5.0.6-py3-none-any.whl Django-5.0.6-py3-none-any.zip
unzip Django-5.0.6-py3-none-any.zip -d Django-5.0.6-py3-none-any
ls Django-5.0.6-py3-none-any

There are two directory in inside of the uncompressed directory

  • django
  • Django-5.0.6.dist-info

Of these two the dist-info folder is always present in all packages it contains the metadata of the package. The rest packages are generally collected either using automatic discovery or based on findings of

setuptools.find_packages(where = "src")

Distribution Information

The following files are present inside the dist-info directory

➜ ls --tree Django-5.0.6.dist-info
Django-5.0.6.dist-info
├── AUTHORS
├── entry_points.txt
├── LICENSE
├── LICENSE.python
├── METADATA
├── RECORD
├── top_level.txt
└── WHEEL

METADATA contains information about the package name, version, readme, dependencies etc. The WHEEL file consists details about how the package build was generated. Rest files have a self understandable name.

RECORD File

The RECORD file is a CSV file, composed of records, one line per installed file. It ensures that file used during installation are the same ones from which package was built. It is done by keeping digital signatures of each file.

django/views/templates/technical_500.txt,sha256=b0ihE_FS7YtfAFOXU_yk0-CTgUmZ4ZkWVfkFHdEQXQI,3712

Each line is composed of three elements:

  • The path of included file
  • A hash of the file’s contents
  • Size of file in bytes

The hash used here is sha256 earlier it used to be md5sum which is a weak hash. The sha256 hash is encoded in a URL safe base64 encoding with no trailing “=” characters.

Incorrect Package Build

The problem I encountered was that the wheel for a package I received didn’t include all its required dependencies. The person who built this package and knew where code was had left the team :) so I couldn’t really bother him

So the only possible workaround I had to remedy this was to specify the required dependencies of this package inside my module.

no fricking way

I said a big NO! because this is a bad practice. We definitely don’t want to start specifying transitive dependencies of each package that we use it’s their responsibility.

METADATA injection

Armed with information provided above I updated the METADATA file to correctly specify all required additional dependencies.

Requires-Dist: ruptures ==1.1.9
Requires-Dist: numba ==0.59.1

Re-Hash Signature

Given I went through contents of RECORD file it was clear that just updating METADATA won’t be sufficient as the install will start failing due to mis match of expected v/s newly generated digital signature of METADATA.

I took help from our old friends at stackoverflow instead of fighting with AI Chat bots. The suggested way to generate hash was using this python script.

from hashlib import sha256
from base64 import urlsafe_b64encode

file_content = open('METADATA', 'rb').read()
digest = sha256(file_content).digest()
encoding = urlsafe_b64encode(digest)
print(encoding.decode('latin1').rstrip('='))

Given the size of file might have also changed I needed to update that as well

wc -c METADATA

Release New Version

We also have to update the version references of package so that a new one can be released from these places:

  • METADATA
  • Django-5.0.6.dist-info

We can just update from 5.0.6 to 5.0.7 for now.

Now that we have modified required files we need to re compress things and build the wheel file again, which can be done as below

cd Django-5.0.6-py3-none-any
zip -r Django-5.0.7-py3-none-any.zip ./*
cp Django-5.0.7-py3-none-any.zip Django-5.0.7-py3-none-any.whl

What all was left was to push this package to our package repository and then using it.

twine upload -r $REPOSITORY_NAME Django-5.0.7-py3-none-any.whl
pip install $mypackage

thank you

That’s all folks been messing with python for sometime now. You can be sure that I still don’t like it there are too many foot guns but am learning about it somebody has to so that you can keep using your LLMs.