To download wheel dependencies for various offline machines using a laptop in a cafe with a dicey flakey Internet connection, I would like to know the URLs of the files to fetch so I can use a more robust tool like aria2.
A command like the following is very fragile:
$ python -m pip download --verbose -d ./wheel_cache/ argostranslate
It can only handle the job if the uplink is fast and reliable. The logs of a download look like this:
…
Collecting sacremoses<0.2,>=0.0.53
Downloading sacremoses-0.1.1-py3-none-any.whl (897 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 897.5/897.5 kB 645.5 kB/s eta 0:00:00
…
It’s a shame the URLs are concealed despite having verbosity, even when there are errors (which is when the URL is most needed). These docs show no way of increasing log verbosity.
UV – not ideal but the UV method still worth discussion
I heard a suggestion that UV could likely reveal the URLs. UV is becoming a popular replacement for pip* tools. But for me it’s not mature or popular enough (judging by non-existence in official Debian repos). For the record, it would still be interesting to document how to use UV to derive the URLs.
Hacking
I am most interested in deriving the URLs using Debian-supported apps. Hacker methods welcome. E.g. clever use of strace with a pip* command. Though I don’t strace would see URLs, perhaps something like privoxy or socat would. Of course this could be quite tedious because the pip* commands have no simulation mode, so pip must first be given an opportunity to fetch every file. When a fetch fails on one file, it terminates, which would force us to feed 1 URL to Aria2 at a time and do a manually intensive serial procedure.
@evenwicht The --debug option will show URLs. But you’d probably have an easier time with unearth (https://pypi.org/project/unearth/), which by default will just print the URL and other metadata without downloading the wheel file. Or, the API offered by PyPI is a published standard (https://packaging.python.org/en/latest/specifications/simple-repository-api/), so in principle you can get wheel URLs by just following the instructions in that standard - you can even open some (rather large) HTML pages in your browser and click a few links and get the wheel URLs that way.
Thanks! I noticed
python3-unearthis a debian pkg so that looks like a good first approach for me to try.
can’t you just like patch pip to output/log the url?
Sounds feasible for a python dev, which I am not. The situation is that end-users of Python apps (who are not necessarily programmers of any kind) are put in a position of grappling with developer tools.
I’m not only looking for a better approach for my own installation of argostranslate but I also intend to publish the improved approach. Although I could probably work out how to patch pip, it gets messy when putting that patch in an argostranslate installation guide for end users. So patching pip would be high effort with low return (collectively, unless the patch gets a PR, but then only an MS Github user could do a PR).


