Python Package Management
The addition of Python to the MultiValue (MV) platform, is an exciting innovation from Rocket and promises to open up great opportunities for old and new MV applications. There’s a lot to like about Python but one of the nice things is its extensibility which it achieves via additional packages. These additional packages are available for a large range of uses such as, data science analysis, artificial intelligence, graphing, data transformation, web service interaction and many more. There appears to be no limit to what these additional packages are able to offer. At a recent MultiValue Technology Day, I was asked a question about deploying and managing extension packages that are used as part of a MV Python solution. With that in mind, let’s look at some of the deployment and management options that are available.
Many of these additional packages can be found on the Python Package Index (PYPI), which is the major repository for these additional packages for the Python language. A new version of the PYPI went live on April 16, 2018 so if you have not been there before, or haven’t been for a while, you should go and browse around the site to see what is available. Currently (May 2018), it boasts over 137,000 projects available!
The PYPI is maintained under the umbrella of the Python Packaging Authority (PYPA), which “is an independent group of developers whose goal is to improve and maintain many of the core projects related to Python packaging” (https://pypi.org/help/#maintainers).
So, that’s a bit about the PYPI but how do you use it? Well, let’s assume that you have decided you need to extend your Python installation to include some functionality which is not provided by the Python core and you have decided that what you need is NumPy. I’m not going to go into NumPy here as I’ll just be using it as an example of how to manage Python packages but Mike Rajkowski has written a blog post on NumPy which does go into it in some detail.
The tool used to manage Python packages is pip which comes bundled with Python version 3.4. To check if you have pip, start an operating system command prompt and run:
[box border=”full”]pip –version[/box]
If you get a response telling you what version of pip is installed, then you’re good to go!
Check what you have
You’ll want to know what packages are already installed on your Python installation and that is easy to do by running:
[box border=”full”]pip list[/box]
You’ll be shown a list of installed packages and their release levels, similar to this:
Installing packages using PIP is done with the pip install command. To install NumPy from the PYPI, run:
[box border=”full”]pip install numpy[/box]
If all goes well, you should receive a message saying that numpy has been successfully installed and the release that was installed. Run pip list to check it and you should see it in the list:
To uninstall a package, use the pip uninstall command. To uninstall the NumPy package, run:
[box border=”full”]pip uninstall numpy[/box]
The pip uninstall command will display a message asking for confirmation to uninstall the package before actually uninstalling it. To make sure the package has been uninstalled, check the installed packages with pip list:
Doing more with PIP
When packages are installed using the minimal pip install <package_name> command, a couple of important assumptions are made. Firstly, it is assumed that the latest release of the package is needed and secondly it is assumed that the package is to be downloaded from the PYPI.
That makes for an easy installation but may not be what you want. You might want a different release or you may not want, or be able, to have PIP download the file from the PYPI. Perhaps you already have the source file. Fortunately, there is a lot more you can do with PIP.
With PIP, a requirement specifier is a “format used by pip to install packages from a Package Index” (https://packaging.python.org/glossary/#term-requirement-specifier). The requirement specifier details both the package name and release level.
What this means for package installation is that not only the name of the package but also a particular version of the package can be specified for installation. Let’s have a closer look.
If we search the PYPI for the NumPy package, we’ll find the current (May 2018) version is 1.14.3. Now let’s assume that for some reason, we actually want to install version 1.13.0 instead of the current release. Using the requirements specifier, we can achieve that with the following command:
[box border=”full”]pip install numpy==1.13.0[/box]
When that command is run, PIP will do the following:
- Collect (download) the package for Numpy at release 1.13.0
- Look for other versions of Numpy
- Uninstall other versions of Numpy
- Install the version required – in this case 1.13.0.
After doing that, run pip list to check what is now installed:
So, it has worked and now Numpy 1.13.0 is installed, which is not the latest but it is what we asked for.
There is a lot more that can be done with requirement specifiers regarding the version of a package that is required. Not only can a version be explicitly specified but also we can specify:
- Compatible versions
- Versions to exclude
- Inclusive versions e.g.; install a version between version 1 and version 2
For more information about requirements specifiers, the PIP documentation describes them in more detail (https://pip.pypa.io/en/stable/reference/pip_install/#requirement-specifiers).
Installing from a wheel file
Sometimes, it might be useful to be able to install a package from a source file that is kept locally on disk. Being able to do this will help you to control the installation process because you will know exactly what wheel file is being used to install the package.
But before we get into it, what is a ‘wheel’ file? According to the glossary (https://packaging.python.org/glossary/#term-wheel ), it is a “Built Distribution format introduced by PEP 427, which is intended to replace the Egg format”. So, it’s the file format used to build and distribute Python packages and it has the file extension of “.whl”.
So, how do we install from a wheel file? Well, with PIP of course! But before we can install a package from a wheel file, first we will need to obtain the wheel file. In this case you’ll have to download it from the PYPI first, or obtain it from some other source. More on that a bit later.
Installing from a local wheel file is done simply by supplying the location of the wheel file to PIP. Let’s assume that the Numpy wheel file has been downloaded into the /root/downloads/python/wheels folder as we see below:
[root@rrlinuxvm wheels]# ls -l
-rw-r–r–. 1 root root 12132820 Apr 27 16:15 numpy-1.14.2-cp34-cp34m-manylinux1_x86_64.whl
To install from there, the PIP command is simply:
[box border=”full”]pip install ~/downloads/python/wheels/numpy-1.14.2-cp34-cp34m-manylinux1_x86_64.whl[/box]
PIP will read the wheel file from disk, process it and install the package just as if it had downloaded the package first. At the end of that process, Numpy release 1.14.2 will be installed because according to the wheel file name (numpy-1.14.2-cp34-cp34m-manylinux1_x86_64.whl), that is the version that is on disk. And, indeed it is:
So far, we have looked at installing a package both from a PYPI download and from a locally stored wheel file. Excellent! There is, however, more that can be done.
Imagine this scenario. You have completed writing an application with Python and your application relies on a specific set of packages and specific release levels for those packages. Furthermore, some of those packages require other packages to be installed. Having completed the application, you need a repeatable way of installing only the packages you need at the release level you require. This is where a requirements (https://pip.pypa.io/en/stable/user_guide/#requirements-files) file is useful.
A requirements file is simply a text file containing a list of packages, possibly including release levels, which need to be installed. These requirements can be specified using the requirement specifiers we discussed before.
It’s easier to understand with an example, so let’s see one:
####### req.txt #######
In this sample requirements file (req.txt), it’s simple to see that we want to install:
- beautifulsoup4 at release 4.6.0
- bs4 at release 0.0.1
- lxml at release4.1.1
- numpy at release 1.14.2
- pygal at release 2.4.0
Installing packages using a requirements file
To install packages using a requirements file, use the -r option on pip install to specify the requirements file to use.
[box border=”full”]pip install -r ~/downloads/python/req.txt[/box]
pip will control the installation by downloading the correct release, as specified and installing that release.
Now let’s see what we have installed with pip list:
Now we can see that the packages we need have been installed at the release level we require.
The –r option can be used multiple times to specify multiple requirements files.
Note that if any package or specified release does not exist, pip will display an error and the package installation process will be aborted.
Uninstalling with a requirements file
The pip uninstall command we looked at earlier can also accept a requirements file which, as with installation, is specified by the use of the –r option. Let’s do an uninstallation using the same requirements file (req.txt):
[box border=”full”]pip uninstall -y -r ~/downloads/python/req.txt[/box]
Note the use of the –y option to repress the deletion confirmation. That allows all the packages to be uninstalled with a minimum of user input. Running that command will uninstall the packages specified by the requirements file and gives the following output:
Successfully uninstalled beautifulsoup4-4.6.0
Successfully uninstalled lxml-4.1.1
Successfully uninstalled numpy-1.14.2
Successfully uninstalled bs4-0.0.1
Successfully uninstalled pygal-2.4.0
Gathering the requirements
So, using a requirements file can greatly help manage the installation and uninstallation of required packages on your Python installation but how can a requirements file be generated? Of course, being a text file, it can be generated using a text editor and assuming the file is formatted correctly, it will work correctly.
There is another way, however and that is with pip freeze. The pip freeze command will output a list of installed packages in the format needed by a requirements file. Running pip list (as we now well know!) will list the installed packages but not in the format needed by a requirements file. Running a pip freeze however, shows the list of packages needed, in the format needed by a requirements file:
Creating the requirements file is simply a matter of redirecting the output to a file, like this:
[box border=”full”]pip freeze > req.txt[/box]
Downloading wheel files
We saw earlier that it is simple to install a package from a local source (wheel) file but how can you get those files? The most obvious answer to that is to use your browser to download the required wheel file(s) from the PYPI. It’s a simple matter of searching the PYPI for your package and using the ‘Download files’ link to do just that. But you can’t do that if you don’t have a browser (e.g.; an OS with no GUI installed), so what to do?
PIP makes it simple to download the source files too, by providing the pip download command (https://pip.pypa.io/en/stable/reference/pip_download/).
The –d option specifies the directory that the downloaded wheel files will be saved in. Let’s download numpy at release 1.14.2 and save it in ~/downloads/python/wheels/:
[box border=”full”]pip download numpy==1.14.2 -d ~/downloads/python/wheels/[/box]
The pip download command has similar usage syntax and options to the pip install command which means that requirement specifiers may be used to specify a particular release or wheel file to download. Having created your requirements file to provide for more controlled application deployment, the needed source wheel files can then be downloaded as well. That means the correct release of the needed packages can be shipped with the application, so deployment won’t rely on them being downloaded from the PYPI by PIP. Let’s use the requirements file we created earlier to download the source wheel files for the packages we need:
[box border=”full”]pip download -r ~/downloads/python/req.txt -d ~/downloads/python/wheels/[/box]
Note the use here of the –r option to specify the requirements file to use and the –d option to specify the save location of the wheel files.
One of the great things about Python is its easy extensibility through the large number of packages. Most commonly, those packages will be available from the Python Package Index (PYPI).
Once an application has been built using some additional packages, the issue of package management becomes important. We need a tool that gives the ability to manage the installation, uninstallation, upgrading and listing of packages. PIP is the tool we need.
In this article, I have tried to provide an introduction to the capabilities of the PIP tool but have really only scratched the surface of what it can do. To find out more about installing Python packages, refer to the Python documentation on Installing Python Modules.
For full details of the PIP tool, refer to the documentation for the PIP tool on the PYPA site.
I hope this has been helpful but if you can add any more to the topic, I would love to hear from you. I would also love to hear from you if you want to start a conversation about the addition of Python to the MultiValue database platforms from Rocket Software.
Meier Business Systems