In order to utilize some of the awesome available Python libraries with Google App Engine, the libraries are best packaged into a ZIP package and distributed alongside the other files under development. It’s kind of like JARing stuff up in Java, it makes it so that you never have to mess with the library packages, and you can have one stable library base to work with, until you decide to update it all again.
I haven’t been able to find a useful guide to do this via the following searches (which I’m including so Google might pick up on them):
“bundling multiple python modules”
“bundling python packages into zip”
“using pip to create zip packages”
“building library package pip”
The standard pip installer program, which is used to distribute and install so many Python packages, must have a way of doing this.
First, let’s see what is currently installed in the virtualenv that I’m using to do my development. In this case, I’m using Flask and a number of items related to it:
1
2
3
4
5
6
7
|
$ pip freeze > requirements.txt
$ cat requirements.txt
Flask==0.9
Flask-WTF==0.8
WTForms==1.0.2
Werkzeug==0.8.3
pystache==0.5.3
|
Now, how do we bundle these up?
It looks something like this (copied from my project’s requirements.txt):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
|
# http://www.pip-installer.org/en/latest/
#
# Create a new virtualenv with no access to site-packages
# virtualenv --no-site-packages packageenv
#
# Switch to the proper virtualenv, if you have one.
# $ . packageenv/bin/activate
#
# Install the required packages into the packages directory.
# (packageenv)$ pip install -r requirements.txt --target ./packages
#
# Use the find command (on OS X, and probably BSD, at least this is the syntax)
# to wipe out all precompiled Python files.
#
# And wipe out all egg-info folders (for whatever reason, -delete doesn't work here)
#
# (packageenv)$ find packages -name "*.pyc" -delete
# (packageenv)$ find packages -name "*.egg-info" | xargs rm -rf
#
# Zip up the packages directory, which is now the requirements bundle.
# (packageenv)$ cd packages
# (packageenv)$ zip -9mrv packages.zip .
# (packageenv)$ mv packages.zip ..
# (packageenv)$ cd ..
# (packageenv)$ rm -rf packages
#
# The great thing here is that we only need to list a few modules,
# the rest of the dependencies are pulled in automatically.
#
# Now, just make sure to add packages.zip to sys.path early on in
# the Python sources that need it.
Flask>=0.9
Flask-WTF>=0.8
pystache>=0.5.3
|
First, you want to make sure to create a new virtualenv and to activate it before using the Bash script. If you then run pip freeze in the activated packageenv, you should see a minimal number of packages, possibly none at all. In my case, VirtualBox pushed its packages into the global Python site-packages folder.
1
2
3
4
5
6
7
|
$ virtualenv --no-site-packages packageenv
$ . packageenv/bin/activate
(packageenv)$
(packageenv)$ pip freeze
vboxapi==1.0
virtualenv==1.8.2
wsgiref==0.1.2
|
Then, inside of the clean virtualenv, the Bash script (packages.sh) to run would look like this (on OS X):
1
2
3
4
5
6
7
8
9
10
11
|
#!/bin/bash
pip install -r requirements.txt --target ./packages
if [ -d packages ]; then
cd packages
find . -name "*.pyc" -delete
find . -name "*.egg-info" | xargs rm -rf
zip -9mrv packages.zip .
mv packages.zip ..
cd ..
rm -rf packages
fi
|
When the script is finished, the packages.zip would look something like:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
|
$ unzip -l packages.zip
Archive: packages.zip
Length Date Time Name
-------- ---- ---- ----
0 11-22-12 23:39 flask/
1418 11-22-12 23:39 flask/__init__.py
[...]
0 11-22-12 23:39 flask_wtf/
1835 11-22-12 23:39 flask_wtf/__init__.py
[...]
0 11-22-12 23:39 jinja2/
2268 11-22-12 23:39 jinja2/__init__.py
[...]
0 11-22-12 23:39 pystache/
265 11-22-12 23:39 pystache/__init__.py
[...]
0 11-22-12 23:39 werkzeug/
7097 11-22-12 23:39 werkzeug/__init__.py
[...]
0 11-22-12 23:39 wtforms/
405 11-22-12 23:39 wtforms/__init__.py
-------- -------
2683277 445 files
|
The packages.zip file will probably contain the test suites of the included libraries as well. I haven’t made any provisions to delete these files in the Bash script.
Now, in the Python file that wants access to these libraries, you insert the packages.zip bundle early enough into the sys.path variable to make Python use it to dereference imports. That said and done, you now have access to all the packages necessary to run the full Flask WSGI-compliant server + its plugins + whatever plugins you want to bundle. This all helps to keep your dev environment clean and it helps that you can just re-run the Bash script to update the entire bundle.
For example (packages.py):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
import sys
if not ('packages.zip' in sys.path):
sys.path.insert(0, 'packages.zip')
from flask import Flask
app = Flask(__name__)
app.debug = True
@app.route('/')
def root():
return 'Hello, packages.zip!'
if __name__ == "__main__":
app.run()
|
You could now actually use a clean virtualenv with this packages.zip bundle to do your development, and consolidate all of the development into a single standalone directory, with no dependencies on the user or global site-packages.
This doesnt seems to be working for complex python packages such as numpy. I tried to bundle celery and tensorflow . Even after setting PYTHONPATH with bundled zip file, tensorflow import failed on python shell.
In [2]: import tensorflow
—————————————————————————
ImportError Traceback (most recent call last)
in ()
—-> 1 import tensorflow
/home/vikram/Downloads/package-fe6283ca561c7bf0d233939cc63dbf98-deps.zip/tensorflow/__init__.py in ()
/home/vikram/Downloads/package-fe6283ca561c7bf0d233939cc63dbf98-deps.zip/tensorflow/python/__init__.py in ()
/home/vikram/Downloads/package-fe6283ca561c7bf0d233939cc63dbf98-deps.zip/numpy/__init__.py in ()
/home/vikram/Downloads/package-fe6283ca561c7bf0d233939cc63dbf98-deps.zip/numpy/add_newdocs.py in ()
/home/vikram/Downloads/package-fe6283ca561c7bf0d233939cc63dbf98-deps.zip/numpy/lib/__init__.py in ()
/home/vikram/Downloads/package-fe6283ca561c7bf0d233939cc63dbf98-deps.zip/numpy/lib/type_check.py in ()
/home/vikram/Downloads/package-fe6283ca561c7bf0d233939cc63dbf98-deps.zip/numpy/core/__init__.py in ()
I could imagine the requirements for Tensorflow or numpy being a bit harder to package, as they have probably have native code components.