Adding Python Twisted module to a virtualenv on Windows

I’m trying to keep my Python system-level install as clean as possible on Windows, but also trying to get Buildbot set up in a virtualenv. Unfortunately, some of the Buildbot dependencies want to build native extensions, and the most important — Twisted — just fails.

This is all I have in my system-level site-packages:

If I create a new virtualenv with --no-site-packages, then try to install the buildbot-slave package, I get the dreaded "Unable to find vcvarsall.bat" when Twisted is being installed:

Unfortunately, if you download and attempt to install the Twisted redistributable package from their website directly into a virtualenv, the installer package will register itself with the system, even if you select the “Install just for me” option. In other words, you can’t just have the package install itself portably into multiple virtualenvs:

  • Select “Install just for me”
    twisted-install-01
  • Set “Python 2.7 from registry” to “Entire feature will be unavailable”
  • Set “Python from another location” to “Will be installed on local hard drive”
  • Set “Provide an alternate Python location” to the location of your virtualenv "c:\virtualenv"
    twisted-install-02

The User Account Control (UAC) dialog will pop up, and after the install, you’ll be stuck with a single install of Twisted which can only be Repaired or Removed:

twisted-install-04

This situation kind of sucks. Unfortunately, the .exe version of the installer also won’t let you choose where to install its contents. It just gets stuck on the dialog where it autodetects your existing Python 2.7 installation, but won’t let you choose a different site-packages folder as a file destination. It only goes for the global install.

twisted-install-03

The solution is to download the .exe version of the installer, unpack it using something like 7-zip, and then copy the files from the PLATLIB folder to your virtualenv’s Libs\site-packages and the the files from the SCRIPTS folder to your virtualenv’s Scripts folder.

Then, the next time you run pip freeze, you should see something like:

Then you can install the rest of the Buildbot packages:

dockarchive

I’ve spent the past two days working on a way to improve Docker image generation, by using the C preprocessor to allow me to include Dockerfiles in other Dockerfiles. Part of this was to eliminate redundant typing, with the bonus that I’ve been able to code best-practice directly into the includeable Dockerfiles.

I’ve checked all this into this github repository — https://github.com/nuket/dockarchive — to keep track of things.

For example, let’s say you want to run an SSH server (sshd) in a Docker container. One of the ways to do this is to use supervisord, to start and to keep it running. So there’s an implicit dependency, which can be captured in the following way:

Dockerfile.supervisord

Dockerfile.ssh

Dockerfile.supervisord.run

The sshd installation can pull in its dependency and add a configuration to be automatically started by supervisord. Docker images can then be built up in an orderly way and complexity cut into smaller pieces, as each piece only needs to worry about its own particular configuration. In this way, the pieces can be combined into full images.

As an example, the source for one of my Docker images is very simple now, since the dependency tree is taken care of:

buildbot-master/Dockerfile.in

Then you just use the C preprocessor to create the actual Dockerfile and then build that:

Setting Up Docker and Buildbot

One of the newest players in the field of increasing server density and utilization is a piece of software called Docker. The idea is solid. Rather than automate the creation and management of resource-heavy virtual machines, Docker automates the management of lightweight Linux Containers (LXC containers), which allow for process and resource isolation backed by kernel guarantees on a single system. This means you have one system kernel, instead of dozens, and you don’t have to waste resources on duplicated pieces of code like system libraries, daemons, and other things that every server will always load into memory.

The ability to create a uniform and repeatable software environment inside of a container is worthy of attention, since it directly relates to getting both development and continuous integration environments running cleanly across a variety of systems.

It’s a problem I’m having right now: I need a continuous integration setup that can poll the master branch of a git repository and trigger a build on Windows, OS X, Linux, and Android. I have limited physical resources, but at least one multicore machine with 8GB of RAM running a 64-bit host OS.

Getting Started with Docker

Without further ado, here’s how I got going with Docker, after getting a clean 64-bit Ubuntu 12.04.3 system installed inside of VirtualBox.

Purge the old kernel (3.2):

Install the new kernel (3.8):

Install Docker (instructions from here):

That last command spits out an interesting list of dependencies, which I’m capturing here in case I need to look up the associated manpages later:

$ sudo apt-get install lxc-docker
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following extra packages will be installed:
  aufs-tools bridge-utils cgroup-lite cloud-utils debootstrap euca2ools libapparmor1 libyaml-0-2 lxc lxc-docker-0.6.1
  python-boto python-m2crypto python-paramiko python-yaml
Suggested packages:
  btrfs-tools lvm2 qemu-user-static
The following NEW packages will be installed:
  aufs-tools bridge-utils cgroup-lite cloud-utils debootstrap euca2ools libapparmor1 libyaml-0-2 lxc lxc-docker
  lxc-docker-0.6.1 python-boto python-m2crypto python-paramiko python-yaml
0 upgraded, 15 newly installed, 0 to remove and 0 not upgraded.
Need to get 3,817 kB of archives.
After this operation, 20.6 MB of additional disk space will be used.
Do you want to continue [Y/n]? 

With Docker installed, but before running the “Hello World” Docker example, I took a snapshot of the virtual machine. Now that I think about it, though, that’s the last snapshot I’ll need, since Docker is itself a snapshottable container organizer.

$ sudo docker run -i -t ubuntu /bin/bash
[sudo] password for nuket: 
Unable to find image 'ubuntu' (tag: latest) locally
Pulling repository ubuntu
8dbd9e392a96: Download complete
b750fe79269d: Download complete
27cf78414709: Download complete
WARNING: Docker detected local DNS server on resolv.conf. Using default external servers: [8.8.8.8 8.8.4.4]
WARNING: IPv4 forwarding is disabled.
root@e8c30f41da03:/# 

No More sudo

I got a little tired of typing sudo in front of everything, so used the instructions here to add a docker group to the system, and restart the daemon with those credentials.

# Add the docker group
sudo groupadd docker

# Add the ubuntu user to the docker group
# You may have to logout and log back in again for
# this to take effect
sudo gpasswd -a ubuntu docker

# Restart the docker daemon
sudo service docker restart

Then I log out and log back into my desktop session, to gain the group permissions.

Getting Buildbot Installed

Someone beat me to it and uploaded a Dockerfile describing both a buildbot-master and buildbot-slave configuration:

Found 6 results matching your query ("buildbot")
NAME                             DESCRIPTION
mzdaniel/buildbot                
mdaniel/buildbot                 
ehazlett/buildbot-master         Buildbot Master See full description for available environment variables to customize.
ehazlett/buildbot-slave          
ehazlett/buildbot                Standalone buildbot with master/slave.  See full description for available environment variables.
mzdaniel/buildbot-tutorial     

Pull buildbot-master

docker pull ehazlett/buildbot-master

According to the Docker Index entry for buildbot-master, there are a handful of environment variables available to be passed into docker run. (This is a bit of a kicker for me, that at the moment you have to pass these environment variables in via the command line, but I’m guessing they’ll fix that to read them in via a file at some point.)

CONFIG_URL: URL to buildbot config (overrides all other vars)
PROJECT_NAME: Name of project (shown in UI)
PROJECT_URL: URL of project (shown in UI)
REPO_PATH: Path to code repo (buildbot watches -- i.e. git://github.com:ehazlett/shipyard.git)
TEST_CMD: Command to run as test
BUILDBOT_USER: Buildbot username (UI)
BUILDBOT_PASS: Buildbot password (UI)
BUILDSLAVE_USER: Buildbot slave username
BUILDSLAVE_PASS: Buildbot slave password

Start buildbot-master

The documentation isn’t super-clear on how to pass these multiple environment variables into the docker container, but it looks something like this:

$ docker run -e="foo=bar" -e="bar=baz" -i -t ubuntu /bin/bash
WARNING: Docker detected local DNS server on resolv.conf. Using default external servers: [8.8.8.8 8.8.4.4]
root@1f357c1e17b4:/# echo $foo
bar
root@1f357c1e17b4:/# echo $bar
baz

For the time being, I’ll just run the container with its default parameters.

CID=$(docker run -d ehazlett/buildbot-master)

But I’m also curious as to how I’m supposed to communicate with it. So I inspect the docker configuration for the buildbot-master:

So what happens when you run the container? You have to find out the portmapping of the buildbot ports inside the container to the ports on your host system.

Port 9989 is the communications port for the Buildbot systems to talk to one another. Port 8010 is the Buildbot web interface whch you can open in a browser, like so:

buildbot-running

Of course, you can access this from the outside (in the top-most, non-virtual host OS as well):

buildbot-running-chrome

Docker Subnetwork

It’s also not entirely clear from the Docker basic instructions, that Docker also creates an internal private network that is NATed to the host, so when you run ifconfig on the Docker host, you’ll see:

And if you’ve attached to a Docker container, and run ip addr show, you’ll see:

Which you’ll also see if you run docker inspect $CID, which returns useful runtime information about the specific container instance:

So now the question is how to get the buildbot-slaves on other VMs to talk to the buildbot-master, and how to configure the buildbot-master itself. I’m also considering getting an instance of CoreOS running, as it seems to have a mechanism for handling global configuration within a cluster, which would be one way to provide master.cfg to the buildbot-master.

Updates to this post to follow.

Update: Easier Overview

The easier way to see your container runtime configurations at a glance is to use the docker ps command (Duh!). Particularly nice is the port-mapping list in the rightmost column:

Update: Configuring the Buildbot Master Itself

You can just jump into the container and edit the master.cfg, like so:

Update: Getting the list of all Containers

It’s not entirely intuitive, but each time you docker run an Image, you get a new Container, and these Containers don’t generally show up as you might expect. (I was wondering how the diff command was supposed to work, for instance.)

You have to use the docker ps -a command, to see all of the Container IDs, which you then can start and stop.

In other words, using docker run image-name creates the new Container. But for subsequent calls, you should use docker start container-id.

This also clarifies why there’s a docker rm and a docker rmi command.

An easy way to remove unused Docker containers

Update: Using cpp to make Dockerfiles uniform

This section could also be called “The horror, the horror”. Following the best practices list mentioned here, I decided to create a uniform include files to pull into my Dockerfiles, which I then generate using the C preprocessor (pretty much because it’s cheap and available).

So the idea I had was to put common Dockerfile instructions into separate files. I’m guessing the Docker devs might build an INCLUDE instruction into the Dockerfile syntax at some point. The benefit to doing this is that you can take advantage of the docker image cache, which stores incremental versions of your build-images based on the instructions and base-image used to create them. In other words, you don’t have to keep rebuilding the common parts of various Docker images. And you’re less likely to mistype common lines across files, which could be a source of inefficiency.

Dockerfile.ubuntu

Dockerfile.run

In a clean subdirectory, below where the .ubuntu and .run files are located:

Dockerfile.in

To create the custom Dockerfile:

Which generates something like:

Then just docker build . like usual.

Update: Now with github!

I’ve created a repository on github to play around with includable Dockerfiles.

The github repository currently has a few images in it, which are related to one another in a tree that looks like this:

The idea, then is to eliminate redundant text by including Dockerfiles in other Dockerfiles, and to organize this hierarchically, such that images further down the hierarchy are just combinations of their parents + some differentiation.

apt-get install package caching using squid-deb-proxy

If you’re actively developing Docker images, one thing that slows you down a lot and puts considerable load on Ubuntu’s mirror network is the redundant downloading of software packages.

To speed up your builds and save bandwidth, install squid-deb-proxy and squid-deb-proxy-client on the Docker container host (in my case, the outermost Ubuntu VM):

And, make sure you add ppa.launchpad.net (and any other PPA archive sources) to /etc/squid-deb-proxy/mirror-dstdomain.acl:

And restart the proxy:

In your Dockerfile, set the proxy configuration file to point to the default route address (which is where squid-deb-proxy will be running):

Once the caching is set up, you can monitor accesses via the logfiles in /var/log/squid-deb-proxy.

The first time you build an image, the log file has lots of cache misses:

The second time, you’ll see a number of hits (TCP_REFRESH_UNMODIFIED), saving you bandwidth and time:

Update: Using sshfs to mount Docker container folders

Your Docker containers might not actually have any editors installed. One way to easily get around this is to just mount folders inside the Docker container using the user-mode SSHFS:

Further Appification Of Full-Powered Computers

Ubuntu’s Unity launcher has two stupid usability flaws that I can’t believe I ran into today:

1. Auto-hiding the Dock in a Virtual Machine just does not work. I don’t care why, maybe because the VM can’t detect when the mouse cursor has reached column 0 in the virtual screen. But come on.

2. Adding a new application to the Dash. How do you do it? Please, can someone give me a hint as to who the idiot UI designer was, who decided that a simple “add” button and shortcut-adding workflow would be too much to ask for?

Drag and drop doesn’t work:

add-a-shortcut

And there’s nothing in the Dash launcher that indicates how to do it:

no-add-button

Trying to launch QT Creator via alacarte didn’t work either, for whatever reason.

I guess I’m just supposed to run packaged apps, Canonical forbid I actually want to add a shortcut to something else.

Unfortunately, this kind of hassle just makes me do:

sudo apt-get install gnome-session-fallback

Also, why does Unity feel so slow compared to the old Gnome Classic? It feels positively sluggish when rendering. Too much compositing, perhaps? The other weird thing is that while running in a virtual machine, resizing the virtual desktop size (by resizing the virtual machine window) takes forever under Unity, and no time at all under Gnome.

Using Selenium WebDriver with PhantomJS on a shared host

I’m currently trying to set up a cronjob to do end-to-end autotesting of the OAuth code I wrote for Tandem Exchange, which is occasionally failing these days due to the OAuth provider systems being unavailable or very very slow (which is the same thing). It’s actually one of the biggest problems in web authentication actually, the fact that once you do start relying on OAuth providers to authenticate users on your website, you inherit a critical dependency on their servers always being reachable. (In our case, I’ve been seeing stupid downtimes by QQ, one of the largest Chinese social networks.) And it’s not something you can test with mock objects.

Set up a virtualenv:

Install Selenium in the virtualenv:

Download and untar the PhantomJS executable somewhere on the PATH:

During the first attempts at using PhantomJS, I had problems with SSL-based addresses always returning a .current_url of “about:blank“, which you have to fix using a somewhat obscure flag in the webdriver.PhantomJS() constructor.

The fix looks like this, in Python:

And when the unit test runs (via nosetests):

For the most part the Python-based selenium module works, but it is pretty verbose, as it sticks to the original Webdriver API very closely. There are higher-level abstractions, such as Splinter, but I get the impression that making sure it starts PhantomJS properly will be an ordeal in and of itself. I’ve gotten Facebook and Google OAuth testing working headlessly, which is pretty cool, but the next step of getting the QZone OAuth test working is getting jammed up on the fact that QZone is behaving exactly as users see it behave (which is to say, problematically).

But then, that’s the objective of the unit testing, to reveal the source of the latest OAuth problems.

Actually, I’ve also noticed that it’s pretty slow not only from the shared-hosting server we’re using, but also from where I am currently located. So two different points on the globe. I can’t help but wonder how much the Great Firewall of China is slowing things down.