July 2013 - Page 2 of 2

Python Code Coverage and cron

Every now and then, it’s useful to get a sense of assurance about the code you’re writing. In fact, it might be a primary goal of your organization to have functional code. Who knows?

Although I began development of Tandem Exchange following a test-first development process, the pace of change was too rapid. It’s not that I didn’t appreciate the value of testing. At the very beginning, I did implement a large number of tests. It’s just that those tests were written against soon-to-be-obsolete code and I didn’t have the time to develop new functionality and write unit tests simultaneously. Before the prototyping phase had ended, I learned the hard way that it didn’t really make sense to write many of those tests, when such a huge fraction of early functional code ended up in the dustbin.

Once things settled down, I started to leverage the Python code-coverage module alongside newly-written unit tests, made simple by using the nose test runner, which is a fantastic tool for test auto-discovery.

I then added the nose test runs to the development-site crontab, to generate coverage and unit test statistics on a regular basis:

@daily  /usr/local/bin/python2.7 /path/to/nosetests -v --with-coverage \
        --cover-package=exchange --cover-erase \ 
        --cover-html --cover-html-dir=/path/to/webdir/coverage --cover-branches \
        exchange.search_tests exchange.models_tests

All you have to do is specify a handful of extra options to the nosetests command line, it’s practically a freebie. Especially useful are the --cover-html and --cover-html-dir options, which tells nosetests to place the coverage reports in a specific directory.

In our case, I created a directory on the webhost, where I can log in and check the report results, which look something like:

The coverage reports show which Python statements (lines) have been exercised by the unit tests that have been run. Green lines have been run at least once, red lines have not been run, yellow lines indicate that not all of the conditions of a branch have been tested. (i.e. If you have an “if” statement, you have to test it for both True and False conditions, otherwise known as Modified Condition / Decision Coverage.) Note, however, that a coverage test does not prove that a piece of code behaves the way you expect, only that it has been run. The unit tests are the bits exclusively responsible for proving behavior.

In any case, I’ve already isolated two issues through the unit tests and am now assured that they will never come back. And as the percentage of statements covered by unit tests continues to increase, I’m sure any remaining issues will shake out. Which is the whole point, isn’t it?

Terrible Connectors

Pretend you’re the biggest manufacturer of gearshifts and mechanical accessories for bicycles. You invent a fantastic range of hub-mounted electric generators for bicycles, which are intended to power the lights as a person pedals along. Your products are reliable, long-lasting, and mostly free of required maintenance. But you decide to skimp on a sensible mechanical connector for the electrical output from your generator products, instead asking the dumbest junior engineer in the office to design a connector for you. What would that look like?

Probably something like this (from source):

A friction-fit cable connection where you pray that the wires don’t cross and that the wires are thick enough to rub against the generator contacts tightly enough.

Are you kidding me, Shimano? Please, please, please, for the love of God, talk to these people.

Genius-Level Software Developers

Unfortunately, there aren’t more of them around. For anyone who has ever wondered why their data plan gets chewed up so quickly, look no further than the idiots who program smartphone apps that force you to opt-out of 3G / 4G data synchronization over the phone networks:

Because I really wanted my multi-year-old photos from Picasa, which I’m about to delete, to sync down onto my phone over HSPA. Thanks guys. I really like it when a program punishes me for its bad behavior. Of the top-three data consumers on my phone, the top two had better just be email and web, because that is the stuff I actually care about and need immediate, high-speed access to. Also: Why do I want to sync these photos anyway? Isn’t that what the cloud is for? So I don’t have to lug around a full-sized (!) local copy, if I don’t want?

So they’re smart enough to know that I might mind, but not smart enough to explicitly ask me whether or not I want to burn up my high-speed data quota first.

Django Testing: Creating And Removing Test Users

During the development of Tandem Exchange, I wanted to write some test routines to check the validity of the core search functions that figure out which of the users would be good matches for one another as language tandem partners.

There are a number of ways to do this, Django’s built-in unit test infrastructure is a bit limited in that it attempts to create and drop a new database each time, to let the test have an independent and clean database in which to work. But this isn’t something that’s possible with most shared webhosters, for obvious reasons, so the Django unit test infrastructure is less than useful in these cases.

But you don’t really need to create/drop databases all the time. In my case, the schema stays the same throughout, all I want to do is add and remove users on the non-production database and check that the search functions work.

Here’s how I am planning to do it, via the unittest.TestCase setUp() and tearDown() methods:

setUp():
users_before = User.objects.values_list('id', flat=True).order_by('id')
[2, 8221, 8222, 8732, 8734, 8735, 8750, 8758, 8760, 8777, 8780, 8791]

tearDown():
users_after  = User.objects.values_list('id', flat=True).order_by('id')
[2, 8221, 8222, 8732, 8734, 8735, 8750, 8758, 8760, 8777, 8780, 8791, 8805]

users_to_remove = list(set(users_before) - set(users_after))
[8805]

setUp():

users_before = User.objects.values_list('id', flat=True).order_by('id')

[2, 8221, 8222, 8732, 8734, 8735, 8750, 8758, 8760, 8777, 8780, 8791]

tearDown():

users_after = User.objects.values_list('id', flat=True).order_by('id')

[2, 8221, 8222, 8732, 8734, 8735, 8750, 8758, 8760, 8777, 8780, 8791, 8805]

users_to_remove = list(set(users_before) - set(users_after))

[8805]

Then all you need to do for any particular unit test is to remove those users created in the duration. This isn’t something you can use in a production database, since there will be a natural race condition between the setUp() and tearDown() calls. But this should work just fine in a non-production environment, where no one’s signing up while you’re running tests.

Update: Here’s what the unittest.TestCase code looked like, in the end. Note that you must evaluate the QuerySet expression immediately in setUp() and tearDown() as failure to do so causes them to both be lazily-evaluated at the users_to_remove assignment, which gives you an empty set.

    def setUp(self):
        # Before performing any tests, record the existing
        # user IDs:
        # 1. So we know which users we created during the test
        # 2. So we can remove just those fake users.
        print 'setUp()'

        # Get the list of all users before the tests.
        # Must evaluate the QuerySet or it will be lazily-evaluated later, which is wrong.
        self.users_before = list(User.objects.values_list('id', flat=True).order_by('id'))
        print self.users_before

    def tearDown(self):
        print 'tearDown()'

        # Get the list of all users after the tests.
        users_after = list(User.objects.values_list('id', flat=True).order_by('id'))
        print users_after

        # Calculate the set difference.
        users_to_remove = sorted(list(set(users_after) - set(self.users_before)))
        print users_to_remove

        # Delete that difference from the database.
        User.objects.filter(id__in=users_to_remove).delete()

def setUp(self):

# Before performing any tests, record the existing

# user IDs:

# 1. So we know which users we created during the test

# 2. So we can remove just those fake users.

print 'setUp()'

# Get the list of all users before the tests.

# Must evaluate the QuerySet or it will be lazily-evaluated later, which is wrong.

self.users_before = list(User.objects.values_list('id', flat=True).order_by('id'))

print self.users_before

def tearDown(self):

print 'tearDown()'

# Get the list of all users after the tests.

users_after = list(User.objects.values_list('id', flat=True).order_by('id'))

print users_after

# Calculate the set difference.

users_to_remove = sorted(list(set(users_after) - set(self.users_before)))

print users_to_remove

# Delete that difference from the database.

User.objects.filter(id__in=users_to_remove).delete()

A Lack Of Negative Reinforcement

When I was growing up, my parents taught me that if you couldn’t say something nice, you shouldn’t say anything at all. When I grew up, I figured out that that was bullshit, but I still tend to hold the line.

Google, Facebook, and others don’t seem to understand that the lack of a negative reinforcement signal does not help to generate results that users want.

I’m tired, namely, of this appearing in various, completely unrelated search results on YouTube:

How about a “never show me this again” option? Or an Unlike button. Without Unlike, all of the possible Likes in the universe are biased in such a way that you have only two choices, with the first being a conflated form of “I dislike it and would gladly never see it again / I am ambivalent about it and couldn’t care less” and the second being “Like”.

On YouTube, I believe you can downvote a video, after you’ve clicked on it, which seems kind of stupid, since it gives the uploader the view they so desperately want. There should be an option to remove items you find stupid when you’re hovering over suggestions, and that ought to count in some way against them.

I suppose the only saving grace is that it’s a good thing that the social network operators of the world only know my Likes, but not yet my Dislikes. The higher their signal to noise ratio gets, the creepier the online experience becomes.