Localization / Translation Using Google Spreadsheets

So I spent a little time over the weekend hacking together a piece of code to help export usable localization / translation files from a simple spreadsheet in Google Docs. With a little extra effort setting up Protected Ranges and giving other users editing permissions, team translation using the same spreadsheet should be pretty easy. Currently, it can export Django-style gettext .po files and jquery.localize-compatible .json translation files.

This was an experiment in learning how to use Google Spreadsheets in combination with Google App Scripts to create a localization table that can be accessed via JSON/JSONP. This is useful because there’s still no useful built-in JSON publishing option for individual spreadsheets.

So here is the localization spreadsheet URL that I’m using as my source (spreadsheet key highlighted):
https://docs.google.com/spreadsheet/ccc?key=0AqrUvD5TZZs3dF9ULUh5X1JlakVJRGFHaWRZQmFuZEE

It looks like this:

And here is the Google App Script URL that will generate localization string tables from that spreadsheet:
https://script.google.com/macros/s/AKfycbxLnEUyElPtL01qHnL7pD2hmTmaO7Tc1yLhjJzQpitpuBfxxBU/exec

If you put the two together, with the querystring sheet_id set to the spreadsheet key and sheet_name set to an appropriate sheet name inside that spreadsheet, you get the following:
https://script.google.com/macros/s/AKfycbxLnEUyElPtL01qHnL7pD2hmTmaO7Tc1yLhjJzQpitpuBfxxBU/exec?sheet_id=0AqrUvD5TZZs3dF9ULUh5X1JlakVJRGFHaWRZQmFuZEE&sheet_name=Main

And when you retrieve that link, it generates the following (folded for brevity):

{
    "de": {
        "string_with_quotes": "Mit \"\"", 
        "potato": "Kartoffel", 
        "language_code": "de", 
        "hello": "Guten Tag!", 
        "chanterelle_mushroom": "Pfifferlinge", 
        "how_are_you": "Wie geht's?", 
        "string_with_comma": "Mit, Komma", 
        "language_name": "Deutsch", 
        "string_with_colon": "With:", 
        "text_direction": "ltr", 
        "string_with_newlines": "Mit \n\n"
    }, 
    "zh-Hant": {},
    "zh-Hans": {},
    "de-AT": {
        "string_with_quotes": "Mit \"\"", 
        "potato": "Erdapfel", 
        "language_code": "de-AT", 
        "hello": "Guten Tag!", 
        "chanterelle_mushroom": "Eierschwammerl", 
        "how_are_you": "Wie geht's?", 
        "string_with_comma": "Mit, Komma", 
        "language_name": "Deutsch (Österreich)", 
        "string_with_colon": "With:", 
        "text_direction": "ltr", 
        "string_with_newlines": "Mit \n\n"
    }, 
    "fr": {},
    "en": {
        "string_with_quotes": "With \"\"", 
        "potato": "potato", 
        "language_code": "en", 
        "hello": "Hello!", 
        "chanterelle_mushroom": "chanterelle mushroom", 
        "how_are_you": "How are you?", 
        "string_with_comma": "With, comma", 
        "language_name": "English", 
        "string_with_colon": "With:", 
        "text_direction": "ltr", 
        "string_with_newlines": "With\n\n"
    }, 
    "ja": {
        "hello": "こんにちは!",
        "language_code: "ja",
        "text_direction: "ltr",
        "string_with_comma: "With, comma",
        "string_with_newlines: "With \n\n",
        "how_are_you: "お元気ですか?",
        "string_with_quotes": "With \"\"",
        "string_with_colon": "With:",
        "potato": "potato",
        "language_name": "日本語",
        "chanterelle_mushroom": "chanterelle mushroom"
    }
}

The spreadsheet format follows a few conventions: By convention, the left-most column (Column 0) is the keystring, which you use to access the translation value later. The next column to the right (Column 1) is the source language or default-language column, in this case English, which should contain all of the original strings you need to localize. By convention, the language_code row is the top-most row (Row 0). Note that the export script will replace untranslated strings in a target language first with the translations from the closest base-language language_code, so in the case of Austrian German “de-AT”, it pulls in the translations from the generic German “de” language_code column; then, for anything missing in that column, it pulls from the default-language column.

I’ve set up a github repository to capture further development, and have been looking at the various file formats supported by Transifex, to see if it would be possible to generate output from the spreadsheet for some of them. Unfortunately, Google App Scripts doesn’t let you generate and immediately return a ZIP file, for which I filed a bug. To be exact, you can generate the ZIP file, but Google provides no good way to return the raw bytes and it’s unclear whether they ever will. Nonetheless, there are two Python scripts in to github repo that generate useful gettext and JSON files. I think the most important (which I’ll add eventually unless someone beats me to it) formats that need support would be Android, Windows, and OSX/iOS string resource files.

Helper Scripts: jquery.localize

Looking at the github repository, there’s one script under the “jquery.localize” folder called “generate-language-pack.py”, which generates files that can be used with the jquery.localize plugin:

translation-de-AT.json
translation-de.json
translation-en.json
translation-fr.json
translation-ja.json

The file contents look like:

{
    "string_with_quotes": "Mit \"\"",
    "potato": "Kartoffel",
    "language_code": "de",
    "hello": "Guten Tag!",
    "chanterelle_mushroom": "Pfifferlinge",
    "how_are_you": "Wie geht's?",
    "string_with_comma": "Mit, Komma",
    "language_name": "Deutsch",
    "string_with_colon": "Mit:",
    "text_direction": "ltr",
    "string_with_newlines": "Mit \n\n"
}

I’ve set up an example website using jquery.localize and a simple $.click() handler, to show the translations in action.

Helper Scripts: gettext

Under the “gettext” subdirectory in the github repository, there’s a file called “localize-django-app.py”, which, when run inside a Django app directory, prints out the following:

Markdown unavailable.
Creating locale/de/LC_MESSAGES/django.po
Creating locale/de_AT/LC_MESSAGES/django.po
Creating locale/en/LC_MESSAGES/django.po
Creating locale/fr/LC_MESSAGES/django.po
Creating locale/ja/LC_MESSAGES/django.po
Creating locale/zh_Hans/LC_MESSAGES/django.po
Creating locale/zh_Hant/LC_MESSAGES/django.po

And generates gettext catalogs that look like:

If you have Markdown installed, the script will run markdown.markdown() on the msgstr translation values before outputting them.

Google App Script Source

I’ve also set the export script to be viewable via the following link: https://script.google.com/d/167d-d6YtX74ZOZ0auMlPdd6emqusmWy5wUqhruKo9uu8AQoaoc3yvZsP/edit?usp=sharing, but I think you might have to log in to view it. It doesn’t seem like you can just see the script w/o having a Google Account. If you’d like to collaborate on making this better, let me know and I can grant edit access.

One thought on “Localization / Translation Using Google Spreadsheets”

  1. I’ve been doing my own thing but always ended up having to regex the resulting “file” output to fix Google’s double escapes.

    Getting stuff through the URL call will be great!

    Thank you!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.