2011-01-14 21:00:08 +03:00
|
|
|
============
|
|
|
|
Localization
|
|
|
|
============
|
|
|
|
|
|
|
|
Kitsune is localized with `gettext <http://www.gnu.org/software/gettext/>`_.
|
|
|
|
User-facing strings in the code or templates need to be marked for gettext
|
|
|
|
localization.
|
|
|
|
|
|
|
|
We use `Verbatim <http://localize.mozilla.org/>`_ to provide an easy interface
|
|
|
|
to localizing these files. Localizers are also free to download the PO files
|
|
|
|
and use whatever tool they are comfortable with.
|
|
|
|
|
|
|
|
|
|
|
|
Making Strings Localizable
|
|
|
|
==========================
|
|
|
|
|
|
|
|
Making strings in templates localizable is exceptionally easy. Making strings
|
|
|
|
in Python localizable is a little more complicated. The short answer, though,
|
|
|
|
is just wrap the string in ``_()``.
|
|
|
|
|
|
|
|
|
|
|
|
Interpolation
|
|
|
|
-------------
|
|
|
|
|
|
|
|
A string is often a combination of a fixed string and something changing, for
|
|
|
|
example, ``Welcome, James`` is a combination of the fixed part ``Welcome,``,
|
|
|
|
and the changing part ``James``. The naive solution is to localize the first
|
|
|
|
part and the follow it with the name::
|
|
|
|
|
|
|
|
_('Welcome, ') + username
|
|
|
|
|
|
|
|
This is **wrong!**
|
|
|
|
|
|
|
|
In some locales, the word order may be different. Use Python string formatting
|
|
|
|
to interpolate the changing part into the string::
|
|
|
|
|
|
|
|
_('Welcome, {name}').format(name=username)
|
|
|
|
|
|
|
|
Python gives you a lot of ways to interpolate strings. The best way is to use
|
|
|
|
Py3k formatting and kwargs. That's the clearest for localizers.
|
|
|
|
|
|
|
|
The worst way is to use ``%(label)s``, as localizers seem to have all manner
|
|
|
|
of trouble with it. Options like ``%s`` and ``{0}`` are somewhere in the
|
|
|
|
middle, and generally OK if it's clear from context what they will be.
|
|
|
|
|
|
|
|
|
|
|
|
Localization Comments
|
|
|
|
---------------------
|
|
|
|
|
|
|
|
Sometimes, it can help localizers to describe where a string comes from,
|
|
|
|
particularly if it can be difficult to find in the interface, or is not very
|
|
|
|
self-descriptive (e.g. very short strings). If you immediately precede the
|
|
|
|
string with a comment that starts ``L10n:``, the comment will be added to the
|
|
|
|
PO file, and visible to localizers.
|
|
|
|
|
|
|
|
|
|
|
|
Adding Context with msgctxt
|
|
|
|
---------------------------
|
|
|
|
|
|
|
|
Strings may be the same in English, but different in other languages. English,
|
|
|
|
for example, has no grammatical gender, and sometimes the noun and verb forms
|
|
|
|
of a word are identical.
|
|
|
|
|
|
|
|
To make it possible to localize these correctly, we can add "context" (known in
|
|
|
|
gettext as "msgctxt") to differentiate two otherwise identical strings.
|
|
|
|
|
|
|
|
For example, the string "Search" may be a noun or a verb in English. In a
|
|
|
|
heading, it may be considered a noun, but on a button, it may be a verb. It's
|
|
|
|
appropriate to add a context (like "button") to one of them.
|
|
|
|
|
|
|
|
Generally, we should only add context if we are sure the strings aren't used in
|
|
|
|
the same way, or if localizers ask us to.
|
|
|
|
|
|
|
|
|
|
|
|
Plurals
|
|
|
|
-------
|
|
|
|
|
|
|
|
"You have 1 new messages" grates on discerning ears. Fortunately, gettext gives
|
|
|
|
us a way to fix that in English *and* other locales, the ``ngettext``
|
|
|
|
function::
|
|
|
|
|
|
|
|
ngettext('singular', 'plural', count)
|
|
|
|
|
|
|
|
A more realistic example might be::
|
|
|
|
|
|
|
|
ngettext('Found {count} result.',
|
|
|
|
'Found {count} results',
|
|
|
|
len(results)).format(count=len(results))
|
|
|
|
|
|
|
|
This method takes three arguments because English only needs three, i.e., zero
|
|
|
|
is considered "plural" for English. Other locales may have different plural
|
|
|
|
rules, and require different phrases for, say 0, 1, 2-3, 4-10, >10. That's
|
|
|
|
absolutely fine, and gettext makes it possible.
|
|
|
|
|
|
|
|
|
|
|
|
Strings in Templates
|
|
|
|
--------------------
|
|
|
|
|
|
|
|
When putting new text into a template, all you need to do is wrap it in a
|
|
|
|
``_()`` call::
|
|
|
|
|
|
|
|
<h1>{{ _('Heading') }}</h1>
|
|
|
|
|
|
|
|
Adding context is easy, too::
|
|
|
|
|
|
|
|
<h1>{{ _('Heading', 'context') }}</h1>
|
|
|
|
|
|
|
|
L10n comments need to be Jinja2 comments::
|
|
|
|
|
|
|
|
{# L10n: Describes this heading #}
|
|
|
|
<h1>{{ _('Heading') }}</h1>
|
|
|
|
|
|
|
|
Note that Jinja2 escapes all content output through ``{{ }}`` by default. To
|
|
|
|
put HTML in a string, you'll need to add the ``|safe`` filter::
|
|
|
|
|
|
|
|
<h1>{{ _('Firefox <span>Help</span>')|safe }}</h1>
|
|
|
|
|
|
|
|
To interpolate, you should use one of two Jinja2 filters: ``|f()`` or, in some
|
|
|
|
cases, ``|fe()``. ``|f()`` has exactly the same arguments as
|
|
|
|
``u''.format()``::
|
|
|
|
|
|
|
|
{{ _('Welcome, {name}!')|f(name=request.user.username) }}
|
|
|
|
|
|
|
|
The ``|fe()`` is exactly like the ``|f()`` filter, but escapes its arguments
|
|
|
|
before interpolating, then returns a "safe" object. Use it when the localized
|
|
|
|
string contains HTML::
|
|
|
|
|
|
|
|
{{ _('Found <strong>{0}</strong> results.')|fe(num_results) }}
|
|
|
|
|
|
|
|
Note that you *do not need* to use ``|safe`` with ``|fe()``. Also note that
|
|
|
|
while it may look similar, the following is *not* safe::
|
|
|
|
|
|
|
|
{{ _('Found <strong>{0}</strong> results.')|f(num_results)|safe }}
|
|
|
|
|
|
|
|
The ``ngettext`` function is also available::
|
|
|
|
|
|
|
|
{{ ngettext('Found {0} result.',
|
|
|
|
'Found {0} results.',
|
|
|
|
num_results)|f(num_results) }}
|
|
|
|
|
|
|
|
|
|
|
|
Using ``{% trans %}`` Blocks for Long Strings
|
|
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
|
|
|
|
When a string is very long, i.e. long enough to make Github scroll sideways, it
|
|
|
|
should be line-broken and put in a ``{% trans %}`` block. ``{% trans %}``
|
2011-01-27 22:19:38 +03:00
|
|
|
blocks work like other block-level tags in Jinja2, except they cannot have
|
|
|
|
other tags, except strings, inside them.
|
2011-01-14 21:00:08 +03:00
|
|
|
|
|
|
|
The only thing that should be inside a ``{% trans %}`` block is printing a
|
|
|
|
string with ``{{ string }}``. These are defined in the opening ``{% trans %}``
|
|
|
|
tag::
|
|
|
|
|
|
|
|
{% trans user=request.user.username %}
|
|
|
|
Thanks for registering, {{ user }}! We're so...
|
|
|
|
hope that you'll...
|
|
|
|
{% trans %}
|
|
|
|
|
|
|
|
|
|
|
|
Strings in Python
|
|
|
|
-----------------
|
|
|
|
|
|
|
|
*NB: Whenever you are adding a string in Python, ask yourself if it really
|
|
|
|
needs to be there, or if it should be in the template. Keep logic and
|
|
|
|
presentation separate!*
|
|
|
|
|
|
|
|
Strings in Python are more complex for two reasons:
|
|
|
|
|
|
|
|
#. We need to make sure we're always using Unicode strings and the
|
|
|
|
Unicode-friendly versions of the functions.
|
|
|
|
|
|
|
|
#. If you use the ``ugettext`` function in the wrong place, the string may end
|
|
|
|
up in the wrong locale!
|
|
|
|
|
|
|
|
Here's how you might localize a string in a view::
|
|
|
|
|
|
|
|
from tower import ugettext as _
|
|
|
|
|
|
|
|
def my_view(request):
|
|
|
|
if request.user.is_superuser:
|
|
|
|
msg = _(u'Oh hi, staff!')
|
|
|
|
else:
|
|
|
|
msg = _(u'You are not staff!')
|
|
|
|
|
|
|
|
Interpolation is done through normal Python string formatting::
|
|
|
|
|
|
|
|
msg = _(u'Oh, hi, {user}').format(user=request.user.username)
|
|
|
|
|
|
|
|
``ugettext`` supports context, too::
|
|
|
|
|
|
|
|
msg = _('Search', 'context')
|
|
|
|
|
|
|
|
L10n comments are normal one-line Python comments::
|
|
|
|
|
|
|
|
# L10n: A message to users.
|
|
|
|
msg = _(u'Oh, hi there!')
|
|
|
|
|
|
|
|
If you need to use plurals, import the function ``ungettext`` from Tower::
|
|
|
|
|
|
|
|
from tower import ungettext, ugettext as _
|
|
|
|
|
|
|
|
n = len(results)
|
|
|
|
msg = ungettext('Found {0} result', 'Found {0} results', n).format(n)
|
|
|
|
|
|
|
|
|
|
|
|
Lazily Translated Strings
|
|
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
|
|
|
|
You can use ``ugettext`` or ``ungettext`` only in views or functions called
|
|
|
|
from views. If the function will be evaluated when the module is loaded, then
|
|
|
|
the string may end up in English or the locale of the last request! (We're
|
|
|
|
tracking down that issue.)
|
|
|
|
|
|
|
|
Examples include strings in module-level code, arguments to functions in class
|
|
|
|
definitions, strings in functions called from outside the context of a view. To
|
|
|
|
localize these strings, you need to use the ``_lazy`` versions of the above
|
|
|
|
methods, ``ugettext_lazy`` and ``ungettext_lazy``. The result doesn't get
|
|
|
|
translated until it is evaluated as a string, for example by being output or
|
|
|
|
passed to ``unicode()``::
|
|
|
|
|
|
|
|
from tower import ugettext_lazy as _lazy
|
|
|
|
|
|
|
|
PAGE_TITLE = _lazy(u'Page Title')
|
|
|
|
|
|
|
|
``ugettext_lazy`` also supports context.
|
|
|
|
|
|
|
|
It is very important to pass Unicode objects to the ``_lazy`` versions of these
|
|
|
|
functions. Failure to do so results in significant issues when they are
|
|
|
|
evaluated as strings.
|
|
|
|
|
|
|
|
If you need to work with a lazily-translated string, you'll first need to
|
|
|
|
convert it to a ``unicode`` object::
|
|
|
|
|
|
|
|
from tower import ugettext_lazy as _lazy
|
|
|
|
|
|
|
|
WELCOME = _lazy(u'Welcome, %s')
|
|
|
|
|
|
|
|
def my_view(request):
|
|
|
|
# Fails:
|
|
|
|
WELCOME % request.user.username
|
|
|
|
|
|
|
|
# Works:
|
|
|
|
unicode(WELCOME) % request.user.username
|
|
|
|
|
|
|
|
|
|
|
|
Getting the Localizations
|
|
|
|
=========================
|
|
|
|
|
|
|
|
Localizations are not stored in this repository, but are in Mozilla's SVN::
|
|
|
|
|
|
|
|
http://svn.mozilla.org/projects/sumo/locales
|
|
|
|
|
|
|
|
You don't need the localization files for general development. However, if
|
|
|
|
you need them for something, they're pretty easy to get::
|
|
|
|
|
|
|
|
$ cd kitsune
|
|
|
|
$ svn checkout https://svn.mozilla.org/projects/sumo/locales locale
|
|
|
|
|
|
|
|
(Alternatively, you can do yourself a favor and use::
|
|
|
|
|
|
|
|
$ git svn clone -r HEAD https://svn.mozilla.org/projects/sumo/locales locale
|
|
|
|
|
|
|
|
if you're a git fan.)
|
|
|
|
|
|
|
|
|
|
|
|
Updating the Localizations
|
|
|
|
==========================
|
|
|
|
|
|
|
|
When strings are added or updated, we need to update the templates and PO files
|
|
|
|
for localizers. This needs to be coordinated with someone who has rights to
|
|
|
|
update the data on `Verbatim <http://localize.mozilla.org/>`_. If you commit
|
|
|
|
new strings to SVN and they are not updated right away on Verbatim, there will
|
|
|
|
be big merging headaches.
|
|
|
|
|
|
|
|
Updating strings is pretty easy. Check out the localizations as above, then::
|
|
|
|
|
|
|
|
$ python manage.py extract
|
|
|
|
$ python manage.py verbatimize --rename
|
|
|
|
$ python manage.py merge
|
|
|
|
|
|
|
|
Congratulations! You've now updated the POT and PO files.
|
|
|
|
|
|
|
|
Sometimes this can leave a bunch of garbage files with ``.po~`` extensions. You
|
|
|
|
should delete these, never commit them::
|
|
|
|
|
|
|
|
$ find . -name "*.po~" -delete
|
|
|
|
|
|
|
|
|
|
|
|
Adding a New Locale
|
|
|
|
-------------------
|
|
|
|
|
|
|
|
Adding a new locale is even easier than updating the templates. Say you wanted
|
|
|
|
to add ``fa-IR``::
|
|
|
|
|
|
|
|
$ mkdir -p locale/fa-IR/LC_MESSAGES
|
|
|
|
$ python manage.py merge
|
|
|
|
|
|
|
|
Done!
|
|
|
|
|
|
|
|
|
|
|
|
Compiling MO Files
|
|
|
|
==================
|
|
|
|
|
|
|
|
gettext is so fast for localization because it doesn't parse text files, it
|
|
|
|
reads a binary format. You can easily compile that binary file from the PO
|
|
|
|
files in the repository.
|
|
|
|
|
|
|
|
We don't store MO files in the repository because they need to change every
|
|
|
|
time the corresponding PO file changes, so it's silly and not worth it. They
|
|
|
|
are ignored by ``svn:ignore``, but please make sure you don't forcibly add them
|
|
|
|
to the repository.
|
|
|
|
|
|
|
|
There is a shell script to compile the MO files for you::
|
|
|
|
|
|
|
|
$ ./locale/compile-mo.sh locale
|
|
|
|
|
|
|
|
Done!
|