bods-dev-handbook

Translations

The processes, roles and mechanisms detailed below implement the BODS Language Support Policy.

Note: This process is still in development, and improvements or clarifications are welcome.

The following instructions cover the translation of:

Locations of source files to be translated

BODS schema and codelists exist under the schema folder within the BODS Github repository. The content for the BODS documentation website exists under the docs folder within the same BODS Github repository. The documentation website’s theme has its own Github repository - data-standard-sphinx-theme.

By translating the three components listed above the publicly available website at https://standard.openownership.org can be published in different languages. This is the aim of the translation work.

Scope of translation work

Things that are in scope for translation are:

Things that are not in scope for the translation are:

Things that need to become part of the translation workflow but are not yet:

Overview of the translation workflow

The workflow for doing the translation is:

The diagram below provides an extremely high-level overview of the workflow. It excludes the final step of making the translation live.

Translation Workflow

BODS on Transifex

BODS translations currently live under the Open Data Services Transifex account. The BODS docs and schema live in BODS-main (for the latest in-development version), or versioned projects (for versioned releases of the standard), e.g. v0.1 ‘project’ is bods-v01. The project contains ‘resources’, each of which correspond to a page of documentation (an individual RST file) plus one each for the schema, codelists and .svg files. These may also be referred to as ‘source files’.

The documentation sphinx theme translations live under bods-theme, which contains only one resource for all the strings in the theme templates.

Setting up your local machine

Complete the following steps to get your system set up to undergo the workflow outlined above (assumes Ubuntu 18.04.2 LTS or similar):

Setting up an account and joining the ODSC organization in Transifex

Create a free Transifex account on their sign up page.

The organization is the home of all the translation projects that an organization runs on Transifex.

Open Data Services Co-operative is already set up as an organization in Transifex and all of our projects and teams are managed from within it. A list of our translation projects is visible at the Open Data Services Co-operative public page.

ODSC Organisation in Transifex

Once you have signed up to Transifex you should ask an administrator of ODSC to make you an administrator too. We don’t maintain a list of those administrators here, but there is a considerable overlap with the people who contribute towards the BODS Standard repository.

Cloning the repositories to manage the workflow

Follow the instructions in the BODS data-standard-sphinx-theme README.md. These instructions will clone both the data-standard-sphinx-theme and data-standard repositories to your local machine.

Installing and configuring the Transifex client

Note that if you installed the data standard repo the Transifex client is a dependency so it will already be installed.

Installing the Transifex client

Transifex offers a number of different options for uploading content to be translated. In this documentation we will describe the process for setting up the Command Line Interface (CLI) Client.

The commands for installing the Transifex CLI client on Ubuntu are:

$ sudo apt-get install python-pip
$ sudo pip install transifex-client

Installing other dependencies

You also need to make sure you have gettext, pybabel and (for SVGs) itstool installed in whatever environment you’re running this in:

$ apt-get install gettext
$ apt-get install python3-babel
$ apt-get install itstool

Configuring the Transifex client

Transifex configuration involves the creation of two files:

~/.transifexrc, which stores your Transifex host configuration in your home directory, including your API key. .tx/config, which stores the mappings between your local files and Transifex in a .tx folder in your repo’s root directory.

Transifex’s own documentation for initialising the client involves running the tx init command. We are going to edit the files more directly.

Creating and storing your API key

You’ll need a Transifex API key to push to and pull from the BODS project. Click on the “Generate a token” button on the right hand side. Click on “Copy and Close”. Your API token has been created. You only need this locally; don’t commit it or share it or store it anywhere public.

So you don’t have to enter it at the commandine every time you can store it in .transifexrc in your home directory (~/), which looks like:

[https://www.transifex.com]
api_hostname = https://api.transifex.com
hostname = https://www.transifex.com
password = YOUR-API-KEY-SHOULD-GO-HERE
username = api

Enter the lines as shown above, replacing YOUR-API-KEY-SHOULD-GO-HERE with your newly generated api key.

This sets you up with access to add and retrieve files to and from Transifex. See Transifex client configuration for more details.

.tx/config

The .tx/config file is used to map files in a local repo/directory to resources in Transifex. This file is stored in the .tx folder in the repo’s root directory.

Although it is possible to pre-configure your .tx/config file in advance there are number of ways in which the configuration can change, right up until the moment that you extract and push your projects’ strings up to Transifex. In particular, the .tx/config file maps:

For that reason, we often recreate the .tx/config file as part of the workflow.

The diagram below shows the state of the .tx/config file after extracting the strings from the 0.3 dev branch of the data-standard repo, ready to push those strings up to the BODS v0.3 project on Transifex.

Github-Transifex config

Instructions to create an initial .tx/config file are provided below.

Integrating translations

Whenever any strings are changed that are in scope for translation (see list above) they need to be ‘extracted’, pushed to Transifex, translated, and the translated strings pulled back down. Updates to the documentation and schema should not be released until the necessary translations are in place.

The steps for doing this should be done by the person making the changes to the schema and docs, and are documented below. There are separate steps for the docs, schema and codelists, and you only need to carry out the steps applicable to the changes you made. For example, if you only updated the schema, you don’t need to execute commands to extract strings from the docs or codelists.

If you are working on a development branch, you should not push source file changes to Transifex. Instead, wait until your changes have been merged into the main branch. Source files should only ever be pushed to Transifex from the main branch (currently main) to ensure conflicts do not occur in Transifex between multiple people working on different branches simultaneously.

Note that ‘extracted’ (English) strings (.pot files) are not pushed to the Github repo, but translated strings (.po files) are. This lets readthedocs find them so it can build everything in other languages. For a clean commit history, it’s helpful to make separate commits for your changes to the source (docs or schema) and the translation files subsequently pulled from Transifex.

The steps for the Sphinx theme are in the sphinx theme README.

Note for developers: .po files from the Sphinx theme are included when you build the docs from data-standard thanks to the following line in docs/conf.py:

locale_dirs = ['locale/', os.path.join(oods.sphinxtheme.get_html_theme_path(), 'locale')]

So make sure the latest version of the theme is being installed in case expected translations aren’t showing up.

Creating and configuring a new project in Transifex

Translations for the current latest version of BODS are found in the BODS-main Transifex project. These may not be updated until it is time for a new versioned release of BODS, meaning translations in Transifex may be lagging behind the latest text in the BODS Github repository. However, when changes to the docs, schema or codelists are merged into the main branch, these changes should always be pushed to Transifex, so that - assuming translators are available - translations can be brought up to date at any time. See integrating translations.

If you need to create a new Transifex project that contains the latest available source files and translations, do the following:

Translation workflow

To run the steps in the translation workflow, ensure that you have followed the installation and setup instructions above.

Run the following commands from the root directory unless otherwise specified (eg. sometimes it’s less complicated to run them from docs).

  1. Before you start, run tx pull -a to make sure you have the most up to date translations in your local environment.

When you change text in the docs you need to do the following so that they can be translated:

If you modified the schema also:

If you modified the codelists also:

If you modified an SVG diagram also:

If you added, deleted or renamed files or you want to use a different Transifex project, run (from root, ie. cd ../):

rm -f .tx/config
sphinx-intl create-txconfig
sphinx-intl update-txconfig-resources --pot-dir docs/_build/gettext --locale-dir docs/locale --transifex-project-name bods-test

(Replacing bods-test with a different Transifex project name.)

And then:

  1. Run tx push -s to push to Transifex.

Now the files are ready to be translated in Transifex.

  1. To fetch new translations when they’re done, you need to run tx pull -a to fetch all, or tx pull -l ru to fetch a particular language.

  2. If you are still on the main branch, check out a new development branch from which you will make a PR with the updated translations. Commit the new or updated .po files in docs/locale.

  3. Build translated SVGs for each language using itstool, and commit these (because we can’t easily install itstool on readthedocs):

pybabel compile --use-fuzzy -d docs/locale -D svg

Replacing with language code, eg, `ru` (run this once per language):

itstool -m docs/locale/<LANG>/LC_MESSAGES/svg.mo -o docs/_build_svgs/<LANG> docs/_assets/*.svg
  1. Make a PR with the new translation files and SVGs (if applicable).

Teams and Roles

Teams are the groups of people who do the translations. Each project has just one team allocated to it, although a team can be allocated to more than one project. To illustrate this, below is a diagram showing the first ten projects listed under Open Data Services Co-operative (as at 2019-06-19) and the teams that are allocated to them.

Transifex: projects and teams

A team can consist of the following roles:

The BODS team consists of a team manager, translators and reviewers, with the team manager taking on the role of coordinator. We also use subject matter experts to maintain the glossary. They do not have to use Transifex. Their work can be done in a spreadsheet that is uploaded to Transifex by the team manager. In the future we intend to host the glossary in the data-standard repository.

The BODS team manager allocates the translators and reviewers to a specific language. As at the time of writing we have only set up a team consisting of the Members translating to Russian.

Transifex: BODS Team

NOTE: The OCDS handbook specifies different roles. It separates out the “team manager” role into a “Release Manager” and a ‘coordinator’. We should review these two different ways of working to see if we can agree a common standard. It also details a proofreader role which is not supported under the Transifex free plan.

Team manager

Tasks:

Therefore they need:

Skills:

Translator

Tasks:

Therefore they need:

Skills:

Reviewer

Tasks:

Therefore they need:

Skills:

Subject matter expert

Tasks:

Therefore they need:

Skills:

Access for translators

Translators should be given access to translate the main BODS project (documentation, schema, codelists), as well as the theme.

Translators and reviewers can follow the instructions here to sign up to Transifex: Transifex docs for translators.

Once a translator or reviewer has signed up to Transifex then an admiinistrator can add them to the BODS Team through the BODS team > Members translating to Russian page. Clicking on “Add translators” or “Add reviewers” will bring up this form

Add collaborator

which can be completed to add the person to the project

Instructions for translators

Translators should be given access to the project on Transifex and also a link to the latest version of the data standard website for context.

Translators do not have to translate every word in the Transifex project. Any text wrapped in `s (e.g. `address`, `JSON document https://tools.ietf.org/html/rfc8259`) should not be translated. Special attention to this should be paid in the schema, schema-reference and concepts resources where they are used most. In the svg resource the names of objects and codes from a codelist are not to be translated. As a guide a link to a translated version of the Key Concepts page should be provided (e.g. https://standard.openownership.org/es/latest/schema/concepts.html).

The translation and review process

NOTE: This section describes a process that is different to how we have worked to date. As such it should be seen as a suggestion that is open to discussion.

Translators

Once the strings for the schema release have been uploaded to Transifex, the translator should be given access to the project and asked to begin the translation.

Translators should be aware that they will be required to take part in the review of their work. Details of this are in the following section.

Reviewers

There are two inter-linked tasks for a reviewer. A reviewer can choose to do these separately or together as they work through the project.

  1. Review all of the translated strings in the project
  2. Check all of the warnings* against the translated strings in the project

*Transifex will warn users when certain translation checks fail. This includes cases when a term in the source file is translated to something other than the translation in the glossary.

Reviewers should make comments against the translation that are then resolved between them and the reviewer.

Comments are made against a string. Because a string can consist of an entire paragraph it is necessary to quote the part of the string that a comment is made against. Because a reviewer might query more than one part of the string it is necessary to create a reference for the comment.

A comment template is as follows:

#1 "selection-of-text-being-commented-on"
- Description of the problem that the reviewer sees in the translation
- Suggestion how this can be resolved

The translator can then accept the suggestion by editing the translated string or they can reply to the reviewer with an alternative suggestion or a request for clarification.

A template for a response to a comment is as follows

#1 "selection-of-text-being-commented-on"
- Response to the comment
Resolving differences

Where agreement cannot be reached by the translator and the reviewer it is the Team Manager’s role to decide what should be done. They may take a decision themselves, or seek external advice.

Adding new languages on readthedocs

Once you’ve got all your translations, you need to publish them. The process for adding a new language version of the docs on readthedocs is as follows.

These instructions were summarised from Localization of Documentation in the readthedocs docs.

Previewing on readthedocs

When work is in progress on a branch, you can build this branch in readthedocs to preview it before publishing.

Additional resources