The processes, roles and mechanisms detailed below implement the BODS Language Support Policy.
Note: This process is still in development, and improvements or clarifications are welcome.
The following instructions cover the translation of:
BODS schema and codelists exist under the schema folder within the BODS Github repository. The content for the BODS documentation website exists under the docs folder within the same BODS Github repository. The documentation website’s theme has its own Github repository - data-standard-sphinx-theme.
By translating the three components listed above the publicly available website at https://standard.openownership.org can be published in different languages. This is the aim of the translation work.
Things that are in scope for translation are:
title
and description
.title
, description
and technical note
.Things that are not in scope for the translation are:
type
and required
Things that need to become part of the translation workflow but are not yet:
The workflow for doing the translation is:
The diagram below provides an extremely high-level overview of the workflow. It excludes the final step of making the translation live.
BODS translations currently live under the Open Data Services Transifex account. The BODS docs and schema live in BODS-main (for the latest in-development version), or versioned projects (for versioned releases of the standard), e.g. v0.1 ‘project’ is bods-v01. The project contains ‘resources’, each of which correspond to a page of documentation (an individual RST file) plus one each for the schema, codelists and .svg files. These may also be referred to as ‘source files’.
The documentation sphinx theme translations live under bods-theme, which contains only one resource for all the strings in the theme templates.
Complete the following steps to get your system set up to undergo the workflow outlined above (assumes Ubuntu 18.04.2 LTS or similar):
Create a free Transifex account on their sign up page.
The organization is the home of all the translation projects that an organization runs on Transifex.
Open Data Services Co-operative is already set up as an organization in Transifex and all of our projects and teams are managed from within it. A list of our translation projects is visible at the Open Data Services Co-operative public page.
Once you have signed up to Transifex you should ask an administrator of ODSC to make you an administrator too. We don’t maintain a list of those administrators here, but there is a considerable overlap with the people who contribute towards the BODS Standard repository.
Follow the instructions in the BODS data-standard-sphinx-theme README.md. These instructions will clone both the data-standard-sphinx-theme and data-standard repositories to your local machine.
Note that if you installed the data standard repo the Transifex client is a dependency so it will already be installed.
Transifex offers a number of different options for uploading content to be translated. In this documentation we will describe the process for setting up the Command Line Interface (CLI) Client.
The commands for installing the Transifex CLI client on Ubuntu are:
$ sudo apt-get install python-pip
$ sudo pip install transifex-client
You also need to make sure you have gettext
, pybabel
and (for SVGs) itstool
installed in whatever environment you’re running this in:
$ apt-get install gettext
$ apt-get install python3-babel
$ apt-get install itstool
Transifex configuration involves the creation of two files:
~/.transifexrc
, which stores your Transifex host configuration in your home directory, including your API key.
.tx/config
, which stores the mappings between your local files and Transifex in a .tx folder in your repo’s root directory.
Transifex’s own documentation for initialising the client involves running the tx init command. We are going to edit the files more directly.
You’ll need a Transifex API key to push to and pull from the BODS project. Click on the “Generate a token” button on the right hand side. Click on “Copy and Close”. Your API token has been created. You only need this locally; don’t commit it or share it or store it anywhere public.
So you don’t have to enter it at the commandine every time you can store it in .transifexrc
in your home directory (~/
), which looks like:
[https://www.transifex.com]
api_hostname = https://api.transifex.com
hostname = https://www.transifex.com
password = YOUR-API-KEY-SHOULD-GO-HERE
username = api
Enter the lines as shown above, replacing YOUR-API-KEY-SHOULD-GO-HERE with your newly generated api key.
This sets you up with access to add and retrieve files to and from Transifex. See Transifex client configuration for more details.
The .tx/config
file is used to map files in a local repo/directory to resources in Transifex. This file is stored in the .tx folder in the repo’s root directory.
Although it is possible to pre-configure your .tx/config
file in advance there are number of ways in which the configuration can change, right up until the moment that you extract and push your projects’ strings up to Transifex. In particular, the .tx/config
file maps:
For that reason, we often recreate the .tx/config file as part of the workflow.
The diagram below shows the state of the .tx/config file after extracting the strings from the 0.3 dev branch of the data-standard repo, ready to push those strings up to the BODS v0.3 project on Transifex.
Instructions to create an initial .tx/config
file are provided below.
Whenever any strings are changed that are in scope for translation (see list above) they need to be ‘extracted’, pushed to Transifex, translated, and the translated strings pulled back down. Updates to the documentation and schema should not be released until the necessary translations are in place.
The steps for doing this should be done by the person making the changes to the schema and docs, and are documented below. There are separate steps for the docs, schema and codelists, and you only need to carry out the steps applicable to the changes you made. For example, if you only updated the schema, you don’t need to execute commands to extract strings from the docs or codelists.
If you are working on a development branch, you should not push source file changes to Transifex. Instead, wait until your changes have been merged into the main branch. Source files should only ever be pushed to Transifex from the main branch (currently main
) to ensure conflicts do not occur in Transifex between multiple people working on different branches simultaneously.
Note that ‘extracted’ (English) strings (.pot
files) are not pushed to the Github repo, but translated strings (.po
files) are. This lets readthedocs find them so it can build everything in other languages. For a clean commit history, it’s helpful to make separate commits for your changes to the source (docs or schema) and the translation files subsequently pulled from Transifex.
The steps for the Sphinx theme are in the sphinx theme README.
Note for developers: .po
files from the Sphinx theme are included when you build the docs from data-standard
thanks to the following line in docs/conf.py
:
locale_dirs = ['locale/', os.path.join(oods.sphinxtheme.get_html_theme_path(), 'locale')]
So make sure the latest version of the theme is being installed in case expected translations aren’t showing up.
Translations for the current latest version of BODS are found in the BODS-main Transifex project. These may not be updated until it is time for a new versioned release of BODS, meaning translations in Transifex may be lagging behind the latest text in the BODS Github repository. However, when changes to the docs, schema or codelists are merged into the main branch, these changes should always be pushed to Transifex, so that - assuming translators are available - translations can be brought up to date at any time. See integrating translations.
If you need to create a new Transifex project that contains the latest available source files and translations, do the following:
bods-v02
for BODS version 0.2.tx push -s
to push the source files to Transifex.tx push -t
to push the translation files to Transifex.
tx push -t -f
- you will have to confirm (press y
and To run the steps in the translation workflow, ensure that you have followed the installation and setup instructions above.
Run the following commands from the root directory unless otherwise specified (eg. sometimes it’s less complicated to run them from docs
).
tx pull -a
to make sure you have the most up to date translations in your local environment.When you change text in the docs you need to do the following so that they can be translated:
docs
directory, run make gettext
to extract translatable English strings from the docs. (This generates .pot
files into docs/_build/gettext/
.)If you modified the schema also:
pybabel extract -F babel_bods_schema.cfg . -o docs/_build/gettext/schema.pot
to extract translatable English strings from the schema.If you modified the codelists also:
pybabel extract -F babel_bods_codelist.cfg . -o docs/_build/gettext/codelist.pot
to extract translatable English strings from the codelists.babel_bods_codelist.cfg
file to match.If you modified an SVG diagram also:
itstool -i svg-its-rules.xml -o docs/_build/gettext/svg.pot docs/_assets/*.svg
to extract translatable English strings from the SVGs.If you added, deleted or renamed files or you want to use a different Transifex project, run (from root, ie. cd ../
):
rm -f .tx/config
sphinx-intl create-txconfig
sphinx-intl update-txconfig-resources --pot-dir docs/_build/gettext --locale-dir docs/locale --transifex-project-name bods-test
(Replacing bods-test
with a different Transifex project name.)
And then:
tx push -s
to push to Transifex.Now the files are ready to be translated in Transifex.
To fetch new translations when they’re done, you need to run tx pull -a
to fetch all, or tx pull -l ru
to fetch a particular language.
If you are still on the main branch, check out a new development branch from which you will make a PR with the updated translations. Commit the new or updated .po files in docs/locale
.
Build translated SVGs for each language using itstool, and commit these (because we can’t easily install itstool on readthedocs):
pybabel compile --use-fuzzy -d docs/locale -D svg
Replacing
itstool -m docs/locale/<LANG>/LC_MESSAGES/svg.mo -o docs/_build_svgs/<LANG> docs/_assets/*.svg
Teams are the groups of people who do the translations. Each project has just one team allocated to it, although a team can be allocated to more than one project. To illustrate this, below is a diagram showing the first ten projects listed under Open Data Services Co-operative (as at 2019-06-19) and the teams that are allocated to them.
A team can consist of the following roles:
The BODS team consists of a team manager, translators and reviewers, with the team manager taking on the role of coordinator. We also use subject matter experts to maintain the glossary. They do not have to use Transifex. Their work can be done in a spreadsheet that is uploaded to Transifex by the team manager. In the future we intend to host the glossary in the data-standard repository.
The BODS team manager allocates the translators and reviewers to a specific language. As at the time of writing we have only set up a team consisting of the Members translating to Russian.
NOTE: The OCDS handbook specifies different roles. It separates out the “team manager” role into a “Release Manager” and a ‘coordinator’. We should review these two different ways of working to see if we can agree a common standard. It also details a proofreader role which is not supported under the Transifex free plan.
Tasks:
Therefore they need:
Skills:
Tasks:
Therefore they need:
Skills:
Tasks:
Therefore they need:
Skills:
Tasks:
Therefore they need:
Skills:
Translators should be given access to translate the main BODS project (documentation, schema, codelists), as well as the theme.
Translators and reviewers can follow the instructions here to sign up to Transifex: Transifex docs for translators.
Once a translator or reviewer has signed up to Transifex then an admiinistrator can add them to the BODS Team through the BODS team > Members translating to Russian page. Clicking on “Add translators” or “Add reviewers” will bring up this form
which can be completed to add the person to the project
Translators should be given access to the project on Transifex and also a link to the latest version of the data standard website for context.
Translators do not have to translate every word in the Transifex project. Any text wrapped in `s (e.g. `address`, `JSON document https://tools.ietf.org/html/rfc8259`) should not be translated. Special attention to this should be paid in the schema, schema-reference and concepts resources where they are used most. In the svg resource the names of objects and codes from a codelist are not to be translated. As a guide a link to a translated version of the Key Concepts page should be provided (e.g. https://standard.openownership.org/es/latest/schema/concepts.html).
NOTE: This section describes a process that is different to how we have worked to date. As such it should be seen as a suggestion that is open to discussion.
Once the strings for the schema release have been uploaded to Transifex, the translator should be given access to the project and asked to begin the translation.
Translators should be aware that they will be required to take part in the review of their work. Details of this are in the following section.
There are two inter-linked tasks for a reviewer. A reviewer can choose to do these separately or together as they work through the project.
*Transifex will warn users when certain translation checks fail. This includes cases when a term in the source file is translated to something other than the translation in the glossary.
Reviewers should make comments against the translation that are then resolved between them and the reviewer.
Comments are made against a string. Because a string can consist of an entire paragraph it is necessary to quote the part of the string that a comment is made against. Because a reviewer might query more than one part of the string it is necessary to create a reference for the comment.
A comment template is as follows:
#1 "selection-of-text-being-commented-on"
- Description of the problem that the reviewer sees in the translation
- Suggestion how this can be resolved
The translator can then accept the suggestion by editing the translated string or they can reply to the reviewer with an alternative suggestion or a request for clarification.
A template for a response to a comment is as follows
#1 "selection-of-text-being-commented-on"
- Response to the comment
Where agreement cannot be reached by the translator and the reviewer it is the Team Manager’s role to decide what should be done. They may take a decision themselves, or seek external advice.
Once you’ve got all your translations, you need to publish them. The process for adding a new language version of the docs on readthedocs is as follows.
These instructions were summarised from Localization of Documentation in the readthedocs docs.
When work is in progress on a branch, you can build this branch in readthedocs to preview it before publishing.