The processes, roles and mechanisms detailed below implement the BODS Language Support Policy.
Note: This process is still in development, and improvements or clarifications are welcome.
The following instructions cover the translation of:
BODS schema and codelists exist under the schema folder within the BODS Github repository. The content for the BODS documentation website exists under the docs folder within the same BODS Github repository. The documentation website’s theme has its own Github repository - data-standard-sphinx-theme.
By translating the three components listed above the publicly available website at https://standard.openownership.org can be published in different languages. This is the aim of the translation work.
Things that are in scope for translation are:
title
and description
.title
, description
and technical note
.Things that are not in scope for the translation are:
type
and required
Things that need to become part of the translation workflow but are not yet:
The workflow for doing the translation is:
The diagram below provides an extremely high-level overview of the workflow. It excludes the final step of making the translation live.
BODS translations are found under the Open Data Services Transifex account.
The BODS documentation and schema are in BODS-main (for the latest in-development version), or versioned projects (for versioned releases of the standard), e.g. v0.1 ‘project’ is bods-v01. Once a translation has been completed in BODS-main a snapshot should be taken and renamed under the relevant project name or version, see steps to snapshot a translated release.
A Transifex project contains ‘resources’, each of which correspond to a page of documentation (an individual RST file) plus one each for the schema, codelists and SVG files. These may also be referred to as ‘source files’.
The translations for the documentation theme (menu items, footer text, etc) are in the bods-theme Transifex project, which contains only one resource for all the strings in the theme templates.
The following steps get your system set up to follow the translation workflow (assumes Ubuntu 22.04 LTS or similar).
Summary:
These steps are explained in more detail in the next sections.
A list of translation projects Open Data Services manage via Transifex is visible at the Open Data Services Co-operative public page.
Create a free Transifex account on their sign up page.
Ask an administrator of ODS to make you an administrator for the BODS projects. We don’t maintain a list of those administrators here, but there is a considerable overlap with the people who contribute towards the BODS repository.
If you are using the BODS development environment, this has already been installed and you can skip to Configuration.
curl -o- https://raw.githubusercontent.com/transifex/cli/master/install.sh | bash
For other methods of installing, or to get a specific version of the client, follow the instructions for installing the Transifex client for your system here.
gettext
(for extracting source strings from the documentation), pybabel
(for extracting from the schema and codelists) and itstool
(for extracting from SVGs):$ apt-get install gettext
$ apt-get install python3-babel
$ apt-get install itstool
.transifexrc
in your home directory (~/
) with the following contents, replacing YOUR-API-KEY-SHOULD-GO-HERE with your newly generated api key:[https://www.transifex.com]
api_hostname = https://api.transifex.com
hostname = https://www.transifex.com
password = YOUR-API-KEY-SHOULD-GO-HERE
username = api
See Transifex client configuration for more details.
The BODS-main Transifex project holds the latest source files and the most up to date translations available.
Translations may not be available for the latest English text because translation happens in batches when the source files are stable (not under active development).
When changes to the docs, schema or codelists that are in scope for translation are merged into the main
branch, and a phase of translation is set to begin, these changes should be pushed to the BODS-main project on Transifex.
The steps to do this are:
These steps happen after changes have been approved and merged into the main
branch on Github. Never push to transifex from a development branch. Note that locally ‘extracted’ (English) strings (stored in .pot
files) are ignored: they do not get pushed to the remote Github repository.
Updates to the documentation and schema should not be released until the translations are complete.
After translations have been added in Transifex, the translated strings (.po
files) do need to be added to the Github repository so that ReadTheDocs can build everything in other languages. The steps to do this are:
main
.Stages 4 and 5 may need to be repeated several times.
Finally
To run the steps in the translation workflow, ensure that you have followed the installation and setup instructions.
Before you start, run tx pull -a
to make sure you have the most up to date translations in your local environment.
Run the following commands from the root directory of the repository unless otherwise specified.
If you modified the schema:
pybabel extract -F babel_bods_schema.cfg . -o docs/_build/gettext/schema.pot
If you modified the codelists:
babel_bods_codelist.cfg
file to match.pybabel extract -F babel_bods_codelist.cfg . -o docs/_build/gettext/codelist.pot
If you modified an SVG diagram:
itstool -i svg-its-rules.xml -o docs/_build/gettext/svg.pot docs/_assets/*.svg
If you changed the documentation:
docs
directory (cd docs
)make gettext
to extract translatable English strings. This generates .pot
files into docs/_build/gettext/
.If you added, deleted or renamed files or you want to use a different Transifex project, run (from root):
rm -f .tx/config
to delete the old config filesphinx-intl create-txconfig
to create a new empty config filesphinx-intl update-txconfig-resources --pot-dir docs/_build/gettext --locale-dir docs/locale --transifex-organization-name OpenDataServices --transifex-project-name bods-main
(replacing bods-main
with a different Transifex project name if necessary) to fill the config file with the file paths for the source strings.tx push -s
to push the source files to Transifex.Now the files are ready to be translated in Transifex. See ‘Teams and roles’ for project managing the translation process in Transifex.
main
.tx pull -f -a
to fetch all, or tx pull -f -l ru
to fetch a particular language (Russian in this case). (We force pull to ensure that local po files are always overwritten with translations from Transifex.)pybabel compile --use-fuzzy -d docs/locale -D svg
<LANG>
with language code, eg, ru
(run this once per language): itstool -m docs/locale/<LANG>/LC_MESSAGES/svg.mo -o docs/_build_svgs/<LANG> docs/_assets/*.svg
docs/locale
, eg.:
git add docs/locale
git add docs/_build_svgs/
git commit -m "Translations: Add latest translations for the schema"
The steps for the Sphinx theme are in the sphinx theme README.
Once you have extracted the strings, you can follow the instructions to Update the configuration before pushing to transifex.
Note for developers: .po
files from the Sphinx theme are included when you build the docs from data-standard
thanks to the following line in docs/conf.py
:
locale_dirs = ['locale/', os.path.join(oods.sphinxtheme.get_html_theme_path(), 'locale')]
So make sure the latest version of the theme is being installed if expected translations aren’t showing up.
When a new version of BODS has been released, and the translation completed, we snapshot the translations in a new Transifex project to match the frozen git branch for the version.
bods-v05
for BODS version 0.5.git branch 0.5.0
).tx push -s
to push the source files to Transifex.tx push -a
to push the translation files to Transifex. Use the -l
flag if you only want to push certain languages (eg. tx push -l fr,ru
).
tx push -t -f
- you will have to confirm (press y
and Transifex pre-fills translations for phrases which have previously been translated using Translation Memory, but this doesn’t work across projects. We can reuse the Translation Memory from another project when creating a new project with the following steps:
docs/_build
directory.tx push -s
to push the source files to Transifex.Once you’ve got all your translations, you need to publish them. The process for adding a new language version of the docs on readthedocs is as follows.
These instructions were summarised from Localization of Documentation in the readthedocs docs.
During the translation process, there will be points where it will be helpful to generate a preview to allow the translators and/or reviewers to see the translations in context.
You can build this branch in readthedocs to preview it before publishing.
Teams are the groups of people who do the translations. Each project has just one team allocated to it, although a team can be allocated to more than one project. To illustrate this, below is a diagram showing the first ten projects listed under Open Data Services Co-operative (as at 2019-06-19) and the teams that are allocated to them.
A team can consist of the following roles:
The BODS team consists of a team manager, translators and reviewers, with the team manager taking on the role of coordinator. We also use subject matter experts to maintain the glossary. They do not have to use Transifex. Their work can be done in a spreadsheet that is uploaded to Transifex by the team manager. In the future we intend to host the glossary in the data-standard repository.
The BODS team manager allocates the translators and reviewers to a specific language. As at the time of writing we have only set up a team consisting of the Members translating to Russian.
NOTE: The OCDS handbook specifies different roles. It separates out the “team manager” role into a “Release Manager” and a ‘coordinator’. We should review these two different ways of working to see if we can agree a common standard. It also details a proofreader role which is not supported under the Transifex free plan.
Tasks:
Therefore they need:
Skills:
Tasks:
Therefore they need:
Skills:
Tasks:
Therefore they need:
Skills:
Tasks:
Therefore they need:
Skills:
Translators should be given access to translate the main BODS project (documentation, schema, codelists), as well as the theme.
Translators and reviewers can follow the instructions here to sign up to Transifex: Transifex docs for translators.
Once a translator or reviewer has signed up to Transifex then an admiinistrator can add them to the BODS Team through the BODS team > Members translating to Russian page. Clicking on “Add translators” or “Add reviewers” will bring up this form
which can be completed to add the person to the project
Translators should be given access to the project on Transifex and also a link to the latest version of the data standard website for context.
Translators do not have to translate every word in the Transifex project. Any text wrapped in `s (e.g. `address`, `JSON document https://tools.ietf.org/html/rfc8259`) should not be translated. Special attention to this should be paid in the schema, schema-reference and concepts resources where they are used most. In the svg resource the names of objects and codes from a codelist are not to be translated. As a guide a link to a translated version of the Key Concepts page should be provided (e.g. https://standard.openownership.org/es/latest/schema/concepts.html).
NOTE: This section describes a process that is different to how we have worked to date. As such it should be seen as a suggestion that is open to discussion.
Once the strings for the schema release have been uploaded to Transifex, the translator should be given access to the project and asked to begin the translation.
Translators should be aware that they will be required to take part in the review of their work. Details of this are in the following section.
There are two inter-linked tasks for a reviewer. A reviewer can choose to do these separately or together as they work through the project.
Transifex will warn users when certain translation checks fail. This includes cases when a term in the source file is translated to something other than the translation in the glossary.
Reviewers should make comments against the translation that are then resolved between them and the reviewer.
Comments are made against a string. Because a string can consist of an entire paragraph it is necessary to quote the part of the string that a comment is made against. Because a reviewer might query more than one part of the string it is necessary to create a reference for the comment.
A comment template is as follows:
#1 "selection-of-text-being-commented-on"
- Description of the problem that the reviewer sees in the translation
- Suggestion how this can be resolved
The translator can then accept the suggestion by editing the translated string or they can reply to the reviewer with an alternative suggestion or a request for clarification.
A template for a response to a comment is as follows
#1 "selection-of-text-being-commented-on"
- Response to the comment
Where agreement cannot be reached by the translator and the reviewer it is the Team Manager’s role to decide what should be done. They may take a decision themselves, or seek external advice.