Skip to content

Commit

Permalink
Extract locales automatically in CI
Browse files Browse the repository at this point in the history
  • Loading branch information
KevinMind committed Jul 12, 2024
1 parent c5e0e9e commit 9df4428
Show file tree
Hide file tree
Showing 9 changed files with 218 additions and 248 deletions.
32 changes: 32 additions & 0 deletions .github/workflows/localization.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
name: GitHub Actions Demo
run-name: ${{ github.actor }} is testing out GitHub Actions 🚀
on: [push]
jobs:
Explore-GitHub-Actions:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 18
cache: 'yarn'
- name: Install gettext
run: sudo apt-get install gettext
- name: Yarn install
run: yarn install --frozen-lockfile --prefer-offline

- name: Extract locales
run: yarn run zx ./bin/locales.mjs
- name: Commit locales
run: |
git config --global user.name "Kevin Meinhardt"
git config --global user.email "kmeinhardt@mozilla.com"
git checkout -b localization-ci
git add .
git commit -m 'test: localization'
git push -u origin localization-ci
gh pr create -B master -H localization-ci --title 'Locale BOT destroy!' --body 'Created by Github action'
41 changes: 41 additions & 0 deletions babel.config.locales.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
// Create UTC creation date in the correct format.
const potCreationDate = new Date()
.toISOString()
.replace('T', ' ')
.replace(/:\d{2}.\d{3}Z/, '+0000');

module.exports = {
extends: './babel.config.js',
plugins: [
[
'module:babel-gettext-extractor',
{
headers: {
'Project-Id-Version': 'amo',
'Report-Msgid-Bugs-To': 'EMAIL@ADDRESS',
'POT-Creation-Date': potCreationDate,
'PO-Revision-Date': 'YEAR-MO-DA HO:MI+ZONE',
'Last-Translator': 'FULL NAME <EMAIL@ADDRESS>',
'Language-Team': 'LANGUAGE <LL@li.org>',
'MIME-Version': '1.0',
'Content-Type': 'text/plain; charset=utf-8',
'Content-Transfer-Encoding': '8bit',
'plural-forms': 'nplurals=2; plural=(n!=1);',
},
functionNames: {
gettext: ['msgid'],
dgettext: ['domain', 'msgid'],
ngettext: ['msgid', 'msgid_plural', 'count'],
dngettext: ['domain', 'msgid', 'msgid_plural', 'count'],
pgettext: ['msgctxt', 'msgid'],
dpgettext: ['domain', 'msgctxt', 'msgid'],
npgettext: ['msgctxt', 'msgid', 'msgid_plural', 'count'],
dnpgettext: ['domain', 'msgctxt', 'msgid', 'msgid_plural', 'count'],
},
fileName: './locale/templates/LC_MESSAGES/amo.pot',
baseDirectory: process.cwd(),
stripTemplateLiteralIndent: true,
},
],
],
};
3 changes: 0 additions & 3 deletions bin/extract-locales

This file was deleted.

59 changes: 59 additions & 0 deletions bin/locales.mjs
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
#!/usr/bin/env zx

import {$, path, echo, within, glob} from 'zx';

const root = path.join(__dirname, '..');
const localeDir = path.join(root, 'locale');
const templateFile = path.join(localeDir, '/templates/LC_MESSAGES/amo.pot');

within(async () => {
echo('Extracting locales...');

const sourceDir = path.join(root, 'src', 'amo');
const outputDir = path.join(root, 'dist', 'locales');
const localesConfig = path.join(root, 'babel.config.locales.js');

await $`babel ${sourceDir} \
--out-dir ${outputDir} \
--config-file ${localesConfig} \
--verbose \
`;

const {stdout: output} = await $`git diff --numstat -- ${templateFile}`;

// git diff --numstat returns the number of insertions and deletions for each file
// this regex extracts the numbers from the output
const regex = /([0-9]+).*([0-9]+)/;

const [, insertions = 0, deletions = 0] = output.match(regex) || [];

const isLocaleClean = insertions < 2 && deletions < 2;

if (isLocaleClean) {
return echo('No locale changes, nothing to update, ending process');
}

echo(`Found ${insertions} insertions and ${deletions} deletions in ${templateFile}.`);

const poFiles = await glob(`${localeDir}/**/amo.po`);

echo(`Merging ${poFiles.length} translation files.`);

for await (const poFile of poFiles) {
const dir = path.dirname(poFile);
const stem = path.basename(poFile, '.po');
const tempFile = path.join(dir, `${stem}.po.tmp`);
echo(`merging: ${poFile}`);

try {
await $`msgmerge --no-fuzzy-matching -q -o ${tempFile} ${poFile} ${templateFile}`
await $`mv ${tempFile} ${poFile}`
} catch (error) {
await $`rm ${tempFile}`;
throw new Error(`Error merging ${poFile}`);
}
}

return true;
});

24 changes: 0 additions & 24 deletions bin/merge-locales

This file was deleted.

100 changes: 0 additions & 100 deletions bin/run-l10n-extraction

This file was deleted.

93 changes: 5 additions & 88 deletions docs/i18n.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,96 +20,13 @@ NODE_PATH='./:./src' bin/create-locales

## Updating locales

TL;DR: run the following script from the `master` branch: `./bin/run-l10n-extraction`
Locales are updated automatically as a part of our CI.
On every push to master `yarn extract-locales` is run which extracts locale strings from our codebase,
merges any changes to the source language files and commits the changes.

### The long story
You can run this command manually on your local environment any time to check the output strings.

Once a week right after the forthcoming release [is tagged](http://addons.readthedocs.io/en/latest/server/push-duty.html), the locales for each app must be generated.

This is a semi-automated process: a team member must create a pull request with the following commits:

1. A commit containing the extraction of newly added strings
2. A commit containing a merge of localizations

Each one of these steps are detailed in the sections below. Let's begin...

#### Extracting newly added strings

Start the process by creating a git branch and extracting the locales.

```
git checkout master
git pull
git checkout -b amo-locales
bin/extract-locales
```

This extracts all strings wrapped with `i18n.gettext()` or any other function supported by [Jed][jed] (the library we use in JavaScript to carry out replacements for the string keys in the current locale).

The strings are extracted using a babel plugin via webpack. Extracted strings are added to a pot template file. This file is used to seed the po for each locale with the strings needing translating when merging locales.

Run `git diff` to see what the extraction did. **If no strings were updated then you do not have to continue creating the pull request. You can revert the changes made to the `pot` timestamp.** Here is an example of a diff where no strings were changed. It just shows a single change to the timestamp:

```diff
diff --git a/locale/templates/LC_MESSAGES/amo.pot b/locale/templates/LC_MESSAGES/amo.pot
index 31e113f2..c7da4e34 100644
--- a/locale/templates/LC_MESSAGES/amo.pot
+++ b/locale/templates/LC_MESSAGES/amo.pot
@@ -2,7 +2,7 @@ msgid ""
msgstr ""
"Project-Id-Version: amo\n"
"Report-Msgid-Bugs-To: EMAIL@ADDRESS\n"
-"POT-Creation-Date: 2017-06-08 14:01+0000\n"
+"POT-Creation-Date: 2017-06-08 14:43+0000\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
```

When the application is under active development it's more likely that you will see a diff containing new strings or at least strings that have shifted to different line numbers in the source. If so, commit your change and continue to the next step:

```
git commit -a -m "Extract AMO locales"
```

#### Merging locale files

After extracting new strings, you have to merge them into the existing locale files. Do this in your branch and commit:

```
bin/merge-locales
```

Keep an eye out for [fuzzy strings](https://www.gnu.org/software/gettext/manual/html_node/Fuzzy-Entries.html) by running `git diff` and searching for a comment that looks like `# fuzzy`. This comment means the localization may not exactly match the source text; a localizer needs to review it. As per our configuration, the application will not display fuzzy translations. These strings will fall back to English.

In some rare cases you may wish to remove the `fuzzy` marker to prevent falling back to English. Discuss it with a team member before removing `fuzzy` markers.

Commit and continue to the next step:

```
git commit -a -m "Merged AMO locales"
```

#### Finalizing the extract/merge process

Now that you have extracted and merged locales for one application, it's time to create a pull request for your branch. For example:

```
git push origin amo-locales
```

If the pull request passes all of our CI tests it is likely good to merge. You don't need to ask for a review unless you're unsure of something because often locale updates will be thousands of lines of minor diffs that can't be reasonably reviewed by a human. 🙂 If the pull request passes all of our CI tests it is likely good to merge.

#### Building the JS locale files

This command creates the JSON files which are then built into JS bundles by webpack when the build step is run. This happens automatically as part of the deployment process.

Since dist files are created when needed you only need to build and commit the JSON to the repo.

```
# build the JSON.
bin/build-locales
```
Github actions internally prevent infinite loops by default.

## Setting up translations

Expand Down
6 changes: 4 additions & 2 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@
}
},
"extract-locales": {
"command": "webpack --progress --color --config webpack.l10n.config.babel.js",
"command": "zx ./bin/locales.mjs",
"env": {
"NODE_ENV": "production",
"NODE_ICU_DATA": "./node_modules/full-icu",
Expand Down Expand Up @@ -253,6 +253,7 @@
"webpack-isomorphic-tools": "4.0.0"
},
"devDependencies": {
"@babel/cli": "7.23.4",
"@babel/core": "^7.24.7",
"@babel/eslint-parser": "^7.24.7",
"@babel/preset-env": "^7.24.7",
Expand Down Expand Up @@ -329,7 +330,8 @@
"webpack-cli": "^4.0.0",
"webpack-dev-middleware": "^6.1.2",
"webpack-hot-middleware": "^2.26.1",
"webpack-subresource-integrity": "5.1.0"
"webpack-subresource-integrity": "5.1.0",
"zx": "8.1.4"
},
"bundlewatch": [
{
Expand Down
Loading

0 comments on commit 9df4428

Please sign in to comment.