Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New find_xgettext method to i18n module #13900

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

bruchar1
Copy link
Member

This method return a PotExtractor object used to extract translatable
string from source files into a .pot translation template.

It differs from a plain CustomTarget in three ways:

  • It accepts build targets as sources, and automatically resolves source
    files from those build targets;
  • It detects command lines that are too long, and writes, at config
    time, the list of source files into a text file to be consumed by the
    xgettext command;
  • It detects dependencies between pot extraction targets, based on the
    dependencies between source targets.

This does the specific thing I need for my project, that I was trying to implement in a more generic way in #12272 and #11822.

@bruchar1 bruchar1 requested a review from jpakkane as a code owner November 12, 2024 13:45
@bruchar1 bruchar1 force-pushed the i18nextract branch 4 times, most recently from 9ebc09e to 4f5283b Compare November 12, 2024 16:23
This was referenced Nov 14, 2024
docs/markdown/i18n-module.md Outdated Show resolved Hide resolved
@@ -74,3 +74,72 @@ for normal keywords. In addition it accepts these keywords:
* `mo_targets` *required*: mo file generation targets as returned by `i18n.gettext()`.

*Added 0.62.0*


### i18n.pot_extractor()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer a more specific name like find_xgettext for this function and XgettextProgram for the object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I'm open changing the names, I'm not sure about find_xgettext, because I think it would suggest that the returned object is compatible with was is returned by find_program. Or should it be?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think this is a neat idea. I modified it to make the returned object a subclass of ExternalProgram.

@bruchar1 bruchar1 force-pushed the i18nextract branch 2 times, most recently from b35d6b7 to 9347ef2 Compare December 6, 2024 16:51
@@ -115,6 +134,161 @@
}


class XgettextProgram(ExternalProgram):

Check failure

Code scanning / CodeQL

Missing call to `__init__` during object initialization Error

Class XgettextProgram may not be initialized properly as
method ExternalProgram.__init__
is not called from its
__init__ method
.
@bruchar1 bruchar1 changed the title New pot_extrator method to i18n module New find_xgettext method to i18n module Dec 6, 2024
This method return a XgettextProgram object used to extract translatable
string from source files into a .pot translation template.

It differs from a plain CustomTarget in three ways:
- It accepts build targets as sources, and automatically resolves source
  files from those build targets;
- It detects command lines that are too long, and writes, at config
  time, the list of source files into a text file to be consumed by the
  xgettext command;
- It detects dependencies between pot extraction targets, based on the
  dependencies between source targets.
@typed_kwargs(
'i18n.find_xgettext',
_ARGS,
KwargInfo('required', bool, default=True),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use the shared option that also handles UserFeatureOptions? Or add the disabler argument. There may be something I'm missing here.

for dep in depends:
source_list += '\n' + path.join(dep.subdir, dep.get_filename())

estimated_cmdline_length = len(source_list) + len(arguments) + sum(len(arg) for arg in arguments) + 1
Copy link
Member

@dcbaker dcbaker Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this code is correct, but it looks really odd because of the way your count the spaces in arguments, I think:

# count the space between each argument as well
estimated_cmdline_length = len(source_list) + sum(len(arg) + 1 for arg in arguments) + 1

would be more clear


estimated_cmdline_length = len(source_list) + len(arguments) + sum(len(arg) for arg in arguments) + 1
if estimated_cmdline_length < 8000:
# Maximum command line length on Windows is 8191
Copy link
Member

@dcbaker dcbaker Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a helper function to calculate an rsp threshold in the ninja backend, which includes an environment variable to tune whether we use an rsp file or not. I wonder if it would be appropriate to use that function here (maybe moved to utils.universal)? (or possibly it would make more sense to have a way for custom_targets to specify that they can use an RSP file, but that seems like a large out of scope project.

class XgettextProgram(ExternalProgram):

def __init__(self, xgettext: ExternalProgram, args: T.List[str]):
self.__dict__.update(xgettext.__dict__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love this, but I understand why you did this.

Comment on lines +113 to +117
- merge `bool`:
if `true`, will merge the resulting pot file with extracted pot files
related to dependencies of the given source targets. For instance,
if you build an executable, then you may want to merge the executable
translations with the translations from the dependent libraries.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it should be called "recursive" instead of "merge".

Comment on lines +121 to +126
- alias `list[build_tgt]`:
a list of build targets using the same resulting pot file. For
instance, if you build both a static and a shared library from the
same sources, you may specify the static library target as an alias,
to use the generated pot file when that static library is used as a
dependency of another target.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand what this is doing. But it seems backwards somehow.

Comment on lines +1 to +12
## i18n module find_xgettext

There is a new `find_xgettext` function in `i18n` module that acts as a
wrapper around `xgettext`. It allows to extract strings to translate from
source files.

This function is convenient, because:
- It can find the sources files from a build target;
- It will use an intermediate file when the number of source files is too
big to be handled directly from the command line;
- It is able to get strings to translate from the dependencies of the given
targets.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ultimately the goal here is just to avoid the i18n.gettext() assumption that POTFILES is the source of truth for xgettext, right?

It feels like this function adds complexity which doesn't directly serve that. In particular, it seems odd to have a function which finds the program, and then have that yield an object with a single method. We could instead directly provide an i18n.xgettext() function with the same API contract which internally looks up xgettext, no?

@eli-schwartz
Copy link
Member

Since the underlying goal as I understand it is essentially "i18n.gettext() is not suitable for my use case because it depends on POTFILES but I would like to calculate the list from meson.build", it feels a bit like a more appropriate solution here could be to teach i18n.gettext() to take an optional override of files to extract from (and e.g. resolve build targets by recursing into their source filelist, if desirable) -- and then generate a POTFILES for you, basically.

@bonzini
Copy link
Contributor

bonzini commented Dec 10, 2024

and then generate a POTFILES for you, basically.

I understood that he also wants to generate multiple .pot files, so that would not be a single POTFILES.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants