-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New 'build' and 'source' bool args for include_directories() #11984
base: master
Are you sure you want to change the base?
Conversation
fdef3a0
to
1ba64b2
Compare
mesonbuild/build.py
Outdated
# 'incdirs', if not absolute paths, are to be used as both src-relative and | ||
# build-relative. Absolute paths are allowed and will just result in one | ||
# include dir. | ||
src_and_build_relative = 0 | ||
|
||
# 'incdirs' will only be used as src-relative (or absolute) include directories. | ||
src_relative = 1 | ||
|
||
# 'incdirs' are expected to be build-relative include dirs and will error if | ||
# absolute paths are used. | ||
build_relative = 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general we use ALL_CAPS
names for enum fields, and I'd prefer to keep it that way.
These are also really verbose names, could we just have SRC
, BUILD
, BOTH
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that an enum is very wrong here, because it fundamentally ties to the meson.build API contract being offered, which is string-based and should not be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd expect an API at the meson.build level to look like this:
incdirs = include_directories('headers', build: false, source: true)
This is much better than:
incdirs: include_directories('headers', use_type: 'src-and-build-relative')
It's easier for users to understand, and it's also easier for the interpreter to validate because we don't need an enum of magic strings, we just need two boolean properties...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's very easy to validate and convert a value from a set of 3 allowed strings.
Also, I'd argue that accepting 2 separate boolean arguments is a less strict, slightly sloppier interface because it allows/invites the user to try to express -
incdirs = include_directories('headers', build: false, source: false)
which is a meaningless combination of boolean args and which now needs a new bit of validation + user error message. Ultimately, at the lowest level (i.e. within class IncludeDirs
) we should express only the set of acceptable/meaningful values, which the enum describes perfectly, but your suggestion raises the following questions -
- Should we keep this kind of enum within
IncludeDirs
? Doing so while having a '2 bools'-style user interface for the parser/interpreter would now require us to additionally translate from 2 bools to the stricter, more precise enum used byIncludeDirs
. - Should we keep the '2 bools'-style within
IncludeDirs
too, instead of a more exact enum? In that case, we not only need to do validation of the invalid combination in the interpreter, we also need additional validation withinIncludeDirs
to ensure any other internal use ofIncludeDirs
isn't setting the invalid combination.
Happy to make the ALL_CAPS and shorter enum suggestions, provided I can point out that I copied this enum style from existing code: See BuildStep(Enum)
, Quoting(Enum)
, and WrapMode(Enum)
... but it does look like the predominant style is the ALL_CAPS gang.
But before I do any changes, I'd just like to get thoughts on my concerns with the '2 separate bools'-style interface, above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I prefer the two bools option. Having both values to false
is not a problem, as you can always add a validator an issue and error if this is the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it's not too a big deal either way and you say you prefer the two bools style in response to my points above, I've added a separate commit for now (just in case anyone wants to compare the two alternatives) that uses that approach. This approach keeps 2 bools throughout the internal structures and so, as mentioned above, needs to double up the validation for the invalid combination that this style now invites.
mesonbuild/build.py
Outdated
include_dirs_use_kwargs_to_enum: T.Dict[str, IncludeDirsUse] = { | ||
'src-and-build-relative': IncludeDirsUse.src_and_build_relative, | ||
'src-relative': IncludeDirsUse.src_relative, | ||
'build-relative': IncludeDirsUse.build_relative | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is module constant data, so to fit our style it should be INCLUDE_DIRS_USE_KWARGS_TO_ENUM
On the other hand, if we just made the values match the names, then you can do
IncludeDirsUse[name]
and avoid this, ie e = IncludeDirsUse['BOTH']
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having names match values would then either be either imposing the internal ALL_CAPS-ness style of the enum values to the parser/interperter's public interface -
incdirs = include_directories('headers', use_type: 'BOTH')
just to avoid this little translation mapping from the user meson script values to the internal enum, which seems a little unfortunate/clunky... Or we still need an arguably more involved way to translate the user script strings to internal enum vals. Something like this lambda with conditional generator and 'next'? -
class IncludeDirsUse(Enum):
BOTH = 'both'
SRC = 'src'
BUILD = 'build'
...
KwargInfo(
'use_type',
...
validator=in_set_validator(set((e.value for e in IncludeDirsUse))),
convertor=lambda x: next( (e for e in IncludeDirsUse if x == e.value) )
I'm OK with this (and the all caps style) but a str -> enum dict seems marginally preferable to me, but maybe you know of a nicer way to convert from Enum assigned values ('both', 'src', 'build') to an actual enum type (IncludeDirsUse.BOTH, IncludeDirsUse.SRC, etc), where the string values differ from the enum member values, than this 'next' + generator fiddlyness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could just do lambda name: IncludeDirUse[name.upper()]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Done.
mesonbuild/build.py
Outdated
def _validate_no_build_relative_abs_paths(self): | ||
if self.use_type == IncludeDirsUse.build_relative: | ||
for d in self.incdirs: | ||
if os.path.isabs(d): | ||
raise MesonException(f'IncludeDirs, with \'build_relative\', using absolute path ({d}), is not expected. This is likely a user error.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the wrong place to do use input validation, this should all be happening in the Interpreter/module layer, and the build layer should assume what it's getting is valid already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have instantiation of IncludeDirs in several places -
- modules/windows.py,
compile_resources(...)
- modules/gnome.py,
generate_vapi(...)
- interpreter/interpreterobjects.py,
private_dir_include_method(...)
- build.py,
add_include_dirs(...)
- interpereter/interpreter.py,
build_incdir_object(...)
I think the last 2 of these are the main instances of user-script-supplied include dirs, which would most benefit from validating that the user isn't likely confused by using an absolute path with the 'build'-relative use_type.
Are you suggesting I move _validate_no_build_relative_abs_paths(...)
outside of IncludeDirs
to a free function that is then explicitly called prior to each of the instantiation scenarios above (or perhaps just the last one)?
And then are there any concerns with the possibility of missing this bit of user sanity checking if some future bit of interpreter functionality decides to construct more new IncludeDirs
instances while negecting to perform this bit of validation?
I ask these because, with the current state of these changes, this validation is very difficult to avoid all current and possible future use of IncludeDirs
, which seems a reasonable thing to do, so I'd like to better understand the actual problems with being done where it is as opposed to in one or more places elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the build layer should assume what it's getting is valid already
Are you suggesting [...] a free function that is then explicitly called prior to each of the instantiation scenarios above (or perhaps just the last one)?
No, the suggestion is that instantiation scenarios do not need any validation at all, because it's assumed to be valid.
There's a conflict of understanding here, in that:
- @dcbaker is assuming the point of validation is to check that users give valid inputs to meson.build, but the meson code itself contains no bugs, for example because it's tested using unittests
- you are assuming the point of validation is to check that future contributors to the meson codebase don't violate the invariants of the API.
My hot take is that doing sanity checking like this for the latter case is fundamentally incorrect and wrong, in all domains and all languages, not because of the reasons one might expect, but because it should be using an assert for code that is only meant to run in debug builds to verify that other coders working on the codebase didn't write incorrect code.
:P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. In which case, we end up with one assertion validation internal to IncludeDirs, similar to what's already there but an assert, in 'post_init', instead of a MesonException. Then another check, doing exactly the same validation logic, only ending up with a user-facing MesonException message, to be done only on the interpreter's processing of the user's dirs listed in their include_directories(...)
input, right?
Now, because both uses want to perform exactly the same checks under slightly different circumstances, instead of duplicating the same checking logic in 2 difeerent places, it would seem reasonable to add a IncludeDirs method (maybe something called validate_no_build_relative_abs_paths()
or has_no_build_relative_abs_paths()
) that's used in both cases.
This is very similar to what's already there, seems reasonable, and I'm fine with making this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I mean, my main focus here wasn't to take a side and say that it either should or should not check in both places... I'm currently just focused on helping everyone to better understand what the proposal is. That being said, I don't personally mind an assert, assuming no other devs have a strong objection.
So in the interest of better understanding what the proposal is, I think it's very beneficial for code review in general to be able to look at a piece of code and say "this is an assert, clearly the author of the changeset intends for this to be testing an invariant, the violation of which constitutes a meson bug", and distinguish that from "this code appears to be testing user input".
Reviewers might then respond to the assert by saying "I don't think this assert is needed", or they might respond by saying "okay now I get it, that's fine to have there". But they won't respond by saying "user input can't get here why are you checking it again". :)
So I think always being clear about what's an invariant check and what's a user input validation check, is important.
mesonbuild/build.py
Outdated
strlist.append(os.path.join(sourcedir, self.curdir, idir)) | ||
if builddir: | ||
strlist.append(os.path.join(builddir, self.curdir, idir)) | ||
if self.use_type != IncludeDirsUse.build_relative: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be using is
and is not
with enum values, rather than ==
and !=
, which are just wrappers around is
and is not
anyway, but provide an extra layer of indirection.
KwargInfo( | ||
'use_type', | ||
str, | ||
default='src-and-build-relative', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these names are also rather long and verbose src
, build
, both
are much shorter and easier to remember.
I'm also not sure about the name. use_type
isn't wrong, but maybe we can come up with something that is more self-descriptive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Use what? -isystem?"
:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine with both
, build
, src
.
I also wasn't sure about use_type
but couldn't come up with anything better and barring something really long and descriptive, which is itself still quite difficult to encapsulate the meaning of this field, I couldn't think of anything better. If either of you can, I'm all ears 😄
I've updated the changes with all but one of the above suggestions. On the suggestion of is / is not instead of ==/!= used for enum value comparisons: Simple value equivalence is often what we mean but 'is' being 'actual objects are the same' and concerning yourself with underlying object instantiation is something rather specific and niche.
vs -
is rarely something the author or the reader wants to concern themselves with but the author's use of 'is' is much more an explicit signal that they really are interested specifically in object identity equality, more than just value equivalence, requiring the reader to stop and dig a little deeper into their understanding of the underlying objects in play... but often when unnecessary and when value equivalence is really all that was intended. I think your point is a concern about performance on the different language mechanisms involved but if actual performance is equal (which I'm prepared to accept they may not be), I tend to prefer language that expresses the strictest intent/meaning without exposing concepts/internals that aren't necessary or fundamentally intended. Regarding my concern over using double boolean args instead of a strict, exact set of string values: I'm not someone who will pointlessly dig their heels in and double-down on a position if I can be persuaded of a better alternative and, likewise, I expect neither of you have fragile egos and would stubbornly refuse to change your position when there's some debate over an issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates so far. I gave this another look, more at the implementation details this time.
use_type: | ||
type: str | ||
default: "'both'" | ||
since: 1.2.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1.3.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
mesonbuild/build.py
Outdated
for idir in self.incdirs: | ||
strlist.append(os.path.join(sourcedir, self.curdir, idir)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could go either way, but maybe with comprehensions?
strlist.extend([os.path.join(sourcedir, self.curdir, idir) for idr in self.incdirs])
I don't have a strong opinion. either way though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, except with a slight tweak: Since extend takes an iterable, it seemed better to pass a generator, rather than instantiate a list.
'use_type', | ||
str, | ||
default='both', | ||
since='1.2.0', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'1.3.0'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
validator=in_set_validator(set(e.value for e in build.IncludeDirsUse)), | ||
convertor=lambda x: next(e for e in build.IncludeDirsUse if x == e.value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would be simpler to write as:
validator=in_set_validator({e.value for e in build.IncludeDirUse}),
convertor=lambda x: build.IncludeDirUse[x.upper()]
The convertor can definitely be changed, since we've already been assured that x is Literal['src', 'build', 'both']
. And the validator can uset a set comprehension instead of a calling set on a generator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that is a bit neater. Thanks. Done.
mesonbuild/backend/ninjabackend.py
Outdated
srctreedir = os.path.normpath(os.path.join(self.build_to_src, expdir)) | ||
sargs = compiler.get_include_args(srctreedir, is_system) | ||
|
||
inc_dirs: T.List['ImmutableListProtocol[str]'] = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can't be Immutable yet. ImmutableList's don't define mutation methods. So you'll need to make this
inc_dirs: T.List[T.List[str]] = []
and then (possibly) cast it in the return. The case may not be required since ImmutableListProtocol[str]
is fulfilled by list[str]
, and mypy might be smart enough to figure that out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
validator=in_set_validator(set(e.value for e in build.IncludeDirsUse)), | ||
convertor=lambda x: next(e for e in build.IncludeDirsUse if x == e.value) | ||
), | ||
) | ||
def func_include_directories(self, node: mparser.BaseNode, args: T.Tuple[T.List[str]], | ||
kwargs: 'kwtypes.FuncIncludeDirectories') -> build.IncludeDirs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see mesonbuild.interpreter.kwargs.FuncIncludeDirectories
updated with the new keyword argument
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. It was missing, but its absence caused no problems. Still, I've now added it.
Is the intention that those kwargs.py classes get used just for type checking? Based on that use, would you have expected that mypy should have picked up my omission from FuncIncludeDirectories?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the intention that those kwargs.py classes get used just for type checking?
Yes.
Based on that use, would you have expected that mypy should have picked up my omission from FuncIncludeDirectories?
No, because interpreter.py is incomplete and thus ends up excluded from coverage.
For me when I see enums being compared with >> e = Enum('e', ['a', 'b, 'c])
>> e.a == 1
False
>> e = IntEnum('e', ['a', 'b', 'c'])
>> e.a == 1
True And, you really do want to do an identity rather than value check with Enums, they're distinct types |
In other words:
given the use case of an enum is specifically that it is a kind of types-based named sentinel where the type is the sentinel, we are in the unusual case where we in fact are interested specifically in object identity equality. It would be appropriate to use |
94e1cfe
to
be03712
Compare
I noticed that, while this has been waiting, a conflicting Is there anything else anyone wants me to do? |
62ad8a0
to
94f1791
Compare
fc57187
to
edeaf98
Compare
edeaf98
to
0fef0b6
Compare
mesonbuild/build.py
Outdated
build_relative: bool = field(default=True) | ||
source_relative: bool = field(default=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
build_relative: bool = field(default=True) | |
source_relative: bool = field(default=True) | |
build_relative: bool = True | |
source_relative: bool = True |
While not wrong, you don't really need field
here, since you only provide the default value
|
||
def __repr__(self) -> str: | ||
r = '<{} {}/{}>' | ||
return r.format(self.__class__.__name__, self.curdir, self.incdirs) | ||
|
||
def __post_init__(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this validation step is really necessary. User arguments are already checked in the interpreter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this overlaps with what I mentioned above: There are at least 5 different places that create IncludeDirs instances: Some come through the interpreter (which we expect to do user-facing validation) but the other instances are from meson internals, which, along with any new future additional internal use, can benefit from the assert catching future internal/meson-programmer misuse.
i = build.IncludeDirs(self.subdir, incdir_strings, is_system) | ||
|
||
if not (build_relative or source_relative): | ||
raise InvalidArguments('include_directories must use \'build\'-relative, ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about the -relative
. From the user perspective, there are only build
and source
arguments to provide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I think what's going on is a couple of things that just might not fit entirely perfectly together, at least for you (and maybe others?) -
- I generally lean towards longer names when they can convey a little more meaningful information but I get the impression that others have a low tolerance to them and much prefer shorter names, even when there may sometimes be a little information lost).
- An error message presented to the user seems like a good opportunity to convey as much information and clarity to the user as possible; there's no benefit to being econoimical with error messages.
To me, an IncludeDirs parameter called 'build_relative' is just that little bit more immediately obvious as to what it does than one just called 'build'. However, 'build' and 'source' had already been suggested and so, rather than risk the ire of any meson gatekeepers from using longer names 😱 , it seemed best to go with what's been suggested, while taking the opportunity to be a bit more verbose/meaningful in the associated error message... but that would appear to have backfired with you 😆
Out of interest, if a variable is called 'build' and you understand it to mean 'build-relative', is it jarring to refer to it as "'build'-relative"? Also, would you prefer to name it 'build_relative' in all cases or only ever 'build' as its parameter name and in error messages, omitting the opporunity to explicitly clarify that we're really dealing with build-relative paths?
At this point, I really don't mind; whatever helps get this merged.
22d96cf
to
937f74b
Compare
I'm just wondering: Is it a problem that subsequent feedback and changes on this PR have led it to diverge a little from original description, which mentions adding a 'use_type' arg, despite the fact that preference in feedback has been for 2 bools, which is what we've ended up with? Would it help if I go back and edit the original description to better reflect this latest state? Also, it'd be good to know if I've overlooked any issues/objections people have with this so that I can address them or whether it should all be good to go and I just need to wait and keep on top of any master branch merge conflicts... or whether it ain't happening (and why) so we can close this. |
2bd05d8
to
858f3d9
Compare
Motivation: Frequent, unnecessary (and maybe even problematic) doubling up of both build and src-relative include search paths as discussed here: mesonbuild#11919 'build' specifies whether the include dirs should be treated as build- relative and 'src', whether source-relative. They both default to true to keep previous behaviour when unspecified. Added new 'test cases/common/267 include_directories use types' test (should these live under 'common'?). Added documentation and new feature snippet. Respect the new bools in vs2010backend (and adding more type annotations to help readability).
858f3d9
to
09d1676
Compare
(Updated this PR title and description to better reflect the current state through changes from review feedback.)
include_directories(...)
automatically and forcibly doubles up all include search dirs used (one for source-relative and one for build-relative), which is often (perhaps mostly) not what the user intends. I can't speak for others but I always know whether I intend my include_directories() to be either source or build-relative. For compilation to quietly double up the include dirs compile args with the complementary location of the one I intend is a source of potential problems as well as unnecessarily adding to include dir searching when compiling.This adds two new 'build' and 'source' bool args to include_directories() so the user can be more explicit and strict.
The default, when neither bools are specified, is the same, to add both build and src-relative, except with the added benefit that trivially duplicated directories (i.e. from using absolute paths) are now avoided.
Original description below -
Motivation: Frequent, unnecessary (and maybe even problematic) doubling up of both build and src-relative include search paths as discussed here: #11919
New 'use_type' keyword arg accepts 'src-and-build-relative' (default), 'src-relative', and 'build-relative' values, where the type appropriately restricts the include search paths used in compilation.
Added new 'test cases/common/264 include_directories use types' test (should these live under 'common' or somewhere else?). Added 'use_type' documentation and new feature snippet.
Respect IncludeDirsUse in vs2010backend (and adding more type annotations to help a little readability [and any type checking]).
I'm guessing I have to leave the 3 version number instances for this new feature in the hands of anyone who merges this PR, if it's taken, but have just gone with 1.2.0 for now. See -
As for possible performance impact of reduced number of include path compile args: On a medium/large sized project (40-something lib, executable, and custom targets comprising just under 3000 src and header files), even though there's a significant reduction in many compile include search path args, the total compile/build time is only 0.1s faster on a 68s total build time (i.e. negligible; within timing noise).