GNU expr syntax #1126
-
Hi! Over at uutils, we are happily using this library and highly appreciate all the work you all put in! However, we have a bit of a problem, because GNU expr (which we try to be compatible with) uses a custom syntax (explained in the manual, see details element below). Of course, we don't expect this syntax to be supported by the Many thanks! Quote from the GNU manual about the syntax
Source: https://www.gnu.org/software/coreutils/manual/html_node/String-expressions.html |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Unfortunately, it's way worse than what you're picturing here. In theory, if it were just a matter of syntax, then you could write your own parser and lower it to But "GNU expr" is, AIUI, built on top of POSIX regexes and POSIX regexes have different match semantics for the same regexes. For example:
This is because POSIX uses leftmost-longest semantics where as this crate uses leftmost-first. You can see a description of these modes here: https://docs.rs/aho-corasick/latest/aho_corasick/enum.MatchKind.html And note that it isn't limited to just literals. It applies to arbitrary regexes too:
In theory, I want to support leftmost-longest semantics in the regex engine. But it's a herculean task and I neither have the time to do it myself or mentor it unfortunately. There may be other differences in match semantics that I'm unaware of. You also have the whole locale mess to contend with that this crate won't help you with. IMO, your quickest and easiest path is to fork |
Beta Was this translation helpful? Give feedback.
Unfortunately, it's way worse than what you're picturing here. In theory, if it were just a matter of syntax, then you could write your own parser and lower it to
regex-syntax
'sHir
type (skipping itsAst
). From there, you could build a regex from theHir
and you'd be good to go.But "GNU expr" is, AIUI, built on top of POSIX regexes and POSIX regexes have different match semantics for the same regexes. For example:
This is because POSIX uses leftmost-longest semantics where as this crate uses leftmost-first. You can see a description of these modes here: https://docs.rs/aho-corasick/latest/aho…