Update \operatorname to work more like in LaTeX. (mathjax/MathJax#2830) #788

dpvc · 2022-02-23T17:49:45Z

This PR improves the implementation of \operatorname (and \DeclareMathOperator) to work more like in true LaTeX. In the past, the macro replaced - and * by \text{-} and \text{*} in order to handle what LaTeX dues using \catcode. Here, we take advantage of features added to ParseMethods.variable to get \mathrm{} to produce multi-character mi elements to get \operatorname to do the same. The multiLetterIdentifiers environment value has been replaced by a regular expression for the characters to use (and EnvProp is extended to allow that). Then we make a new map that matches - and * and checks whether we are in an \operatorname, and if so, parses the characters like variables, otherwise parses them as usual.

The \operatorname macro is rewritten to parse the arguments itself, rather than using a replacement macro internally, so that we can set the multiLetterIdentifiers regular expression and the operatorLetters environment property. This also means we can handle the following \limits directly rather than needing the extra \SkipLimits macro to do that.

Finally, \DeclareMathOperator is set up to define the macro in terms of \operatorname rather than \mathop.

Resolves issue mathjax/MathJax#2830

dpvc · 2022-02-23T18:23:22Z

PS, this PR is to the html-in-tex branch because it needs to fixes to the TeX parser that allows returning false to continue looking through the symbol maps. I could try to factor that out into a separate PR, but that would require rebasing the html-in-mml and html-in-tex branches, and I didn't want to do that in case you have already been commenting on them.

zorkow · 2022-04-08T08:42:53Z

PS, this PR is to the html-in-tex branch because it needs to fixes to the TeX parser that allows returning false to continue looking through the symbol maps. I could try to factor that out into a separate PR, but that would require rebasing the html-in-mml and html-in-tex branches, and I didn't want to do that in case you have already been commenting on them.

No worries, I think it is easier to review like that.

zorkow

Looks pretty good, but I have one major issue with the PR:
The extension of the EnvProp type to include a non-primitive type. Can't we pass the multiLetterIdentfiers as a string from which the RegExp is created? In addition, can we spell out the regular expression. I find the i modifier very dodgy as it is implementation dependent what it does on non-ASCII characters representing letters. As an example

         'a𝐚A𝐀'.match(/[a𝐚]+/)

vs

         'a𝐚A𝐀'.match(/[a𝐚]+/i)

The latter leads to very odd behaviour.

zorkow · 2022-04-08T08:44:56Z

ts/input/tex/ParseMethods.ts

-      c = parser.string.substr(parser.i - 1).match(/^[a-z]+/i)[0];
+    const env = parser.stack.env;
+    if (env.multiLetterIdentifiers && env.font !== '') {
+      c = parser.string.substr(parser.i - 1).match(env.multiLetterIdentifiers as RegExp)[0];


multiLetterIdentifiers could be a non-empty string.

zorkow · 2022-04-08T09:02:07Z

ts/input/tex/StackItem.ts

@@ -28,7 +28,7 @@ import TexError from './TexError.js';
 import StackItemFactory from './StackItemFactory.js';

 // Union types for abbreviation.
-export type EnvProp = string | number | boolean;
+export type EnvProp = string | number | boolean | RegExp;


I don't like that. I'd like to keep that to primitive types.

Other than aesthetic reasons, is there a practical advantage to that? I'm not sure why the environment properties can't be any type of data that is needed.

dpvc · 2022-04-08T12:28:31Z

Can't we pass the multiLetterIdentfiers as a string from which the RegExp is created?

We can, but that means the RegExp has to be re-compiled every time it is used, whereas with a constant RegExp, it is only compiled once, so that is more efficient. That was the main reason to do it this way.

In addition, can we spell out the regular expression.

I'm not sure what that means.

I find the i modifier very dodgy as it is implementation dependent what it does on non-ASCII characters representing letters

But the examples you give aren't the RegExp being used. That is just /^[a-z]+/i, which doesn't have those issues, as it only specifies Ascii characters. In any case, if you want to handle Unicode better, you would want to use u in addition to i. I can add the u if you like.

zorkow · 2022-04-16T08:42:24Z

As per f2f I am happy with extending the type by RegExp.

In addition, can we spell out the regular expression.

I'm not sure what that means.

For the record: That just means I prefer /a-zA-Z/ over /a-z/,i.
I generally prefer to avoid i as it makes the meaning of regexps unclear.
If u is now consistently implemented everywhere, that is helpful, although I really believe there should not even be a need for u.

dpvc · 2022-04-18T21:50:00Z

Thank for the clarification. I'm adjusting the RegExp.

Update \operatorname to work more like in LaTeX. (mathjax/MathJax#2830)

9a7304c

dpvc requested a review from zorkow February 23, 2022 17:49

dpvc added this to the 3.2.1 milestone Apr 6, 2022

zorkow requested changes Apr 8, 2022

View reviewed changes

zorkow approved these changes Apr 16, 2022

View reviewed changes

Spell out regular expression, as per Volker's request

c93d799

dpvc merged commit 4a6ad26 into html-in-tex Apr 18, 2022

dpvc deleted the issue2830 branch April 18, 2022 21:50

dpvc mentioned this pull request May 11, 2022

Allow return value of false to be used in SymbolMap parsers to continue the lookup process. #805

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update \operatorname to work more like in LaTeX. (mathjax/MathJax#2830) #788

Update \operatorname to work more like in LaTeX. (mathjax/MathJax#2830) #788

dpvc commented Feb 23, 2022 •

edited

Loading

dpvc commented Feb 23, 2022

zorkow commented Apr 8, 2022

zorkow left a comment

zorkow Apr 8, 2022

zorkow Apr 8, 2022

dpvc Apr 8, 2022

dpvc commented Apr 8, 2022

zorkow commented Apr 16, 2022

dpvc commented Apr 18, 2022

Update \operatorname to work more like in LaTeX. (mathjax/MathJax#2830) #788

Update \operatorname to work more like in LaTeX. (mathjax/MathJax#2830) #788

Conversation

dpvc commented Feb 23, 2022 • edited Loading

dpvc commented Feb 23, 2022

zorkow commented Apr 8, 2022

zorkow left a comment

Choose a reason for hiding this comment

zorkow Apr 8, 2022

Choose a reason for hiding this comment

zorkow Apr 8, 2022

Choose a reason for hiding this comment

dpvc Apr 8, 2022

Choose a reason for hiding this comment

dpvc commented Apr 8, 2022

zorkow commented Apr 16, 2022

dpvc commented Apr 18, 2022

dpvc commented Feb 23, 2022 •

edited

Loading