Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intarray #2151

Merged
merged 8 commits into from
Jul 20, 2023
Merged

Intarray #2151

merged 8 commits into from
Jul 20, 2023

Conversation

brucemiller
Copy link
Owner

This PR provides the intarray infrastructure needed by the latest (2023) TeXLive versions of expl3. It no longer fails with errors, but still takes way too long to run (working on that).

This partially, but not completely, addresses #2044 and #2064.

@brucemiller brucemiller requested a review from dginev July 14, 2023 01:11
@dginev
Copy link
Collaborator

dginev commented Jul 14, 2023

I would also like to invite @teepeemm to offer a review here (if he's willing and interested), given that he did very relevant overlapping work in #2109 . No obligation to do so, but he's welcome to!

Copy link
Collaborator

@dginev dginev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good on a surface read! I left some low-level perl comments.

my ($stomach, $pos, $table, $value, $ifmissing) = @_;
my $length = scalar(@$table);
$pos = $pos->valueOf;
if (($pos > 0) && ($pos < $length)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an off-by-one error here, or am I misreading the bounds?

Say, if $table = [SomeSingleElement]; and $pos is 1, then $length is 1 as well, but $pos == $length. I think we won't be able to assign an element in that slot?

So maybe $pos < $length should be $pos <= $length ?

There are a number of other subs with this $$table[$pos - 1] kind of indexing, so posing the question in general for all of them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be easier to use 1-indexed arrays throughout the file (along with the comment to that affect). We'd need scalar(@$table)-1 or $#table when we need the length, and I don't think $table[0] being undef would cause problems.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But all that wasted space!?!?! I think I'll be better off keeping the Perl arrays 0-based, but do need to be careful about the limits. Patch coming.

my $f = ($at ? $at->divide(Dimension('1em'))->valueOf
: ($scaled ? $scaled->valueOf / 1000
: 1));
my $fontinfo = {%props};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, why not my $fontinfo = \%props; ?
I think that avoids cloning the hash, and if I'm reading the code right %props has no further uses that may cause data race conflicts.

lib/LaTeXML/Package/TeX.pool.ltxml Outdated Show resolved Hide resolved
my $tok = $gullet->readXToken;
if (my $defn = LookupDefinition($tok)) { # token may be \let to the identifier!!!
$tok = $defn->getCS; }
if ($tok && $tok->equals(T_CS('\__intarray:w'))) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, the 3 lines here seem to be a good possible use of the new $tok->defined_as(T_CS('\__intarray:w')), right?

The current code looks correct too, just marking the opportunity.

DefMacroI('\__intarray_range_to_clist:w', 'Intarray Number Number', sub {
my ($stomach, $table, $from, $to) = @_;
$from = $from->valueOf - 1;
$to = $to->valueOf - 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a rare edge case, but if either of the input numbers happens to be (erroneously?) zero 0, the $from or $to index will be -1, which will lead to a rather peculiar indexing in perl. It may be worth guarding them to be within the allowable range of $table here as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the expl3 code usually has the public functions do the input validation, and then the __private functions don't need to worry about that. The corresponding lua code doesn't validate the inputs. (I've also not noticed a public version of this function, so it might all be moot.)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

return; },
protected => 1);

DefPrimitiveI('\intarray_gzero:w', 'Intarray', sub {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My dtx file is defining \intarray_gzero:N at this point, and no \intarray_gzero:w.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't completely clear on the lua interface, so I just implemented (or didn't) everything related to intarray.

$to = $to->valueOf - 1;
return Tokenize(join(', ', @$table[$from .. $to])); });

DefPrimitiveI('\__intarray_gset_range:w', 'Number Intarray', sub {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since \__kernel_intarray_gset_range_from_clist:Nnn calls \__intarray_gset_range:w \int_eval:w #2 #1 #3 , , \scan_stop:, I think this will leave #3 , , \scan_stop: in the token stream. Should this be something like

DefPrimitiveI('\__intarray_gset_range:w',
  LaTeXML::Core::Parameters->new(
    LaTeXML::Core::Parameter->new('Number'),
    LaTeXML::Core::Parameter->new('Intarray'),
    LaTeXML::Core::Parameter->new('Until','Until:\scan_stop:',extra=>[T_CS('\scan_stop:')]) ),
  sub {
    ... } );

On the other hand, there doesn't appear to be a public version of the function, so I'm not sure how often this will get called (also, being unimplemented will throw an error anyway).

$tok = $defn->getCS; }
if ($tok && $tok->equals(T_CS('\__intarray:w'))) {
my $n = $gullet->readNumber;
my $address = '__intarray' . '_' . ToString($tok) . '_' . $n->valueOf;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really a bug, but having every $address begin with __intarray_\__intarray:w_ seems a bit redundant. (Unless there's some other way to get an intarray that I wasn't finding.)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I wasn't completely clear on exactly how these were being marked from the TeX end (or lua end), so I went with redundant just in case.

Copy link
Collaborator

@dginev dginev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@brucemiller
Copy link
Owner Author

I think this one's ready, if you guys agree.

@teepeemm
Copy link
Contributor

We do still need \intarray_gzero:N, which can use the :w definition. I'm moderately certain that we don't need \intarray_gzero:w.

I'm still trying to dig into why it takes so long to go through UnicodeData, but that shouldn't stop this from merging.

@brucemiller
Copy link
Owner Author

Thanks for the reviews. I still would like to come back later to #2109 to see if we can speed things up, but given the evolving nature of latex3, I'd want to avoid building in too much expl3 code (even translated into perl); it'll likely be different tomorrow.

@brucemiller brucemiller merged commit 202aced into master Jul 20, 2023
@brucemiller brucemiller deleted the intarray branch July 20, 2023 02:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants