-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intarray #2151
Intarray #2151
Conversation
…red font metric data, since 2023's expl3 code is using it for intarray storage
…xtensions (in Perl)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good on a surface read! I left some low-level perl comments.
lib/LaTeXML/Package/expl3.lua.ltxml
Outdated
my ($stomach, $pos, $table, $value, $ifmissing) = @_; | ||
my $length = scalar(@$table); | ||
$pos = $pos->valueOf; | ||
if (($pos > 0) && ($pos < $length)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an off-by-one error here, or am I misreading the bounds?
Say, if $table = [SomeSingleElement];
and $pos
is 1
, then $length
is 1
as well, but $pos == $length
. I think we won't be able to assign an element in that slot?
So maybe $pos < $length
should be $pos <= $length
?
There are a number of other subs with this $$table[$pos - 1]
kind of indexing, so posing the question in general for all of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be easier to use 1-indexed arrays throughout the file (along with the comment to that affect). We'd need scalar(@$table)-1
or $#table
when we need the length, and I don't think $table[0]
being undef
would cause problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But all that wasted space!?!?! I think I'll be better off keeping the Perl arrays 0-based, but do need to be careful about the limits. Patch coming.
lib/LaTeXML/Package/TeX.pool.ltxml
Outdated
my $f = ($at ? $at->divide(Dimension('1em'))->valueOf | ||
: ($scaled ? $scaled->valueOf / 1000 | ||
: 1)); | ||
my $fontinfo = {%props}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, why not my $fontinfo = \%props;
?
I think that avoids cloning the hash, and if I'm reading the code right %props
has no further uses that may cause data race conflicts.
lib/LaTeXML/Package/expl3.lua.ltxml
Outdated
my $tok = $gullet->readXToken; | ||
if (my $defn = LookupDefinition($tok)) { # token may be \let to the identifier!!! | ||
$tok = $defn->getCS; } | ||
if ($tok && $tok->equals(T_CS('\__intarray:w'))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, the 3 lines here seem to be a good possible use of the new $tok->defined_as(T_CS('\__intarray:w'))
, right?
The current code looks correct too, just marking the opportunity.
lib/LaTeXML/Package/expl3.lua.ltxml
Outdated
DefMacroI('\__intarray_range_to_clist:w', 'Intarray Number Number', sub { | ||
my ($stomach, $table, $from, $to) = @_; | ||
$from = $from->valueOf - 1; | ||
$to = $to->valueOf - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a rare edge case, but if either of the input numbers happens to be (erroneously?) zero 0
, the $from
or $to
index will be -1
, which will lead to a rather peculiar indexing in perl. It may be worth guarding them to be within the allowable range of $table
here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like the expl3 code usually has the public functions do the input validation, and then the __private
functions don't need to worry about that. The corresponding lua code doesn't validate the inputs. (I've also not noticed a public version of this function, so it might all be moot.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto.
lib/LaTeXML/Package/expl3.lua.ltxml
Outdated
return; }, | ||
protected => 1); | ||
|
||
DefPrimitiveI('\intarray_gzero:w', 'Intarray', sub { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My dtx file is defining \intarray_gzero:N
at this point, and no \intarray_gzero:w
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't completely clear on the lua interface, so I just implemented (or didn't) everything related to intarray
.
$to = $to->valueOf - 1; | ||
return Tokenize(join(', ', @$table[$from .. $to])); }); | ||
|
||
DefPrimitiveI('\__intarray_gset_range:w', 'Number Intarray', sub { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since \__kernel_intarray_gset_range_from_clist:Nnn
calls \__intarray_gset_range:w \int_eval:w #2 #1 #3 , , \scan_stop:
, I think this will leave #3 , , \scan_stop:
in the token stream. Should this be something like
DefPrimitiveI('\__intarray_gset_range:w',
LaTeXML::Core::Parameters->new(
LaTeXML::Core::Parameter->new('Number'),
LaTeXML::Core::Parameter->new('Intarray'),
LaTeXML::Core::Parameter->new('Until','Until:\scan_stop:',extra=>[T_CS('\scan_stop:')]) ),
sub {
... } );
On the other hand, there doesn't appear to be a public version of the function, so I'm not sure how often this will get called (also, being unimplemented will throw an error anyway).
$tok = $defn->getCS; } | ||
if ($tok && $tok->equals(T_CS('\__intarray:w'))) { | ||
my $n = $gullet->readNumber; | ||
my $address = '__intarray' . '_' . ToString($tok) . '_' . $n->valueOf; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really a bug, but having every $address
begin with __intarray_\__intarray:w_
seems a bit redundant. (Unless there's some other way to get an intarray that I wasn't finding.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, I wasn't completely clear on exactly how these were being marked from the TeX end (or lua end), so I went with redundant just in case.
…iven file if (& before) it is loaded
…r with expl3's \@expl@push@filename@aux@@ dangerous way of knowing the file that is about to be loaded
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
I think this one's ready, if you guys agree. |
We do still need I'm still trying to dig into why it takes so long to go through UnicodeData, but that shouldn't stop this from merging. |
Thanks for the reviews. I still would like to come back later to #2109 to see if we can speed things up, but given the evolving nature of latex3, I'd want to avoid building in too much expl3 code (even translated into perl); it'll likely be different tomorrow. |
This PR provides the
intarray
infrastructure needed by the latest (2023) TeXLive versions ofexpl3
. It no longer fails with errors, but still takes way too long to run (working on that).This partially, but not completely, addresses #2044 and #2064.