Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plain fonts #2411

Merged
merged 17 commits into from
Sep 18, 2024
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
bb30324
New Definition class FontDef for font selecting commands defined by \…
brucemiller Sep 3, 2024
71218a7
Rename parameter type FontToken to FontDef for cs explicitly defined …
brucemiller Sep 3, 2024
1badc03
Update \meaning to recognize new FontDef font commands
brucemiller Sep 3, 2024
35c2a2e
Update \the to recognize FontDef commands defined by \font and proces…
brucemiller Sep 3, 2024
6f7f868
Moved decodeMathChar to Package.pm, updating it to more correctly dec…
brucemiller Sep 3, 2024
b27e41a
Note that \mit doesn't REQUIRE math, but ony has effect in math (sets…
brucemiller Sep 3, 2024
91ef537
Consistent use of font decoding makes apparent misuse of T_OTHER when…
brucemiller Sep 3, 2024
73fe017
Fix mangled renesting of if/else
brucemiller Sep 3, 2024
e9bf750
\cal also does not require math and does nothing in text
brucemiller Sep 3, 2024
c764042
Add test case for plain style font manipulations
brucemiller Sep 3, 2024
f6afd64
Improve decoding of font filenames into family/series/shape and IMPOR…
brucemiller Sep 16, 2024
af96721
Update FontMap to provide options for alphanumerics to remain ASCII i…
brucemiller Sep 16, 2024
8d586d5
Make FontDecode in math keep alphanumerics in math as ASCII w/font ch…
brucemiller Sep 16, 2024
3f6ab14
Update all callers of FontDecode
brucemiller Sep 16, 2024
1a27ba6
Make \cal return a Box so that it can revert
brucemiller Sep 16, 2024
6317280
Enhance and correct plain fonts test cases
brucemiller Sep 16, 2024
2a0bd12
Code cleanup suggested by D.Ginev
brucemiller Sep 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions MANIFEST
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ lib/LaTeXML/Core/Definition.pm
lib/LaTeXML/Core/Definition/Expandable.pm
lib/LaTeXML/Core/Definition/Conditional.pm
lib/LaTeXML/Core/Definition/Primitive.pm
lib/LaTeXML/Core/Definition/FontDef.pm
lib/LaTeXML/Core/Definition/Register.pm
lib/LaTeXML/Core/Definition/CharDef.pm
lib/LaTeXML/Core/Definition/Constructor.pm
Expand Down Expand Up @@ -1352,6 +1353,9 @@ t/fonts/mixed.xml
t/fonts/omencodings.pdf
t/fonts/omencodings.tex
t/fonts/omencodings.xml
t/fonts/plainfonts.pdf
t/fonts/plainfonts.tex
t/fonts/plainfonts.xml
t/fonts/sizes.pdf
t/fonts/sizes.tex
t/fonts/sizes.xml
Expand Down
151 changes: 87 additions & 64 deletions lib/LaTeXML/Common/Font.pm
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ my $FLAG_EMPH = 0x10;
# Mappings from various forms of names or component names in TeX
# Given a font, we'd like to map it to the "logical" names derived from LaTeX,
# (w/ loss of fine grained control).
# and (importantly) the encoding needed to lookup unicode in a FontMap!
# I'd like to use Karl Berry's font naming scheme
# (See http://www.tug.org/fontname/html/)
# but it seems to be a one-way mapping, and moreover, doesn't even fit CM fonts!
Expand All @@ -60,61 +61,58 @@ my $FLAG_EMPH = 0x10;
# NOTE: This probably doesn't really belong in here...

my %font_family = (
cmr => { family => 'serif' },
cmss => { family => 'sansserif' },
cmssq => { family => 'sansserif' }, # quote style?
cmssqi => { family => 'sansserif', shape => 'italic' }, # quote style?
cmtt => { family => 'typewriter' }, cmvtt => { family => 'typewriter' },
cmt => { family => 'serif' }, # for cmti "text italic"
cmfib => { family => 'serif' },
cmfr => { family => 'serif' },
cm => { family => 'serif' },
cmdh => { family => 'serif' },
cmr => { family => 'serif' },
cmdunh => { family => 'serif' }, # like cmr10 but with tall body heights
cmu => { family => 'serif' }, # unslanted italic ??
ptm => { family => 'serif' }, ppl => { family => 'serif' },
pnc => { family => 'serif' }, pbk => { family => 'serif' },
phv => { family => 'sansserif' }, pag => { family => 'serif' },
pcr => { family => 'typewriter' }, pzc => { family => 'script' },
put => { family => 'serif' }, bch => { family => 'serif' },
psy => { family => 'symbol' }, pzd => { family => 'dingbats' },
ccr => { family => 'serif' }, ccy => { family => 'symbol' },
cmbr => { family => 'sansserif' }, cmtl => { family => 'typewriter' },
cmbrs => { family => 'symbol' }, ul9 => { family => 'typewriter' },
txr => { family => 'serif' }, txss => { family => 'sansserif' },
txtt => { family => 'typewriter' }, txms => { family => 'symbol' },
txsya => { family => 'symbol' }, txsyb => { family => 'symbol' },
pxr => { family => 'serif' }, pxms => { family => 'symbol' },
pxsya => { family => 'symbol' }, pxsyb => { family => 'symbol' },
futs => { family => 'serif' },
uaq => { family => 'serif' }, ugq => { family => 'sansserif' },
eur => { family => 'serif' }, eus => { family => 'script' },
euf => { family => 'fraktur' }, euex => { family => 'symbol' },
# The following are actually math fonts.
ms => { family => 'symbol' },
ccm => { family => 'serif', shape => 'italic' },
cmm => { family => 'math', shape => 'italic', encoding => 'OML' },
cmex => { family => 'symbol', encoding => 'OMX' }, # Not really symbol, but...
cmsy => { family => 'symbol', encoding => 'OMS' },
ccitt => { family => 'typewriter', shape => 'italic' },
cmsltt => { family => 'typewriter', shape => 'slanted' },
cmbrm => { family => 'sansserif', shape => 'italic' },
futm => { family => 'serif', shape => 'italic' },
futmi => { family => 'serif', shape => 'italic' },
txmi => { family => 'serif', shape => 'italic' },
pxmi => { family => 'serif', shape => 'italic' },
bbm => { family => 'blackboard' },
bbold => { family => 'blackboard' },
bbmss => { family => 'blackboard' },
# some ams fonts
cmmib => { family => 'italic', series => 'bold' },
cmbsy => { family => 'symbol', series => 'bold' },
msa => { family => 'symbol', encoding => 'AMSa' },
msb => { family => 'symbol', encoding => 'AMSb' },
# Are these really the same?
msx => { family => 'symbol', encoding => 'AMSa' },
msy => { family => 'symbol', encoding => 'AMSb' },
# Computer Modern
cm => { family => 'serif' }, # base for synthesizing cmbx, cmsl ...
cmr => { family => 'serif' },
cmm => { family => 'math', shape => 'italic', encoding => 'OML' }, # cmmi
cmsy => { encoding => 'OMS' },
cmex => { encoding => 'OMX' },
cmss => { family => 'sansserif' },
cmtt => { family => 'typewriter' },
cmvtt => { family => 'typewriter' },
cmssq => { family => 'sansserif' }, # quote style?
cmssqi => { family => 'sansserif', shape => 'italic' }, # quote style?
cmt => { family => 'serif' }, # for cmti "text italic"
cmmib => { family => 'italic', series => 'bold' },
cmbsy => { series => 'bold', encoding => 'OMS' },
cmfib => { family => 'serif' },
cmfr => { family => 'serif' },
cmdh => { family => 'serif' },
cmdunh => { family => 'serif' }, # like cmr10 but with tall body heights
cmu => { family => 'serif' }, # unslanted italic ??
cmsltt => { family => 'typewriter', shape => 'slanted' },
cmbrm => { family => 'sansserif', shape => 'italic' },
# Some Blackboard Bold fonts
bbm => { family => 'blackboard' },
bbold => { family => 'blackboard' },
bbmss => { family => 'blackboard' },
# Computer Concrete
ccr => { family => 'serif' },
ccm => { family => 'serif', shape => 'italic' },
cct => { family => 'serif' },
ccitt => { family => 'typewriter', shape => 'italic' },
# AMS fonts
msa => { encoding => 'AMSa' },
msb => { encoding => 'AMSb' },
msx => { encoding => 'AMSa' }, # Are these really the same? (or even real?)
msy => { encoding => 'AMSb' },
# Euler
eur => { family => 'serif' },
eus => { family => 'script' },
euf => { family => 'fraktur' },
euex => { encoding => 'OMX' },
# TX Fonts (Times Roman)
txr => { family => 'serif' },
txmi => { family => 'serif', shape => 'italic' },
txss => { family => 'sansserif' },
txtt => { family => 'typewriter' },
txsya => { encoding => 'AMSa' },
txsyb => { encoding => 'AMSb' },
# PX Fonts (Palladio)
pxr => { family => 'serif' },
pxmi => { family => 'serif', shape => 'italic' },
pxsya => { encoding => 'AMSa' },
pxsyb => { encoding => 'AMSb' },
# Pretend to recognize xy's fonts
xydash => { family => 'graphic' },
xyatip => { family => 'graphic' },
Expand All @@ -125,17 +123,44 @@ my %font_family = (
xycmbt => { family => 'graphic' },
xyluat => { family => 'graphic' },
xylubt => { family => 'graphic' },
# Fourier
futm => { family => 'serif', shape => 'italic' },
futmi => { family => 'serif', shape => 'italic' },
# More fonts that need to be better sorted, classified & labelled
# family symbol, dingbats are nonsense: We need an encoding and FontMap!!!
ptm => { family => 'serif' }, ppl => { family => 'serif' },
pnc => { family => 'serif' }, pbk => { family => 'serif' },
phv => { family => 'sansserif' }, pag => { family => 'serif' },
pcr => { family => 'typewriter' }, pzc => { family => 'script' },
put => { family => 'serif' }, bch => { family => 'serif' },
psy => { family => 'symbol' }, pzd => { family => 'dingbats' },
cmbr => { family => 'sansserif' }, cmtl => { family => 'typewriter' },
cmbrs => { family => 'symbol' }, ul9 => { family => 'typewriter' },
futs => { family => 'serif' },
uaq => { family => 'serif' }, ugq => { family => 'sansserif' },
);

# Maps the "series code" to an abstract font series name
my %font_series = (
'' => { series => 'medium' }, m => { series => 'medium' }, mc => { series => 'medium' },
b => { series => 'bold' }, bc => { series => 'bold' }, bx => { series => 'bold' },
sb => { series => 'bold' }, sbc => { series => 'bold' }, bm => { series => 'bold' });
'' => {}, # default medium
m => { series => 'medium' },
mc => { series => 'medium' },
b => { series => 'bold' },
bc => { series => 'bold' },
bx => { series => 'bold' },
sb => { series => 'bold' },
sbc => { series => 'bold' },
bm => { series => 'bold' });

# Maps the "shape code" to an abstract font shape name.
my %font_shape = ('' => { shape => 'upright' }, n => { shape => 'upright' }, i => { shape => 'italic' }, it => { shape => 'italic' },
sl => { shape => 'slanted' }, sc => { shape => 'smallcaps' }, csc => { shape => 'smallcaps' });
my %font_shape = (
'' => {}, # default upright
n => { shape => 'upright' },
i => { shape => 'italic' },
it => { shape => 'italic' },
sl => { shape => 'slanted' },
sc => { shape => 'smallcaps' },
csc => { shape => 'smallcaps' });

# These could be exported...
sub lookupFontFamily {
Expand Down Expand Up @@ -181,7 +206,7 @@ my $FONTREGEXP
sub decodeFontname {
my ($name, $at, $scaled) = @_;
if ($name =~ /^$FONTREGEXP$/o) {
my %props;
my %props = (series => 'medium', shape => 'upright', encoding => 'OT1');
my ($fam, $ser, $shp, $size) = ($1, $2, $3, $4);
if (my $ffam = lookupFontFamily($fam)) { map { $props{$_} = $$ffam{$_} } keys %$ffam; }
if (my $fser = lookupFontSeries($ser)) { map { $props{$_} = $$fser{$_} } keys %$fser; }
Expand All @@ -191,8 +216,6 @@ sub decodeFontname {
$size = $size * $scaled if defined $scaled;
$props{name} = $name;
$props{size} = $size;
# Experimental Hack !?!?!?
$props{encoding} = 'OT1' unless defined $props{encoding};
return %props; }
else {
Info('unrecognized', 'font', undef, "Unrecognized fontname '$name'");
Expand Down Expand Up @@ -251,7 +274,7 @@ sub textDefault {
sub mathDefault {
my ($self) = @_;
return $self->new_internal('math', $DEFSERIES, 'italic', DEFSIZE(),
$DEFCOLOR, $DEFBACKGROUND, $DEFOPACITY, undef, $DEFLANGUAGE, 'text', 0); }
$DEFCOLOR, $DEFBACKGROUND, $DEFOPACITY, 'OT1', $DEFLANGUAGE, 'text', 0); }

# Accessors
# Using an array here is getting ridiculous!
Expand Down
4 changes: 4 additions & 0 deletions lib/LaTeXML/Core/Definition.pm
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ use base qw(LaTeXML::Common::Object);
require LaTeXML::Core::Definition::Expandable;
require LaTeXML::Core::Definition::Conditional;
require LaTeXML::Core::Definition::Primitive;
require LaTeXML::Core::Definition::FontDef;
require LaTeXML::Core::Definition::Register;
require LaTeXML::Core::Definition::CharDef;
require LaTeXML::Core::Definition::Constructor;
Expand Down Expand Up @@ -52,6 +53,9 @@ sub isExpandable {
sub isRegister {
return ''; }

sub isFontDef { # ONLY FontDef handles this!
return ''; }

sub isPrefix {
return 0; }

Expand Down
7 changes: 4 additions & 3 deletions lib/LaTeXML/Core/Definition/CharDef.pm
Original file line number Diff line number Diff line change
Expand Up @@ -54,10 +54,11 @@ sub invoke {
($local ? Tokens(T_CS('\mathchar'), $value->revert, T_CS('\relax')) : $$self{cs}),
role => $$self{role}); }
else { # else text; but note defered font/encoding till digestion!
my ($char, %props) = LaTeXML::Package::FontDecode($value->valueOf);
return Box($char, undef, undef,
# Decode the codepoint using current font & encoding
my ($glyph, $adjfont) = LaTeXML::Package::FontDecode($value->valueOf);
return Box($glyph, $adjfont, undef,
($local ? Tokens(T_CS('\char'), $value->revert, T_CS('\relax')) : $$self{cs}),
%props); } }
); } }

sub equals {
my ($self, $other) = @_;
Expand Down
74 changes: 74 additions & 0 deletions lib/LaTeXML/Core/Definition/FontDef.pm
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# /=====================================================================\ #
# | LaTeXML::Core::Definition::FontDef | #
# | Representation of definitions of Fonts | #
# |=====================================================================| #
# | Part of LaTeXML: | #
# | Public domain software, produced as part of work done by the | #
# | United States Government & not subject to copyright in the US. | #
# |---------------------------------------------------------------------| #
# | Bruce Miller <bruce.miller@nist.gov> #_# | #
# | http://dlmf.nist.gov/LaTeXML/ (o o) | #
# \=========================================================ooo==U==ooo=/ #
package LaTeXML::Core::Definition::FontDef;
use strict;
use warnings;
use LaTeXML::Global;
use LaTeXML::Common::Object;
use LaTeXML::Common::Error;
use LaTeXML::Core::Token;
use LaTeXML::Core::Tokens;
use LaTeXML::Core::Box;
use base qw(LaTeXML::Core::Definition::Primitive);

# A CharDef is a specialized register;
# You can't assign it; when you invoke the control sequence, it returns
# the result of evaluating the character (more like a regular primitive).
# When $mathglyph is provided, it is the unicode corresponding to the \mathchar of $value
sub new {
my ($class, $cs, $fontid, %traits) = @_;
return bless { cs => $cs, parameters => undef,
fontID => $fontid,
locator => $STATE->getStomach->getGullet->getMouth->getLocator,
%traits }, $class; }

# Return the "font info" associated with the (TeX) font that this command selects (See \font)
sub isFontDef {
my ($self) = @_;
return $STATE->lookupValue($$self{fontID}); }

sub invoke {
my ($self, $stomach) = @_;
my $current = $STATE->lookupValue('font');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$current is never used?

if (my $fontinfo = $STATE->lookupValue($$self{fontID})) {
# Temporary hack for \the\font; remember the last font def executed
$STATE->assignValue(current_FontDef => $$self{cs}, 'local');
$STATE->assignValue(font => $STATE->lookupValue('font')->merge(%$fontinfo), 'local');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe use $current here, or better yet keep as-is and remove the outer lookup (since it's only needed if $fontinfo has a value).

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Thanks!

}
return Box(undef, undef, undef, $$self{cs}); }

#===============================================================================
1;

__END__

=pod

=head1 NAME

C<LaTeXML::Core::Definition::FontDef> - Control sequence definitions for font symbols defined by \font.

=head1 DESCRIPTION

Representation for control sequences defined by \font.
It extends L<LaTeXML::Core::Definition::Primitive>.

=head1 AUTHOR

Bruce Miller <bruce.miller@nist.gov>

=head1 COPYRIGHT

Public domain software, produced as part of work done by the
United States Government & not subject to copyright in the US.

=cut
20 changes: 14 additions & 6 deletions lib/LaTeXML/Core/Stomach.pm
Original file line number Diff line number Diff line change
Expand Up @@ -242,8 +242,13 @@ sub invokeToken_simple {
return LaTeXML::Core::Comment->new($comment); }
else {
$STATE->clearPrefixes; # prefixes shouldn't apply here.
return Box(LaTeXML::Package::FontDecodeString($meaning->toString, undef, 1),
undef, undef, $meaning); } }
if (my $mathcode = $STATE->lookupValue('IN_MATH')
&& $STATE->lookupMathcode($meaning->toString)) {
my ($role, $glyph, $f, $reversion) = LaTeXML::Package::decodeMathChar($mathcode, $meaning);
return Box($glyph, $f, undef, $reversion, role => $role); }
else {
return Box(LaTeXML::Package::FontDecodeString($meaning->toString, undef, 1),
undef, undef, $meaning); } } }

# Regurgitate: steal the previously digested boxes from the current level.
sub regurgitate {
Expand Down Expand Up @@ -359,10 +364,13 @@ sub setMode {
# and save the text font for any embedded text.
$STATE->assignValue(savedfont => $curfont, 'local');
$STATE->assignValue(script_base_level => scalar(@{ $$self{boxing} })); # See getScriptLevel
$STATE->assignValue(font => $STATE->lookupValue('mathfont')->merge(
color => $curfont->getColor, background => $curfont->getBackground,
size => $curfont->getSize,
mathstyle => ($mode =~ /^display/ ? 'display' : 'text')), 'local'); }
my $mathfont = $STATE->lookupValue('mathfont')->merge(
color => $curfont->getColor, background => $curfont->getBackground,
size => $curfont->getSize,
mathstyle => ($mode =~ /^display/ ? 'display' : 'text'));
$STATE->assignValue(font => $mathfont, 'local');
$STATE->assignValue(initial_math_font => $mathfont, 'local');
$STATE->assignValue(fontfamily => -1, 'local'); }
else {
# When entering text mode, we should set the font to the text font in use before the math
# but inherit color and size
Expand Down
28 changes: 0 additions & 28 deletions lib/LaTeXML/Engine/Base_ParameterTypes.pool.ltxml
Original file line number Diff line number Diff line change
Expand Up @@ -283,34 +283,6 @@ DefParameterType('Variable', sub {
my $params = $defn->getParameters;
return Tokens($defn->getCS, ($params ? $params->revertArguments(@args) : ())); });

# Same, but not necessarily writable
DefParameterType('Register', sub {
my ($gullet) = @_;
my $token = $gullet->readXToken;
my $defn = $token && LookupDefinition($token);
if ((defined $defn) && $defn->isRegister) {
[$defn, ($$defn{parameters} ? $$defn{parameters}->readArguments($gullet) : ())]; }
else {
if ($token && ($token->getCatcode == CC_CS)) {
if ($token->getString eq '\font') {
# \font is a bit of a register-like exception
return [$defn]; }
Error('expected', '<register>', $gullet,
"A <register> was supposed to be here", "Got " . Stringify($token),
"Defining it now.");
DefRegisterI($token, undef, Dimension(0)); # Dimension, or what?
return [LookupDefinition($token)]; }
else {
Error('expected', '<register>', $gullet,
"A <register> was supposed to be here", "Got " . Stringify($token),
"But it is not even definable.");
return [LookupDefinition(T_CS('\lx@DUMMY@REGISTER'))]; } } },
reversion => sub {
my ($var) = @_;
my ($defn, @args) = @$var;
my $params = $defn->getParameters;
return Tokens($defn->getCS, ($params ? $params->revertArguments(@args) : ())); });

DefParameterType('TeXFileName', sub {
my ($gullet) = @_;
my ($token, $cc, @tokens) = ();
Expand Down
Loading
Loading