Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add interpunct (·) as a keyword for dot product, etc. #3584

Open
wants to merge 3 commits into
base: development
Choose a base branch
from

Conversation

d-torrance
Copy link
Member

For this to work, we no longer count unicode characters beginning with 194 as alphabetic (just like how we don't count characters beginning with 226 as alphabetic). These are the "Latin-1 Punctuation and Symbols".

So afterwards, we can use this new keyword to define things like dot products without having to worry about surrounding it with spaces:

i1 : Vector · Vector := (v, w) -> ((transpose v#0) * w#0)_(0, 0);

i2 : v = vector {1, 2, 3}; w = vector {4, 5, 6};

       3
o2 : ZZ

       3
o3 : ZZ

i4 : v·w

o4 = 32

Closes: #3434

@mahrud
Copy link
Member

mahrud commented Nov 18, 2024

A couple of questions:

  1. While you're doing this, could you also add exceptions for U+00D7 and U+00F7 for \times and \div?
  2. I don't understand where 194 is coming from. Latin-1 punctuation and symbols starts at U+00A0 which is 160, no?

@mahrud
Copy link
Member

mahrud commented Nov 18, 2024

Also, this doesn't need to happen right now, but texMath symbol · should give \cdot (and maybe same for a handful others that have tex names).

@d-torrance
Copy link
Member Author

  1. While you're doing this, could you also add exceptions for U+00D7 and U+00F7 for \times and \div?

The problem with those two is that they're in with a bunch of characters that I think are definitely alphabetic:

i1 : apply(splice(150..160,180..190), x -> ascii {195, x})

o1 = (Ö, ×, Ø, Ù, Ú, Û, Ü, Ý, Þ, ß, à, ô, õ, ö, ÷, ø, ù, ú, û, ü, ý, þ)

I'm not sure which is more important -- allowing characters like à in symbol names or × and as ÷ as keywords.

  1. I don't understand where 194 is coming from. Latin-1 punctuation and symbols starts at U+00A0 which is 160, no?

From what I understand from Wikipedia, since A0 is between 80 and 7FF (and can be represented in 11 bits), it gets encoded in two bytes. The first byte is 110 followed by the first 5 bits, and the second byte is 10 followed by the last 6 bits. In this case, A0 -> 00010100000, and so the first byte is 11000010, or 194.

M2/Macaulay2/m2/latex.m2 Outdated Show resolved Hide resolved
@mahrud
Copy link
Member

mahrud commented Nov 18, 2024

The problem with those two is that they're in with a bunch of characters that I think are definitely alphabetic:

Can't we single out those specific characters rather than the whole range?

@pzinn
Copy link
Contributor

pzinn commented Nov 19, 2024

tangentially related, while I was testing this:

i1 : getSymbol "⟎"

o1 = ⟎

o1 : Symbol

i2 : getSymbol "⟎⟎"
stdio:2:9:(3): error: invalid symbol

i3 : getGlobalSymbol "⟎"

o3 = ⟎

o3 : Symbol

i4 : getGlobalSymbol "⟎⟎"
stdio:4:15:(3): error: attempted to create symbol in protected dictionary

Clearly the second error message is incorrect. (edited: and yes it's a moot point because can't create a new symbol if it's protected anyway.) This can be traced back to actors5.d

getglobalsym(d:Dictionary,s:string):Expr := (
     w := makeUniqueWord(s,parseWORD);
     when lookup(w,d.symboltable) is x:Symbol do Expr(SymbolClosure(globalFrame,x))
     is null do (
          if !isvalidsymbol(s) then return buildErrorPacket("invalid symbol");
	  if d.Protected then return buildErrorPacket("attempted to create symbol in protected dictionary");
	  t := makeSymbol(w,tempPosition,d);
	  globalFrame.values.(t.frameindex)));

getglobalsym(s:string):Expr := (
     w := makeUniqueWord(s,parseWORD);
     when globalLookup(w)
     is x:Symbol do Expr(SymbolClosure(if x.thread then threadFrame else globalFrame,x))
     is null do (
	  if globalDictionary.Protected then return buildErrorPacket("attempted to create symbol in protected dictionary");
	  t := makeSymbol(w,tempPosition,globalDictionary);
	  globalFrame.values.(t.frameindex)));

the second def of getglobalsym is missing the line about invalid symbol.
but why can't the second definition just call the first with globalDictionary? the only difference is this thread thing which I don't fully understand.

@pzinn
Copy link
Contributor

pzinn commented Nov 19, 2024

also, I'm confused about help getGlobalSymbol:

If dict is omitted, then the first symbol found in the dictionaries listed in [dictionaryPath](http://localhost:8002/home/pzinn/M2/M2/BUILD/fedora/usr-dist/common/share/doc/Macaulay2/Macaulay2Doc/html/_dictionary__Path.html) will be returned. If none is found, one will be created in the first dictionary listed in dictionaryPath, unless it is not mutable, in which case an error will be signalled; perhaps that behavior should be changed.

first dictionary listed in dictionaryPath??? it's Varieties.Dictionary...

Instead of just characters whose first byte is 226, we add 194 and a
couple (multiplication/division symbols) that start w/ 195.
@d-torrance
Copy link
Member Author

Can't we single out those specific characters rather than the whole range?

We definitely can! I've pushed an updated version with support for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants