Add operator name canonicalization #54

GeorgeR227 · 2024-06-25T18:48:41Z

This is meant to address #34, to allow for better support of using Unicode and Ascii operator names. The idea here is to allow the user to choose from a range of supported operator names but have the underlying codebase only work on the canon name.

An example would be to have type inference rules carry a canon name, instead of an array of supported names, and have inference rules simply check that their name matches the converted user name.

These canon names should be carefully chosen to be easy to work with and parse. Some rules are included but are liable to change.

These tables are for both canon op1 operator names and their formless counterparts

codecov · 2024-06-25T18:52:13Z

Codecov Report

Attention: Patch coverage is 81.81818% with 4 lines in your changes missing coverage. Please review.

Project coverage is 85.96%. Comparing base (12edc3a) to head (df5d2c5).
Report is 3 commits behind head on main.

Files	Patch %	Lines
src/deca/deca_op_names.jl	75.00%	4 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #54      +/-   ##
==========================================
+ Coverage   84.68%   85.96%   +1.28%     
==========================================
  Files          12       14       +2     
  Lines         764      905     +141     
==========================================
+ Hits          647      778     +131     
- Misses        117      127      +10

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Works so far but more testing needed. Also believe that the Halfar test in language is broken.

The HeatXfer test is confirmed broken. The only reason why it failed now and not before is because negation inference broke slightly and is fixed now.

The problem now is that operators will be always be overloaded with our unicode names.

Before the code would remove any form information from an operator name. However, this would mean that if a user mistyped an operator, type inference might've silently worked if another overloading of that operator was valid. This now prevents that by respecting the given type of the operator.

Problem right now is that operators that get overloaded resolve into our unicode operators. However, already typed operators that aren't in the unicode stay untouched. This is intended behavior but means downstream packages need to use canon names to work with all supported names.

This needs to be tested first

If the generic type is unicode then use unicode name, if ascii then use ascii. Some special cases still apply for consistency with previous versions but should be changed. Also moved canon into its own file.

Also update rest of resolution rules

These describe how to add proper inference and resolution rules for new DEC operators

lukem12345 · 2024-07-01T19:18:14Z

test/op_naming.jl

+end
+
+@testset "Typed Function Observance" begin
+  let # Typing respects typed exterior derivative


What is this test supposed to show?

When I first implemented canon name typing, if a user typed in A == d_0(B) but said B::Form1, which is wrong cause the operator doesn't match with the input, infer would see d_0 as d, the canon generic name, and type A as a Form2.

When I realized that, I changed the code so if the operator is already typed it can only use its own specific rule and not the generic rules. These tests enforce this behavior because it felt like what was occurring above is a silent failing, leading to bad errors down the line.

Ok. We should have a comment that explains this situation using e.g. that above summary. The "respects" feels a little overloaded.

Do we need to test this behavior for all the other operators as well, or does this single test suffice?

lukem12345 · 2024-07-01T19:25:59Z

test/op_naming.jl

+
+@testset "Canon Inference and Overloading" begin
+  function check_canontyping(control::SummationDecapode, test::SummationDecapode)
+    infer_test = infer_types!(deepcopy(test))


Can we use infer_types!(deepcopy(test)) == infer_types!(deepcopy(control))

Technically the scope of this test is just to checks types and that names don't change. I guess we could change it to that but I want to limit the scope of the test to test what we actually want to check.

lukem12345 · 2024-07-01T19:27:16Z

test/op_naming.jl

+    @test get_canon_name(over_test[:op1]) == get_canon_name(over_control[:op1])
+  end
+
+  setup_basecase(d::SummationDecapode) = resolve_overloads!(infer_types!(deepcopy(d)))


"Setup" is too generic of a function name here, and it looks like deepcopy, infer, resolve is used in the function above this.

Yeah I can change the name. I'm not too attached to it since they're only valid within the test anyway.

lukem12345 · 2024-07-01T19:54:59Z

src/deca/deca_op_names.jl

This file should get renamed to theory.jl. Later, we can silo all of the information that would normally incorporate @theory or "Model" information. The reason we won't use @theory outright for the time being is that we want to have support for multiple names referring to the same operator, etc.

lukem12345 · 2024-07-01T20:05:36Z

This PR just uses the global names Dict for the type inference and function overloading, but it should be used throughout this codebase and Decapodes.jl wherever explicit symbols appear. Can you spot check the rest of this codebase for such updates? Where it is too tedious (e.g. unicode!), we can make an issue to refactor.

lukem12345 · 2024-07-01T20:13:58Z

src/deca/deca_op_names.jl

+const DUALDERIV_0 = :dual_d_0
+const DUALDERIV_1 = :dual_d_1
+
+const HODGE_0 = :hdg_0


The ASCII-Unicode equivalents that we have now use star and star_inv instead of hdg.

DiagrammaticEquations.jl/src/deca/deca_acset.jl

Line 281 in e21e6c7

ascii_to_unicode_op1 = Pair{Symbol, Any}[

lukem12345 · 2024-07-01T20:14:58Z

src/deca/deca_op_names.jl

+
+CANON_NAMES = Dict{Symbol, Symbol}(
+  # Partial time derivative
+  :∂ₜ => PARTIAL_T,


Why is this not given a const UNICODE_PARTIALT

Because the unicode consts are only needed for overloading and this should never be overloaded.

Really, unicode names should just be aliases for ascii names but I have the unicode consts for nicer function naming in overloading, so for example op is ∧ => ∧₀₁ but if is wdg => wdg_01.

GeorgeR227 · 2024-07-01T20:15:47Z

I've skimmed through the rest of the code and it seems like really the only other function that should use canon names, in DiagrammaticEquations, would be the find_chains code. So in this case, we can just check that the canon name of an operator matches the canon name of an operator in a black/whitelist.

Everything else seems to just be editing names or copying over raw names and so don't need this.

lukem12345 · 2024-07-01T20:16:14Z

src/deca/deca_op_names.jl

+const UNICODE_AVG_01 = :avg₀₁
+
+# Op2 names
+const UNICODE_WEDGE_00 = :∧₀₀


I would group these blocks of operators such that all forms of wedge product (i.e.g including ASCII) are grouped together.

lukem12345 · 2024-07-01T20:50:56Z

AlgebraicJulia/Decapodes.jl#142 is a spiritual predecessor of this feature. The main distinctions are that the old PR "canonicalized" on Unicode, and this PR ASCII, and the old PR added a subroutine that edits the ACSet to use the canonicalized names.

lukem12345 · 2024-07-01T20:56:13Z

We note that incident(decapode, :L_1, :op1) needs special treatment if it is to be inter-operable with canonical representations. Of course, features such as incident(decapode, :L, :op1) (the intention being to get all Lie derivatives, typed or untyped, Unicode or ASCII) require more engineering anyway.

lukem12345 · 2024-07-01T21:01:16Z

As discussed off-line, we should:

Have a way of getting the Unicode of representation of an operator. (Currently partially handled by the Dict from Decapodes.jl PR 142)
'' ASCII ''. Currently handled by this PR.
We want to get the set of all aliases (for e.g. using incident in a more pain-free way). This information is stored implicitly now.

Added lookup table

25a52e1

These tables are for both canon op1 operator names and their formless counterparts

GeorgeR227 added 12 commits June 25, 2024 16:02

Added canon name to op1 infer

7f6a3f7

Works so far but more testing needed. Also believe that the Halfar test in language is broken.

Added tests for op1 canon infer

bc13754

The HeatXfer test is confirmed broken. The only reason why it failed now and not before is because negation inference broke slightly and is fixed now.

Starting support for overloading

8605f8a

The problem now is that operators will be always be overloaded with our unicode names.

Added support for canon op2

94a0428

This needs to be tested first

Added tests for wedge

c13c537

More tests for op2

d8ef661

Tests user typed op2s respected

d8f6182

Nicer name resolving

21e0d4e

If the generic type is unicode then use unicode name, if ascii then use ascii. Some special cases still apply for consistency with previous versions but should be changed. Also moved canon into its own file.

Move tests to own file

003e8af

Also update rest of resolution rules

Add some dev instructions

e1fa772

These describe how to add proper inference and resolution rules for new DEC operators

GeorgeR227 marked this pull request as ready for review July 1, 2024 15:20

Remove old rules

97a4759

GeorgeR227 requested a review from lukem12345 July 1, 2024 15:25

lukem12345 reviewed Jul 1, 2024

View reviewed changes

Add Unicode operators into new dictionary scheme

df5d2c5

lukem12345 mentioned this pull request Aug 15, 2024

Interop with SymbolicUtils #64

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add operator name canonicalization #54

Add operator name canonicalization #54

GeorgeR227 commented Jun 25, 2024

codecov bot commented Jun 25, 2024 •

edited

Loading

lukem12345 Jul 1, 2024

GeorgeR227 Jul 1, 2024

lukem12345 Jul 1, 2024

lukem12345 Jul 1, 2024

lukem12345 Jul 1, 2024

GeorgeR227 Jul 1, 2024

lukem12345 Jul 1, 2024

GeorgeR227 Jul 1, 2024

lukem12345 Jul 1, 2024

lukem12345 commented Jul 1, 2024

lukem12345 Jul 1, 2024

lukem12345 Jul 1, 2024

GeorgeR227 Jul 1, 2024

GeorgeR227 Jul 1, 2024

GeorgeR227 commented Jul 1, 2024

lukem12345 Jul 1, 2024

lukem12345 commented Jul 1, 2024

lukem12345 commented Jul 1, 2024

lukem12345 commented Jul 1, 2024 •

edited

Loading

Add operator name canonicalization #54

Are you sure you want to change the base?

Add operator name canonicalization #54

Conversation

GeorgeR227 commented Jun 25, 2024

codecov bot commented Jun 25, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lukem12345 commented Jul 1, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

GeorgeR227 commented Jul 1, 2024

Choose a reason for hiding this comment

lukem12345 commented Jul 1, 2024

lukem12345 commented Jul 1, 2024

lukem12345 commented Jul 1, 2024 • edited Loading

codecov bot commented Jun 25, 2024 •

edited

Loading

lukem12345 commented Jul 1, 2024 •

edited

Loading