Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

State of inner products in Base #25565

Closed
juliohm opened this issue Jan 15, 2018 · 146 comments
Closed

State of inner products in Base #25565

juliohm opened this issue Jan 15, 2018 · 146 comments
Labels
linear algebra Linear algebra

Comments

@juliohm
Copy link
Contributor

juliohm commented Jan 15, 2018

If a user needs to define custom inner products for general Hilbert spaces, what is the current state in Base for this type of generalization? #16573 is a related, but less general issue. My concern is with new types that aren't arrays.

I'd like to propose renaming dot to inner, or perhaps directing users to define inner(x,y) as a general inner product between objects x and y, including the array case:

inner(x::AbstractVector, y::AbstractVector) = dot(x, y)

In case the change is reasonable, could it be part of Julia v1.0?

@andreasnoack
Copy link
Member

Could you explain a little bit more about your use case and why having it in Base instead of just defining it in your package is beneficial? A concrete example would be best. Do you expect several inner definitions across packages loaded simultaneously?

@juliohm
Copy link
Contributor Author

juliohm commented Jan 15, 2018

I think having a formal interface for these mathematical spaces will help users exploit the type system better. For example, I would expect clustering methods to work on any metric space. If I could define my type with an inner product, I would benefit from Clustering.jl out of the box (after the package is fixed accordingly). Many other distance-based or projection-based algorithms could also be generalized.

As a concrete example, I came across this limitation today trying to define a geometry for compositional data: https://github.com/juliohm/CoDa.jl I'd rather specialize on a well-known inner function defined in Base than define my own interface that no one else will be aware of.

@JeffBezanson JeffBezanson added the linear algebra Linear algebra label Jan 15, 2018
@andyferris
Copy link
Member

Why not extend dot for your Hilbert space types? I’m pretty sure it’s designed with being the generic inner product in mind.

@juliohm
Copy link
Contributor Author

juliohm commented Jan 16, 2018

The concept of dot product is more strict than the concept of inner product. Whereas the latter is defined for general spaces, a dot product is only defined when there is the notion of a coordinate system defined by a finite basis. The semantics of dot(x,y) is x'*y where x and y represent coordinates of the objects in a Cartesian world. Mathematical textbooks will rarely mention the term dot product as the authors are usually interested treating the material in more general (not necessarily finite nor Euclidean) spaces.

To distinguish further, in a Hilbert space with inner product <x,y> (or inner(x,y)) objects can be infinite and the semantics x'*y doesn't apply. For example, in functional data analysis the objects are functions f and g and the inner product is usually obtained by numerical integration: inner(f,g) = quadrature(f*g). Calling this operation a dot product is misleading.

Another example as I pointed out in my CoDa.jl package is compositional data. Composition objects lie in a simplex world for which the operation x'*y doesn't make any sense. However, there exists a isometric transformation (the log-ratio transformation) that one can use to map compositions into another geometry where one can then apply the dot product with coordinates. Working with coordinates is not necessary, but it is common practice in this field. The result can be back transformed to the original space where the objects exist.

I don't see benefit in maintaining the term dot in the language, but if one asks for backward compatibility, the generalization inner(x::AbstractVector, y::AbstractVector) = dot(x,y) works perfectly.

Can you elaborate on the objections for this change?

@StefanKarpinski
Copy link
Member

Can you elaborate on the objections for this change?

We generally require a fair amount of justification for adding new public functions to Base, that's the objection. This could be provided by an InnerProducts package. Why does it need to be built into the language itself? This was the first question that @andreasnoack asked above – it got a somewhat vague answer of "I think having a formal interface for these mathematical spaces will help users exploit the type system better". There's no reason that an interface defined in a package is any less formal than one in Base. What does having Base.inner offer that InnerProducts.inner doesn't? This is a genuine question that could have a convincing answer, but I don't know what that answer might be, which is why the question is being asked.

@juliohm
Copy link
Contributor Author

juliohm commented Jan 16, 2018

I don't see a good argument to define a basic mathematical concept like inner products elsewhere that is not in Base. A language whose main audience is scientific computing folks would benefit from correct terminology. Why the concept of norm is defined in Base.LinAlg and inner, which is on the same cohort, should be defined in a package? Besides this inconsistency, the language already has dot, which makes me wonder why it should have something so specific rather than a more general concept?

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Jan 16, 2018

So you want all possible mathematical concepts in the base language? Not having something defined in Base doesn't force people to use the wrong terminology. The norm function is exported from LinAlg because it is defined and used in LinAlg. Similar for dot. Are you proposing that dot should be renamed to inner?

@juliohm
Copy link
Contributor Author

juliohm commented Jan 16, 2018

So you want all possible mathematical concepts in the base language?

I never said that.

Not having something defined in Base doesn't force people to use the wrong terminology.

I am sure that it doesn't. Promoting the wrong terminology is the issue. People coming from a less mathematical background will adopt the usage of dot because they see it in Base. Usage of the term "dot" product to represent the concept of inner product is incorrect. It is also harmful to the mathematical community, which struggles every now and then to fix these scars that wrong terminology has left. Students from my generation are consistently having to refer to old books to get the terminology right, this shouldn't be the case.

Are you proposing that dot should be renamed to inner

That would already be a major improvement in my opinion. See all the examples I gave above on functional and compositional data. People in these communities would never use the term dot in their work. "dot" is more like a computer science term than anything else.

@StefanKarpinski
Copy link
Member

Renaming dot to inner is quite a different proposal than adding inner to Base in addition to dot. That's more of a "correct terminology" question, which you and other linalg folks will have to hash out, although I seem to recall we bikeshedded this once and concluded that dot was the correct name for what this function implements.

@andreasnoack
Copy link
Member

There was a little discussion of this in #22227 and #22220

@juliohm
Copy link
Contributor Author

juliohm commented Jan 16, 2018

Renaming dot to inner is quite a different proposal than adding inner to Base in addition to dot.

This is what I proposed in my first message in this thread:

I'd like to propose renaming dot to inner, or perhaps directing users to define inner(x,y) as a general inner product between objects x and y

I repeat dot product is the incorrect term for the operation I am discussing here. Inner, outer, scalar product... these are mathematical objects. "dot product" is a computational object: it gets two sequences of numbers and performs x1*y1 + x2*y2 + ... xn*yn, a useless operation in other mathematical spaces.

@StefanKarpinski
Copy link
Member

I had focused on the second option you proposed, which seems to have been adding Base.inner with a fallback to call Base.dot. Either option is possible, but both require some justification: to add a new operation, one needs a reason why it can't just be in a package (what the initial part of this discussion was about); to rename, it needs to be decided that dot is the wrong name and inner is the correct one (what the conversation seems to have turned to).

@andyferris
Copy link
Member

@juliohm It's probably worth (re)stating that there is an active effort currently trying to shrink Base and encourage the use of packages. In this case dot seems to be correct for all the types participating in linear algebra provided in standard Julia (i.e. Number and Array - so yes, there is a definite, known, finite basis in all cases - thus I don't think we've made a mistake in terminology, though there may be better choices). I'm not against this proposal - but wanted to point this to clarify why you might be experiencing some "latent" resistance to change.

Also worth keeping in mind that a fair number of Julia newcomers may be familiar with a dot product but not an inner product (say, they did a bit of physics at university, but aren't math majors) so there are also some reasons to keep dot (not to mention that we have an infix operator that it corresponds with - we could just map it to inner I suppose but that's slightly less obvious). We also don't have an outer function, or a variety of other possible operations.

Thus, there is a burden to make a reasonable case for how putting this in Base (or LinAlg) is strictly better than putting this in a user package. The primary reason seems to be to provide an interface that can be shared and extended by others - is that a reasonable summary? The argument about letting generic code from Clustering.jl work with your inner product seems pretty compelling. Also, in the context that we seem to be splitting LinAlg into a stdlib package - I was thinking that if I were to author a package called LinearAlgebra I'd probably be happy to include an inner function for others to extend.

@juliohm
Copy link
Contributor Author

juliohm commented Jan 16, 2018

Thank you @andyferris for sharing your thoughts. I see the resistance very clearly, which is something that I am not very excited about. Nevertheless, I am curious about how this specific proposal leads to code increase? To me, it seems like a trivial change in code with major improvement in abstraction. The example with Clustering.jl is just one of many, think of any kernel-based method that can be made to work with arbitrary Julia types for which the notion of inner product exists. MultivariateStats.jl has plenty of them.

@juliohm
Copy link
Contributor Author

juliohm commented Jan 16, 2018

Regarding the comment about LinAlg split into a separate package, I agree that it seems like a good place to encapsulate mathematical products. I am assuming that this LinearAlgebra package of the future would be imported in a Julia session by default and so all users would have access to the concept of inner, outer, etc right away.

@andyferris
Copy link
Member

andyferris commented Jan 16, 2018

Yes, the standard libraries are all built together with the Julia system image and available by default. At least for the v1.x series no-one will need to type using LinAlg (I don't think it will be renamed LinearAlgbebra, btw, I just made that up as a hypothetical competitor).

@StefanKarpinski
Copy link
Member

To clarify, it would be loaded with standard Julia so you don't have to install anything, but you would still have to write using LinAlg to get the names it exports.

@andyferris
Copy link
Member

This is where it gets odd, right, since we'll get the * methods and so-on without using LinAlg? (in other terms, LinAlg is a type pirate).

@StefanKarpinski
Copy link
Member

Yes, that's basically where we'll have to draw the line: Base must define as much linear algebra functionality as needed to make LinAlg not a pirate, so matmul is defined in Base because Array and * both are. Funky matrix types and non-base operations live there though.

@juliohm
Copy link
Contributor Author

juliohm commented Jan 16, 2018

Let me give you a concrete example and ask you how would you solve it with the current interface, maybe this can clarify things for me.

The goal is to perform factor analysis with compositional data. I have a type called a Composition and an inner product in the space of compositions. I collect many samples (e.g. rock samples) and put all of them into a big Vector{Composition} (e.g. composition = %water, %grain, %air). Now I want to call a factor analysis algorithm implemented in another package (e.g. MultivariateStats.jl) on this vector of data. How would you implement that generically without having an inner product imported by default?

What I understood from the last comments is that both MultivariateStats.jl and CoDa.jl would have to depend on LinAlg.jl. The dependency in MultivariateStats.jl is just to bring name inner into scope. The dependency in CoDa.jl is to define a method for inner that can be called by MultivariateStats.jl. Is that what you are suggesting?

@andyferris
Copy link
Member

It seems Composition{D} is a D dimensional vector space under + and *.

I would be quite tempted to define the dual vector space.

So, you could define adjoint(::Composition) -> DualComposition and *(::DualComposition, ::Composition) -> scalar (currently inner). DualComposition wouldn't have to do much except hold a Composition inside.

@andyferris
Copy link
Member

The first sentence in https://en.wikipedia.org/wiki/Dot_product does seem to suggest that dot could be an operation on any two iterables. We could make it recursive and define it for Number, and define inner as the abstract linear algebra function, which happens to overlap for Number and AbstractArray.

@juliohm
Copy link
Contributor Author

juliohm commented Jan 17, 2018

Thank you @andyferris, I appreciate your thoughts on the dual space. I'd rather not rely on a new type for this task though. The final solution is unnecessarily complex.

What I am interested in understanding is why something like:

inner(x,y) = sum(x.*y)
norm(x) = sqrt(inner(x,x))

export inner, norm

not welcome in Base? I am assuming this is all that is required to define the function names generically for users of the language to specialize on. Keep in mind I am asking these questions with the genuine interest of understanding the point of view of the core devs. I want to say this before the conversation goes into the wrong direction again.

From the perspective of someone interested in math in general, it feels unnatural to have these concepts not exported by default, and instead have them defined inside of LinAlg. I think of LinAlg as implementations of these high-level concepts for array types. Perhaps my entire work does not require linear algebra on arrays, but I could still benefit from the concept of inner product across packages (e.g. MultivariateStats.jl, Clustering.jl). Also, I may not want to have LinAlg as a dependency in my package because it is not.

If I can emphasize it further, there is the concept of inner product, which is independent of arrays. This concept is represented by the statement export inner in Base. There is the implementation of inner products for array-like objects representing coordinates inner(x,y) = sum(x.*y). This operation can be defined as a fallback method in Base like above, if necessary.

@Jutho
Copy link
Contributor

Jutho commented Jan 17, 2018

Another example of a use case is Krylov methods. If you have e.g. function spaces with inner products, then you could use Krylov methods to approximate a linear problem or eigenproblem in a small finite-dimensional subspace of that infinite dimensional function space.

I too have my own objects which form a vector/Hilbert space but are not part of <: AbstractArray. From the analogy that also arrays with rank N>1 form vector spaces and can be used as 'vectors' in Krylov methods, I've come to rely on using vecdot and vecnorm being the generalized notion of inner product and norm. So I've been developing a package with Krylov methods that uses functions as linear operators and where the 'vector's can be of any type, provided objects of that type support vecdot, vecnorm and a few other things (scale!, zero, ...). But maybe that is abusing what was meant by these concepts in Base, so it would be good to straighten out the correct interface here.

@andyferris
Copy link
Member

Right - vecdot could be renamed inner.

(Now I’m vaguely wondering if norm should actually be called matrixnorm for matrices with norm always matching inner. It seems that maybe there are two distinct things going on with norm which is causing some difficulties with generalising it)

@Jutho
Copy link
Contributor

Jutho commented Jan 25, 2018

In fact, for general vector-like objects, it's also useful to query the dimension of the vector space (e.g. to verify that Krylov dimension should not be larger than dimension of the full space in my example use case). The example of nested arrays shows that length is not the right concept here, i.e. there would need to be some recursive notion of length for those cases.

Now for the example of using nested arrays as a general vector, vecdot and vecnorm are in some cases not even the correct notion of inner product and norm, as discussed in #25093, i.e. they are not recursively calling vecdot and vecnorm. My interpretation of these functions as a generic inner product and norm function is what triggered #25093, but it seems that this might not be how these functions were intended (not sure what they were intended to do instead).

So I do agree that we need a consistent interface here to be used across packages, that would therefore belong in a central location (probably not in Base but certainly in a Standard Library, e.g. such that one has to do using VectorSpaces). As for naming, I see two options:

Option 1 (my interpretation so far):
the prefix vec indicates the property of that object when interpreting it as a generic vector, hence

Option 2 (probably better): use more mathematically correct names

  • inner
  • dimension
  • But what to do with norm?

And finally, just pinging @stevengj as he will certainly have some useful comments; my apologies if this is inconvenient.

@andyferris
Copy link
Member

Yes, congrats!

@jebej
Copy link
Contributor

jebej commented Jun 3, 2018

It seems like there might be some consensus forming around the idea of having both dot and inner, where:

  1. inner is a true recursive inner product
  2. dot = dot(x,y) = sum(x[i]'*y[i] for i = 1:length(x)) conjugated or not, and would therefore overlap with dot for Vector{<:Number} or Vector{<:Real}

Regarding:

Many people would write dot when they meant inner, and it would work fine for them in most cases, but then their code would do something unexpected if it is passed an array of matrices.

I don't believe that would be an issue. Since this is a fairly uncommon operation, I would expect people to at the very least try it to see what it does and/or go look at the documentation.

I think that the above would be a great change, and not very disruptive since the semantics of dot are not changed in most cases.

@Sacha0
Copy link
Member

Sacha0 commented Jun 3, 2018

It seems like there might be some consensus forming around the idea of having both dot and inner

To the contrary, the discussion from #25565 (comment) on appears to favor having one or the other but not both, as e.g. laid out in #25565 (comment) and well supported with reactions.

@ranocha
Copy link
Member

ranocha commented Jun 3, 2018

Maybe inner (and also dot) should be recursive inner/dot/scalar products and the old behaviour could be implemented in functions such as dotc(x,y) = sum(x[i]' * y[i] for i in eachindex(x)) and dotu(x,y) = sum(transpose(x[i]) * y[i] for i in eachindex(x))? The names dotu and dotc would match corresponding BLAS names.

@juliohm
Copy link
Contributor Author

juliohm commented Jun 3, 2018

(I agree that ⟨u,v⟩, ⟨u|v⟩, or (u,v) are more common notations for inner products on arbitrary Hilbert spaces—they are what I typically use myself—but those notations are a nonstarter for Julia. There was some discussion of parsing Unicode brackets as function/macro calls, e.g. #8934 and #8892, but it never went anywhere and this seems unlikely to change soon.)

@stevengj, when you added this paragraph to a previous comment by yourself, you meant that the syntax ⟨u,v⟩ is hard to implement in the language?

@juliohm
Copy link
Contributor Author

juliohm commented Jun 10, 2018

Any chance this feature will make it to Julia v1.0? I have so many ideas and packages that depend on the notion of general inner products. Please let me know if I should lower my expectations. Sorry for the constant reminder.

@jebej
Copy link
Contributor

jebej commented Jun 10, 2018

Have you not seen #27401?

@juliohm
Copy link
Contributor Author

juliohm commented Jun 10, 2018

Thank you @jebej and thank you @ranocha for taking the lead ❤️

@stevengj
Copy link
Member

stevengj commented Jun 11, 2018

when you added this paragraph to a previous comment by yourself, you meant that the syntax ⟨u,v⟩ is hard to implement in the language?

Not technically hard to add to the parser, but it has proven difficult to come up with a consensus on how (and whether) to represent custom brackets in the language. See the discussion at #8934, which went nowhere 4 years ago and has not been revived since. (Add that to the fact that in different fields people use the same brackets for many different things, e.g. ⟨u⟩ is used for ensemble averages in statistical physics.) Another issue, raised in #8892, is the visual similarity of many different Unicode brackets.

@juliohm
Copy link
Contributor Author

juliohm commented Jun 11, 2018

Thank you @stevengj, I appreciate the clarifications. I am already very excited that we are gonna have general inner products standardized across packages. 💯 Maybe the angle bracket notation could shine in another release cycle in the future. Not as important, but quite convenient to be able to write code that is literally like the math in our publications.

@StefanKarpinski
Copy link
Member

If ⟨args...⟩ is a valid syntax for calling the anglebrackets operator or something (what to call the function this syntax calls is actually kind of tricky since we don't have any precedent), then people could opt into whatever meaning they want for the syntax.

@stevengj
Copy link
Member

stevengj commented Jun 11, 2018

@StefanKarpinski, the argument in #8934 was that it should be a macro. I don't think we ever came to a consensus.

(If we decide in Base that anglebrackets(a,b) means inner(a,b), that will discourage people from "opting into whatever meaning they want" because the decision will have already been made.
It's not a terrible choice, of course, but it may be unnecessary to assign this a meaning in Base as long as they are parsed.)

@StefanKarpinski
Copy link
Member

I don't recall the details of that discussion but making a macro seems like obviously a bad idea to me.

@StefanKarpinski
Copy link
Member

With #27401 I think we can consider inner products to have been taking seriously.

@StefanKarpinski StefanKarpinski removed the triage This should be discussed on a triage call label Jun 14, 2018
@stevengj
Copy link
Member

Traditionally, an issue is closed only when the relevant PR is merged...

@StefanKarpinski
Copy link
Member

Sure, we can leave it open I guess. Just wanted to get it off the triage label.

@ranocha
Copy link
Member

ranocha commented Jun 19, 2018

Should this be closed since #27401 is merged now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linear algebra Linear algebra
Projects
None yet
Development

No branches or pull requests