-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make numbers non-iterable? #7903
Comments
The issue is not specific to I do think it would be good to make see also #5844 |
So I guess the problem is not specifically with `in' but with ==. I raised the issue that apparently meaningless uses of 'in' do not generate errors because I made silly blunder with 'in' in my project that did not generate an error message. For a similar reason (ease of debugging) the following uses of == should also generate errors, no? julia> [3,4,5] == ["x","y"] julia> IntSet() == ["a"=>7] julia> 9 == Int64 julia> |
Our == is total, defined on all pairs of values. I find this convenient,
|
Jeff, I guess it's OK to make == total, but there also ought to be a restricted -- Steve On Thu, 7 Aug 2014, Jeff Bezanson wrote:
|
Three julia> 5 == 5.0
true
julia> 5 === 5.0
false |
However that is for object identity, not equality. I.e. (zeros(3) === On Thursday, August 7, 2014, Tim Holy notifications@github.com wrote:
|
One option is you can pick your favorite unicode equality-resembling symbol from this list Line 13 in 31eb8d4
== but open for user definitions, and make that operator error on inputs of different types.
|
Earlier I wrote that it seems OK for == to work for all possible operands, but now I changed my mind. The (small) increase in expressive power does not offset the potential for enabling programmer blunders. One important mission of a programming language is to help prevent the programmer from shooting himself/herself in the foot, and in this case Julia needlessly fails to close a loophole. About 25 years ago when I was a CS prof at Cornell, the issue came up (again) whether to switch our introductory programming course to C, and our faculty unanimously rejected the idea (again) for many reasons. One of the reasons was that we did not savor the possibility of undergrads lined up outside the TA office for help with their assignments because they wrote |
It seems it would be better to add restrictions to I also agree that |
I find the case for restricting With In other words, it would be too difficult to get back the current behavior. You would need something like |
Another (unrelated) issue Please let me know what you think about this, and feel free to move this elsewhere if appropriate. Here's an example: julia> VERSION
v"0.2.1"
julia> immutable Edge # edges of an undirected graph
a :: Integer
b :: Integer
end
julia> ==(a::Edge, b::Edge) = (a.a == b.a && a.b == b.b) || (a.a == b.b && a.b == b.a)
== (generic function with 47 methods)
julia> import Base.isequal
julia> isequal(a::Edge, b::Edge) = a == b
isequal (generic function with 34 methods)
julia> edges = [Edge(1,2), Edge(2,3), Edge(3,1), Edge(1,3)]
4-element Array{Edge,1}:
Edge(1,2)
Edge(2,3)
Edge(3,1)
Edge(1,3) Now, compare julia> in(Edge(2,1), edges)
true and julia> in(Edge(2,1), Set(edges))
false I did not expect this. This also breaks In this particular case a simple workaround is to define the (inner) constructor Edge(a::Integer, b::Integer) = a < b ? new(a,b) : new(b,a) and use default PS: As far as I can tell recent code in the |
You also need to implement |
Hm, perhaps |
Could you elaborate on what is a "mathy" type, and what the problems are? |
Let me make the following proposal: instead of redefining ==, how about if you redefine isequal(.,.) so that it is valid only when its operands are the same type (of course extensible by the programmer if necessary). Then you redefine 'in' to apply isequal instead of ==. Finally, you make it clear in the documentation that isequal(.,.) is specifically intended for use with containers (the documentation already says that it is useful for sorting). Furthermore, containers that use isequal might impose additional restrictions on keys. For example, Dict() and Set() impose the restriction that isequal(a,b) implies isequal(hash(a),hash(b)). For the containers I am developing in my current project (OrderedDict, MultiMap and OrderedSet), the restriction is that isequal(x,y) is true if and only if isless(x,y) and isless(y,x) are both false (i.e., isless defines a total order on the keys). |
Isn't == defined in terms of isequal()? Forcing same type means you lose the ability to compare numbers of differing types but the same numeric values as equal. |
isequal is specifically for hashing, which has to support cross-type comparisons, so that's a nonstarter. |
We have actually put a lot of time and thought into equality, and we've tried a couple variants. Having too many different kinds of equality (common lisp has at least 5) gets difficult to manage. Stefan developed a clever hash function that's able to efficiently hash equal numbers equally, even if they're of different types. Since then we've enjoyed the luxury of |
Steven, It would actually be pretty trivial to implement what you're after if you'd typesequal{T}(x::T, y::T) = x == y -Jacob On Sat, Aug 9, 2014 at 10:00 PM, Jeff Bezanson notifications@github.com
|
Jacob, I understand that I could implement this myself, but that is not the point I'm trying to make. I'm trying to say that Julia is flawed if the statement in(x,3) does not cause an error message, where x is an IntSet, because it is failing its mission to catch programmer blunders. (See my earlier posting on this thread.) I am working on a Julia project and would like to 'buy in' to Julia, but obviously I am concerned that it might not catch on, especially at the level of university instruction, if it has flaws like this. Furthermore, I think that this conversation reveals a problem with Set and in(): they are trying to solve too many problems at once. The application of Set mentioned by Constantine in an earlier message in this thread is more likely to appear in a scientific code than an application in which Set holds objects of varying types. Therefore, either Set should be redesigned to be more appropriate for Constantine's example, or else it should be split into two separate container types, say ArbitrarySet and TypedSet. |
Blowing this one relatively minor issue up to the success or failure of the entire project is a bit hyperbolic. But yes, it would be nice to catch common usage mistakes better here – that's why I opened the issue to discuss it. Making isequal error on different types is not good, but I think that throwing an error for |
Checking However, in my view it's always at least valid to ask whether an item is in some collection. It shouldn't be ok to ask Python doesn't even have typed containers (except in numpy), and it has certainly caught on for university teaching. It deals with What happened in julia is that integers can be used as indexes, of course, and we describe indexing as iterating over all of the indexes. Therefore integers became iterable. The rest is a natural consequence. If I had a value Therefore any realistic "fix" for these things needs to go back and revise the underlying ideas. For example, we could entertain a proposal where numbers are not iterable, and indexing works on some other principle. Maybe scalars should be promoted to 0-d arrays before being used as indexes, or maybe we iterate over not an index itself, but over an iterator returned by |
Stefan, Yes, I agree, I exaggerated the significance of one issue. But it is not an exaggeration to say that preventing programmer blunders is a more important goal for a programming language than maximizing generality. Second, I would like to point out that the following test appears to indicate that Julia is following the julia> s = Set([5.0, -0.0])
Set{Float64}({-0.0,5.0})
julia> in(5.0,s)
true
julia> in(0.0,s)
false
julia> 0.0 == -0.0
true
julia> isequal(-0.0,0.0)
false Finally, let me put forward yet another proposal for fixing this problem: make an additional, more complicated constructor for Set(my_isequal, [ <initial entries> ]) or perhaps even two functions: Set(my_isequal, my_hash, [<initial entries>]) C++ lets you do something analogous with containers. Then The [edit formatting: @StefanKarpinski] |
Sets and Dicts with custom comparison and hash functions would be a fine feature to have. |
I disagree. Getting the comparison and hash functions right is incredibly hard. If someone wants to do some different kind of comparison, the way to do it is to transform the keys explicitly before hashing or storing them and then use the normal comparison and hashing. This is simpler, more explicit, and doesn't fail in subtle and confusing ways, which is what would happen with custom comparison and hashing functions. |
I agree that transforming keys is better than using a custom comparison function and I would tend to steer people towards that, but I'm not totally against the custom comparison approach. Maybe it could go in a package, along with things like OrderedDict. |
A subtler check for the element type is similar to what we do to check if a key is value for a typed Dict – check that |
Using the same check that Maybe a more surgical approach is needed: only allow the second argument to Taken a bit further, this could argue for removing the fully-generic |
As I wrote on the mailing list, I suspect that a lot of the need for iterable/indexable numbers should be gone now with 0.5's dot-call syntax. In the cases where you would previously have written a generic vector/scalar function, you should now just write the scalar function |
It's instructive to try to patch Base to make numbers non-iterable. I'm finding various cases where removing iterability requires much uglier code. For example:
|
On the other hand, making numbers non-indexable (removing |
The converse argument: if it is so useful to make numbers iterable, maybe everything should be iterable? i.e. just define fallback |
This would be fixed by #10593 (see this Julep): you'd call |
And note that we could use #19730 to wrap all numbers in a specialized |
I increasingly think we're not going to do this. We could make a lint warning that pesters you if you write |
I often get the phenomena many times. Since it does not raise error e.g. syntax error, it is hard to find |
Yes, that was one of the motivations cited in this issue when it was opened. |
I've just posted my question at Julia discourse (that is why i mentioned a comment at this issue). It was not clear for me why number e.g. |
Was the conclusion that this definitely isn't happening even for a Julia 2.0? I've seen several complaints/confusions about it in various places over the past few months. |
It's gotta be rather convincing. We tried removing both iterability and indexability pre-1.0, but:
and
Some of these things have indeed changed, so it's certainly possible that the balance has shifted... but has it shifted enough? I'd bet not. It's quite a bit of churn. |
I'm generally a supporter of iterability of numbers, but I have seen people get bit by it. To play devil's advocate, would it be so bad to change iterable(x) = x
iterable(x::Number) = (x,) ? |
... or just toss whatever wrapper we end up using for broadcasting on a number and iterate that |
Right, that's what makes this different — that |
My experience in trying to implement even a small piece of this pre-1.0 (#19700) leads me to believe that changing this would lead to a huge amount of code churn over the whole ecosystem. i.e. it wouldn't be worth it without huge benefits, which I haven't seen anyone articulate beyond "slightly confusing to some newcomers". |
One admittedly minor problem is that I cannot use Julia to teach discrete mathematics as this goes against what I teach my students:
|
@StephenVavasis has pointed out some rather confusing behavior of the
in
operator, including:Worse still is this:
This issue is to discuss what, if anything, we can do to reduce some of this confusion.
The text was updated successfully, but these errors were encountered: