Skip to content

Commit

Permalink
Sorted containers -> new iteration protocol
Browse files Browse the repository at this point in the history
 On branch newiterationsortedcontainers
 Changes to be committed:
	modified:   ../docs/src/sorted_containers.md
	modified:   DataStructures.jl
	modified:   balanced_tree.jl
	modified:   container_loops.jl
	modified:   ../test/test_sorted_containers.jl

This commit updates container_loops.jl to use the new iteration protocol
(introduced in 0.7.0-DEV). It should be backwards compatible with 0.6.2.

In addition, it fixes a bug in container_loops.jl in which the length()
function when applied to subranges (i.e.,  inclusive(a,b,c) or
exclusive(a,b,c)) returned the length of the whole container instead
of the length of the subrange.  (There should be no value returned
for the length of the subrange since the data structure does not support
an O(1) algorithm or even O(log n) algorithm to compute the length.)

Some other smaller changes in this commit are as follows.
 - IntSet was changed to BitSet (name change in 0.7.0-DEV)
 - Small updates to documentation
 - Some assert statements in balanced_tree.jl that were present
   during development are deleted.

Add more tests for new length and eltype methods
  • Loading branch information
StephenVavasis committed May 25, 2018
1 parent a8d9477 commit 43425a7
Show file tree
Hide file tree
Showing 6 changed files with 423 additions and 180 deletions.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
doc/build
docs/build/
docs/site/
.gitignore
*~

24 changes: 13 additions & 11 deletions docs/src/sorted_containers.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ then there may be a loss of performance compared to:
k,v = deref((sc,st))
```

because the former needs an extra heap allocation step for `tok`.
because the former may need an extra heap allocation step for `tok`.

The notion of token is similar to the concept of iterators used by C++
standard containers. Tokens can be explicitly advanced or regressed
Expand Down Expand Up @@ -406,8 +406,8 @@ past-end token. Time: O(1)
## Iteration Over Sorted Containers

As is standard in Julia, iteration over the containers is implemented
via calls to three functions, `start`, `next` and `done`. It is usual
practice, however, to call these functions implicitly with a for-loop
via calls to the function `Base.iterate`. It is usual
practice, however, to call this function implicitly with a for-loop
rather than explicitly, so they are presented here in for-loop notation.
Internally, all of these iterations are implemented with semitokens that
are advanced via the `advance` operation. Each iteration of these loops
Expand Down Expand Up @@ -454,11 +454,13 @@ end
```

Here, `st1` and `st2` are semitokens that refer to the container `sc`.
Token `(sc,st1)` may not be the before-start token and
token `(sc,st2)` may not be the past-end token.
It is acceptable for `(sc,st1)` to be the past-end token or `(sc,st2)`
to be the before-start token (in these cases, the body is not executed).
to be the before-start token or both (in these cases, the body is not executed).
If `compare(sc,st1,st2)==1` then the body is not executed. A second
calling format for `inclusive` is `inclusive(sc,(st1,st2))`. One purpose
for second format is so that the return value of `searchequalrange` may
calling format for `inclusive` is `inclusive(sc,(st1,st2))`. With
the second format, the return value of `searchequalrange` may
be used directly as the second argument to `inclusive`.

One can also define a loop that excludes the final item:
Expand Down Expand Up @@ -771,10 +773,10 @@ Lt((x,y) -> isless(lowercase(x),lowercase(y)))
The ordering object is indicated in the above list of constructors in
the `o` position (see above for constructor syntax).

This approach suffers from a performance hit (10%-50% depending on the
container) because the compiler cannot inline or compute the correct
dispatch for the function in parentheses, so the dispatch takes place at
run-time. A more complicated but higher-performance method to implement
This approach suffers may suffer from a performance hit because
higher performance may be possibility if equality is available
as well as less-than.
A more complicated but higher-performance method to implement
a custom ordering is as follows. First, the user creates a singleton
type that is a subtype of `Ordering` as follows:

Expand All @@ -798,7 +800,7 @@ container also needs an equal-to function; the default is:
eq(o::Ordering, a, b) = !lt(o, a, b) && !lt(o, b, a)
```

For a further slight performance boost, the user can also customize this
The user can also customize this
function with a more efficient implementation. In the above example, an
appropriate customization would be:

Expand Down
6 changes: 6 additions & 0 deletions src/DataStructures.jl
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,12 @@ module DataStructures
union, intersect, symdiff, setdiff, issubset,
searchsortedfirst, searchsortedlast, in

if VERSION >= v"0.7.0-DEV.5126"
import Base: iterate, IteratorSize, HasLength, SizeUnknown,
IteratorEltype, HasEltype
end


using Compat
using Compat.InteractiveUtils # for methodswith
import Compat: lastindex, pushfirst!, popfirst!
Expand Down
28 changes: 14 additions & 14 deletions src/balanced_tree.jl
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
## d: the data of the node
## parent: the tree leaf that is the parent of this
## node. Parent pointers are needed in order
## to implement indices.
## to implement tokens.
## There are two constructors, the standard one (first)
## and the incomplete one (second). The incomplete constructor
## is needed because when the data structure is first created,
Expand Down Expand Up @@ -80,14 +80,14 @@ end


## Type BalancedTree23{K,D,Ord} is 'base class' for
## SortedDict.
## SortedDict, SortedMultiDict and SortedSet.
## K = key type, D = data type
## Key type must support an ordering operation defined by Ordering
## object Ord.
## The default is Forward which implies that the ordering function
## is isless (see ordering.jl)
## The fields are as follows.
## ord:: The ordering object. Often the ordering type
## ord: The ordering object. Often the ordering type
## is a singleton type, so this field is empty, but it
## is still necessary to direct the multiple dispatch.
## data: the (key,data) pairs of the tree.
Expand All @@ -104,10 +104,10 @@ end
## tree array (locations are freed due to deletion)
## freedatainds: Array of indices of free locations in the
## data array (locations are freed due to deletion)
## useddatacells: IntSet (i.e., bit vector) showing which
## useddatacells: BitSet (i.e., bit vector) showing which
## data cells are taken. The complementary positions are
## exactly those stored in freedatainds. This array is
## used only for error checking (only present at debug level 1 and 2)
## used only for error checking.
## deletionchild and deletionleftkey are two work-arrays
## for the delete function.

Expand All @@ -119,7 +119,7 @@ mutable struct BalancedTree23{K, D, Ord <: Ordering}
depth::Int
freetreeinds::Array{Int,1}
freedatainds::Array{Int,1}
useddatacells::IntSet
useddatacells::BitSet
# The next two arrays are used as a workspace by the delete!
# function.
deletionchild::Array{Int,1}
Expand All @@ -129,7 +129,7 @@ mutable struct BalancedTree23{K, D, Ord <: Ordering}
initializeTree!(tree1)
data1 = Vector{KDRec{K,D}}(undef, 2)
initializeData!(data1)
u1 = IntSet()
u1 = BitSet()
push!(u1, 1, 2)
new{K,D,Ord}(ord1, data1, tree1, 1, 1, Vector{Int}(), Vector{Int}(),
u1,
Expand Down Expand Up @@ -631,30 +631,30 @@ function compareInd(t::BalancedTree23, i1::Int, i2::Int)
i2a = i2
p1 = t.data[i1].parent
p2 = t.data[i2].parent
curdepth = t.depth
# curdepth = t.depth
while true
@assert(curdepth > 0)
# @assert(curdepth > 0)
if p1 == p2
if i1a == t.tree[p1].child1
@assert(t.tree[p1].child2 == i2a || t.tree[p1].child3 == i2a)
# @assert(t.tree[p1].child2 == i2a || t.tree[p1].child3 == i2a)
return -1
end
if i1a == t.tree[p1].child2
if (t.tree[p1].child1 == i2a)
return 1
end
@assert(t.tree[p1].child3 == i2a)
# @assert(t.tree[p1].child3 == i2a)
return -1
end
@assert(i1a == t.tree[p1].child3)
@assert(t.tree[p1].child1 == i2a || t.tree[p1].child2 == i2a)
# @assert(i1a == t.tree[p1].child3)
# @assert(t.tree[p1].child1 == i2a || t.tree[p1].child2 == i2a)
return 1
end
i1a = p1
i2a = p2
p1 = t.tree[i1a].parent
p2 = t.tree[i2a].parent
curdepth -= 1
# curdepth -= 1
end
end

Expand Down
Loading

0 comments on commit 43425a7

Please sign in to comment.