Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Chunked] Add chunks(of:) variant that divides a collection in chunks of a given size #54

Merged
merged 10 commits into from
Jan 16, 2021
30 changes: 24 additions & 6 deletions Guides/Chunked.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,10 @@
[Tests](https://github.com/apple/swift-algorithms/blob/main/Tests/SwiftAlgorithmsTests/ChunkedTests.swift)]

Break a collection into subsequences where consecutive elements pass a binary
predicate, or where all elements in each chunk project to the same value.
predicate, or where all elements in each chunk project to the same value.

Also, includes a `chunks(ofCount:)` that breaks a collection into subsequences
of a given `count`.

There are two variations of the `chunked` method: `chunked(by:)` and
`chunked(on:)`. `chunked(by:)` uses a binary predicate to test consecutive
Expand All @@ -26,17 +29,32 @@ let chunks = names.chunked(on: \.first!)
// [["David"], ["Kyle", "Karoy"], ["Nate"]]
```

These methods are related to the [existing SE proposal][proposal] for chunking a
collection into subsequences of a particular size, potentially named something
like `chunked(length:)`. Unlike the `split` family of methods, the entire
collection is included in the chunked result — joining the resulting chunks
recreates the original collection.
The `chunks(ofCount:)` takes a `count` parameter (required to be > 0) and separates
the collection into `n` chunks of this given count. If the `count` parameter is
evenly divided by the count of the base `Collection` all the chunks will have
the count equals to the parameter. Otherwise, the last chunk will contain the
remaining elements.

```swift
let names = ["David", "Kyle", "Karoy", "Nate"]
let evenly = names.chunks(ofCount: 2)
// equivalent to [["David", "Kyle"], ["Karoy", "Nate"]]

let remaining = names.chunks(ofCount: 3)
// equivalent to [["David", "Kyle", "Karoy"], ["Nate"]]
```

The `chunks(ofCount:)` is the method of the [existing SE proposal][proposal].
Unlike the `split` family of methods, the entire collection is included in the
chunked result — joining the resulting chunks recreates the original collection.

```swift
c.elementsEqual(c.chunked(...).joined())
// true
```

Check the [proposal][proposal] detailed design section for more info.

[proposal]: https://github.com/apple/swift-evolution/pull/935

## Detailed Design
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ Read more about the package, and the intent behind it, in the [announcement on s

#### Other useful operations

- [`chunked(by:)`, `chunked(on:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Chunked.md): Eager and lazy operations that break a collection into chunks based on either a binary predicate or when the result of a projection changes.
- [`chunked(by:)`, `chunked(on:)`, `chunks(ofCount:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Chunked.md): Eager and lazy operations that break a collection into chunks based on either a binary predicate or when the result of a projection changes or chunks of a given count.
- [`indexed()`](https://github.com/apple/swift-algorithms/blob/main/Guides/Indexed.md): Iterate over tuples of a collection's indices and elements.
- [`trimming(where:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Trim.md): Returns a slice by trimming elements from a collection's start and end.

Expand Down
304 changes: 304 additions & 0 deletions Sources/Algorithms/Chunked.swift
Original file line number Diff line number Diff line change
Expand Up @@ -246,3 +246,307 @@ extension Collection {
try chunked(on: projection, by: ==)
}
}

//===----------------------------------------------------------------------===//
// chunks(ofCount:)
//===----------------------------------------------------------------------===//

/// A collection that presents the elements of its base collection
/// in `SubSequence` chunks of any given count.
///
/// A `ChunkedByCount` is a lazy view on the base Collection, but it does not implicitly confer
/// laziness on algorithms applied to its result. In other words, for ordinary collections `c`:
///
/// * `c.chunks(ofCount: 3)` does not create new storage
/// * `c.chunks(ofCount: 3).map(f)` maps eagerly and returns a new array
/// * `c.lazy.chunks(ofCount: 3).map(f)` maps lazily and returns a `LazyMapCollection`
public struct ChunkedByCount<Base: Collection> {

public typealias Element = Base.SubSequence

@usableFromInline
internal let base: Base

@usableFromInline
internal let chunkCount: Int

@usableFromInline
internal var startUpperBound: Base.Index

/// Creates a view instance that presents the elements of `base`
/// in `SubSequence` chunks of the given count.
///
/// - Complexity: O(n)
@inlinable
internal init(_base: Base, _chunkCount: Int) {
self.base = _base
self.chunkCount = _chunkCount

// Compute the start index upfront in order to make
// start index a O(1) lookup.
self.startUpperBound = _base.index(
_base.startIndex, offsetBy: _chunkCount,
limitedBy: _base.endIndex
) ?? _base.endIndex
}
}

extension ChunkedByCount: Collection {
public struct Index {
@usableFromInline
internal let baseRange: Range<Base.Index>

@usableFromInline
internal init(_baseRange: Range<Base.Index>) {
self.baseRange = _baseRange
}
}

/// - Complexity: O(1)
@inlinable
public var startIndex: Index {
Index(_baseRange: base.startIndex..<startUpperBound)
}
@inlinable
public var endIndex: Index {
Index(_baseRange: base.endIndex..<base.endIndex)
}

/// - Complexity: O(1)
public subscript(i: Index) -> Element {
precondition(i < endIndex, "Index out of range")
return base[i.baseRange]
}

@inlinable
public func index(after i: Index) -> Index {
LucianoPAlmeida marked this conversation as resolved.
Show resolved Hide resolved
precondition(i < endIndex, "Advancing past end index")
let baseIdx = base.index(
i.baseRange.upperBound, offsetBy: chunkCount,
limitedBy: base.endIndex
) ?? base.endIndex
return Index(_baseRange: i.baseRange.upperBound..<baseIdx)
}
LucianoPAlmeida marked this conversation as resolved.
Show resolved Hide resolved
}

extension ChunkedByCount.Index: Comparable {
@inlinable
public static func == (lhs: ChunkedByCount.Index,
rhs: ChunkedByCount.Index) -> Bool {
lhs.baseRange.lowerBound == rhs.baseRange.lowerBound
}

@inlinable
public static func < (lhs: ChunkedByCount.Index,
rhs: ChunkedByCount.Index) -> Bool {
lhs.baseRange.lowerBound < rhs.baseRange.lowerBound
}
}

extension ChunkedByCount:
BidirectionalCollection, RandomAccessCollection
where Base: RandomAccessCollection {
@inlinable
public func index(before i: Index) -> Index {
precondition(i > startIndex, "Advancing past start index")

var offset = chunkCount
if i.baseRange.lowerBound == base.endIndex {
let remainder = base.count%chunkCount
if remainder != 0 {
offset = remainder
}
}

let baseIdx = base.index(
i.baseRange.lowerBound, offsetBy: -offset,
limitedBy: base.startIndex
) ?? base.startIndex
LucianoPAlmeida marked this conversation as resolved.
Show resolved Hide resolved
return Index(_baseRange: baseIdx..<i.baseRange.lowerBound)
}
}

extension ChunkedByCount {
@inlinable
public func distance(from start: Index, to end: Index) -> Int {
LucianoPAlmeida marked this conversation as resolved.
Show resolved Hide resolved
let distance =
base.distance(from: start.baseRange.lowerBound,
to: end.baseRange.lowerBound)
let (quotient, remainder) =
distance.quotientAndRemainder(dividingBy: chunkCount)
return quotient + remainder.signum()
}

@inlinable
public var count: Int {
let (quotient, remainder) =
base.count.quotientAndRemainder(dividingBy: chunkCount)
return quotient + remainder.signum()
}

@inlinable
public func index(
LucianoPAlmeida marked this conversation as resolved.
Show resolved Hide resolved
_ i: Index, offsetBy offset: Int, limitedBy limit: Index
) -> Index? {
guard offset != 0 else { return i }
guard limit != i else { return nil }

if offset > 0 {
return limit > i
? offsetForward(i, offsetBy: offset, limit: limit)
: offsetForward(i, offsetBy: offset)
} else {
return limit < i
? offsetBackward(i, offsetBy: offset, limit: limit)
: offsetBackward(i, offsetBy: offset)
}
}

@inlinable
public func index(_ i: Index, offsetBy distance: Int) -> Index {
guard distance != 0 else { return i }

let idx = distance > 0
? offsetForward(i, offsetBy: distance)
: offsetBackward(i, offsetBy: distance)
guard let index = idx else {
fatalError("Out of bounds")
}
return index
}

@usableFromInline
internal func offsetForward(
_ i: Index, offsetBy distance: Int, limit: Index? = nil
) -> Index? {
assert(distance > 0)

return makeOffsetIndex(
from: i, baseBound: base.endIndex,
distance: distance, baseDistance: distance * chunkCount,
limit: limit, by: >
)
}

// Convenience to compute offset backward base distance.
@inline(__always)
private func computeOffsetBackwardBaseDistance(
_ i: Index, _ distance: Int
) -> Int {
if i == endIndex {
let remainder = base.count%chunkCount
// We have to take it into account when calculating offsets.
if remainder != 0 {
// Distance "minus" one(at this point distance is negative)
// because we need to adjust for the last position that have
// a variadic(remainder) number of elements.
return ((distance + 1) * chunkCount) - remainder
}
}
return distance * chunkCount
}

@usableFromInline
internal func offsetBackward(
_ i: Index, offsetBy distance: Int, limit: Index? = nil
) -> Index? {
assert(distance < 0)
let baseDistance =
computeOffsetBackwardBaseDistance(i, distance)
return makeOffsetIndex(
from: i, baseBound: base.startIndex,
distance: distance, baseDistance: baseDistance,
limit: limit, by: <
)
}

// Helper to compute index(offsetBy:) index.
@inline(__always)
private func makeOffsetIndex(
from i: Index, baseBound: Base.Index, distance: Int, baseDistance: Int,
limit: Index?, by limitFn: (Base.Index, Base.Index) -> Bool
) -> Index? {
let baseIdx = base.index(
i.baseRange.lowerBound, offsetBy: baseDistance,
limitedBy: baseBound
)

if let limit = limit {
if baseIdx == nil {
// If we past the bounds while advancing forward and the
// limit is the `endIndex`, since the computation on base
// don't take into account the remainder, we have to make
// sure that passing the bound was because of the distance
// not just because of a remainder. Special casing is less
// expensive than always use count(which could be O(n) for
// non-random access collection base) to compute the base
// distance taking remainder into account.
if baseDistance > 0 && limit == endIndex {
if self.distance(from: i, to: limit) < distance {
return nil
}
} else {
return nil
}
}

// Checks for the limit.
let baseStartIdx = baseIdx ?? baseBound
if limitFn(baseStartIdx, limit.baseRange.lowerBound) {
return nil
}
}

let baseStartIdx = baseIdx ?? baseBound
let baseEndIdx = base.index(
baseStartIdx, offsetBy: chunkCount, limitedBy: base.endIndex
) ?? base.endIndex

return Index(_baseRange: baseStartIdx..<baseEndIdx)
}
}

extension Collection {
/// Returns a `ChunkedCollection<Self>` view presenting the elements
/// in chunks with count of the given count parameter.
///
/// - Parameter size: The size of the chunks. If the count parameter
/// is evenly divided by the count of the base `Collection` all the
/// chunks will have the count equals to size.
/// Otherwise, the last chunk will contain the remaining elements.
///
/// let c = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
/// print(c.chunks(ofCount: 5).map(Array.init))
/// // [[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]
///
/// print(c.chunks(ofCount: 3).map(Array.init))
/// // [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10]]
///
/// - Complexity: O(1)
@inlinable
public func chunks(ofCount count: Int) -> ChunkedByCount<Self> {
precondition(count > 0, "Cannot chunk with count <= 0!")
return ChunkedByCount(_base: self, _chunkCount: count)
}
}

// Conditional conformances.
extension ChunkedByCount: Equatable where Base: Equatable {}

// Since we have another stored property of type `Index` on the
// collection, synthesis of `Hashble` conformace would require
// a `Base.Index: Hashable` constraint, so we implement the hasher
// only in terms of `base`. Since the computed index is based on it,
// it should not make a difference here.
extension ChunkedByCount: Hashable where Base: Hashable {
public func hash(into hasher: inout Hasher) {
hasher.combine(base)
}
}
extension ChunkedByCount.Index: Hashable where Base.Index: Hashable {}

// Lazy conditional conformance.
extension ChunkedByCount: LazySequenceProtocol
where Base: LazySequenceProtocol {}
extension ChunkedByCount: LazyCollectionProtocol
where Base: LazyCollectionProtocol {}
Loading