Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Chunked] Add chunks(of:) variant that divides a collection in chunks of a given size #54

Merged
merged 10 commits into from
Jan 16, 2021
30 changes: 24 additions & 6 deletions Guides/Chunked.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,10 @@
[Tests](https://github.com/apple/swift-algorithms/blob/main/Tests/SwiftAlgorithmsTests/ChunkedTests.swift)]

Break a collection into subsequences where consecutive elements pass a binary
predicate, or where all elements in each chunk project to the same value.
predicate, or where all elements in each chunk project to the same value.

Also, includes a `chunks(ofCount:)` that breaks a collection into subsequences
of a given `count`.

There are two variations of the `chunked` method: `chunked(by:)` and
`chunked(on:)`. `chunked(by:)` uses a binary predicate to test consecutive
Expand All @@ -26,17 +29,32 @@ let chunks = names.chunked(on: \.first!)
// [["David"], ["Kyle", "Karoy"], ["Nate"]]
```

These methods are related to the [existing SE proposal][proposal] for chunking a
collection into subsequences of a particular size, potentially named something
like `chunked(length:)`. Unlike the `split` family of methods, the entire
collection is included in the chunked result — joining the resulting chunks
recreates the original collection.
The `chunks(ofCount:)` takes a `count` parameter (required to be > 0) and separates
the collection into `n` chunks of this given count. If the `count` parameter is
evenly divided by the count of the base `Collection` all the chunks will have
the count equals to the parameter. Otherwise, the last chunk will contain the
remaining elements.

```swift
let names = ["David", "Kyle", "Karoy", "Nate"]
let evenly = names.chunks(ofCount: 2)
// equivalent to [["David", "Kyle"], ["Karoy", "Nate"]]

let remaining = names.chunks(ofCount: 3)
// equivalent to [["David", "Kyle", "Karoy"], ["Nate"]]
```

The `chunks(ofCount:)` is the method of the [existing SE proposal][proposal].
Unlike the `split` family of methods, the entire collection is included in the
chunked result — joining the resulting chunks recreates the original collection.

```swift
c.elementsEqual(c.chunked(...).joined())
// true
```

Check the [proposal][proposal] detailed design section for more info.

[proposal]: https://github.com/apple/swift-evolution/pull/935

## Detailed Design
Expand Down
174 changes: 174 additions & 0 deletions Sources/Algorithms/Chunked.swift
Original file line number Diff line number Diff line change
Expand Up @@ -246,3 +246,177 @@ extension Collection {
try chunked(on: projection, by: ==)
}
}

//===----------------------------------------------------------------------===//
// chunks(ofCount:)
//===----------------------------------------------------------------------===//

/// A collection that presents the elements of its base collection
/// in `SubSequence` chunks of any given count.
///
/// A `ChunkedByCount` is a lazy view on the base Collection, but it does not implicitly confer
/// laziness on algorithms applied to its result. In other words, for ordinary collections `c`:
///
/// * `c.chunks(ofCount: 3)` does not create new storage
/// * `c.chunks(ofCount: 3).map(f)` maps eagerly and returns a new array
/// * `c.lazy.chunks(ofCount: 3).map(f)` maps lazily and returns a `LazyMapCollection`
public struct ChunkedByCount<Base: Collection> {

public typealias Element = Base.SubSequence

@usableFromInline
internal let base: Base

@usableFromInline
internal let chunkCount: Int

@usableFromInline
internal var computedStartIndex: Index
LucianoPAlmeida marked this conversation as resolved.
Show resolved Hide resolved

/// Creates a view instance that presents the elements of `base`
/// in `SubSequence` chunks of the given count.
///
/// - Complexity: O(n)
@inlinable
internal init(_base: Base, _chunkCount: Int) {
self.base = _base
self.chunkCount = _chunkCount

// Compute the start index upfront in order to make
// start index a O(1) lookup.
let baseEnd = _base.index(
_base.startIndex, offsetBy: _chunkCount,
limitedBy: _base.endIndex
) ?? _base.endIndex

self.computedStartIndex =
Index(_baseRange: _base.startIndex..<baseEnd)
}
}

extension ChunkedByCount: Collection {
public struct Index {
@usableFromInline
internal let baseRange: Range<Base.Index>

@usableFromInline
internal init(_baseRange: Range<Base.Index>) {
self.baseRange = _baseRange
}
}

/// - Complexity: O(n)
public var startIndex: Index { computedStartIndex }
public var endIndex: Index {
Index(_baseRange: base.endIndex..<base.endIndex)
}

/// - Complexity: O(n)
public subscript(i: Index) -> Element {
base[i.baseRange]
}
LucianoPAlmeida marked this conversation as resolved.
Show resolved Hide resolved

@inlinable
public func index(after i: Index) -> Index {
LucianoPAlmeida marked this conversation as resolved.
Show resolved Hide resolved
let baseIdx = base.index(
i.baseRange.upperBound, offsetBy: chunkCount,
limitedBy: base.endIndex
) ?? base.endIndex
return Index(_baseRange: i.baseRange.upperBound..<baseIdx)
}
LucianoPAlmeida marked this conversation as resolved.
Show resolved Hide resolved
}

extension ChunkedByCount.Index: Comparable {
@inlinable
public static func < (lhs: ChunkedByCount.Index,
rhs: ChunkedByCount.Index) -> Bool {
lhs.baseRange.lowerBound < rhs.baseRange.lowerBound
}
}
LucianoPAlmeida marked this conversation as resolved.
Show resolved Hide resolved

extension ChunkedByCount:
BidirectionalCollection, RandomAccessCollection
where Base: RandomAccessCollection {
@inlinable
public func index(before i: Index) -> Index {
var offset = chunkCount
if i.baseRange.lowerBound == base.endIndex {
let remainder = base.count%chunkCount
if remainder != 0 {
offset = remainder
}
}

let baseIdx = base.index(
i.baseRange.lowerBound, offsetBy: -offset,
limitedBy: base.startIndex
) ?? base.startIndex
LucianoPAlmeida marked this conversation as resolved.
Show resolved Hide resolved
return Index(_baseRange: baseIdx..<i.baseRange.lowerBound)
}

@inlinable
public func distance(from start: Index, to end: Index) -> Int {
LucianoPAlmeida marked this conversation as resolved.
Show resolved Hide resolved
let distance =
base.distance(from: start.baseRange.lowerBound,
to: end.baseRange.lowerBound)
let (quotient, remainder) =
distance.quotientAndRemainder(dividingBy: chunkCount)
// Increment should account for negative distances.
if remainder < 0 {
return quotient - 1
}
return quotient + (remainder == 0 ? 0 : 1)
LucianoPAlmeida marked this conversation as resolved.
Show resolved Hide resolved
}

@inlinable
public var count: Int {
let (quotient, remainder) =
base.count.quotientAndRemainder(dividingBy: chunkCount)
return quotient + (remainder == 0 ? 0 : 1)
LucianoPAlmeida marked this conversation as resolved.
Show resolved Hide resolved
}
}

extension Collection {
/// Returns a `ChunkedCollection<Self>` view presenting the elements
/// in chunks with count of the given count parameter.
///
/// - Parameter size: The size of the chunks. If the count parameter
/// is evenly divided by the count of the base `Collection` all the
/// chunks will have the count equals to size.
/// Otherwise, the last chunk will contain the remaining elements.
///
/// let c = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
/// print(c.chunks(ofCount: 5).map(Array.init))
/// // [[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]
///
/// print(c.chunks(ofCount: 3).map(Array.init))
/// // [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10]]
///
/// - Complexity: O(1)
@inlinable
public func chunks(ofCount count: Int) -> ChunkedByCount<Self> {
precondition(count > 0, " Cannot chunk with count <= 0!")
return ChunkedByCount(_base: self, _chunkCount: count)
}
}

// Conditional conformances.
extension ChunkedByCount: Equatable where Base: Equatable {}

// Since we have another stored property of type `Index` on the
// collection, synthetization of hashble conformace would require
// a `Base.Index: Hashable` constraint, so we implement the hasher
// only in terms of base. Since the computed index is based on it,
// it should make a difference here.
LucianoPAlmeida marked this conversation as resolved.
Show resolved Hide resolved
extension ChunkedByCount: Hashable where Base: Hashable {
public func hash(into hasher: inout Hasher) {
hasher.combine(base)
}
}
extension ChunkedByCount.Index: Hashable where Base.Index: Hashable {}

// Lazy conditional conformance.
extension ChunkedByCount: LazySequenceProtocol
where Base: LazySequenceProtocol {}
extension ChunkedByCount: LazyCollectionProtocol
where Base: LazyCollectionProtocol {}
63 changes: 63 additions & 0 deletions Tests/SwiftAlgorithmsTests/ChunkedTests.swift
Original file line number Diff line number Diff line change
Expand Up @@ -76,4 +76,67 @@ final class ChunkedTests: XCTestCase {
XCTAssertLazySequence(fruits.lazy.chunked(by: { $0.first == $1.first }))
XCTAssertLazySequence(fruits.lazy.chunked(on: { $0.first }))
}


//===----------------------------------------------------------------------===//
// Tests for `chunks(ofCount:)`
//===----------------------------------------------------------------------===//
func testChunksOfCount() {
XCTAssertEqualSequences([Int]().chunks(ofCount: 1), [])
XCTAssertEqualSequences([Int]().chunks(ofCount: 5), [])

let collection = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
XCTAssertEqualSequences(collection.chunks(ofCount: 1),
[[1], [2], [3], [4], [5], [6], [7], [8], [9], [10]])
XCTAssertEqualSequences(collection.chunks(ofCount: 3),
[[1, 2, 3], [4, 5, 6], [7, 8, 9], [10]])
XCTAssertEqualSequences(collection.chunks(ofCount: 5),
[[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])
XCTAssertEqualSequences(collection.chunks(ofCount: 11),
[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]])
}

func testChunksOfCountBidirectional() {
let collection = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

XCTAssertEqualSequences(collection.chunks(ofCount: 1).reversed(),
[[10], [9], [8], [7], [6], [5], [4], [3], [2], [1]])
XCTAssertEqualSequences(collection.chunks(ofCount: 3).reversed(),
[[10], [7, 8, 9], [4, 5, 6], [1, 2, 3]])
XCTAssertEqualSequences(collection.chunks(ofCount: 5).reversed(),
[[6, 7, 8, 9, 10], [1, 2, 3, 4, 5]])
XCTAssertEqualSequences(collection.chunks(ofCount: 11).reversed(),
[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]])
}

func testChunksOfCountCount() {
XCTAssertEqual([Int]().chunks(ofCount: 1).count, 0)
XCTAssertEqual([Int]().chunks(ofCount: 5).count, 0)

let collection = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
XCTAssertEqual(collection.chunks(ofCount: 1).count, 10)
XCTAssertEqual(collection.chunks(ofCount: 3).count, 4)
XCTAssertEqual(collection.chunks(ofCount: 5).count, 2)
XCTAssertEqual(collection.chunks(ofCount: 11).count, 1)
}

func testEmptyChunksTraversal() {
let emptyChunks = [Int]().chunks(ofCount: 1)

validateIndexTraversals(emptyChunks)
}

func testChunksOfCountTraversal() {
let collection = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
let chunks = collection.chunks(ofCount: 2)

validateIndexTraversals(chunks)
}

func testChunksOfCountWithRemainderTraversal() {
let collection = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
let chunks = collection.chunks(ofCount: 3)

validateIndexTraversals(chunks)
}
}