Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BidirectionalCollection.trimming #4

Merged
merged 1 commit into from
Oct 30, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
114 changes: 114 additions & 0 deletions Guides/Trim.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# Trim

[[Source](https://github.com/apple/swift-algorithms/blob/main/Sources/Algorithms/Trim.swift) |
[Tests](https://github.com/apple/swift-algorithms/blob/main/Tests/SwiftAlgorithmsTests/TrimTests.swift)]

Returns a `SubSequence` formed by discarding all elements at the start and end of the collection
which satisfy the given predicate.

This example uses `trimming(where:)` to get a substring without the white space at the beginning and end of the string.

```swift
let myString = " hello, world "
print(myString.trimming(where: \.isWhitespace)) // "hello, world"

let results = [2, 10, 11, 15, 20, 21, 100].trimming(where: { $0.isMultiple(of: 2) })
print(results) // [11, 15, 20, 21]
```

## Detailed Design

A new method is added to `BidirectionalCollection`:

```swift
extension BidirectionalCollection {

public func trimming(where predicate: (Element) throws -> Bool) rethrows -> SubSequence
}
```

This method requires `BidirectionalCollection` for an efficient implementation which visits as few elements as possible.

A less-efficient implementation is _possible_ for any `Collection`, which would involve always traversing the
entire collection. This implementation is not provided, as it would mean developers of generic algorithms who forget
to add the `BidirectionalCollection` constraint will receive that inefficient implementation:

```swift
func myAlgorithm<Input>(input: Input) where Input: Collection {

let trimmedInput = input.trimming(where: { ... }) // Uses least-efficient implementation.
}

func myAlgorithm2<Input>(input: Input) where Input: BidirectionalCollection {

let trimmedInput = input.trimming(where: { ... }) // Uses most-efficient implementation.
}
```

Swift provides the `BidirectionalCollection` protocol for marking types which support reverse traversal,
and generic types and algorithms which want to make use of that should add it to their constraints.

### Complexity

Calling this method is O(_n_).

### Naming

The name `trim` has precedent in other programming languages. Another popular alternative might be `strip`.

| Example usage | Languages |
|-|-|
| ''String''.Trim([''chars'']) | C#, VB.NET, Windows PowerShell |
| ''string''.strip(); | D |
| (.trim ''string'') | Clojure |
| ''sequence'' [ predicate? ] trim | Factor |
| (string-trim '(#\Space #\Tab #\Newline) ''string'') | Common Lisp |
| (string-trim ''string'') | Scheme |
| ''string''.trim() | Java, JavaScript (1.8.1+), Rust |
| Trim(''String'') | Pascal, QBasic, Visual Basic, Delphi |
| ''string''.strip() | Python |
| strings.Trim(''string'', ''chars'') | Go |
| LTRIM(RTRIM(''String'')) | Oracle SQL, T-SQL |
| string:strip(''string'' [,''option'', ''char'']) | Erlang |
| ''string''.strip or ''string''.lstrip or ''string''.rstrip | Ruby |
| trim(''string'') | PHP, Raku |
| [''string'' stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]] | Objective-C/Cocoa |
| ''string'' withBlanksTrimmed ''string'' withoutSpaces ''string'' withoutSeparators | Smalltalk |
| string trim ''$string'' | Tcl |
| TRIM(''string'') or TRIM(ADJUSTL(''string'')) | Fortran |
| TRIM(''string'') | SQL |
| String.trim ''string'' | OCaml 4+ |

Note: This is an abbreviated list from Wikipedia. [Full table](https://en.wikipedia.org/wiki/Comparison_of_programming_languages_(string_functions)#trim)

The standard library includes a variety of methods which perform similar operations:

- Firstly, there are `dropFirst(Int)` and `dropLast(Int)`. These return slices but do not support user-defined predicates.
If the collection's `count` is less than the number of elements to drop, they return an empty slice.
- Secondly, there is `drop(while:)`, which also returns a slice and is equivalent to a 'left-trim' (trimming from the head but not the tail).
If the entire collection is dropped, this method returns an empty slice.
- Thirdly, there are `removeFirst(Int)` and `removeLast(Int)` which do not return slices and actually mutate the collection.
If the collection's `count` is less than the number of elements to remove, this method triggers a runtime error.
- Lastly, there are the `popFirst()` and `popLast()` methods, which work like `removeFirst()` and `removeLast()`,
except they do not trigger a runtime error for empty collections.

The closest neighbours to this function would be the `drop` family of methods. Unfortunately, unlike `dropFirst(Int)`,
the name `drop(while:)` does not specify which end(s) of the collection it operates on. Moreover, one could easily
mistake code such as:

```swift
let result = myString.drop(while: \.isWhitespace)
```

With a lazy filter that drops _all_ whitespace characters regardless of where they are in the string.
Besides that, the root `trim` leads to clearer, more conscise code, which is more aligned with other programming
languages:

```swift
// Does `result` contain the input, trimmed of certain elements?
// Or does this code mutate `input` in-place and return the elements which were dropped?
let result = input.dropFromBothEnds(where: { ... })

// No such ambiguity here.
let result = input.trimming(where: { ... })
```
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ Read more about the package, and the intent behind it, in the [announcement on s

- [`chunked(by:)`, `chunked(on:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Chunked.md): Eager and lazy operations that break a collection into chunks based on either a binary predicate or when the result of a projection changes.
- [`indexed()`](https://github.com/apple/swift-algorithms/blob/main/Guides/Indexed.md): Iterate over tuples of a collection's indices and elements.
- [`trimming(where:)`](https://github.com/apple/swift-algorithms/blob/main/Guides/Trim.md): Returns a slice by trimming elements from a collection's start and end.


## Adding Swift Algorithms as a Dependency
Expand Down
50 changes: 50 additions & 0 deletions Sources/Algorithms/Trim.swift
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
//===----------------------------------------------------------------------===//
//
// This source file is part of the Swift Algorithms open source project
//
// Copyright (c) 2020 Apple Inc. and the Swift project authors
// Licensed under Apache License v2.0 with Runtime Library Exception
//
// See https://swift.org/LICENSE.txt for license information
//
//===----------------------------------------------------------------------===//

extension BidirectionalCollection {

/// Returns a `SubSequence` formed by discarding all elements at the start and end of the collection
/// which satisfy the given predicate.
///
/// This example uses `trimming(where:)` to get a substring without the white space at the
/// beginning and end of the string:
///
/// ```
/// let myString = " hello, world "
/// print(myString.trimming(where: \.isWhitespace)) // "hello, world"
/// ```
///
/// - parameters:
/// - predicate: A closure which determines if the element should be omitted from the
/// resulting slice.
///
/// - complexity: `O(n)`, where `n` is the length of this collection.
///
@inlinable
public func trimming(
where predicate: (Element) throws -> Bool
) rethrows -> SubSequence {

// Consume elements from the front.
let sliceStart = try firstIndex { try predicate($0) == false } ?? endIndex
// sliceEnd is the index _after_ the last index to match the predicate.
var sliceEnd = endIndex
while sliceStart != sliceEnd {
let idxBeforeSliceEnd = index(before: sliceEnd)
guard try predicate(self[idxBeforeSliceEnd]) else {
return self[sliceStart..<sliceEnd]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: can these bounds be safely unchecked too?

Copy link
Contributor Author

@karwa karwa Oct 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure. Since sliceEnd is formed by walking backwards from endIndex and has not yet tested equal to sliceStart, the only way this bounds check will fail is if either your collection or the index's Comparable implementation is broken.

For the empty result, we already checked that sliceStart == sliceEnd as part of the while loop condition, so it is safe to omit the bounds check. A broken Equatable implementation would never get to that part.

}
sliceEnd = idxBeforeSliceEnd
}
// Trimmed everything.
return self[Range(uncheckedBounds: (sliceStart, sliceStart))]
}
}
59 changes: 59 additions & 0 deletions Tests/SwiftAlgorithmsTests/TrimTests.swift
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
//===----------------------------------------------------------------------===//
//
// This source file is part of the Swift Algorithms open source project
//
// Copyright (c) 2020 Apple Inc. and the Swift project authors
// Licensed under Apache License v2.0 with Runtime Library Exception
//
// See https://swift.org/LICENSE.txt for license information
//
//===----------------------------------------------------------------------===//

import Algorithms
import XCTest

final class TrimTests: XCTestCase {

func testEmpty() {
let results_empty = ([] as [Int]).trimming { $0.isMultiple(of: 2) }
XCTAssertEqual(results_empty, [])
}

func testNoMatch() {
// No match (nothing trimmed).
let results_nomatch = [1, 3, 5, 7, 9, 11, 13, 15].trimming {
$0.isMultiple(of: 2)
}
XCTAssertEqual(results_nomatch, [1, 3, 5, 7, 9, 11, 13, 15])
}

func testNoTailMatch() {
// No tail match (only trim head).
let results_notailmatch = [1, 3, 5, 7, 9, 11, 13, 15].trimming { $0 < 10 }
XCTAssertEqual(results_notailmatch, [11, 13, 15])
}

func testNoHeadMatch() {
// No head match (only trim tail).
let results_noheadmatch = [1, 3, 5, 7, 9, 11, 13, 15].trimming { $0 > 10 }
XCTAssertEqual(results_noheadmatch, [1, 3, 5, 7, 9])
}

func testBothEndsMatch() {
// Both ends match, some string of >1 elements do not (return that string).
let results = [2, 10, 11, 15, 20, 21, 100].trimming { $0.isMultiple(of: 2) }
XCTAssertEqual(results, [11, 15, 20, 21])
}

func testEverythingMatches() {
// Everything matches (trim everything).
let results_allmatch = [1, 3, 5, 7, 9, 11, 13, 15].trimming { _ in true }
XCTAssertEqual(results_allmatch, [])
}

func testEverythingButOneMatches() {
// Both ends match, one element does not (trim all except that element).
let results_one = [2, 10, 12, 15, 20, 100].trimming { $0.isMultiple(of: 2) }
XCTAssertEqual(results_one, [15])
}
}