Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slice generator #1927

Closed
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
148 changes: 148 additions & 0 deletions proposals/p1927.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
# Slice generator

<!--
Part of the Carbon Language project, under the Apache License v2.0 with LLVM
Exceptions. See /LICENSE for license information.
SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-->

[Pull request](https://github.com/carbon-language/carbon-lang/pull/1927)

<!-- toc -->

## Table of contents

- [Problem](#problem)
- [Background](#background)
- [Proposal](#proposal)
- [Details](#details)
- [Library API](#library-api)
- [Rationale](#rationale)
- [Alternatives considered](#alternatives-considered)

<!-- tocstop -->

## Problem

Slices of arrays is a popular way to extract subarray of array. They also may be
used for generating arrays.

## Background

There is no requirements for background.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be useful to describe what other languages do in this space here.


## Proposal

Introduce slice generators.

## Details

Slice syntax is `a:b:s`. Here, `a` is the beginning of slice, `b` is the end of
slice and `s` is the step in slice. In array indexing, `:s` can be omitted, then
Comment on lines +40 to +41
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is b included or excluded? The examples below suggest it's included -- eg, (1:5:2) includes 5 -- but for example in Python and Rust it's excluded, and Swift provides two operators for closed and half-open ranges. I'd like to see the proposal discuss this question; maybe closed ranges are the right answer, but given that we presumably will use 0-based indexing by default, half-open ranges seem likely to fit better in that context.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

b is included. For having something like half-open range, I think that end keyword proposed by you is fine.

it means that step is 1. If `a` is omitted, slice starts from the beginning of
array. The same situation is for `b`, but slice ends at the end of arrays. `:`
is the same as `::`. If `s` is negative and `a` or `b` or both are skipped,
instead of `a`, upper bound is and, instead of `b`, lower bound is. Slice has
Comment on lines +44 to +45
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this sentence. I think what you mean is that if s is negative, then if a is omitted the slice starts from the end of the array, and if b is omitted then the slice ends at the start of the array. Is that right?

closed range.

Compilation or runtime errors are:

- if `s` is zero;
- if `a` is greater than `b` when `s` is positive;
- if `a` is less than `b` when `s` is negative.

`:` has the lowest priority for making possible `a[1:N-1]` without any brackets.

Slice generator may return `i32`, `i64`, `i128` and unsigned versions of them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can user-defined types be customized to work with this syntax? If so, how does that customization work? See also #1885.


Slice as generator of arrays:

```carbon
var array1: auto = (1:5:2); // array1 = (1,3,5)
var array2: auto = (1:5); // array1 = (1,2,3,4,5)
```

Round brackets, `a`, and `b` must be presented, `s` may be omitted,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I think this syntax is still problematic. It doesn't look like we can disambiguate between this and a binding in a pattern:

fn F(a: i32, b: i32) {
  match ((1:2)) {
    case  (a:b) => { ... }
  }
}
// vs
fn G(b:! Type, n: b) {
  match (n) {
    case (a:b) => { ... }
  }
}

We could maybe use a named constructor, eg Slice(a, b, s) for the unusual case of constructing a slice anywhere other than in [], and specify x[a:b:s] as desugaring to x[Slice(a,b,s)].


In operator `[]` of array, slices can be used in the following ways:

```
var array: auto = (1:5:2); // array = (1,3,5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems likely to be an efficiency surprise. As with python's range, I think we'll want a slice generator to lazily produce the elements of the slice. Having a conversion from slice generator to array seems like it might be reasonable, but only if the developer somehow asks for it (eg, by writing an array type).

var a1: auto = array[:]; // a1 = (1,3,5)
var a2: auto = array[::]; // a2 = (1,3,5)
var a3: auto = array[:1]; // a3 = (1,3)
var a4: auto = array[::-1]; // a4 = (5,3,1)
var a5: auto = array[:2:-1]; // a5 = (5)
```

Note, `auto` for slice of one element gives an array containing one element.

### Library API

Internally, slice is presented as `Slice(low, high, step)` that may return
arrays or `Slice`. While constructing, `low`, `high` are calculated by way of
calling arrays properties.

## Rationale
jonmeow marked this conversation as resolved.
Show resolved Hide resolved

- [Performance-critical software](/docs/project/goals.md#performance-critical-software)
- In performance-critical sofrware, like modelling, sometimes slices of
arrays may be used. For example, for updating sub-arrays. There is no
necessary to go to low-level for choosing several elements.
- [Software and language evolution](/docs/project/goals.md#software-and-language-evolution)
- Simplifying to access to sub-ranges of arrays.
- [Code that is easy to read, understand, and write](/docs/project/goals.md#code-that-is-easy-to-read-understand-and-write)
- Less code will be written for working with ranges.
- [Interoperability with and migration from existing C++ code](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code)
- Here, there is a problem with interacting with std::span/std::mdspan
comes from C++.

## Alternatives considered
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's important to consider the alternative of having this feature be provided by the library, with no special core-language syntax for it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please talk about the alternative of using half-open ranges rather than closed ranges here. This seems like a significant divergence from what I think at least C++ and Python programmers would expect, so if it's the right choice then we need rationale for that.


Most of programming languages have syntax the same or close to proposed.

In Pascal, [subrange](https://wiki.freepascal.org/subrange_types) is available.
Using similar syntax, slice may produce arrays:

```carbon
var arr: auto = 1 .. 4;
```

will generate `[1, 2, 3, 4]` (round brackets may be omitted).

Advantages:

- Clear and popular for ranges in mathematic's notation
- Close to Rust and Pascal

Disadvantage:

- Unfortunately, such syntax does not allow generate array with specific step
between elements.

Another way is function call. The most simple and can be implemented by way of
standard library.

```carbon
var arr: auto = slice(/*low=*/ 1, /*high=*/ 4, /*step=*/ 1);
```

Here, `step` is optional argument.

Advantage:

- Close to internal representation;
- May work similarly to `std::span`

Disadvantage:

- This way lacks possibility to specify only low or high bound or even skip
both of them. It will lead to the following code for slicing of whole array
that is not simple to read, understand, and write:

```carbon
var arr: auto = (1, 2, 3);
var arr_copy: auto = arr[slice(lbound(arr), ubound(arr))];
```

`lbound` and `ubound` returns lower and upper bounds of `arr`, respectively.