Array Types Design

Requirements

We need to be able to represent roughly what Fortran allows, as it is know that the Fortran design allows compilers to deliver excellent performance and they are versatile enough and there is a lot of experience out there using them from 1990 onward. However we can go beyond Fortran and try to simplify / abstract some of the concepts, but we need to do at least what Fortran does.
We should follow Python's canonical approach where possible, using Python's typing and be consistent with the following documents:

Survey of Fortran arrays

These examples show all that Fortran allows, and we will try to come up the most natural Python equivalent in the next section.

Explicit-shape arrays

These arrays do not need an array descriptor, their lower and upper bound is always passed in as arguments (or implicitly 1 for lower bound), and they are always contiguous, so we know all information about these.

Note that the compiler can still pass these using array descriptor, but it doesn't have to, it can also pass them as just a pointer, because no other information is needed.

Default lower bound

subroutine f(n, r)
integer, intent(in) :: n
real(dp), intent(out) :: r(n)
integer :: i
do i = 1, n
    r(i) = 1.0_dp / i**2
enddo
end subroutine

subroutine g(m, n, A)
integer, intent(in) :: m, n
real(dp), intent(in) :: A(m, n)
...
end subroutine

These are a shortcut to:

subroutine f(n, r)
integer, intent(in) :: n
real(dp), intent(out) :: r(1:n)
integer :: i
do i = 1, n
    r(i) = 1.0_dp / i**2
enddo
end subroutine

subroutine g(m, n, A)
integer, intent(in) :: m, n
real(dp), intent(in) :: A(1:m, 1:n)
...
end subroutine

Custom lower bounds

subroutine print_eigenvalues(kappa_min, kappa_max, lam)
integer, intent(in) :: kappa_min, kappa_max
real(dp), intent(in) :: lam(kappa_min:kappa_max)

integer :: kappa
do kappa = kappa_min, ubound(lam, 1)
    print *, kappa, lam(kappa)
end do
end subroutine

Assumed-shape arrays

These arrays do need an array descriptor, their lower and upper bound as well as strides are passed in at runtime as part of the array (descriptor).

Default lower bounds

subroutine f(r)
real(dp), intent(out) :: r(:)
integer :: n, i
n = size(r)
do i = 1, n
    r(i) = 1.0_dp / i**2
enddo
end subroutine

subroutine g(A)
real(dp), intent(in) :: A(:, :)
...
end subroutine

These are a shortcut to:

subroutine f(r)
real(dp), intent(out) :: r(1:)
integer :: n, i
n = size(r)
do i = 1, n
    r(i) = 1.0_dp / i**2
enddo
end subroutine

subroutine g(A)
real(dp), intent(in) :: A(1:, 1:)
...
end subroutine

Custom lower bounds

subroutine print_eigenvalues(kappa_min, lam)
integer, intent(in) :: kappa_min
real(dp), intent(in) :: lam(kappa_min:)

integer :: kappa
do kappa = kappa_min, ubound(lam, 1)
    print *, kappa, lam(kappa)
end do
end subroutine

Generalization / Abstraction of Assumed-shape arrays

The assumed-shape arrays are a subset of the following generalization:

subroutine f(r)
integer, dim :: n
real(dp), intent(out) :: r(n)
integer :: n, i
n = size(r)
do i = 1, n
    r(i) = 1.0_dp / i**2
enddo
end subroutine

subroutine g(A)
integer, dim :: m, n
real(dp), intent(in) :: A(m, n)
...
end subroutine

subroutine print_eigenvalues(kappa_min, lam)
integer, dim :: n
integer, intent(in) :: kappa_min
real(dp), intent(in) :: lam(kappa_min:n)

integer :: kappa
do kappa = kappa_min, ubound(lam, 1)
    print *, kappa, lam(kappa)
end do
end subroutine

The dim :: n variable means "infer n at runtime from the actual size of the array that gets passed in". The assumed-shape array A(:,:,:) becomes a syntactic sugar to dim :: l, m, n; A(l,m,n), all three dimensions are different. However, one can declare them to be the same as follows:

subroutine g(A)
integer, dim :: l, m, n
real(dp), intent(in) :: A(l, m, n)
...
end subroutine

Furthermore, one can use the dim parameter in an expression such as:

function f(A) result(r)
integer, dim :: n
real(dp), intent(in) :: A(n, n)
real(dp), r(n**2)
...
end subroutine

At the ASR level, there should be an explicit expression for how to compute n at runtime, so the above case is a syntactic sugar for:

function f(A) result(r)
integer, dim :: n = size(A,1)
real(dp), intent(in) :: A(n, n)
real(dp), r(n**2)
...
end subroutine

One can do more complicated examples, such as:

function f(A) result(r)
integer, dim :: n = (size(A,1)-1)/2
real(dp), intent(in) :: A(2*n+1, n**2)
real(dp), r(n**2)
...
end subroutine

The compiler would check (if bounds checking is enabled) at runtime that the actual size of the array agrees with the computed n. For example A(5, 4) (n=2) and A(7, 9) (n=3) will pass, but A(4, 1) will fail, because n=(size(A,1)-1)/2 = (4-1)/2 = 3/2 = 1, but A(2*n+1, n**2) = A(3, 1) which is different to A(4, 1), so the array size is incompatible with the specification. One could use any runtime expression, including a user defined function:

function f(A) result(r)
integer, dim :: n = get_dimension_parameter(A)
real(dp), intent(in) :: A(2*n+1, n**2)
real(dp), r(n**2)
...
end subroutine

real(dp) pure function get_dimension_parameter(A) result(n)
real(dp), intent(in) :: A(:,:)
n = (size(A,1)-1)/2
end function

You can use n inside the function just like any other variable.

At the ASR level, it seems we can thus always define lower and upper bound of an array as an expression. That expression can contain pure (user or intrinsic) function calls, arguments of the function, as well as the internal integer, dim variables. The integer, dim variable in the local symbol will always have an initializer expression, to know how to compute it at runtime. The frontend can infer this initializer in many common simpler cases (but perhaps not all), and the user can always specify it explicitly.

Rosetta Stone

Open questions

How to design Python syntax / typing
How to design ASR in the most natural and abstract way
Should ASR allow custom lower bounds?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Array Types Design

Requirements

Survey of Fortran arrays

Explicit-shape arrays

Default lower bound

Custom lower bounds

Assumed-shape arrays

Default lower bounds

Custom lower bounds

Generalization / Abstraction of Assumed-shape arrays

Rosetta Stone

Open questions

Clone this wiki locally