Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficient way to get/set the diagonal of a matrix #14

Open
certik opened this issue Oct 16, 2019 · 9 comments
Open

Efficient way to get/set the diagonal of a matrix #14

certik opened this issue Oct 16, 2019 · 9 comments
Labels
Clause 16 Standard Clause 16: Intrinsic procedures and modules unsubmitted Has not been submitted to the committee yet

Comments

@certik
Copy link
Member

certik commented Oct 16, 2019

It would be really nice to have a built in (and so highly optimized) operation, as it is in Matlab: diag(A) to get or set the diagonal of a matrix A. Or, more generally, diagonal slices to get any diagonal (whether main or other) or even group of diagonals. Would be the perfect complement to current square submatrix slices.

See also the Matlab's diag function and Python Numpy's diag function.

Use cases:

  1. Getting the diagonal
real(dp) :: A(n, n)
print *, diag(A)
  1. Setting the diagonal
real(dp) :: A(n, n)
A = 0
diag(A) = 1

Note: The diag(A) = 1 syntax is new, as we are setting something to a result of a function. The way it should be read is that diag(A) is not a function but rather an intrinsic operation that gives access to the main diagonal. An alternative notation that could be considered:

  1. Getting the diagonal
real(dp) :: A(n, n)
print *, A(\)
  1. Setting the diagonal
real(dp) :: A(n, n)
A = 0
A(\) = 1

Or more generally A(start\end\skip) for the main diagonal and A(start/end/skip) for the anti-diagonal. In such case, an optional parameter could occur to denote which diagonal, as in Matlab, e.g., A(2\4,0) gets 2nd thru 4th elements of the main diagonal, A(2\4,1) gets 2nd thru 4th elements of diagonal above the main, etc.

@certik certik added the unsubmitted Has not been submitted to the committee yet label Oct 16, 2019
@certik
Copy link
Member Author

certik commented Oct 16, 2019

There is a way to do this using a pointer assignment statement, see the section 10.2.2.5 in the Fortran 2018 Standard, Note 2:

It is possible to obtain different-rank views of parts of an object by specifying upper bounds in pointer
assignment statements. This requires that the object be either rank one or contiguous. Consider the
following example, in which a matrix is under consideration. The matrix is stored as a rank-one object in
MYDATA because its diagonal is needed for some reason – the diagonal cannot be gotten as a single object from a rank-two representation. The matrix is represented as a rank-two view of MYDATA.

real, target :: MYDATA ( NR*NC )  ! An automatic array
real, pointer :: MATRIX ( :, : )  ! A rank-two view of MYDATA      
real, pointer :: VIEW_DIAG ( : )
MATRIX (1:NR, 1:NC) => MYDATA     ! The MATRIX view of the data
VIEW_DIAG => MYDATA (1::NR+1)     ! The diagonal of MATRIX

@zbeekman
Copy link

I'm not sure I see how this is/would be optimized more than direct references to diagonal elements. I agree that it would be useful/convenient, but you're still effectively fetching an entire cache line for just a single element. And the access pattern is the same, unless you do some really fancy gather-scatter with a packed caching buffer.

@certik
Copy link
Member Author

certik commented Oct 17, 2019

I agree the compilers might not be able to optimize this more than developers already can by explicit for loop, although they might. I think the stronger argument is that both Matlab and NumPy have this function and it does simplify code.

@FortranFan
Copy link
Member

FortranFan commented Oct 19, 2019

ASSOCIATE facility in Fortran can be an option to consider for certain use cases involving operations on the diagonal elements of a matrix. One advantage of this option is the ability to work with objects that do NOT have the TARGET attribute, as TARGET can possibly hinder optimization:

   integer, parameter :: N = 3
   integer :: i
   blk1: block
      integer, target :: dat(N*N)  !<-- reguires TARGET attributel might hinder optimization
      integer, pointer :: mat(:,:)
      integer, pointer :: diag(:)
      dat = [( i, i=1,size(dat) )]
      mat(1:N,1:N) => dat
      diag => dat(1::N+1)
      print *, "Approach 1: Matrix:"
      do i = 1, size(mat,dim=1)
         print "(*(g0,1x))", mat(:,i)
      end do
      print *, "Diagonal elements: ", diag
      mat => null()
      diag => null()
   end block blk1
   blk2: block
      integer :: dat(1:N*N)        !<-- Note no TARGET attribute
      dat = [( i, i=1,size(dat) )]
      print *, "Approach 2: Matrix:"
      do i = 1, N
         print "(*(g0,1x))", dat((i-1)*N+1:(i-1)*N+N)
      end do
      associate ( diag => dat(1::N+1) )
         print *, "Diagonal elements: ", diag
      end associate
   end block blk2
   stop
end

Upon compilation and execution, the above should output:

Approach 1: Matrix:
1 2 3
4 5 6
7 8 9
 Diagonal elements:  1 5 9
 Approach 2: Matrix:
1 2 3
4 5 6
7 8 9
 Diagonal elements:  1 5 9

@certik
Copy link
Member Author

certik commented Oct 19, 2019

@FortranFan thanks! I was hoping that associate might be able to do it without the target attribute. However, your example is using a 1D array. If I do 2D array:

integer, parameter :: N = 3
integer :: dat(N,N), i
dat = reshape([( i, i=1,size(dat) )], [N,N])
print *, "Approach 2: Matrix:"
do i = 1, 3
    print "(*(g0,1x))", dat(i,:)
end do
associate ( diag => dat(1::N+1) )
    print *, "Diagonal elements: ", diag
end associate
end

Then gfortran gives an error:

a.f90:8:23:

 associate ( diag => dat(1::N+1) )
                       1
Error: Rank mismatch in array reference at (1) (1/2)

@certik
Copy link
Member Author

certik commented Oct 19, 2019

Here is one way to do it without pointers:

integer, parameter :: N = 3
integer :: dat(N,N), i
dat = reshape([( i, i=1,size(dat) )], [N,N])
print *, "Approach 2: Matrix:"
do i = 1, 3
    print "(*(g0,1x))", dat(i,:)
end do
print *, "Diagonal elements: ", diag(dat)

contains

    function diag(A) result(d)
    integer, intent(in) :: A(:,:)
    integer :: d(size(A,1))
    d = diag1d(size(A,1), A)
    end function

    function diag1d(n, A) result(d)
    integer, intent(in) :: n
    integer, intent(in) :: A(n*n)
    integer :: d(n)
    d = A(1::n+1)
    end function

end

But at this point, even simpler and more straightforward is to just do:

    function diag(A) result(d)
    integer, intent(in) :: A(:,:)
    integer :: d(size(A,1))
    integer :: i
    do i = 1, size(A,1)
        d(i) = A(i,i)
    end do
    end function

@FortranFan
Copy link
Member

@FortranFan thanks! I was hoping that associate might be able to do it without the target attribute. However, your example is using a 1D array. If I do 2D array:
..
Then gfortran gives an error:
..
Error: Rank mismatch in array reference at (1) (1/2)

Agree, there are limitations with ASSOCIATE. That's why I wrote "an option to consider for certain use cases"!

With so many aspects in Fortran (or for that matter, with other programming languages or paradigms), there simply are no "perfect solutions" for some coding needs. A "diag" function might end up requiring expensive data copy operations and given how Fortran is structured, there can even be two sets of such operations, one to fetch the data from the matrix source in the first place followed then by another one during the assignment on the caller side! So coders are left to pick the "poison" of their choice.

If better support for parameterized derived types (PDTs) toward Fortran 2003 and later revisions of the standard were available widely, coders wanting to work a lot with matrices and having to "slice and dice" the matrix "data" heavily might benefit from pursuing a coding design that uses a PDT as a 'class' for a matrix and having type-bound procedures (TBPs) to operate efficiently on the data. For "sliced and diced" reference to matrix data, TBPs returning data pointers can prove really handy given the improvement introduced in the language starting with Fortran 2008 where pointers can be present in any variable-definition context including on the left-hand side of operations.

Readers can note I point out different options due to the fact any changes in the Fortran language - say a DIAG(ONAL) utility, almost any simple trivial syntactic "sugar" for a coder to a semantic improvement in language which improves coder productivity - essentially take eons to take effect: first, an eon or more to influence and convince powers-that-be to even agree to add to the language standard, then a eon or more to work out the details, another X years for the standard containing that feature to go into "official" publication, followed by a decade or more for the facility to start appearing in enough compiler implementations so coders can confidently pursue the new facility in their code, but add another few years for the implementation bugs to be worked out for the facility to be robust from the most lax robustness and quality considerations for compilers (leave alone Six-sigma)!!! Such considerations, unfortunately, need to be contemplated for all the "wonderful" ideas for Fortran that are going to pop up in this new GitHub issues respository!

@zbeekman
Copy link

Just to riff on the ideas here, you can use associate with multidimensional arrays, so long as you are happy with the result, diag being immutable and essentially a copy since it's an expression.

program main
  use iso_fortran_env, only : dp => real64
  implicit none
  integer, parameter :: N=9
  integer :: i, j
  real(dp), allocatable :: A(:,:)

  A = reshape([ ( (10*j + i, i=1,N ), j=1,N) ], [N, N])

  associate( diag => [ (A(i,i), i=1,N) ] )
    write(*,'("diag: ",*(g0,:,", "))') nint(diag)
    A(1,1) = 1010
    write(*,'("diag: ",*(g0,:,", "))') nint(diag)
  end associate
  write(*,'(A)') "A:"
  do j = 1, N
     write(*,'(*(g0,:,", "))') nint(A(:,j))
  end do
end program

This gives the result:

diag: 11, 22, 33, 44, 55, 66, 77, 88, 99
diag: 11, 22, 33, 44, 55, 66, 77, 88, 99
A:
1010, 12, 13, 14, 15, 16, 17, 18, 19
21, 22, 23, 24, 25, 26, 27, 28, 29
31, 32, 33, 34, 35, 36, 37, 38, 39
41, 42, 43, 44, 45, 46, 47, 48, 49
51, 52, 53, 54, 55, 56, 57, 58, 59
61, 62, 63, 64, 65, 66, 67, 68, 69
71, 72, 73, 74, 75, 76, 77, 78, 79
81, 82, 83, 84, 85, 86, 87, 88, 89
91, 92, 93, 94, 95, 96, 97, 98, 99

Notice that diag did not update with A.

@milancurcic milancurcic added Clause 10 Standard Clause 10: Expressions and assignment Clause 16 Standard Clause 16: Intrinsic procedures and modules and removed Clause 10 Standard Clause 10: Expressions and assignment labels Oct 27, 2021
@milancurcic
Copy link
Member

I marked this as Chapter 16: Intrinsic procedures and modules, as I think this is most likely way to add it to the language. Stdlib now provides diag to get a diagnoal, and a setter could be implemented there as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clause 16 Standard Clause 16: Intrinsic procedures and modules unsubmitted Has not been submitted to the committee yet
Projects
None yet
Development

No branches or pull requests

4 participants