Skip to content

Working with arrays

Elias Rabel edited this page Jan 4, 2019 · 7 revisions

Working with arrays

Forpy offers interoperability of Fortran arrays and numpy arrays through the type ndarray. In the following examples, you will see various ways to create a numpy array.

Creating a numpy array from a Fortran array

The simplest way to create a numpy array is with ndarray_create. This function creates a numpy array with the same content as the Fortran array that is passed to the function. For example:

program ndarray01
  use forpy_mod
  implicit none

  integer, parameter :: NROWS = 2
  integer, parameter :: NCOLS = 3
  integer :: ierror, ii, jj
  
  real :: matrix(NROWS, NCOLS)
  
  type(ndarray) :: arr

  ierror = forpy_initialize()

  do jj = 1, NCOLS
    do ii = 1, NROWS
      matrix(ii, jj) = real(ii) * jj
    enddo
  enddo

  ! creates a numpy array with the same content as 'matrix'
  ierror = ndarray_create(arr, matrix)
  
  ierror = print_py(arr)

  call arr%destroy
  call forpy_finalize

end program

When arrays get very large, creating a copy might not be what you want. The next section describes how to wrap a Fortran array with forpy without making a copy.

Creating a numpy wrapper for a Fortran array

When creating a numpy array with ndarray_create_nocopy, no copy of the Fortran array is made and storage has to be managed by the Fortran programmer. Changes to the Fortran array affect the numpy array and vice versa. This is more efficient than ndarray_create, however there are some things to consider:

Since the Fortran array can now be modified not only directly but also indirectly by the ndarray, it is necessary to add the asynchronous attribute to the Fortran array declaration, since without it compiler optimization related bugs can occur (depending on code, compiler and compiler options). Alternatively you could also use the volatile attribute.

program ndarray02
  use forpy_mod
  implicit none

  integer, parameter :: NROWS = 2
  integer, parameter :: NCOLS = 3
  integer :: ierror, ii, jj
  
  ! add the asynchronous attribute to the Fortran array that is wrapped
  ! as ndarray to avoid bugs caused by compiler optimizations
  real, asynchronous :: matrix(NROWS, NCOLS)
  
  type(ndarray) :: arr

  ierror = forpy_initialize()

  do jj = 1, NCOLS
    do ii = 1, NROWS
      matrix(ii, jj) = real(ii) * jj
    enddo
  enddo

  ! creates a numpy array that refers to 'matrix'
  ierror = ndarray_create_nocopy(arr, matrix)
  ierror = print_py(arr)

  matrix(1,1) = 1234.0 ! Change also affects 'arr'

  ierror = print_py(arr)

  call arr%destroy
  call forpy_finalize

end program

Be aware that as long as references to the ndarray are in use, the Fortran data has to be valid. (not deallocated or out of scope,...). Also make sure that not a compiler-created temporary array is passed to ndarray_create_nocopy.

Alternatively you can create ndarrays with storage managed by Python as explained in one of the following sections.

The Fortran array has to be contiguous in memory (this is not checked). Contiguous in memory means that successive array elements are adjacent in memory. For more information, see also Caveats.

Creating a numpy array in forpy

We can also create a new ndarray with the function ndarray_create_empty, specifying the shape of the array. In this case storage is allocated and managed by Python. Memory is freed, when there is no reference to the ndarray anymore (don't forget to call the destroy method).

You can also create an array of zeros with ndarray_create_zeros and an array of ones with ndarray_create_ones.

The following example shows how to access the data of a ndarray with the method ndarray%get_data. It also shows how to efficiently return a ndarray from a subroutine - without making unnecessary copies.

To edit the values of the array, use the Fortran pointer returned from ndarray%get_data.

! Example of how to return a ndarray from a subroutine
program ndarray03
  use forpy_mod
  use iso_fortran_env, only: real64
  implicit none

  integer :: ierror
  type(ndarray) :: arr

  ierror = forpy_initialize()

  call create_matrix(arr)
  ierror = print_py(arr)

  call arr%destroy
  call forpy_finalize

  CONTAINS

  subroutine create_matrix(arr)
    type(ndarray), intent(out) :: arr
    integer :: ierror, ii, jj
    integer, parameter :: NROWS = 2
    integer, parameter :: NCOLS = 3
    real(kind=real64), dimension(:,:), pointer :: matrix    

    ierror = ndarray_create_empty(arr, [NROWS, NCOLS], dtype="float64")
    
    ! Use ndarray%getdata to access the content of a numpy array
    ! from Fortran
    
    ! type of matrix must be compatible with dtype of ndarray 
    ! (here: real(kind=real64) and dtype="float64") 
    ierror = arr%get_data(matrix) 

    do jj = 1, NCOLS
      do ii = 1, NROWS
        matrix(ii, jj) = real(ii, kind=real64) * jj
      enddo
    enddo

  end subroutine

end program

Note that if you create an array with ndarray_create_empty, its values are undefined. Consider using ndarray_create_zeros or ndarray_create_ones instead. These creation functions take an optional argument dtype (default "float") to specify the type of the array's value. You have to use one of numpy's dtypes, such as float64, int32, ...

By default ndarrays are created with Fortran storage order. With the optional 4th argument to one of the creation functions, you can create an array with C storage order:

ierror = ndarray_create_zeros(arr, [NROWS, NCOLS], dtype="float64", order="C")

Accessing array elements with ndarray%get_data

ndarray%get_data fails in each of these cases:

  1. dtype of ndarray and Fortran type of pointer do not match (TypeError raised)
  2. dimension of ndarray and pointer do not match (BufferError raised)
  3. storage order of ndarray does not match the order expected by get_data (by default Fortran order is expected) (BufferError raised)
  4. ndarray is not contiguous in memory (BufferError raised)

Changing the expected storage order

For multidimensional arrays one has to take into account the storage order. By default ndarray%get_data expects a Fortran ordered (column-major) array. Use the optional 2nd argument to change this behaviour:

ierror = arr%get_data(matrix, order='C')

is successful only if arr has C storage order. Attention: Since Fortran pointers always use Fortran storage order, the data it points to will be the transpose of the array.

If you specify:

ierror = arr%get_data(matrix, order='A')

get_data will accept C or Fortran storage order. Again in case of C storage order, the data pointed to will be the transpose. Most likely you will use order='A' only if you expect the array to be symmetric or if you are interested only in diagonal elements. With ndarray%is_ordered you can check if an array has a certain order (see below).

Note that contiguous 1D arrays are both Fortran and C contiguous and therefore you do not need to provide the order argument in that case.

Non-contiguous arrays

ndarray%get_data does not support non-contiguous arrays (yet?). In that case, you can create a copy of the array (which will be contiguous) and use get_data on the copy:

ierror = arr%copy(copy_of_arr) ! copy_of_arr will have Fortran order by default
ierror = copy_of_arr%get_data(matrix)

Checking properties of a ndarray

You can use ndarray%is_ordered to check if a ndarray is contiguous and in a certain storage order:

flag = nd_arr%is_ordered('F') ! .true. if nd_arr Fortran-contiguous
flag = nd_arr%is_ordered('C') ! .true. if nd_arr C-contiguous
flag = nd_arr%is_ordered('A') ! .true. if nd_arr Fortran- or C-contiguous

You can check the type of an array with ndarray%get_dtype_name. The resulting type string are numpy dtype names, not Fortran type names:

integer ierror
type(ndarray) :: nd_arr
real(kind=real64) :: testarr(1) = [42.0_real64]
character(kind=C_CHAR, len=:), allocatable :: dname
  
ierror = ndarray_create(nd_arr, testarr)
ierror = nd_arr%get_dtype_name(dname)
write(*,*) dname
    
call nd_arr%destroy

You can get the dimensionality of a ndarray with ndarray%ndim:

integer :: num_dims
ierror = nd_arr%ndim(num_ndims)

Creating a numpy array from an assumed size array

In Fortran it is possible to leave the size of the last dimension of an array unspecified. This often occurs in classic Fortran codes. Use Fortran array slice notation to give ndarray_create/ndarray_create_nocopy the necessary shape information:

subroutine sub(x, n)
  integer n
  real x(*)
  
  integer :: ierror
  type(ndarray) :: arr
  
  ierror = ndarray_create(arr, x(1:n))
  
  ! more code ...
  
  call arr%destroy
end subroutine

or (even better), change the subroutine arguments declaration into:

subroutine sub(x, n)
  integer n
  real x(n)
  
  integer :: ierror
  type(ndarray) :: arr
  
  ierror = ndarray_create(arr, x)
  
  ! more code ...
  
  call arr%destroy
end subroutine

Caveats

This section discusses some points that you should consider when using ndarray_create_nocopy.

ndarray_create_nocopy does not make a copy of the Fortran array's data, but refers to the memory region that the Fortran array uses directly. While this is performance- and memorywise ideal, there are some caveats:

Bad input to ndarray_create_nocopy

Make sure that the array argument to ndarray_create_nocopy is not a compiler created temporary array, since the lifetime of the temporary is system dependent (it might get deallocated after the call to ndarray_create_nocopy):

! BAD: temporary array created by arithmetic expression
ierror = ndarray_create_nocopy(nd_arr, array*23 + 1)

do not do this either:

! do not use array literal, since lifetime depends on implementation and
! compiler options
ierror = ndarray_create_nocopy(nd_arr, [1, 2, 3, 4, 5])

Also make sure that the array input to ndarray_create_nocopy is contiguous in memory. This is not checked by ndarray_create_nocopy. (One could check this using the Fortran 2008 intrinsic is_contiguous, but not all compilers support it.) Note: ndarray_create supports non-contiguous input.

program bad_dont_do_this
use forpy_mod
implicit none

integer :: ierror
integer, target :: array(10)
integer, dimension(:), pointer :: ptr
type(ndarray) :: nd_arr

ierror = forpy_initialize()

array = [1,2,3,4,5,6,7,8,9,10]
ptr => array(1:10:2)

! BAD: 'ptr' points to a non-contiguous slice of 'array'
ierror = ndarray_create_nocopy(nd_arr, ptr)

! Solution: use ndarray_create instead

write(*,*) "ptr = ", ptr
write(*,*) "nd_arr = "
ierror = print_py(nd_arr)

call nd_arr%destroy
call forpy_finalize

end program

Using a ndarray that refers to deallocated memory

When using arrays created with ndarray_create_nocopy keep in mind that as long as references to the ndarray are in use, the Fortran data has to be valid. (not deallocated or out of scope,...). Especially when returning an ndarray from a subroutine, this can lead to subtle errors. For example:

program bad_dont_do_this
use forpy_mod
implicit none

integer :: ierror
type(ndarray) :: nd_arr
integer, allocatable, dimension(:) :: some_array

ierror = forpy_initialize()

call return_bad_array(nd_arr)

allocate(some_array(4))
some_array = 99

! Depending on your system, this might even print [99, 99, 99, 99] !
ierror = print_py(nd_arr)

call nd_arr%destroy
call forpy_finalize

CONTAINS

subroutine return_bad_array(nd_arr)
  type(ndarray) :: nd_arr
  integer, allocatable, dimension(:) :: array
  integer :: ierror

  allocate(array(4))
  array = [1, 2, 3, 4]

  ierror = ndarray_create_nocopy(nd_arr, array)
  ierror = print_py(nd_arr)

  ! BAD: array goes out of scope and is deallocated...
  !      nd_arr then refers to deallocated memory

end subroutine

end program

Solutions: 1) As shown above, use ndarray_create_empty, ndarray_create_ones,... and get_data to fill the array.

  1. OR: create a copy of the ndarray or use ndarray_create

Compiler optimization related bugs

Since the Fortran array that has been wrapped by ndarray_create_nocopy can now be modified not only directly but also indirectly by the ndarray, you should add the asynchronous attribute to the Fortran array declaration, since without it compiler optimization related bugs can occur (depending on code, compiler and compiler options).

Alternatively you could also use the volatile attribute. For an example that demonstrates the issue see issue #3.

If you do not want to add the asynchronous attribute to the array declaration, you can pass the array to a subroutine with an asynchronous dummy argument:

real, dimension(n) :: array ! declared without asynchronous
 
! ... some code

call sub(array)

subroutine sub(array)
  real, dimension(:), asynchronous, intent(inout) :: array
  ! ...
  ierror = ndarray_create_nocopy(nd_arr, array)
  ! ...
end subroutine

Note that the asynchronous attribute should only be necessary when the array could be modified.