Add OpenACC 2.6 `serial' construct support

The `serial' construct (cf. section 2.5.3 of the OpenACC 2.6 standard) is equivalent to a `parallel' construct with clauses `num_gangs(1) num_workers(1) vector_length(1)' implied. These clauses are therefore not supported with the `serial' construct. All the remaining clauses accepted with `parallel' are also accepted with `serial'. The `serial' construct is implemented like `parallel', except for hardcoding dimensions rather than taking them from the relevant clauses, in `expand_omp_target'. Separate codes are used to denote the `serial' construct throughout the middle end, even though the mapping of `serial' to an equivalent `parallel' construct could have been done in the individual language frontends. In particular, this allows to distinguish between compute constructs in warnings, error messages, dumps etc. 2019-11-12 Maciej W. Rozycki <macro@codesourcery.com> Tobias Burnus <tobias@codesourcery.com> Frederik Harwath <frederik@codesourcery.com> Thomas Schwinge <thomas@codesourcery.com> gcc/ * gimple.h (gf_mask): Add GF_OMP_TARGET_KIND_OACC_SERIAL enumeration constant. (is_gimple_omp_oacc): Handle GF_OMP_TARGET_KIND_OACC_SERIAL. (is_gimple_omp_offloaded): Likewise. * gimplify.c (omp_region_type): Add ORT_ACC_SERIAL enumeration constant. Adjust the value of ORT_NONE accordingly. (is_gimple_stmt): Handle OACC_SERIAL. (oacc_default_clause): Handle ORT_ACC_SERIAL. (gomp_needs_data_present): Likewise. (gimplify_adjust_omp_clauses): Likewise. (gimplify_omp_workshare): Handle OACC_SERIAL. (gimplify_expr): Likewise. * omp-expand.c (expand_omp_target): Handle GF_OMP_TARGET_KIND_OACC_SERIAL. (build_omp_regions_1, omp_make_gimple_edges): Likewise. * omp-low.c (is_oacc_parallel): Rename function to... (is_oacc_parallel_or_serial): ... this. Handle GF_OMP_TARGET_KIND_OACC_SERIAL. (scan_sharing_clauses): Adjust accordingly. (scan_omp_for): Likewise. (lower_oacc_head_mark): Likewise. (convert_from_firstprivate_int): Likewise. (lower_omp_target): Likewise. (check_omp_nesting_restrictions): Handle GF_OMP_TARGET_KIND_OACC_SERIAL. (lower_oacc_reductions): Likewise. (lower_omp_target): Likewise. * tree.def (OACC_SERIAL): New tree code. * tree-pretty-print.c (dump_generic_node): Handle OACC_SERIAL. * doc/generic.texi (OpenACC): Document OACC_SERIAL. gcc/c-family/ * c-pragma.h (pragma_kind): Add PRAGMA_OACC_SERIAL enumeration constant. * c-pragma.c (oacc_pragmas): Add "serial" entry. gcc/c/ * c-parser.c (OACC_SERIAL_CLAUSE_MASK): New macro. (c_parser_oacc_kernels_parallel): Rename function to... (c_parser_oacc_compute): ... this. Handle PRAGMA_OACC_SERIAL. (c_parser_omp_construct): Update accordingly. gcc/cp/ * constexpr.c (potential_constant_expression_1): Handle OACC_SERIAL. * parser.c (OACC_SERIAL_CLAUSE_MASK): New macro. (cp_parser_oacc_kernels_parallel): Rename function to... (cp_parser_oacc_compute): ... this. Handle PRAGMA_OACC_SERIAL. (cp_parser_omp_construct): Update accordingly. (cp_parser_pragma): Handle PRAGMA_OACC_SERIAL. Fix alphabetic order. * pt.c (tsubst_expr): Handle OACC_SERIAL. gcc/fortran/ * gfortran.h (gfc_statement): Add ST_OACC_SERIAL_LOOP, ST_OACC_END_SERIAL_LOOP, ST_OACC_SERIAL and ST_OACC_END_SERIAL enumeration constants. (gfc_exec_op): Add EXEC_OACC_SERIAL_LOOP and EXEC_OACC_SERIAL enumeration constants. * match.h (gfc_match_oacc_serial): New prototype. (gfc_match_oacc_serial_loop): Likewise. * dump-parse-tree.c (show_omp_node, show_code_node): Handle EXEC_OACC_SERIAL_LOOP and EXEC_OACC_SERIAL. * match.c (match_exit_cycle): Handle EXEC_OACC_SERIAL_LOOP. * openmp.c (OACC_SERIAL_CLAUSES): New macro. (gfc_match_oacc_serial_loop): New function. (gfc_match_oacc_serial): Likewise. (oacc_is_loop): Handle EXEC_OACC_SERIAL_LOOP. (resolve_omp_clauses): Handle EXEC_OACC_SERIAL. (oacc_code_to_statement): Handle EXEC_OACC_SERIAL and EXEC_OACC_SERIAL_LOOP. (gfc_resolve_oacc_directive): Likewise. * parse.c (decode_oacc_directive) <'s'>: Add case for "serial" and "serial loop". (next_statement): Handle ST_OACC_SERIAL_LOOP and ST_OACC_SERIAL. (gfc_ascii_statement): Likewise. Handle ST_OACC_END_SERIAL_LOOP and ST_OACC_END_SERIAL. (parse_oacc_structured_block): Handle ST_OACC_SERIAL. (parse_oacc_loop): Handle ST_OACC_SERIAL_LOOP and ST_OACC_END_SERIAL_LOOP. (parse_executable): Handle ST_OACC_SERIAL_LOOP and ST_OACC_SERIAL. (is_oacc): Handle EXEC_OACC_SERIAL_LOOP and EXEC_OACC_SERIAL. * resolve.c (gfc_resolve_blocks, gfc_resolve_code): Likewise. * st.c (gfc_free_statement): Likewise. * trans-openmp.c (gfc_trans_oacc_construct): Handle EXEC_OACC_SERIAL. (gfc_trans_oacc_combined_directive): Handle EXEC_OACC_SERIAL_LOOP. (gfc_trans_oacc_directive): Handle EXEC_OACC_SERIAL_LOOP and EXEC_OACC_SERIAL. * trans.c (trans_code): Likewise. gcc/testsuite/ * c-c++-common/goacc/parallel-dims.c: New test. * gfortran.dg/goacc/parallel-dims.f90: New test. libgomp/ * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: New test. * testsuite/libgomp.oacc-fortran/parallel-dims-aux.c: New test. * testsuite/libgomp.oacc-fortran/parallel-dims.f89: New test. * testsuite/libgomp.oacc-fortran/parallel-dims-2.f90: New test. Reviewed-by: Thomas Schwinge <thomas@codesourcery.com> git-svn-id: https://gcc.gnu.org/svn/gcc/trunk@278082 138bc75d-0d04-0410-961f-82ee72b054a4
author: Maciej W. Rozycki <macro@codesourcery.com> 2019-11-12 08:45:35 +0000
committer: Frederik Harwath <frederik@codesourcery.com> 2019-11-12 08:45:35 +0000
commit: 7ecaaf503a77d44bd85500ad8f926f31dc4509a7 (patch)
tree: 083108bb2b5a367f3ad3ecc198ecf2273eb26661 /libgomp
parent: 0f7f7e95c664def5d8a83c3518ab61bf4d43db88 (diff)
4 files changed, 248 insertions, 0 deletions
diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog
index 2f60d606a88..734395936f0 100644
--- a/libgomp/ChangeLog
+++ b/libgomp/ChangeLog
@@ -1,3 +1,13 @@
+2019-11-12  Maciej W. Rozycki  <macro@codesourcery.com>
+	    Tobias Burnus  <tobias@codesourcery.com>
+	    Frederik Harwath  <frederik@codesourcery.com>
+	    Thomas Schwinge  <thomas@codesourcery.com>
+
+	libgomp/
+	* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: New test.
+	* testsuite/libgomp.oacc-fortran/parallel-dims-aux.c: New test.
+	* testsuite/libgomp.oacc-fortran/parallel-dims.f89: New test.
+
 2019-11-11  Tobias Burnus  <tobias@codesourcery.com>
 	    Kwok Cheung Yeung  <kcy@codesourcery.com>
 
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
index 7e699f476b2..a5edfc6ca16 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
@@ -1,6 +1,8 @@
 /* OpenACC parallelism dimensions clauses: num_gangs, num_workers,
    vector_length.  */
 
+/* See also '../libgomp.oacc-fortran/parallel-dims.f90'.  */
+
 #include <limits.h>
 #include <openacc.h>
 #include <gomp-constants.h>
@@ -45,6 +47,8 @@ int main ()
 {
   acc_init (acc_device_default);
 
+  /* OpenACC parallel construct.  */
+
   /* Non-positive value.  */
 
   /* GR, WS, VS.  */
@@ -478,6 +482,8 @@ int main ()
   }
 
 
+  /* OpenACC kernels construct.  */
+
   /* We can't test parallelized OpenACC kernels constructs in this way: use of
      the acc_gang, acc_worker, acc_vector functions will make the construct
      unparallelizable.  */
@@ -544,5 +550,72 @@ int main ()
   }
 
 
+  /* OpenACC serial construct.  */
+
+  /* GR, WS, VS.  */
+  {
+    int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max;
+    gangs_min = workers_min = vectors_min = INT_MAX;
+    gangs_max = workers_max = vectors_max = INT_MIN;
+#pragma acc serial /* { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } } */ \
+  reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+    {
+      for (int i = 100; i > -100; i--)
+	{
+	  gangs_min = gangs_max = acc_gang ();
+	  workers_min = workers_max = acc_worker ();
+	  vectors_min = vectors_max = acc_vector ();
+	}
+    }
+    if (gangs_min != 0 || gangs_max != 1 - 1
+	|| workers_min != 0 || workers_max != 1 - 1
+	|| vectors_min != 0 || vectors_max != 1 - 1)
+      __builtin_abort ();
+  }
+
+  /* Composition of GP, WP, VP.  */
+  {
+    int vectors_actual = 1;  /* Implicit 'vector_length (1)' clause.  */
+    int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max;
+    gangs_min = workers_min = vectors_min = INT_MAX;
+    gangs_max = workers_max = vectors_max = INT_MIN;
+#pragma acc serial copy (vectors_actual) /* { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } } */ \
+  copy (gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max)
+    {
+      if (acc_on_device (acc_device_nvidia))
+	{
+	  /* The GCC nvptx back end enforces vector_length (32).  */
+	  /* It's unclear if that's actually permissible here;
+	     <https://github.com/OpenACC/openacc-spec/issues/238> "OpenACC
+	     'serial' construct might not actually be serial".  */
+	  vectors_actual = 32;
+	}
+#pragma acc loop gang reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+      for (int i = 100; i > -100; i--)
+#pragma acc loop worker reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+	for (int j = 100; j > -100; j--)
+#pragma acc loop vector reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+	  for (int k = 100 * vectors_actual; k > -100 * vectors_actual; k--)
+	    {
+	      gangs_min = gangs_max = acc_gang ();
+	      workers_min = workers_max = acc_worker ();
+	      vectors_min = vectors_max = acc_vector ();
+	    }
+    }
+    if (acc_get_device_type () == acc_device_nvidia)
+      {
+	if (vectors_actual != 32)
+	  __builtin_abort ();
+      }
+    else
+      if (vectors_actual != 1)
+	__builtin_abort ();
+    if (gangs_min != 0 || gangs_max != 1 - 1
+	|| workers_min != 0 || workers_max != 1 - 1
+	|| vectors_min != 0 || vectors_max != vectors_actual - 1)
+      __builtin_abort ();
+  }
+
+
   return 0;
 }
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims-aux.c b/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims-aux.c
new file mode 100644
index 00000000000..b5986f4afef
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims-aux.c
@@ -0,0 +1,45 @@
+/* OpenACC parallelism dimensions clauses: num_gangs, num_workers,
+   vector_length.  */
+
+/* Copied from '../libgomp.oacc-c-c++-common/parallel-dims.c'.  */
+
+/* Used by 'parallel-dims.f90'.  */
+
+#include <limits.h>
+#include <openacc.h>
+#include <gomp-constants.h>
+
+/* TODO: "(int) acc_device_*" casts because of the C++ acc_on_device wrapper
+   not behaving as expected for -O0.  */
+#pragma acc routine seq
+/* static */ unsigned int __attribute__ ((optimize ("O2"))) acc_gang ()
+{
+  if (acc_on_device ((int) acc_device_host))
+    return 0;
+  else if (acc_on_device ((int) acc_device_nvidia))
+    return __builtin_goacc_parlevel_id (GOMP_DIM_GANG);
+  else
+    __builtin_abort ();
+}
+
+#pragma acc routine seq
+/* static */ unsigned int __attribute__ ((optimize ("O2"))) acc_worker ()
+{
+  if (acc_on_device ((int) acc_device_host))
+    return 0;
+  else if (acc_on_device ((int) acc_device_nvidia))
+    return __builtin_goacc_parlevel_id (GOMP_DIM_WORKER);
+  else
+    __builtin_abort ();
+}
+
+#pragma acc routine seq
+/* static */ unsigned int __attribute__ ((optimize ("O2"))) acc_vector ()
+{
+  if (acc_on_device ((int) acc_device_host))
+    return 0;
+  else if (acc_on_device ((int) acc_device_nvidia))
+    return __builtin_goacc_parlevel_id (GOMP_DIM_VECTOR);
+  else
+    __builtin_abort ();
+}
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90 b/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90
new file mode 100644
index 00000000000..1bfcd6ce099
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90
@@ -0,0 +1,120 @@
+! OpenACC parallelism dimensions clauses: num_gangs, num_workers,
+! vector_length.
+
+! { dg-additional-sources parallel-dims-aux.c }
+! { dg-do run }
+! { dg-prune-output "command-line option '-fintrinsic-modules-path=.*' is valid for Fortran but not for C" }
+
+! See also '../libgomp.oacc-c-c++-common/parallel-dims.c'.
+
+module acc_routines
+  implicit none (type, external)
+
+  interface
+    integer function acc_gang() bind(C)
+      !$acc routine seq
+    end function acc_gang
+
+    integer function acc_worker() bind(C)
+      !$acc routine seq
+    end function acc_worker
+
+    integer function acc_vector() bind(C)
+      !$acc routine seq
+    end function acc_vector
+  end interface
+end module acc_routines
+
+program main
+  use iso_c_binding
+  use openacc
+  use acc_routines
+  implicit none (type, external)
+
+  integer :: gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max
+  integer :: vectors_actual
+  integer :: i, j, k
+
+  call acc_init (acc_device_default)
+
+  ! OpenACC parallel construct.
+
+  !TODO
+
+
+  ! OpenACC kernels construct.
+
+  !TODO
+
+
+  ! OpenACC serial construct.
+
+  ! GR, WS, VS.
+
+  gangs_min = huge(gangs_min) ! INT_MAX
+  workers_min = huge(workers_min) ! INT_MAX
+  vectors_min = huge(vectors_min) ! INT_MAX
+  gangs_max = -huge(gangs_max) - 1  ! INT_MIN
+  workers_max = -huge(gangs_max) - 1 ! INT_MIN
+  vectors_max = -huge(gangs_max) - 1 ! INT_MIN
+  !$acc serial &
+  !$acc   reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max) ! { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } }
+  do i = 100, -99, -1
+     gangs_min = acc_gang ();
+     gangs_max = acc_gang ();
+     workers_min = acc_worker ();
+     workers_max = acc_worker ();
+     vectors_min = acc_vector ();
+     vectors_max = acc_vector ();
+  end do
+  !$acc end serial
+  if (gangs_min /= 0 .or. gangs_max /= 1 - 1 &
+      .or. workers_min /= 0 .or. workers_max /= 1 - 1 &
+      .or. vectors_min /= 0 .or. vectors_max /= 1 - 1) &
+    stop 1
+
+  ! Composition of GP, WP, VP.
+
+  vectors_actual = 1 ! Implicit 'vector_length (1)' clause.
+  gangs_min = huge(gangs_min) ! INT_MAX
+  workers_min = huge(workers_min) ! INT_MAX
+  vectors_min = huge(vectors_min) ! INT_MAX
+  gangs_max = -huge(gangs_max) - 1  ! INT_MIN
+  workers_max = -huge(gangs_max) - 1 ! INT_MIN
+  vectors_max = -huge(gangs_max) - 1 ! INT_MIN
+  !$acc serial copy (vectors_actual) &
+  !$acc   copy (gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max) ! { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } }
+  if (acc_on_device (acc_device_nvidia)) then
+     ! The GCC nvptx back end enforces vector_length (32).
+     ! It's unclear if that's actually permissible here;
+     ! <https://github.com/OpenACC/openacc-spec/issues/238> "OpenACC 'serial'
+     ! construct might not actually be serial".
+   vectors_actual = 32
+  end if
+  !$acc loop gang reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+  do i = 100, -99, -1
+     !$acc loop worker reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+     do j = 100, -99, -1
+        !$acc loop vector reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+        do k = 100 * vectors_actual, -99 * vectors_actual, -1
+           gangs_min = acc_gang ();
+           gangs_max = acc_gang ();
+           workers_min = acc_worker ();
+           workers_max = acc_worker ();
+           vectors_min = acc_vector ();
+           vectors_max = acc_vector ();
+        end do
+     end do
+  end do
+  !$acc end serial
+  if (acc_get_device_type () .eq. acc_device_nvidia) then
+     if (vectors_actual /= 32) stop 2
+  else
+     if (vectors_actual /= 1) stop 3
+  end if
+  if (gangs_min /= 0 .or. gangs_max /= 1 - 1 &
+      .or. workers_min /= 0 .or. workers_max /= 1 - 1 &
+      .or. vectors_min /= 0 .or. vectors_max /= vectors_actual - 1) &
+    stop 4
+
+end program main
author	Maciej W. Rozycki <macro@codesourcery.com>	2019-11-12 08:45:35 +0000
committer	Frederik Harwath <frederik@codesourcery.com>	2019-11-12 08:45:35 +0000
commit	7ecaaf503a77d44bd85500ad8f926f31dc4509a7 (patch)
tree	083108bb2b5a367f3ad3ecc198ecf2273eb26661 /libgomp
parent	0f7f7e95c664def5d8a83c3518ab61bf4d43db88 (diff)