aboutsummaryrefslogtreecommitdiff
path: root/libgomp
diff options
context:
space:
mode:
authorMaciej W. Rozycki <macro@codesourcery.com>2019-11-12 08:45:35 +0000
committerFrederik Harwath <frederik@codesourcery.com>2019-11-12 08:45:35 +0000
commit7ecaaf503a77d44bd85500ad8f926f31dc4509a7 (patch)
tree083108bb2b5a367f3ad3ecc198ecf2273eb26661 /libgomp
parent0f7f7e95c664def5d8a83c3518ab61bf4d43db88 (diff)
Add OpenACC 2.6 `serial' construct support
The `serial' construct (cf. section 2.5.3 of the OpenACC 2.6 standard) is equivalent to a `parallel' construct with clauses `num_gangs(1) num_workers(1) vector_length(1)' implied. These clauses are therefore not supported with the `serial' construct. All the remaining clauses accepted with `parallel' are also accepted with `serial'. The `serial' construct is implemented like `parallel', except for hardcoding dimensions rather than taking them from the relevant clauses, in `expand_omp_target'. Separate codes are used to denote the `serial' construct throughout the middle end, even though the mapping of `serial' to an equivalent `parallel' construct could have been done in the individual language frontends. In particular, this allows to distinguish between compute constructs in warnings, error messages, dumps etc. 2019-11-12 Maciej W. Rozycki <macro@codesourcery.com> Tobias Burnus <tobias@codesourcery.com> Frederik Harwath <frederik@codesourcery.com> Thomas Schwinge <thomas@codesourcery.com> gcc/ * gimple.h (gf_mask): Add GF_OMP_TARGET_KIND_OACC_SERIAL enumeration constant. (is_gimple_omp_oacc): Handle GF_OMP_TARGET_KIND_OACC_SERIAL. (is_gimple_omp_offloaded): Likewise. * gimplify.c (omp_region_type): Add ORT_ACC_SERIAL enumeration constant. Adjust the value of ORT_NONE accordingly. (is_gimple_stmt): Handle OACC_SERIAL. (oacc_default_clause): Handle ORT_ACC_SERIAL. (gomp_needs_data_present): Likewise. (gimplify_adjust_omp_clauses): Likewise. (gimplify_omp_workshare): Handle OACC_SERIAL. (gimplify_expr): Likewise. * omp-expand.c (expand_omp_target): Handle GF_OMP_TARGET_KIND_OACC_SERIAL. (build_omp_regions_1, omp_make_gimple_edges): Likewise. * omp-low.c (is_oacc_parallel): Rename function to... (is_oacc_parallel_or_serial): ... this. Handle GF_OMP_TARGET_KIND_OACC_SERIAL. (scan_sharing_clauses): Adjust accordingly. (scan_omp_for): Likewise. (lower_oacc_head_mark): Likewise. (convert_from_firstprivate_int): Likewise. (lower_omp_target): Likewise. (check_omp_nesting_restrictions): Handle GF_OMP_TARGET_KIND_OACC_SERIAL. (lower_oacc_reductions): Likewise. (lower_omp_target): Likewise. * tree.def (OACC_SERIAL): New tree code. * tree-pretty-print.c (dump_generic_node): Handle OACC_SERIAL. * doc/generic.texi (OpenACC): Document OACC_SERIAL. gcc/c-family/ * c-pragma.h (pragma_kind): Add PRAGMA_OACC_SERIAL enumeration constant. * c-pragma.c (oacc_pragmas): Add "serial" entry. gcc/c/ * c-parser.c (OACC_SERIAL_CLAUSE_MASK): New macro. (c_parser_oacc_kernels_parallel): Rename function to... (c_parser_oacc_compute): ... this. Handle PRAGMA_OACC_SERIAL. (c_parser_omp_construct): Update accordingly. gcc/cp/ * constexpr.c (potential_constant_expression_1): Handle OACC_SERIAL. * parser.c (OACC_SERIAL_CLAUSE_MASK): New macro. (cp_parser_oacc_kernels_parallel): Rename function to... (cp_parser_oacc_compute): ... this. Handle PRAGMA_OACC_SERIAL. (cp_parser_omp_construct): Update accordingly. (cp_parser_pragma): Handle PRAGMA_OACC_SERIAL. Fix alphabetic order. * pt.c (tsubst_expr): Handle OACC_SERIAL. gcc/fortran/ * gfortran.h (gfc_statement): Add ST_OACC_SERIAL_LOOP, ST_OACC_END_SERIAL_LOOP, ST_OACC_SERIAL and ST_OACC_END_SERIAL enumeration constants. (gfc_exec_op): Add EXEC_OACC_SERIAL_LOOP and EXEC_OACC_SERIAL enumeration constants. * match.h (gfc_match_oacc_serial): New prototype. (gfc_match_oacc_serial_loop): Likewise. * dump-parse-tree.c (show_omp_node, show_code_node): Handle EXEC_OACC_SERIAL_LOOP and EXEC_OACC_SERIAL. * match.c (match_exit_cycle): Handle EXEC_OACC_SERIAL_LOOP. * openmp.c (OACC_SERIAL_CLAUSES): New macro. (gfc_match_oacc_serial_loop): New function. (gfc_match_oacc_serial): Likewise. (oacc_is_loop): Handle EXEC_OACC_SERIAL_LOOP. (resolve_omp_clauses): Handle EXEC_OACC_SERIAL. (oacc_code_to_statement): Handle EXEC_OACC_SERIAL and EXEC_OACC_SERIAL_LOOP. (gfc_resolve_oacc_directive): Likewise. * parse.c (decode_oacc_directive) <'s'>: Add case for "serial" and "serial loop". (next_statement): Handle ST_OACC_SERIAL_LOOP and ST_OACC_SERIAL. (gfc_ascii_statement): Likewise. Handle ST_OACC_END_SERIAL_LOOP and ST_OACC_END_SERIAL. (parse_oacc_structured_block): Handle ST_OACC_SERIAL. (parse_oacc_loop): Handle ST_OACC_SERIAL_LOOP and ST_OACC_END_SERIAL_LOOP. (parse_executable): Handle ST_OACC_SERIAL_LOOP and ST_OACC_SERIAL. (is_oacc): Handle EXEC_OACC_SERIAL_LOOP and EXEC_OACC_SERIAL. * resolve.c (gfc_resolve_blocks, gfc_resolve_code): Likewise. * st.c (gfc_free_statement): Likewise. * trans-openmp.c (gfc_trans_oacc_construct): Handle EXEC_OACC_SERIAL. (gfc_trans_oacc_combined_directive): Handle EXEC_OACC_SERIAL_LOOP. (gfc_trans_oacc_directive): Handle EXEC_OACC_SERIAL_LOOP and EXEC_OACC_SERIAL. * trans.c (trans_code): Likewise. gcc/testsuite/ * c-c++-common/goacc/parallel-dims.c: New test. * gfortran.dg/goacc/parallel-dims.f90: New test. libgomp/ * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: New test. * testsuite/libgomp.oacc-fortran/parallel-dims-aux.c: New test. * testsuite/libgomp.oacc-fortran/parallel-dims.f89: New test. * testsuite/libgomp.oacc-fortran/parallel-dims-2.f90: New test. Reviewed-by: Thomas Schwinge <thomas@codesourcery.com> git-svn-id: https://gcc.gnu.org/svn/gcc/trunk@278082 138bc75d-0d04-0410-961f-82ee72b054a4
Diffstat (limited to 'libgomp')
-rw-r--r--libgomp/ChangeLog10
-rw-r--r--libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c73
-rw-r--r--libgomp/testsuite/libgomp.oacc-fortran/parallel-dims-aux.c45
-rw-r--r--libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90120
4 files changed, 248 insertions, 0 deletions
diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog
index 2f60d606a88..734395936f0 100644
--- a/libgomp/ChangeLog
+++ b/libgomp/ChangeLog
@@ -1,3 +1,13 @@
+2019-11-12 Maciej W. Rozycki <macro@codesourcery.com>
+ Tobias Burnus <tobias@codesourcery.com>
+ Frederik Harwath <frederik@codesourcery.com>
+ Thomas Schwinge <thomas@codesourcery.com>
+
+ libgomp/
+ * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: New test.
+ * testsuite/libgomp.oacc-fortran/parallel-dims-aux.c: New test.
+ * testsuite/libgomp.oacc-fortran/parallel-dims.f89: New test.
+
2019-11-11 Tobias Burnus <tobias@codesourcery.com>
Kwok Cheung Yeung <kcy@codesourcery.com>
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
index 7e699f476b2..a5edfc6ca16 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
@@ -1,6 +1,8 @@
/* OpenACC parallelism dimensions clauses: num_gangs, num_workers,
vector_length. */
+/* See also '../libgomp.oacc-fortran/parallel-dims.f90'. */
+
#include <limits.h>
#include <openacc.h>
#include <gomp-constants.h>
@@ -45,6 +47,8 @@ int main ()
{
acc_init (acc_device_default);
+ /* OpenACC parallel construct. */
+
/* Non-positive value. */
/* GR, WS, VS. */
@@ -478,6 +482,8 @@ int main ()
}
+ /* OpenACC kernels construct. */
+
/* We can't test parallelized OpenACC kernels constructs in this way: use of
the acc_gang, acc_worker, acc_vector functions will make the construct
unparallelizable. */
@@ -544,5 +550,72 @@ int main ()
}
+ /* OpenACC serial construct. */
+
+ /* GR, WS, VS. */
+ {
+ int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max;
+ gangs_min = workers_min = vectors_min = INT_MAX;
+ gangs_max = workers_max = vectors_max = INT_MIN;
+#pragma acc serial /* { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } } */ \
+ reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+ {
+ for (int i = 100; i > -100; i--)
+ {
+ gangs_min = gangs_max = acc_gang ();
+ workers_min = workers_max = acc_worker ();
+ vectors_min = vectors_max = acc_vector ();
+ }
+ }
+ if (gangs_min != 0 || gangs_max != 1 - 1
+ || workers_min != 0 || workers_max != 1 - 1
+ || vectors_min != 0 || vectors_max != 1 - 1)
+ __builtin_abort ();
+ }
+
+ /* Composition of GP, WP, VP. */
+ {
+ int vectors_actual = 1; /* Implicit 'vector_length (1)' clause. */
+ int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max;
+ gangs_min = workers_min = vectors_min = INT_MAX;
+ gangs_max = workers_max = vectors_max = INT_MIN;
+#pragma acc serial copy (vectors_actual) /* { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } } */ \
+ copy (gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max)
+ {
+ if (acc_on_device (acc_device_nvidia))
+ {
+ /* The GCC nvptx back end enforces vector_length (32). */
+ /* It's unclear if that's actually permissible here;
+ <https://github.com/OpenACC/openacc-spec/issues/238> "OpenACC
+ 'serial' construct might not actually be serial". */
+ vectors_actual = 32;
+ }
+#pragma acc loop gang reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+ for (int i = 100; i > -100; i--)
+#pragma acc loop worker reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+ for (int j = 100; j > -100; j--)
+#pragma acc loop vector reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+ for (int k = 100 * vectors_actual; k > -100 * vectors_actual; k--)
+ {
+ gangs_min = gangs_max = acc_gang ();
+ workers_min = workers_max = acc_worker ();
+ vectors_min = vectors_max = acc_vector ();
+ }
+ }
+ if (acc_get_device_type () == acc_device_nvidia)
+ {
+ if (vectors_actual != 32)
+ __builtin_abort ();
+ }
+ else
+ if (vectors_actual != 1)
+ __builtin_abort ();
+ if (gangs_min != 0 || gangs_max != 1 - 1
+ || workers_min != 0 || workers_max != 1 - 1
+ || vectors_min != 0 || vectors_max != vectors_actual - 1)
+ __builtin_abort ();
+ }
+
+
return 0;
}
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims-aux.c b/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims-aux.c
new file mode 100644
index 00000000000..b5986f4afef
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims-aux.c
@@ -0,0 +1,45 @@
+/* OpenACC parallelism dimensions clauses: num_gangs, num_workers,
+ vector_length. */
+
+/* Copied from '../libgomp.oacc-c-c++-common/parallel-dims.c'. */
+
+/* Used by 'parallel-dims.f90'. */
+
+#include <limits.h>
+#include <openacc.h>
+#include <gomp-constants.h>
+
+/* TODO: "(int) acc_device_*" casts because of the C++ acc_on_device wrapper
+ not behaving as expected for -O0. */
+#pragma acc routine seq
+/* static */ unsigned int __attribute__ ((optimize ("O2"))) acc_gang ()
+{
+ if (acc_on_device ((int) acc_device_host))
+ return 0;
+ else if (acc_on_device ((int) acc_device_nvidia))
+ return __builtin_goacc_parlevel_id (GOMP_DIM_GANG);
+ else
+ __builtin_abort ();
+}
+
+#pragma acc routine seq
+/* static */ unsigned int __attribute__ ((optimize ("O2"))) acc_worker ()
+{
+ if (acc_on_device ((int) acc_device_host))
+ return 0;
+ else if (acc_on_device ((int) acc_device_nvidia))
+ return __builtin_goacc_parlevel_id (GOMP_DIM_WORKER);
+ else
+ __builtin_abort ();
+}
+
+#pragma acc routine seq
+/* static */ unsigned int __attribute__ ((optimize ("O2"))) acc_vector ()
+{
+ if (acc_on_device ((int) acc_device_host))
+ return 0;
+ else if (acc_on_device ((int) acc_device_nvidia))
+ return __builtin_goacc_parlevel_id (GOMP_DIM_VECTOR);
+ else
+ __builtin_abort ();
+}
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90 b/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90
new file mode 100644
index 00000000000..1bfcd6ce099
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90
@@ -0,0 +1,120 @@
+! OpenACC parallelism dimensions clauses: num_gangs, num_workers,
+! vector_length.
+
+! { dg-additional-sources parallel-dims-aux.c }
+! { dg-do run }
+! { dg-prune-output "command-line option '-fintrinsic-modules-path=.*' is valid for Fortran but not for C" }
+
+! See also '../libgomp.oacc-c-c++-common/parallel-dims.c'.
+
+module acc_routines
+ implicit none (type, external)
+
+ interface
+ integer function acc_gang() bind(C)
+ !$acc routine seq
+ end function acc_gang
+
+ integer function acc_worker() bind(C)
+ !$acc routine seq
+ end function acc_worker
+
+ integer function acc_vector() bind(C)
+ !$acc routine seq
+ end function acc_vector
+ end interface
+end module acc_routines
+
+program main
+ use iso_c_binding
+ use openacc
+ use acc_routines
+ implicit none (type, external)
+
+ integer :: gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max
+ integer :: vectors_actual
+ integer :: i, j, k
+
+ call acc_init (acc_device_default)
+
+ ! OpenACC parallel construct.
+
+ !TODO
+
+
+ ! OpenACC kernels construct.
+
+ !TODO
+
+
+ ! OpenACC serial construct.
+
+ ! GR, WS, VS.
+
+ gangs_min = huge(gangs_min) ! INT_MAX
+ workers_min = huge(workers_min) ! INT_MAX
+ vectors_min = huge(vectors_min) ! INT_MAX
+ gangs_max = -huge(gangs_max) - 1 ! INT_MIN
+ workers_max = -huge(gangs_max) - 1 ! INT_MIN
+ vectors_max = -huge(gangs_max) - 1 ! INT_MIN
+ !$acc serial &
+ !$acc reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max) ! { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } }
+ do i = 100, -99, -1
+ gangs_min = acc_gang ();
+ gangs_max = acc_gang ();
+ workers_min = acc_worker ();
+ workers_max = acc_worker ();
+ vectors_min = acc_vector ();
+ vectors_max = acc_vector ();
+ end do
+ !$acc end serial
+ if (gangs_min /= 0 .or. gangs_max /= 1 - 1 &
+ .or. workers_min /= 0 .or. workers_max /= 1 - 1 &
+ .or. vectors_min /= 0 .or. vectors_max /= 1 - 1) &
+ stop 1
+
+ ! Composition of GP, WP, VP.
+
+ vectors_actual = 1 ! Implicit 'vector_length (1)' clause.
+ gangs_min = huge(gangs_min) ! INT_MAX
+ workers_min = huge(workers_min) ! INT_MAX
+ vectors_min = huge(vectors_min) ! INT_MAX
+ gangs_max = -huge(gangs_max) - 1 ! INT_MIN
+ workers_max = -huge(gangs_max) - 1 ! INT_MIN
+ vectors_max = -huge(gangs_max) - 1 ! INT_MIN
+ !$acc serial copy (vectors_actual) &
+ !$acc copy (gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max) ! { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } }
+ if (acc_on_device (acc_device_nvidia)) then
+ ! The GCC nvptx back end enforces vector_length (32).
+ ! It's unclear if that's actually permissible here;
+ ! <https://github.com/OpenACC/openacc-spec/issues/238> "OpenACC 'serial'
+ ! construct might not actually be serial".
+ vectors_actual = 32
+ end if
+ !$acc loop gang reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+ do i = 100, -99, -1
+ !$acc loop worker reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+ do j = 100, -99, -1
+ !$acc loop vector reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max)
+ do k = 100 * vectors_actual, -99 * vectors_actual, -1
+ gangs_min = acc_gang ();
+ gangs_max = acc_gang ();
+ workers_min = acc_worker ();
+ workers_max = acc_worker ();
+ vectors_min = acc_vector ();
+ vectors_max = acc_vector ();
+ end do
+ end do
+ end do
+ !$acc end serial
+ if (acc_get_device_type () .eq. acc_device_nvidia) then
+ if (vectors_actual /= 32) stop 2
+ else
+ if (vectors_actual /= 1) stop 3
+ end if
+ if (gangs_min /= 0 .or. gangs_max /= 1 - 1 &
+ .or. workers_min /= 0 .or. workers_max /= 1 - 1 &
+ .or. vectors_min /= 0 .or. vectors_max /= vectors_actual - 1) &
+ stop 4
+
+end program main