diff options
author | Maciej W. Rozycki <macro@codesourcery.com> | 2019-11-12 08:45:35 +0000 |
---|---|---|
committer | Frederik Harwath <frederik@codesourcery.com> | 2019-11-12 08:45:35 +0000 |
commit | 7ecaaf503a77d44bd85500ad8f926f31dc4509a7 (patch) | |
tree | 083108bb2b5a367f3ad3ecc198ecf2273eb26661 /libgomp | |
parent | 0f7f7e95c664def5d8a83c3518ab61bf4d43db88 (diff) |
Add OpenACC 2.6 `serial' construct support
The `serial' construct (cf. section 2.5.3 of the OpenACC 2.6 standard)
is equivalent to a `parallel' construct with clauses `num_gangs(1)
num_workers(1) vector_length(1)' implied.
These clauses are therefore not supported with the `serial'
construct. All the remaining clauses accepted with `parallel' are also
accepted with `serial'.
The `serial' construct is implemented like `parallel', except for
hardcoding dimensions rather than taking them from the relevant
clauses, in `expand_omp_target'.
Separate codes are used to denote the `serial' construct throughout the
middle end, even though the mapping of `serial' to an equivalent
`parallel' construct could have been done in the individual language
frontends. In particular, this allows to distinguish between compute
constructs in warnings, error messages, dumps etc.
2019-11-12 Maciej W. Rozycki <macro@codesourcery.com>
Tobias Burnus <tobias@codesourcery.com>
Frederik Harwath <frederik@codesourcery.com>
Thomas Schwinge <thomas@codesourcery.com>
gcc/
* gimple.h (gf_mask): Add GF_OMP_TARGET_KIND_OACC_SERIAL
enumeration constant.
(is_gimple_omp_oacc): Handle GF_OMP_TARGET_KIND_OACC_SERIAL.
(is_gimple_omp_offloaded): Likewise.
* gimplify.c (omp_region_type): Add ORT_ACC_SERIAL enumeration
constant. Adjust the value of ORT_NONE accordingly.
(is_gimple_stmt): Handle OACC_SERIAL.
(oacc_default_clause): Handle ORT_ACC_SERIAL.
(gomp_needs_data_present): Likewise.
(gimplify_adjust_omp_clauses): Likewise.
(gimplify_omp_workshare): Handle OACC_SERIAL.
(gimplify_expr): Likewise.
* omp-expand.c (expand_omp_target):
Handle GF_OMP_TARGET_KIND_OACC_SERIAL.
(build_omp_regions_1, omp_make_gimple_edges): Likewise.
* omp-low.c (is_oacc_parallel): Rename function to...
(is_oacc_parallel_or_serial): ... this.
Handle GF_OMP_TARGET_KIND_OACC_SERIAL.
(scan_sharing_clauses): Adjust accordingly.
(scan_omp_for): Likewise.
(lower_oacc_head_mark): Likewise.
(convert_from_firstprivate_int): Likewise.
(lower_omp_target): Likewise.
(check_omp_nesting_restrictions): Handle
GF_OMP_TARGET_KIND_OACC_SERIAL.
(lower_oacc_reductions): Likewise.
(lower_omp_target): Likewise.
* tree.def (OACC_SERIAL): New tree code.
* tree-pretty-print.c (dump_generic_node): Handle OACC_SERIAL.
* doc/generic.texi (OpenACC): Document OACC_SERIAL.
gcc/c-family/
* c-pragma.h (pragma_kind): Add PRAGMA_OACC_SERIAL enumeration
constant.
* c-pragma.c (oacc_pragmas): Add "serial" entry.
gcc/c/
* c-parser.c (OACC_SERIAL_CLAUSE_MASK): New macro.
(c_parser_oacc_kernels_parallel): Rename function to...
(c_parser_oacc_compute): ... this. Handle PRAGMA_OACC_SERIAL.
(c_parser_omp_construct): Update accordingly.
gcc/cp/
* constexpr.c (potential_constant_expression_1): Handle
OACC_SERIAL.
* parser.c (OACC_SERIAL_CLAUSE_MASK): New macro.
(cp_parser_oacc_kernels_parallel): Rename function to...
(cp_parser_oacc_compute): ... this. Handle PRAGMA_OACC_SERIAL.
(cp_parser_omp_construct): Update accordingly.
(cp_parser_pragma): Handle PRAGMA_OACC_SERIAL. Fix alphabetic
order.
* pt.c (tsubst_expr): Handle OACC_SERIAL.
gcc/fortran/
* gfortran.h (gfc_statement): Add ST_OACC_SERIAL_LOOP,
ST_OACC_END_SERIAL_LOOP, ST_OACC_SERIAL and ST_OACC_END_SERIAL
enumeration constants.
(gfc_exec_op): Add EXEC_OACC_SERIAL_LOOP and EXEC_OACC_SERIAL
enumeration constants.
* match.h (gfc_match_oacc_serial): New prototype.
(gfc_match_oacc_serial_loop): Likewise.
* dump-parse-tree.c (show_omp_node, show_code_node): Handle
EXEC_OACC_SERIAL_LOOP and EXEC_OACC_SERIAL.
* match.c (match_exit_cycle): Handle EXEC_OACC_SERIAL_LOOP.
* openmp.c (OACC_SERIAL_CLAUSES): New macro.
(gfc_match_oacc_serial_loop): New function.
(gfc_match_oacc_serial): Likewise.
(oacc_is_loop): Handle EXEC_OACC_SERIAL_LOOP.
(resolve_omp_clauses): Handle EXEC_OACC_SERIAL.
(oacc_code_to_statement): Handle EXEC_OACC_SERIAL and
EXEC_OACC_SERIAL_LOOP.
(gfc_resolve_oacc_directive): Likewise.
* parse.c (decode_oacc_directive) <'s'>: Add case for "serial"
and "serial loop".
(next_statement): Handle ST_OACC_SERIAL_LOOP and ST_OACC_SERIAL.
(gfc_ascii_statement): Likewise. Handle ST_OACC_END_SERIAL_LOOP
and ST_OACC_END_SERIAL.
(parse_oacc_structured_block): Handle ST_OACC_SERIAL.
(parse_oacc_loop): Handle ST_OACC_SERIAL_LOOP and
ST_OACC_END_SERIAL_LOOP.
(parse_executable): Handle ST_OACC_SERIAL_LOOP and
ST_OACC_SERIAL.
(is_oacc): Handle EXEC_OACC_SERIAL_LOOP and EXEC_OACC_SERIAL.
* resolve.c (gfc_resolve_blocks, gfc_resolve_code): Likewise.
* st.c (gfc_free_statement): Likewise.
* trans-openmp.c (gfc_trans_oacc_construct): Handle
EXEC_OACC_SERIAL.
(gfc_trans_oacc_combined_directive): Handle
EXEC_OACC_SERIAL_LOOP.
(gfc_trans_oacc_directive): Handle EXEC_OACC_SERIAL_LOOP and
EXEC_OACC_SERIAL.
* trans.c (trans_code): Likewise.
gcc/testsuite/
* c-c++-common/goacc/parallel-dims.c: New test.
* gfortran.dg/goacc/parallel-dims.f90: New test.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: New test.
* testsuite/libgomp.oacc-fortran/parallel-dims-aux.c: New test.
* testsuite/libgomp.oacc-fortran/parallel-dims.f89: New test.
* testsuite/libgomp.oacc-fortran/parallel-dims-2.f90: New test.
Reviewed-by: Thomas Schwinge <thomas@codesourcery.com>
git-svn-id: https://gcc.gnu.org/svn/gcc/trunk@278082 138bc75d-0d04-0410-961f-82ee72b054a4
Diffstat (limited to 'libgomp')
-rw-r--r-- | libgomp/ChangeLog | 10 | ||||
-rw-r--r-- | libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c | 73 | ||||
-rw-r--r-- | libgomp/testsuite/libgomp.oacc-fortran/parallel-dims-aux.c | 45 | ||||
-rw-r--r-- | libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90 | 120 |
4 files changed, 248 insertions, 0 deletions
diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog index 2f60d606a88..734395936f0 100644 --- a/libgomp/ChangeLog +++ b/libgomp/ChangeLog @@ -1,3 +1,13 @@ +2019-11-12 Maciej W. Rozycki <macro@codesourcery.com> + Tobias Burnus <tobias@codesourcery.com> + Frederik Harwath <frederik@codesourcery.com> + Thomas Schwinge <thomas@codesourcery.com> + + libgomp/ + * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: New test. + * testsuite/libgomp.oacc-fortran/parallel-dims-aux.c: New test. + * testsuite/libgomp.oacc-fortran/parallel-dims.f89: New test. + 2019-11-11 Tobias Burnus <tobias@codesourcery.com> Kwok Cheung Yeung <kcy@codesourcery.com> diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c index 7e699f476b2..a5edfc6ca16 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c @@ -1,6 +1,8 @@ /* OpenACC parallelism dimensions clauses: num_gangs, num_workers, vector_length. */ +/* See also '../libgomp.oacc-fortran/parallel-dims.f90'. */ + #include <limits.h> #include <openacc.h> #include <gomp-constants.h> @@ -45,6 +47,8 @@ int main () { acc_init (acc_device_default); + /* OpenACC parallel construct. */ + /* Non-positive value. */ /* GR, WS, VS. */ @@ -478,6 +482,8 @@ int main () } + /* OpenACC kernels construct. */ + /* We can't test parallelized OpenACC kernels constructs in this way: use of the acc_gang, acc_worker, acc_vector functions will make the construct unparallelizable. */ @@ -544,5 +550,72 @@ int main () } + /* OpenACC serial construct. */ + + /* GR, WS, VS. */ + { + int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max; + gangs_min = workers_min = vectors_min = INT_MAX; + gangs_max = workers_max = vectors_max = INT_MIN; +#pragma acc serial /* { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } } */ \ + reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max) + { + for (int i = 100; i > -100; i--) + { + gangs_min = gangs_max = acc_gang (); + workers_min = workers_max = acc_worker (); + vectors_min = vectors_max = acc_vector (); + } + } + if (gangs_min != 0 || gangs_max != 1 - 1 + || workers_min != 0 || workers_max != 1 - 1 + || vectors_min != 0 || vectors_max != 1 - 1) + __builtin_abort (); + } + + /* Composition of GP, WP, VP. */ + { + int vectors_actual = 1; /* Implicit 'vector_length (1)' clause. */ + int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max; + gangs_min = workers_min = vectors_min = INT_MAX; + gangs_max = workers_max = vectors_max = INT_MIN; +#pragma acc serial copy (vectors_actual) /* { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } } */ \ + copy (gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max) + { + if (acc_on_device (acc_device_nvidia)) + { + /* The GCC nvptx back end enforces vector_length (32). */ + /* It's unclear if that's actually permissible here; + <https://github.com/OpenACC/openacc-spec/issues/238> "OpenACC + 'serial' construct might not actually be serial". */ + vectors_actual = 32; + } +#pragma acc loop gang reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max) + for (int i = 100; i > -100; i--) +#pragma acc loop worker reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max) + for (int j = 100; j > -100; j--) +#pragma acc loop vector reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max) + for (int k = 100 * vectors_actual; k > -100 * vectors_actual; k--) + { + gangs_min = gangs_max = acc_gang (); + workers_min = workers_max = acc_worker (); + vectors_min = vectors_max = acc_vector (); + } + } + if (acc_get_device_type () == acc_device_nvidia) + { + if (vectors_actual != 32) + __builtin_abort (); + } + else + if (vectors_actual != 1) + __builtin_abort (); + if (gangs_min != 0 || gangs_max != 1 - 1 + || workers_min != 0 || workers_max != 1 - 1 + || vectors_min != 0 || vectors_max != vectors_actual - 1) + __builtin_abort (); + } + + return 0; } diff --git a/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims-aux.c b/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims-aux.c new file mode 100644 index 00000000000..b5986f4afef --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims-aux.c @@ -0,0 +1,45 @@ +/* OpenACC parallelism dimensions clauses: num_gangs, num_workers, + vector_length. */ + +/* Copied from '../libgomp.oacc-c-c++-common/parallel-dims.c'. */ + +/* Used by 'parallel-dims.f90'. */ + +#include <limits.h> +#include <openacc.h> +#include <gomp-constants.h> + +/* TODO: "(int) acc_device_*" casts because of the C++ acc_on_device wrapper + not behaving as expected for -O0. */ +#pragma acc routine seq +/* static */ unsigned int __attribute__ ((optimize ("O2"))) acc_gang () +{ + if (acc_on_device ((int) acc_device_host)) + return 0; + else if (acc_on_device ((int) acc_device_nvidia)) + return __builtin_goacc_parlevel_id (GOMP_DIM_GANG); + else + __builtin_abort (); +} + +#pragma acc routine seq +/* static */ unsigned int __attribute__ ((optimize ("O2"))) acc_worker () +{ + if (acc_on_device ((int) acc_device_host)) + return 0; + else if (acc_on_device ((int) acc_device_nvidia)) + return __builtin_goacc_parlevel_id (GOMP_DIM_WORKER); + else + __builtin_abort (); +} + +#pragma acc routine seq +/* static */ unsigned int __attribute__ ((optimize ("O2"))) acc_vector () +{ + if (acc_on_device ((int) acc_device_host)) + return 0; + else if (acc_on_device ((int) acc_device_nvidia)) + return __builtin_goacc_parlevel_id (GOMP_DIM_VECTOR); + else + __builtin_abort (); +} diff --git a/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90 b/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90 new file mode 100644 index 00000000000..1bfcd6ce099 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90 @@ -0,0 +1,120 @@ +! OpenACC parallelism dimensions clauses: num_gangs, num_workers, +! vector_length. + +! { dg-additional-sources parallel-dims-aux.c } +! { dg-do run } +! { dg-prune-output "command-line option '-fintrinsic-modules-path=.*' is valid for Fortran but not for C" } + +! See also '../libgomp.oacc-c-c++-common/parallel-dims.c'. + +module acc_routines + implicit none (type, external) + + interface + integer function acc_gang() bind(C) + !$acc routine seq + end function acc_gang + + integer function acc_worker() bind(C) + !$acc routine seq + end function acc_worker + + integer function acc_vector() bind(C) + !$acc routine seq + end function acc_vector + end interface +end module acc_routines + +program main + use iso_c_binding + use openacc + use acc_routines + implicit none (type, external) + + integer :: gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max + integer :: vectors_actual + integer :: i, j, k + + call acc_init (acc_device_default) + + ! OpenACC parallel construct. + + !TODO + + + ! OpenACC kernels construct. + + !TODO + + + ! OpenACC serial construct. + + ! GR, WS, VS. + + gangs_min = huge(gangs_min) ! INT_MAX + workers_min = huge(workers_min) ! INT_MAX + vectors_min = huge(vectors_min) ! INT_MAX + gangs_max = -huge(gangs_max) - 1 ! INT_MIN + workers_max = -huge(gangs_max) - 1 ! INT_MIN + vectors_max = -huge(gangs_max) - 1 ! INT_MIN + !$acc serial & + !$acc reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max) ! { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } } + do i = 100, -99, -1 + gangs_min = acc_gang (); + gangs_max = acc_gang (); + workers_min = acc_worker (); + workers_max = acc_worker (); + vectors_min = acc_vector (); + vectors_max = acc_vector (); + end do + !$acc end serial + if (gangs_min /= 0 .or. gangs_max /= 1 - 1 & + .or. workers_min /= 0 .or. workers_max /= 1 - 1 & + .or. vectors_min /= 0 .or. vectors_max /= 1 - 1) & + stop 1 + + ! Composition of GP, WP, VP. + + vectors_actual = 1 ! Implicit 'vector_length (1)' clause. + gangs_min = huge(gangs_min) ! INT_MAX + workers_min = huge(workers_min) ! INT_MAX + vectors_min = huge(vectors_min) ! INT_MAX + gangs_max = -huge(gangs_max) - 1 ! INT_MIN + workers_max = -huge(gangs_max) - 1 ! INT_MIN + vectors_max = -huge(gangs_max) - 1 ! INT_MIN + !$acc serial copy (vectors_actual) & + !$acc copy (gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max) ! { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } } + if (acc_on_device (acc_device_nvidia)) then + ! The GCC nvptx back end enforces vector_length (32). + ! It's unclear if that's actually permissible here; + ! <https://github.com/OpenACC/openacc-spec/issues/238> "OpenACC 'serial' + ! construct might not actually be serial". + vectors_actual = 32 + end if + !$acc loop gang reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max) + do i = 100, -99, -1 + !$acc loop worker reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max) + do j = 100, -99, -1 + !$acc loop vector reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max) + do k = 100 * vectors_actual, -99 * vectors_actual, -1 + gangs_min = acc_gang (); + gangs_max = acc_gang (); + workers_min = acc_worker (); + workers_max = acc_worker (); + vectors_min = acc_vector (); + vectors_max = acc_vector (); + end do + end do + end do + !$acc end serial + if (acc_get_device_type () .eq. acc_device_nvidia) then + if (vectors_actual /= 32) stop 2 + else + if (vectors_actual /= 1) stop 3 + end if + if (gangs_min /= 0 .or. gangs_max /= 1 - 1 & + .or. workers_min /= 0 .or. workers_max /= 1 - 1 & + .or. vectors_min /= 0 .or. vectors_max /= vectors_actual - 1) & + stop 4 + +end program main |