aboutsummaryrefslogtreecommitdiff
path: root/libomptarget/deviceRTLs/nvptx/src/libcall.cu
AgeCommit message (Collapse)Author
2019-05-13[OPENMP][NVPTX]Simplify handling of thread limit, NFC.Alexey Bataev
Summary: Patch improves performance of the full runtime mode by moving threads limit counter to the shared memory. It also allows to save global memory. Reviewers: grokos, kkwli0, gtbercea Subscribers: guansong, jdoerfert, openmp-commits, caomhin Tags: #openmp Differential Revision: https://reviews.llvm.org/D61801 git-svn-id: https://llvm.org/svn/llvm-project/openmp/trunk@360584 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-10[OPENMP][NVPTX]Improve number of threads counter, NFC.Alexey Bataev
Summary: Patch improves performance of the full runtime mode by moving number-of-threads counter to the shared memory. It also allows to save global memory. Reviewers: grokos, gtbercea, kkwli0 Subscribers: guansong, jfb, jdoerfert, openmp-commits, caomhin Tags: #openmp Differential Revision: https://reviews.llvm.org/D61785 git-svn-id: https://llvm.org/svn/llvm-project/openmp/trunk@360457 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-03[OPENMP][NVPTX]Improve thread limit counter, NFC.Alexey Bataev
Summary: Patch improves performance of the full runtime mode by moving thread-limit counter to the shared memory. It also allows to save global memory. Reviewers: grokos, gtbercea, kkwli0 Subscribers: guansong, jdoerfert, caomhin, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D61526 git-svn-id: https://llvm.org/svn/llvm-project/openmp/trunk@359922 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-03[OPENMP][NVPTX]Improved several standard OpenMP functions, NFC.Alexey Bataev
Summary: Used parallelLevel[] counter to simplify and improve implementation of the existing standard OpenMP functions. Functions are tested already in several tests, the patch is NFC. Reviewers: grokos, gtbercea, kkwli0 Subscribers: guansong, jdoerfert, caomhin, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D61459 git-svn-id: https://llvm.org/svn/llvm-project/openmp/trunk@359892 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-02[OPENMP][NVPTX]Improve code by using parallel level counter.Alexey Bataev
Summary: Previously for the different purposes we need to get the active/common parallel level and with full runtime we iterated over all the records to calculate this level. Instead, we can used the warp-based parallel level counters used in no-runtime mode. Reviewers: grokos, gtbercea, kkwli0 Subscribers: guansong, jfb, jdoerfert, caomhin, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D61395 git-svn-id: https://llvm.org/svn/llvm-project/openmp/trunk@359822 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-02[OPENMP][NVPTX]Improve omp_get_max_threads() function.Alexey Bataev
Summary: Function omp_get_max_threads() can always return 1 if current execution mode is SPMD. Reviewers: grokos, gtbercea, kkwli0 Subscribers: guansong, jdoerfert, caomhin, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D61379 git-svn-id: https://llvm.org/svn/llvm-project/openmp/trunk@359792 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-02[OPENMP][NVPTX]Improved omp_get_thread_limit() function.Alexey Bataev
Summary: Function omp_get_thread_limit() in SPMD mode can return the maximum available number of threads as a result. Reviewers: grokos, gtbercea, kkwli0 Subscribers: guansong, jdoerfert, openmp-commits, caomhin Tags: #openmp Differential Revision: https://reviews.llvm.org/D61378 git-svn-id: https://llvm.org/svn/llvm-project/openmp/trunk@359790 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-26[OPENMP][NVPTX]Correctly handle L2 parallelism in SPMD mode.Alexey Bataev
Summary: The parallelLevel counter must be on per-thread basis to fully support L2+ parallelism, otherwise we may end up with undefined behavior. Introduce the parallelLevel on per-warp basis using shared memory. It allows to avoid the problems with the synchronization and allows fully support L2+ parallelism in SPMD mode with no runtime. Reviewers: gtbercea, grokos Subscribers: guansong, jdoerfert, caomhin, kkwli0, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D60918 git-svn-id: https://llvm.org/svn/llvm-project/openmp/trunk@359341 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-15[OPENMP][NVPTX]Fix dynamic scheduling in L2+ SPMD parallel regions.Alexey Bataev
Summary: If the kernel is executed in SPMD mode and the L2+ parallel for region with the dynamic scheduling is executed, dynamic scheduling functions are called. They expect full runtime support, but SPMD kernels may be executed without the full runtime. It leads to the runtime crash of the compiled program. Patch fixes this problem + fixes handling of the parallelism level in SPMD mode, which is required as part of this patch. Reviewers: gtbercea, kkwli0, grokos Subscribers: guansong, jdoerfert, openmp-commits, caomhin Tags: #openmp Differential Revision: https://reviews.llvm.org/D60578 git-svn-id: https://llvm.org/svn/llvm-project/openmp/trunk@358442 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-19Update more file headers across all of the LLVM projects in the monorepoChandler Carruth
to reflect the new license. These used slightly different spellings that defeated my regular expressions. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. git-svn-id: https://llvm.org/svn/llvm-project/openmp/trunk@351648 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-09[OpenMP][libomptarget] Use shared memory variable for tracking parallel levelGheorghe-Teodor Bercea
Summary: Replace existing infrastructure for tracking parallel level using global memory with a per-team shared memory variable. This minimizes the impact of the overhead of tracking the parallel level for non-nested cases. Reviewers: ABataev, caomhin Reviewed By: ABataev Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D55773 git-svn-id: https://llvm.org/svn/llvm-project/openmp/trunk@350747 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-04[OPENMP][NVPTX]Improve performance + reduce number of used registers.Alexey Bataev
Summary: Reduced number of the used register + improved performance propagating the information about current execution/data sharing mode directly from the compiler, where it is possible. In some cases, it requires new/reworked interfaces of the runtime external functions. Old functions are marked as deprecated. Reviewers: grokos, gtbercea, kkwli0 Subscribers: guansong, jfb, openmp-commits, caomhin Differential Revision: https://reviews.llvm.org/D56278 git-svn-id: https://llvm.org/svn/llvm-project/openmp/trunk@350405 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-06[OPENMP][NVPTX]Correct type casting for printf args + simplified shfl64 ↵Alexey Bataev
function. Summary: Explicitly casted printf's args to the required types + simplified shfl64 function. Reviewers: gtbercea, kkwli0 Subscribers: guansong, jfb, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D55379 git-svn-id: https://llvm.org/svn/llvm-project/openmp/trunk@348521 91177308-0d34-0410-b5e6-96231b3b80d8
2018-11-20[OPENMP][NVPTX]Improved lock/critical constructs.Alexey Bataev
Summary: Improved support for critical constructs + omp_..._lock... constructs. Reviewers: gtbercea, kkwli0, caomhin Subscribers: guansong, jfb, openmp-commits Differential Revision: https://reviews.llvm.org/D54766 git-svn-id: https://llvm.org/svn/llvm-project/openmp/trunk@347342 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-30[libomptarget-nvptx] Fix ancestor_thread_num and team_size (non-SPMD)Jonas Hahnfeld
According to OpenMP 4.5, p250:12-14: If the requested nest level is outside the range of 0 and the nest level of the current thread, as returned by the omp_get_level routine, the routine returns -1. The SPMD code path will need a similar fix. Differential Revision: https://reviews.llvm.org/D51787 git-svn-id: https://llvm.org/svn/llvm-project/openmp/trunk@343401 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-29[libomptarget-nvptx] Ignore calls to dynamic APIJonas Hahnfeld
There is no support and according to the OpenMP 4.5, p238:7-9: For implementations that do not support dynamic adjustment of the number of threads this routine has no effect: the value of dyn-var remains false. Add a test that cancellation and nested parallelism aren't supported either. Differential Revision: https://reviews.llvm.org/D51785 git-svn-id: https://llvm.org/svn/llvm-project/openmp/trunk@343381 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-29[libomptarget-nvptx] Fix number of threads in parallelJonas Hahnfeld
If there is no num_threads() clause we must consider the nthreads-var ICV. Its value is set by omp_set_num_threads() and can be queried using omp_get_max_num_threads(). The rewritten code now closely resembles the algorithm given in the OpenMP standard. Differential Revision: https://reviews.llvm.org/D51783 git-svn-id: https://llvm.org/svn/llvm-project/openmp/trunk@343380 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-29[OPENMP][NVPTX] Replace assert() by ASSERT0() macro, NFC.Alexey Bataev
Required to fix the buildbots. git-svn-id: https://llvm.org/svn/llvm-project/openmp/trunk@340956 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-29[OPENMP][NVPTX] Lightweight runtime support for SPMD mode.Alexey Bataev
Summary: Implemented simple and lightweight runtime support for SPMD mode-based constructs. It adds support for L2 sequential parallelism wihtout full runtime support. Also, patch fixes some use cases for uninitialized|lightweight runtime. Reviewers: grokos, kkwli0, Hahnfeld, gtbercea Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D51222 git-svn-id: https://llvm.org/svn/llvm-project/openmp/trunk@340944 91177308-0d34-0410-b5e6-96231b3b80d8
2018-04-26[OpenMP] Remove compilation warning when using clang to compile bc files.Guansong Zhang
Summary: Minor printf format correction. NVCC ignore those. Clang will give warning on these if debug is enabled. Reviewers: grokos Reviewed By: grokos Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D45528 git-svn-id: https://llvm.org/svn/llvm-project/openmp/trunk@330944 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-29[OpenMP] Initial implementation of OpenMP offloading library - libomptarget ↵George Rokos
device RTLs. This patch implements the device runtime library whose interface is used in the code generation for OpenMP offloading devices. Currently there is a single device RTL written in CUDA meant to CUDA enabled GPUs. The interface is a variation of the kmpc interface that includes some extra calls to do thread and storage management that only make sense for a GPU target. Differential revision: https://reviews.llvm.org/D14254 git-svn-id: https://llvm.org/svn/llvm-project/openmp/trunk@323649 91177308-0d34-0410-b5e6-96231b3b80d8