aboutsummaryrefslogtreecommitdiff
path: root/openmp
AgeCommit message (Collapse)Author
2019-01-15[OpenMP] Fix for nested proc_bind affinity bugJonathan Peyton
Using proc_bind clause on a nested #pragma omp parallel region with KMP_AFFINITY set causes an assertion error. This assertion occurs because the place-partition-var is not properly initialized in the nested master threads. Trying to get an intuitive result with KMP_AFFINITY + proc_bind is difficult because of how the KMP_AFFINITY gtid-to-place mapping occurs. This patch creates an initial place list no matter what affinity mechanism is used. For KMP_AFFINITY, the place-partition-var is initialized to all the places. Differential Revision: https://reviews.llvm.org/D55795 llvm-svn: 351227
2019-01-15[OpenMP] Add lock function definitions to fix Bug 40042Jonathan Peyton
This change fixes the sanity issue reported in Bug 40042. Lock function definitions for the three lock kinds were added to disambiguate calls to the lock functions done directly and indirectly. Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=40042 Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D56103 llvm-svn: 351224
2019-01-15[OpenMP][Cmake] Allowed OpenMP testing detect test compiler with same generatorJonathan Peyton
Fix ninja build detect test compiler failed under windows. Patch by Peiyuan Song Differential Revision: https://reviews.llvm.org/D53479 llvm-svn: 351223
2019-01-15[OpenMP] Fix performance regression in SPEC kdtree testJonathan Peyton
Make __ompt_implicit_task_end a static function and remove the inline part. Remove pId variable that is unused. This fixes small regression in SPEC kdtree benchmark. Also reformat some of __ompt_implicit_task_end. Differential Revision: https://reviews.llvm.org/D55788 llvm-svn: 351221
2019-01-15[OMPT] Second chunk of final OMPT 5.0 interface updatesJoachim Protze
The omp-tools.h file is generated from the OpenMP spec to ensure that the interface is implemented as specified. The other changes are necessary to update the interface implementation to the final version as published in 5.0. The omp-tools.h header was previously called ompt.h, currently a copy under this name is installed for legacy tools. Patch partially perpared by @sconvent Reviewers: AndreyChurbanov, hbae, Hahnfeld Reviewed By: hbae Tags: #openmp, #ompt Differential Revision: https://reviews.llvm.org/D55579 llvm-svn: 351197
2019-01-15Update year in license filesHans Wennborg
In last year's update (D48219) it was suggested that the release manager might want to do this, so here we go. llvm-svn: 351194
2019-01-13[OpenMP] Fix LIBOMP_USE_DEBUGGER=ON build (PR38612)Roman Lebedev
Summary: Two things: 1. Those two variables had the wrong sigdness, which was resulting in "sign mismatch in comparison" warning. 2. The whole `kmp_debugger.cpp` wasn't being built, or rather, it was being built as-if `USE_DEBUGGER` was off, thus, nothing provided the definition of `__kmp_omp_debug_struct_info`, `__kmp_debugging`. Makes sense, because `USE_DEBUGGER` is set in `kmp_config.h`, which is not included explicitly. It is included by `kmp.h`, but that one is only included inside of the `#if USE_DEBUGGER` block.. I *think* this is the only source file with this issue, everything else seem to `#include` either `kmp.h` or `kmp_config.h`. The alternative solution would be to add `add_compile_options(-include kmp_config.h)` in CMake. I did verify that `__kmp_omp_debug_struct_info` becomes available with this patch. Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=38612 | PR38612 ]]. Reviewers: AndreyChurbanov, jlpeyton, Hahnfeld Reviewed By: jlpeyton Subscribers: guansong, jfb, openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D55783 llvm-svn: 351019
2019-01-09[OpenMP][libomptarget] Use shared memory variable for tracking parallel levelGheorghe-Teodor Bercea
Summary: Replace existing infrastructure for tracking parallel level using global memory with a per-team shared memory variable. This minimizes the impact of the overhead of tracking the parallel level for non-nested cases. Reviewers: ABataev, caomhin Reviewed By: ABataev Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D55773 llvm-svn: 350747
2019-01-09Doc: fixed description of a parameter of the __kmpc_taskloopAndrey Churbanov
Patch by sergi.mateo.bellido@gmail.com Differential Revision: https://reviews.llvm.org/D56432 llvm-svn: 350713
2019-01-07[OPENMP][NVPTX]Fix dynamic scheduling.Alexey Bataev
Summary: Previous implementation may cause the runtime crash when the number of teams is > 1024. Patch fixes this problem + reduces number of the atomic operations by 32 times. Reviewers: grokos, gtbercea, kkwli0 Subscribers: guansong, jfb, openmp-commits, caomhin Differential Revision: https://reviews.llvm.org/D56332 llvm-svn: 350524
2019-01-04[OPENMP][NVPTX]General formatting/code improvement, NFC.Alexey Bataev
Summary: Formatting. Reviewers: gtbercea, grokos, kkwli0 Subscribers: guansong, openmp-commits, caomhin Differential Revision: https://reviews.llvm.org/D56290 llvm-svn: 350431
2019-01-04[OPENMP][NVPTX]Improve performance + reduce number of used registers.Alexey Bataev
Summary: Reduced number of the used register + improved performance propagating the information about current execution/data sharing mode directly from the compiler, where it is possible. In some cases, it requires new/reworked interfaces of the runtime external functions. Old functions are marked as deprecated. Reviewers: grokos, gtbercea, kkwli0 Subscribers: guansong, jfb, openmp-commits, caomhin Differential Revision: https://reviews.llvm.org/D56278 llvm-svn: 350405
2019-01-04[OpenMP] Fix nvidia-cuda-toolkit detection on Debian/UbuntuJoel E. Denny
The OpenMP runtime's cmake scripts do not correctly locate the libdevice that the Debian/Ubuntu package nvidia-cuda-toolkit currently includes, at least on my Ubuntu 18.04.1 installation. This patch fixes that for me. This problem was discussed at length in D55269. D40453 added a similar adjustment in clang, but reviewers of D55269 concluded that, for the OpenMP runtime, the right place to address this problem is in cmake's CUDA support. However, it was also suggested we could add a workaround to OpenMP's cmake scripts now. This patch contains such a workaround, which I've tried to design so that it will have no harmful effect if cmake improves in the future. nvidia-cuda-toolkit also needs improvements because its intended monolithic CUDA tree shim, /usr/lib/cuda, has many empty directories, such as bin. I reported that at: <https://bugs.launchpad.net/ubuntu/+source/nvidia-cuda-toolkit/+bug/1808999> Reviewed By: grokos Differential Revision: https://reviews.llvm.org/D55588 llvm-svn: 350377
2019-01-03[OpenMP] Add omp_get_device_num() and update several other device API functionsJonathan Peyton
Add omp_get_device_num() function for 5.0 which returns the number of the device the current thread is running on. Currently, we are leaving it to the compiler to handle this properly if it is called inside target. Also, did some cleanup and updating of duplicate device API functions (in both libomp and libomptarget) to make them into weak functions that check for the symbol from libomptarget, and will call the version in libomptarget if it is present. If any additional device API functions are implemented also in libomptarget in the future, we should add the dlsym calls to the host functions. Also, if the omp_target_* functions are to be implemented for the host (this has been requested), they should attempt to call the libomptarget versions as well. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D55578 llvm-svn: 350352
2019-01-03[OPENMP][NVPTX]Fix incompatibility of __syncthreads with LLVM, NFC.Alexey Bataev
Summary: One of the LLVM optimizations, split critical edges, also clones tail instructions. This is a dangerous operation for __syncthreads() functions and this transformation leads to undefined behavior or incorrect results. Patch fixes this problem by replacing __syncthreads() function with the assembler instruction, which cost is too high and wich cannot be copied. Reviewers: grokos, gtbercea, kkwli0 Subscribers: guansong, openmp-commits, caomhin Differential Revision: https://reviews.llvm.org/D56274 llvm-svn: 350333
2019-01-02[libomptarget] Added install component for libomptargetVyacheslav Zakharin
Differential Revision: https://reviews.llvm.org/D56108 llvm-svn: 350254
2018-12-28[OPENMP][NVPTX]Added/fixed debugging messages, NFC.Alexey Bataev
Summary: Added or fixed new/old debugging messages for the better diagnostics. Reviewers: gtbercea, kkwli0, grokos Reviewed By: grokos Subscribers: caomhin, guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D56102 llvm-svn: 350137
2018-12-28[OPENMP][NVPTX]Fixed initialization of the data-sharing interface.Alexey Bataev
Summary: Avoid using of the atomic loop to wait for the completion of the data-sharing interface initialization, use __shfl_sync instead for the communication within the warp to signal other threads in the warp about completion of the initialization. Reviewers: gtbercea, kkwli0, grokos Subscribers: guansong, jfb, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D56100 llvm-svn: 350129
2018-12-28[OPENMP][NVPTX]Outline assert into noinline function, NFC.Alexey Bataev
Summary: At high optimization level asserts lead to some unexpected results because of auto-inserted unreachable instructions. This outlining prevents some of such dangerous optimizations and leads to better stability. Reviewers: gtbercea, kkwli0, grokos Subscribers: guansong, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D56101 llvm-svn: 350128
2018-12-22[runtime] [test] Fix using %python pathMichal Gorny
Fix the newly-added tests to use %python substitution in order to use the correct path to Python interpreter. Otherwise, they fail on NetBSD where there is no 'python', just 'pythonX.Y'. Differential Revision: https://reviews.llvm.org/D56048 llvm-svn: 350001
2018-12-18[Tests] [OpenMP] XFAIL also for ppc64le.Stefan Pintilie
Two tests were XFAILed for powerpc64le in r349512. They should have also been XFAILed for ppc64le. llvm-svn: 349521
2018-12-18XFAIL Pair of OpenMP Tests for PowerPC LE LinuxStefan Pintilie
XFAIL two tests that fail on PowerPC LE Linux due to the change of default from PIC to no-PIC on that platform. A Bug has been opened for this: https://bugs.llvm.org/show_bug.cgi?id=40082 The tests are: runtime/test/ompt/misc/control_tool.c runtime/test/ompt/synchronization/taskwait.c llvm-svn: 349512
2018-12-18[Tests] fix non-determinism failure in testcaseJoachim Protze
llvm-svn: 349460
2018-12-18[OMPT] First chunk of final OMPT 5.0 interface updatesJoachim Protze
This patch updates the implementation of the ompt_frame_t, ompt_wait_id_t and ompt_state_t. The final version of the OpenMP 5.0 spec added the "t" for these types. Furthermore the structure for ompt_frame_t changed and allows to specify that the reenter frame belongs to the runtime. Patch partially prepared by Simon Convent Reviewers: hbae llvm-svn: 349458
2018-12-18[OMPT] Add testcase for thread_num provided by implicit task eventsJoachim Protze
llvm-svn: 349457
2018-12-17[OpenMP] version the affinity format tests and fix one testJonathan Peyton
llvm-svn: 349412
2018-12-17[OpenMP] Add affinity format testsJonathan Peyton
llvm-svn: 349411
2018-12-15[OpenMP] Fixes for LIBOMP_OMP_VERSION=45/40Roman Lebedev
Summary: I have discovered this because i wanted to experiment with building static libomp (with openmp-4.0 support only) for debugging purposes. There are three kinds of problems here: 1. `__kmp_compare_and_store_acq()` simply does not exist. It was added in D47903 by @jlpeyton. I'm guessing `__kmp_atomic_compare_store_acq()` was meant. 2. In `__kmp_is_ticket_lock_initialized()`, `lck->lk.initialized` is `std::atomic<bool>`, while `lck` is `kmp_ticket_lock_t *`. Naturally, they can't be equality-compared. Either, it should return the value read from `lck->lk.initialized`, or do what `__kmp_is_queuing_lock_initialized()` does, compare the passed pointer with the field in the struct pointed by the pointer. I think the latter is correct-er choice here. 3. Tests were not versioned. They assume that `LIBOMP_OMP_VERSION` is at the latest version. This does not touch LIBOMP_OMP_VERSION=30. That is still broken. Reviewers: jlpeyton, Hahnfeld, AndreyChurbanov Reviewed By: AndreyChurbanov Subscribers: guansong, jfb, openmp-commits, jlpeyton Tags: #openmp Differential Revision: https://reviews.llvm.org/D55496 llvm-svn: 349260
2018-12-13[OpenMP] Fix transient divide by zero bug in 32-bit codeJonathan Peyton
The value returned by __kmp_now_nsec() can overflow 32-bit values causing incorrect values to be returned. The overflow can end up causing a divide by zero error because in __kmp_initialize_system_tick(), the value (__kmp_now_nsec() - nsec) can end up being much larger than the numerator: 1e6 * (delay + (now - goal)) during a pathological timing where the current time calculated is much larger than nsec. When this happens, the value of __kmp_ticks_per_msec is set to zero which is then used as the denominator in the KMP_NOW_MSEC() macro leading to the divide by zero error. Differential Revision: https://reviews.llvm.org/D55300 llvm-svn: 349090
2018-12-13[OpenMP] Implement OpenMP 5.0 affinity format functionalityJonathan Peyton
This patch adds the affinity format functionality introduced in OpenMP 5.0. This patch adds: Two new environment variables: OMP_DISPLAY_AFFINITY=TRUE|FALSE OMP_AFFINITY_FORMAT=<string> and Four new API: 1) omp_set_affinity_format() 2) omp_get_affinity_format() 3) omp_display_affinity() 4) omp_capture_affinity() The affinity format functionality has two ICV's associated with it: affinity-display-var (bool) and affinity-format-var (string). The affinity-display-var enables/disables the functionality through the envirable OMP_DISPLAY_AFFINITY. The affinity-format-var is a formatted string with the special field types beginning with a '%' character similar to printf For example, the affinity-format-var could be: "OMP: host:%H pid:%P OStid:%i num_threads:%N thread_num:%n affinity:{%A}" The affinity-format-var is displayed by every thread implicitly at the beginning of a parallel region when any thread's affinity has changed (including a brand new thread being spawned), or explicitly using the omp_display_affinity() API. The omp_capture_affinity() function can capture the affinity-format-var in a char buffer. And omp_set|get_affinity_format() allow the user to set|get the affinity-format-var explicitly at runtime. omp_capture_affinity() and omp_get_affinity_format() both return the number of characters needed to hold the entire string it tried to make (not including NULL character). If not enough buffer space is available, both these functions truncate their output. Differential Revision: https://reviews.llvm.org/D55148 llvm-svn: 349089
2018-12-13Fix for bugzilla https://bugs.llvm.org/show_bug.cgi?id=39970Andrey Churbanov
Broken tests fixed Differential Revision: https://reviews.llvm.org/D55598 llvm-svn: 349017
2018-12-11[runtime] Disable KMP_HAVE_QUAD on NetBSD gccMichal Gorny
Disable KMP_HAVE_QUAD when building via gcc on NetBSD system, as the build fails due to unimplemented builtins: .../kmp_atomic.cpp.o: In function `__kmpc_atomic_cmplx16_mul': .../kmp_atomic.cpp:1332: undefined reference to `__multc3' .../kmp_atomic.cpp.o: In function `__kmpc_atomic_cmplx16_div': .../kmp_atomic.cpp:1334: undefined reference to `__divtc3' ... Differential Revision: https://reviews.llvm.org/D55478 llvm-svn: 348886
2018-12-11[runtime] Use getloadavg() on NetBSD as wellMichal Gorny
Switch NetBSD from reading /proc (which is broken) to getloadavg() (which is already used by Darwin). NetBSD discourages using procfs in favor of system API calls. Differential Revision: https://reviews.llvm.org/D55486 llvm-svn: 348885
2018-12-11Implement __kmp_is_address_mapped() for NetBSDKamil Rytarowski
Summary: Use the sysctl(3) function to check whether an address is mapped into the address space. Reviewers: mgorny, joerg, #openmp Reviewed By: mgorny Subscribers: openmp-commits Tags: #openmp Differential Revision: https://reviews.llvm.org/D55549 llvm-svn: 348874
2018-12-11Implement __kmp_gettid() for NetBSDKamil Rytarowski
Summary: _lwp_self() returns current Thread Id in a numeric version on NetBSD. Reviewers: joerg, mgorny, #openmp Reviewed By: mgorny Subscribers: llvm-commits, openmp-commits, #openmp Tags: #openmp Differential Revision: https://reviews.llvm.org/D55497 llvm-svn: 348873
2018-12-11[test] [runtime] Permit omp_get_wtick() to return 0.01Michal Gorny
Increase the range for omp_get_wtick() test to allow for 0.01 (from <0.01). This is needed for NetBSD where it returns exactly that value due to CLOCKS_PER_SEC being 100. This should not cause a significant difference from e.g. FreeBSD where it is 128, and especially from Linux where CLOCKS_PER_SEC is apparently meaningless and sysconf(_SC_CLK_TCK) gives 100 as well. Differential Revision: https://reviews.llvm.org/D55493 llvm-svn: 348857
2018-12-11[test] [runtime] Do not include alloca.h on NetBSDMichal Gorny
On NetBSD, alloca() is in stdlib.h and there is no alloca.h. Adjust the includes appopriately. Differential Revision: https://reviews.llvm.org/D55487 llvm-svn: 348856
2018-12-11[runtime] [test] Use more portable short options to sort(1)Michal Gorny
Pass `-n -s` instead of `--numeric --stable` to sort(1), as long options are not supported by NetBSD sort implementation. `-n` is defined by POSIX, so it should be fully portable. `-s` is used consistently at least in GNU sort and FreeBSD sort, and I honestly doubt it would cause issues with any other implementation supporting `--stable`. Differential Revision: https://reviews.llvm.org/D55479 llvm-svn: 348855
2018-12-11[cmake] Use -std=gnu++11 to fix alloca() on NetBSDMichal Gorny
Prefer using '-std=gnu++11' over '-std=c++11' when available, as NetBSD exposes the correct alloca() implementation only with gnu* C/C++ standards. Differential Revision: https://reviews.llvm.org/D55477 llvm-svn: 348854
2018-12-10[OpenMP] Fix a few build issuesJonathan Peyton
Fix two build issues: 1) Recent commit 348756 accidentally included Unix clang compilers to use immintrin.h when only clang-cl should be using it leading to the following error: openmp-llvm/runtime/src/kmp_lock.cpp:2035:25: error: always_ inline function '_xbegin' requires target feature 'rtm', but would be inlined into function '__kmp_test_adaptive_lock_only' that is compiled without support for 'rtm' kmp_uint32 status = _xbegin(); This patch changes the guard to use immintrin.h to only use clang-cl instead of all clang 2) gcc-8 gives a warning about multiline comment in kmp_runtime.cpp: This patch just changes it to a two line comment openmp-llvm/runtime/src/kmp_runtime.cpp:7697:8: warning: multi-line comment [-Wcomment] #endif // KMP_OS_LINUX || KMP_OS_DRAGONFLY || KMP_OS_FREEBSD || KMP_OS_NETBSD \ llvm-svn: 348783
2018-12-10[OPENMP][NVPTX]Revert __kmpc_shuffle_int64 to its original form.Alexey Bataev
Summary: Use the original shuffle implementation for __kmpc_shuffle_int64 since default implementation uses the same implementation. Reviewers: gtbercea Subscribers: guansong, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D55514 llvm-svn: 348772
2018-12-10[OPENMP][NVPTX]Enable fast shuffles on 64bit values only if CUDA >= 9.Alexey Bataev
Summary: Shuffle on 64bit data is allowed only for CUDA >= 9.0. Also, fixed the constant for the mask, need one extra L in the end. Reviewers: gtbercea, kkwli0 Subscribers: guansong, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D55440 llvm-svn: 348758
2018-12-10Support clang compiling under windows-gnu and windows-msvcAndrey Churbanov
Patch by Peiyuan Song <squallatf@gmail.com> Differential Revision: https://reviews.llvm.org/D53422 llvm-svn: 348756
2018-12-09Add OpenBSD support to OpenMPKamil Rytarowski
Summary: This patch permits OpenMP to build and work (with both gcc and clang) on OpenBSD. It mostly follows what was done for FreeBSD and NetBSD, except OpenBSD does not have pthread_getattr_np support, so it follows OS X in that one instance. Reviewers: #openmp, krytarowski Reviewed By: krytarowski Subscribers: guansong, jfb, emaste, mgorny, krytarowski, #openmp Tags: #openmp Differential Revision: https://reviews.llvm.org/D34280 llvm-svn: 348726
2018-12-09Add DragonFlyBSD support to OpenMPKamil Rytarowski
Summary: Additions mostly follow FreeBSD and NetBSD and are not intrusive. There is similar patch for OpenBSD: https://reviews.llvm.org/D34280 The -lm was being omitted due to -Wl,--as-needed in cmake rule, similar patch is in freebsd-ports/devel/llvm-devel port. Simple OpenMP programs compile and work as expected: $ clang-devel ~/omp_hello.c -fopenmp -I/usr/local/llvm-devel/include $ LD_LIBRARY_PATH=/usr/local/llvm-devel/lib OMP_NUM_THREADS=100 ./a.out The assertion in LLVMgold.so when -fopenmp was used together with -flto in 20170524 snapshot is no longer triggered on current svn-trunk and works fine as in llvm-4.0 with our local patches. Reviewers: #openmp, krytarowski Reviewed By: krytarowski Subscribers: dexonsmith, jfb, krytarowski, guansong, gregrodgers, emaste, mgorny, mehdi_amini Differential Revision: https://reviews.llvm.org/D35129 llvm-svn: 348725
2018-12-07[OPENMP][NVPTX]Save registers for optimized builds with enabled logging.Alexey Bataev
Summary: Introduced special noinline function log that allows to save some registers for optimized builds but with enabled logging. Also, it increases the stability of the optimized builds with inlined runtime. Reviewers: gtbercea, kkwli0 Reviewed By: gtbercea Subscribers: caomhin, guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D55436 llvm-svn: 348606
2018-12-06[OPENMP][NVPTX]Correct type casting for printf args + simplified shfl64 ↵Alexey Bataev
function. Summary: Explicitly casted printf's args to the required types + simplified shfl64 function. Reviewers: gtbercea, kkwli0 Subscribers: guansong, jfb, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D55379 llvm-svn: 348521
2018-12-06[OPENMP][NVPTX]Fix __kmpc_flush to flush the memory per system, not per block.Alexey Bataev
Summary: According to the standard, after memory flushing the changes in the memory must be visible to all the threads in all teams. Patch fixes this. Reviewers: gtbercea, kkwli0 Subscribers: guansong, jfb, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D55370 llvm-svn: 348491
2018-12-03[OpenMP][libomptarget] Flush intermediate values during team reduction Gheorghe-Teodor Bercea
Summary: Ensure intermediate values of a team reduction are flushed to memory. Reviewers: ABataev, caomhin Reviewed By: ABataev Subscribers: guansong, jfb, openmp-commits Differential Revision: https://reviews.llvm.org/D55219 llvm-svn: 348148
2018-11-30[OPENMP][NVPTX]Make runtime compatible with the original runtime.Alexey Bataev
Summary: Reworked runtime to make it compatible with the requirements of the original runtime library. Also, simplified some code to reduce number of function calls. Reviewers: gtbercea, kkwli0 Subscribers: guansong, jfb, caomhin, openmp-commits Differential Revision: https://reviews.llvm.org/D55130 llvm-svn: 348003