summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-01-05Update GNU/Hurd configure supportSamuel Thibault
ChangeLog: * libtool.m4: Match gnu* along other GNU systems. * libgo/config/libtool.m4: Match gnu* along other GNU systems. * libgo/configure: Re-generate. libffi/ * configure: Re-generate. libgomp/ * configure: Re-generate. gcc/ * configure: Re-generate. libatomic/ * configure: Re-generate. libbacktrace/ * configure: Re-generate. libcc1/ * configure: Re-generate. libgfortran/ * configure: Re-generate. libgomp/ * configure: Re-generate. libhsail-rt/ * configure: Re-generate. libitm/ * configure: Re-generate. libobjc/ * configure: Re-generate. liboffloadmic/ * configure: Re-generate. * plugin/configure: Re-generate. libphobos/ * configure: Re-generate. libquadmath/ * configure: Re-generate. libsanitizer/ * configure: Re-generate. libssp/ * configure: Re-generate. libstdc++-v3/ * configure: Re-generate. libvtv/ * configure: Re-generate. lto-plugin/ * configure: Re-generate. zlib/ * configure: Re-generate.
2021-01-05IBM Z: Fix check_effective_target_s390_z14_hwIlya Leoshkevich
Commit 2f473f4b065d ("IBM Z: Do not run long double tests on old machines") introduced a predicate for tests that must run only on z14+. However, due to a syntax error, the predicate always returns false. gcc/testsuite/ChangeLog: 2020-12-10 Ilya Leoshkevich <iii@linux.ibm.com> * gcc.target/s390/s390.exp: Replace %% with %.
2021-01-05xfail test that will never pass on i?86 FreeBSDSteve Kargl
gcc/testsuite * gfortran.dg/dec_math.f90: xfail on i?86-*-freebsd*
2021-01-05syscall: don't define sys_SETREUID and friendsIan Lance Taylor
We don't use them, since we always call the C library functions which do the right thing anyhow. And they aren't defined on all GNU/Linux variants. Fixes PR go/98510 Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/281473
2021-01-05internal/cpu: more build fixes for Go1.16beta1 releaseIan Lance Taylor
Some files were missing from the libgo copy of internal/cpu, because they used to only declare CacheLinePadSize which libgo gets from goarch.sh. Now they also declare doinit, so copy them over. Adjust cpu_other.go. Fix the amd64p32 build by adding a build constraint to cpu_no_name.go. Fixes PR go/98493 Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/281472
2021-01-05doc: reflect the publication of C++20 in invoke.texi and standards.texiJakub Jelinek
Jonathan mentioned on IRC that ISO/IEC 14882:2020 has been published yesterday (and indeed it appears on www.iso.org for sale). I think we should reflect that in our documentation and in cxx-status.html, patches attached. I understand we want to keep C++20 support experimental even in GCC 11, though not sure if we should still talk about "almost certainly change in incompatible ways" rather than that it might change in incompatible ways. 2021-01-05 Jakub Jelinek <jakub@redhat.com> * doc/invoke.texi (-std=c++20): Adjust for the publication of ISO 14882:2020 standard. * doc/standards.texi: Likewise.
2021-01-05d: Merge upstream dmd a5c86f5b9Iain Buclaw
Adds the following new `__traits' to the D language. - isDeprecated: used to detect if a function is deprecated. - isDisabled: used to detect if a function is marked with @disable. - isFuture: used to detect if a function is marked with @__future. - isModule: used to detect if a given symbol represents a module, this enhancement also adds support using `is(sym == module)'. - isPackage: used to detect if a given symbol represents a package, this enhancement also adds support using `is(sym == package)'. - child: takes two arguments. The first must be a symbol or expression and the second must be a symbol, such as an alias to a member of the first 'parent' argument. The result is the second 'member' argument interpreted with its 'this' context set to 'parent'. This is the inverse of `__traits(parent, member)'. - isReturnOnStack: determines if a function's return value is placed on the stack, or is returned via registers. - isZeroInit: used to detect if a type's default initializer has no non-zero bits. - getTargetInfo: used to query features of the target being compiled for, the back-end can expand this to register any key to handle the given argument, however a reliable subset exists which includes "cppRuntimeLibrary", "cppStd", "floatAbi", and "objectFormat". - getLocation: returns a tuple whose entries correspond to the filename, line number, and column number of where the argument was declared. - hasPostblit: used to detect if a type is a struct with a postblit. - isCopyable: used to detect if a type allows copying its value. - getVisibility: an alias for the getProtection trait. Reviewed-on: https://github.com/dlang/dmd/pull/12093 gcc/d/ChangeLog: * dmd/MERGE: Merge upstream dmd a5c86f5b9. * d-builtins.cc (d_eval_constant_expression): Handle ADDR_EXPR trees created by build_string_literal. * d-frontend.cc (retStyle): Remove function. * d-target.cc (d_language_target_info): New variable. (d_target_info_table): Likewise. (Target::_init): Initialize d_target_info_table. (Target::isReturnOnStack): New function. (d_add_target_info_handlers): Likewise. (d_handle_target_cpp_std): Likewise. (d_handle_target_cpp_runtime_library): Likewise. (Target::getTargetInfo): Likewise. * d-target.h (struct d_target_info_spec): New type. (d_add_target_info_handlers): Declare.
2021-01-05Add <source_location> to the precompiled header.Ed Smith-Rowland
2021-01-05 Ed Smith-Rowland <3dw4rd@verizon.net> * include/precompiled/stdc++.h: Add <source_location> to C++20 section.
2021-01-05x86: Use unsigned short to compute pextrw resultH.J. Lu
Use unsigned short to compute the zero-extended pextrw result. PR target/98495 * gcc.target/i386/sse2-mmx-pextrw.c (compute_correct_result): Use unsigned short to compute pextrw result.
2021-01-05c++: Fix deduction from the type of an NTTPPatrick Palka
In the testcase nontype-auto17.C below, the calls to f and g are invalid because neither deduction nor defaulting of the template parameter T yields a valid specialization. Deducing T doesn't work because T is used only in a non-deduced context, and defaulting T doesn't work because its default argument makes the type of M invalid. But with -std=c++17 or later, we incorrectly accept both calls. Starting with C++17 (specifically P0127R2), during deduction we're allowed to try to deduce T from the argument '42' that's been tentatively deduced for M. The problem is that when unify walks into the type of M (a TYPENAME_TYPE), it immediately gives up without performing any new unifications (so the type of M is still unknown) -- and then we go on to unify M with '42' anyway. Later in type_unification_real, we complete the template argument vector using T's default template argument, and end up forming the bogus specializations f<void, 42> and g<S, 42>. This patch fixes this issue by checking whether the type of an NTTP is still dependent after walking into its type during unification. If it is, it means we couldn't deduce all the template parameters used in its type, and so we shouldn't yet unify the NTTP. (The new testcase ttp33.C demonstrates the need for the TEMPLATE_PARM_LEVEL check; without it, we would ICE on this testcase from the call to tsubst.) gcc/cp/ChangeLog: * pt.c (unify) <case TEMPLATE_PARM_INDEX>: After walking into the type of the NTTP, substitute into the type again. If the type is still dependent, don't unify the NTTP. gcc/testsuite/ChangeLog: * g++.dg/template/partial5.C: Adjust directives to expect the same errors across all dialects. * g++.dg/cpp1z/nontype-auto17.C: New test. * g++.dg/cpp1z/nontype-auto18.C: New test. * g++.dg/template/ttp33.C: New test.
2021-01-05expand: Fold x - y < 0 to x < y during expansion [PR94802]Jakub Jelinek
My earlier patch to simplify x - y < 0 etc. for signed subtraction with undefined overflow into x < y in match.pd regressed some tests, even when it was guarded to be post-IPA, the following patch thus attempts to optimize that during expansion instead (which is the last time we can do it, afterwards we lose the information whether it was x - y < 0 or (int) ((unsigned) x - y) < 0 for which we couldn't optimize it. 2021-01-05 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/94802 * expr.h (maybe_optimize_sub_cmp_0): Declare. * expr.c: Include tree-pretty-print.h and flags.h. (maybe_optimize_sub_cmp_0): New function. (do_store_flag): Use it. * cfgexpand.c (expand_gimple_cond): Likewise. * gcc.target/i386/pr94802.c: New test. * gcc.dg/Wstrict-overflow-25.c: Remove xfail.
2021-01-05nvptx: Cache stacks block for OpenMP kernel launchJulian Brown
2021-01-05 Julian Brown <julian@codesourcery.com> libgomp/ * plugin/plugin-nvptx.c (SOFTSTACK_CACHE_LIMIT): New define. (struct ptx_device): Add omp_stacks struct. (nvptx_open_device): Initialise cached-stacks housekeeping info. (nvptx_close_device): Free cached stacks block and mutex. (nvptx_stacks_free): New function. (nvptx_alloc): Add SUPPRESS_ERRORS parameter. (GOMP_OFFLOAD_alloc): Add strategies for freeing soft-stacks block. (nvptx_stacks_alloc): Rename to... (nvptx_stacks_acquire): This. Cache stacks block between runs if same size or smaller is required. (nvptx_stacks_free): Remove. (GOMP_OFFLOAD_run): Call nvptx_stacks_acquire and lock stacks block during kernel execution.
2021-01-05A couple of comment tweaksRichard Sandiford
Tweak a couple of comments added in the RTL-SSA series in response to reviewer feedback. gcc/ * mux-utils.h (pointer_mux::m_ptr): Tweak description of contents. * rtlanal.c (simple_regno_set): Tweak description to clarify the RMW condition.
2021-01-05Don't link cc1 etc. against libcody.aJakub Jelinek
Richi complained on IRC that cc1 is linked against libcody.a. From my understanding, it is just the cc1plus and cc1objplus binaries that need it, so this patch links only those against it. > this is already part of my Solaris libcody patch The following updated patch are the incremental changes between what Rainer has committed and what I've posted. 2021-01-05 Jakub Jelinek <jakub@redhat.com> gcc/cp/ * Make-lang.in (cc1plus-checksum, cc1plus$(exeext): Add $(CODYLIB) after $(BACKEND). gcc/objcp/ * Make-lang.in (cc1objplus-checksum, cc1objplus$(exeext): Add $(CODYLIB) after $(BACKEND).
2021-01-05tree-optimization/98516 - fix SLP permute opt materializationRichard Biener
When materializing on a VEC_PERM node we have to permute the incoming vectors, not the outgoing one. 2021-01-05 Richard Biener <rguenther@suse.de> PR tree-optimization/98516 * tree-vect-slp.c (vect_optimize_slp): Permute the incoming lanes when materializing on a VEC_PERM node. (vectorizable_slp_permutation): Dump the permute properly. * gcc.dg/vect/bb-slp-pr98516-1.c: New testcase. * gcc.dg/vect/bb-slp-pr98516-2.c: Likewise.
2021-01-05c++: Fix ICE with __builtin_bit_cast [PR98469]Jakub Jelinek
On the following testcase we ICE during constexpr evaluation (for warnings), because the IL has ADDR_EXPR of BIT_CAST_EXPR and ADDR_EXPR case asserts the result is not a CONSTRUCTOR. The patch punts on lval BIT_CAST_EXPR folding. > This change is OK, but part of the problem is that we're trying to do > overload resolution for an S copy/move constructor, which we shouldn't be > because bit_cast is a prvalue, so in C++17 and up we should use it to > directly initialize the target without any implied constructor call. This version therefore wraps it into a TARGET_EXPR then, it alone fixes the bug, but I've kept the constexpr.c change too. 2021-01-05 Jakub Jelinek <jakub@redhat.com> PR c++/98469 * constexpr.c (cxx_eval_constant_expression) <case BIT_CAST_EXPR>: Punt if lval is true. * semantics.c (cp_build_bit_cast): Call get_target_expr_sfinae on the result if it has a class type. * g++.dg/cpp2a/bit-cast8.C: New test. * g++.dg/cpp2a/bit-cast9.C: New test.
2021-01-05c++: ICE with deferred noexcept when deducing targs [PR82099]Marek Polacek
In this test we ICE in type_throw_all_p because it got a deferred noexcept which it shouldn't. Here's the story: In noexcept61.C, we call bar, so we perform overload resolution. When adding the (only) candidate, we need to deduce template arguments, so call fn_type_unification as usually. That deduces U to void (*) (int &, int &) which is correct, but its noexcept-spec is deferred_noexcept. Then we call add_function_candidate (bar), wherein we try to create an implicit conversion sequence for every argument. Since baz<int> is of unknown type, we instantiate_type it; it is a TEMPLATE_ID_EXPR so that calls resolve_address_of_overloaded_function. But we crash there, because target_type contains the deferred_noexcept. So we need to maybe_instantiate_noexcept before we can compare types. resolve_overloaded_unification seemed like the appropriate spot, now fn_type_unification produces the function type with its noexcept-spec instantiated. This shouldn't go against CWG 1330 because here we really need to instantiate the noexcept-spec. This also fixes class-deduction76.C, a dg-ice test I recently added, therefore this fix also fixes c++/90799, yay. gcc/cp/ChangeLog: PR c++/82099 * pt.c (resolve_overloaded_unification): Call maybe_instantiate_noexcept after instantiating the function decl. gcc/testsuite/ChangeLog: PR c++/82099 * g++.dg/cpp1z/class-deduction76.C: Remove dg-ice. * g++.dg/cpp0x/noexcept61.C: New test.
2021-01-05move SLP debug counterRichard Biener
This moves it to catch individual SLP subgraphs 2021-01-05 Richard Biener <rguenther@suse.de> * tree-vect-slp.c (vect_slp_region): Move debug counter to cover individual subgraphs.
2021-01-05tree-optimization/98428 - avoid pre-existing vectors for loop SLPRichard Biener
It wasn't supposed to be enabled and appearantly copying around the checking messed up the condition. 2021-01-05 Richard Biener <rguenther@suse.de> PR tree-optimization/98428 * tree-vect-slp.c (vect_build_slp_tree_1): Properly reject vector lane extracts for loop vectorization.
2021-01-05reassoc: Fix reassociation on 32-bit hosts with > 32767 bbs [PR98514]Jakub Jelinek
Apparently reassoc ICEs on large functions (more than 32767 basic blocks with something to reassociate in those). The problem is that the pass uses long type to store the ranks, and the bb ranks are (number of SSA_NAMEs with default defs + 2 + bb->index) << 16, so with many basic blocks we overflow the ranks and we then have assertions rank is not negative. The following patch just uses int64_t instead of long in the pass, yes, it means slightly higher memory consumption (one array indexed by bb->index is twice as large, and one hash_map from trees to the ranks will grow by 50%, but I think it is better than punting on large functions the reassociation on 32-bit hosts and making it inconsistent e.g. when cross-compiling. Given vec.h uses unsigned for vect element counts, we don't really support more than 4G of SSA_NAMEs or more than 2G of basic blocks in a function, so even with the << 16 we can't really overflow the int64_t rank counters. 2021-01-05 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/98514 * tree-ssa-reassoc.c (bb_rank): Change type from long * to int64_t *. (operand_rank): Change type from hash_map<tree, long> to hash_map<tree, int64_t>. (phi_rank): Change return type from long to int64_t. (loop_carried_phi): Change block_rank variable type from long to int64_t. (propagate_rank): Change return type, rank parameter type and op_rank variable type from long to int64_t. (find_operand_rank): Change return type from long to int64_t and change slot variable type from long * to int64_t *. (insert_operand_rank): Change rank parameter type from long to int64_t. (get_rank): Change return type and rank variable type from long to int64_t. Use PRId64 instead of ld to print the rank. (init_reassoc): Change rank variable type from long to int64_t and adjust correspondingly bb_rank and operand_rank initialization.
2021-01-05phiopt: Optimize x < 0 ? ~y : y to (x >> 31) ^ y [PR96928]Jakub Jelinek
As requested in the PR, the one's complement abs can be done more efficiently without cmov or branching. Had to change the ifcvt-onecmpl-abs-1.c testcase, we no longer optimize it in ifcvt, on x86_64 with -m32 we generate in the end the exact same code, but with -m64: movl %edi, %eax - notl %eax - cmpl %edi, %eax - cmovl %edi, %eax + sarl $31, %eax + xorl %edi, %eax ret 2021-01-05 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/96928 * tree-ssa-phiopt.c (xor_replacement): New function. (tree_ssa_phiopt_worker): Call it. * gcc.dg/tree-ssa/pr96928.c: New test. * gcc.target/i386/ifcvt-onecmpl-abs-1.c: Remove -fdump-rtl-ce1, instead of scanning rtl dump for ifcvt message check assembly for xor instruction.
2021-01-05match.pd: Improve (A / (1 << B)) -> (A >> B) optimization [PR96930]Jakub Jelinek
The following patch improves the A / (1 << B) -> A >> B simplification, as seen in the testcase, if there is unnecessary widening for the division, we just optimize it into a shift on the widened type, but if the lshift is widened too, there is no reason to do that, we can just shift it in the original type and convert after. The tree_nonzero_bits & wi::mask check already ensures it is fine even for signed values. I've split the vr-values optimization into a separate patch as it causes a small regression on two testcases, but this patch fixes what has been reported in the PR alone. 2021-01-05 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/96930 * match.pd ((A / (1 << B)) -> (A >> B)): If A is extended from narrower value which has the same type as 1 << B, perform the right shift on the narrower value followed by extension. * g++.dg/tree-ssa/pr96930.C: New test.
2021-01-05store-merging: Handle vector CONSTRUCTORs using bswap [PR96239]Jakub Jelinek
I've tried to add such helper, but handling over just analysis and letting each pass handle it differently seems complicated given the limitations of the bswap infrastructure. So, this patch just hooks the optimization also into store-merging so that the original testcase from the PR can be fixed. 2021-01-05 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/96239 * gimple-ssa-store-merging.c (maybe_optimize_vector_constructor): New function. (get_status_for_store_merging): Don't return BB_INVALID for blocks with potential bswap optimizable CONSTRUCTORs. (pass_store_merging::execute): Optimize vector CONSTRUCTORs with bswap if possible. * gcc.dg/tree-ssa/pr96239.c: New test.
2021-01-05go: Fix -fgo-embedcfg= option description.Jakub Jelinek
Description of options should be . terminated, the: FAIL: compiler driver --help=go option(s): "^ +-.*[^:.]$" absent from output: " -fgo-embedcfg=<file> List embedded files via go:embed" test even reports that. 2021-01-05 Jakub Jelinek <jakub@redhat.com> * lang.opt (fgo-embedcfg=): Add full stop at the end of description.
2021-01-05tree-optimization/98381 - fix live bool vector extractRichard Biener
This fixes extraction of live bool vector results for the case of integer mode vectors. 2021-01-05 Richard Biener <rguenther@suse.de> PR tree-optimization/98381 * tree.c (vector_element_bits): Properly compute bool vector element size. * tree-vect-loop.c (vectorizable_live_operation): Properly compute the last lane bit offset.
2021-01-05i386: Prevent spurious FP exceptions with _mm_cvt{,t}ps_pi32 [PR98522]Uros Bizjak
Prevent spurious FP exceptions with _mm_cvt{,t}ps_pi32 for TARGET_MMX_WITH_SSE by clearing the top 64 bytes of the input XMM register. 2021-01-05 Uroš Bizjak <ubizjak@gmail.com> gcc/ PR target/98522 * config/i386/sse.md (sse_cvtps2pi): Redefine as define_insn_and_split. Clear the top 64 bytes of the input XMM register. (sse_cvttps2pi): Ditto. gcc/testsuite PR target/98522 * gcc.target/i386/pr98522.c: New test.
2021-01-05i386: Add _mm256_cmov_si256 [PR98521]Uros Bizjak
Add missing _mm256_cmov_si256 intrinsic to xopintrin.h. 2021-01-05 Uroš Bizjak <ubizjak@gmail.com> gcc/ PR target/98521 * config/i386/xopintrin.h (_mm256_cmov_si256): New.
2021-01-05[c++]: Improve module-decl diagnostics [PR 98327]Nathan Sidwell
The diagnostic for a misplaced module decl was essentially 'computer says no', which isn't the most helpful. This adjusts it to indicate what would be acceptable. gcc/cp/ * parser.c (cp_parser_module_declaration): Alter diagnostic text to say where is permissable. gcc/testsuite/ * g++.dg/modules/mod-decl-1.C: Adjust. * g++.dg/modules/p0713-2.C: Adjust. * g++.dg/modules/p0713-3.C: Adjust.
2021-01-05x86: Cast to unsigned short first for _mm_extract_pi16H.J. Lu
_mm_extract_pi16 is intrinsic for pextrw, which should be zero-extended, not sign-extended. gcc/ PR target/98495 * config/i386/xmmintrin.h (_mm_extract_pi16): Cast to unsigned short first. gcc/testsuite/ PR target/98495 * gcc.target/i386/pr98495-1.c: New test. * gcc.target/i386/pr98495-2.c: New test. * gcc.target/i386/pr98495-3.c: New test. * gcc.target/i386/pr98495-4.c: New test. * gcc.target/i386/pr98495-5.c: New test.
2021-01-05arc: fix accumulator first register.Claudiu Zissulescu
gcc/ 2021-01-05 Claudiu Zissulescu <claziss@synopsys.com> * config/arc/arc.md (maddsidi4_split): Use ACC_REG_FIRST. (umaddsidi4_split): Likewise. Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>
2021-01-05i386: Optimize pmovskb on zero_extend of subreg HI of pmovskb result [PR98461]liuhongt
The following patch adds define_insn_and_split to optimize vpmovmskb %xmm0, %eax - movzwl %ax, %eax notl %eax and combine splitter to optimize pmovmskb %xmm0, %eax - notl %eax - movzwl %ax, %eax + xorl $65535, %eax gcc/ChangeLog PR target/98461 * config/i386/sse.md (*sse2_pmovskb_zexthisi): New define_insn_and_split for zero_extend of subreg HI of pmovskb result. (*sse2_pmovskb_zexthisi): Add new combine splitters for zero_extend of not of subreg HI of pmovskb result. gcc/testsuite/ChangeLog * gcc.target/i386/sse2-pr98461-2.c: New test.
2021-01-05explow, aarch64: Fix force-Pmode-to-mem for ILP32 [PR97269]Richard Sandiford
This patch fixes a mode/rtx mismatch for ILP32 targets in: mem = force_const_mem (ptr_mode, imm); where imm can be Pmode rather than ptr_mode. The patch uses convert_memory_address to convert the Pmode address to ptr_mode before the call. However, immediate addresses can in general contain unspecs, and convert_memory_address wasn't set up to handle those. The patch therefore adds some generic unspec handling to convert_memory_address_addr_space_1. As the comment says, we can add a target hook if this behaviour turns out to be wrong for some targets. But I think what the patch does is a strict improvement over the status quo: without it, we would try to force the unspec into a register, but nevertheless wrap the result in a (const ...). That in turn would be invalid rtl and seems bound to generate an ICE later. I tested the explow.c part using -fstack-protector with local hacks to force SYMBOL_FORCE_TO_MEM for UNSPEC_SALT_ADDR. Fixes c-c++-common/torture/pr57945.c and various other tests. gcc/ PR target/97269 * explow.c (convert_memory_address_addr_space_1): Handle UNSPECs nested in CONSTs. * config/aarch64/aarch64.c (aarch64_expand_mov_immediate): Use convert_memory_address to convert symbolic immediates to ptr_mode before forcing them to memory.
2021-01-05recog: Fix a constrain_operands corner case [PR97144]Richard Sandiford
aarch64's *add<mode>3_poly_1 has a pattern with the constraints: "=...,r,&r" "...,0,rk" "...,Uai,Uat" i.e. the penultimate alternative requires operands 0 and 1 to match, but the final alternative does not allow them to match. The register allocators dealt with this correctly, and so used different input and output registers for instructions with Uat operands. However, constrain_operands carried the penultimate alternative's matching rule over to the final alternative, so it would essentially ignore the earlyclobber. This in turn allowed postreload to convert a correct Uat pairing into an incorrect one. The fix is simple: recompute the matching information for each alternative. gcc/ PR rtl-optimization/97144 * recog.c (constrain_operands): Initialize matching_operand for each alternative, rather than only doing it once. gcc/testsuite/ PR rtl-optimization/97144 * gcc.c-torture/compile/pr97144.c: New test. * gcc.target/aarch64/sve/pr97144.c: Likewise.
2021-01-05rtl-ssa: Fix updates to call clobbers [PR98403]Richard Sandiford
In the PR, fwprop was changing a call instruction and tripped an assert when trying to update a list of call clobbers. There are two ways we could handle this: remove the call clobber and then add it back, or assume that the clobber will stay in its current place. At the moment we don't have enough information to safely move calls around, so the second approach seems simpler and more efficient. gcc/ PR rtl-optimization/98403 * rtl-ssa/changes.cc (function_info::finalize_new_accesses): Explain why we don't remove call clobbers. (function_info::apply_changes_to_insn): Don't attempt to add call clobbers here. gcc/testsuite/ PR rtl-optimization/98403 * g++.dg/opt/pr98403.C: New test.
2021-01-05vect: Fix missing alias checks for 128-bit SVE [PR98371]Richard Sandiford
On AArch64, the vectoriser tries various ways of vectorising with both SVE and Advanced SIMD and picks the best one. All other things being equal, it prefers earlier attempts over later attempts. The way this works currently is that, once it has a successful vectorisation attempt A, it analyses all other attempts as epilogue loops of A: /* When pick_lowest_cost_p is true, we should in principle iterate over all the loop_vec_infos that LOOP_VINFO could replace and try to vectorize LOOP_VINFO under the same conditions. E.g. when trying to replace an epilogue loop, we should vectorize LOOP_VINFO as an epilogue loop with the same VF limit. When trying to replace the main loop, we should vectorize LOOP_VINFO as a main loop too. However, autovectorize_vector_modes is usually sorted as follows: - Modes that naturally produce lower VFs usually follow modes that naturally produce higher VFs. - When modes naturally produce the same VF, maskable modes usually follow unmaskable ones, so that the maskable mode can be used to vectorize the epilogue of the unmaskable mode. This order is preferred because it leads to the maximum epilogue vectorization opportunities. Targets should only use a different order if they want to make wide modes available while disparaging them relative to earlier, smaller modes. The assumption in that case is that the wider modes are more expensive in some way that isn't reflected directly in the costs. There should therefore be few interesting cases in which LOOP_VINFO fails when treated as an epilogue loop, succeeds when treated as a standalone loop, and ends up being genuinely cheaper than FIRST_LOOP_VINFO. */ However, the vectoriser can normally elide alias checks for epilogue loops, on the basis that the main loop should do them instead. Converting an epilogue loop to a main loop can therefore cause the alias checks to be skipped. (It probably also unfairly penalises the original loop in the cost comparison, given that one loop will have alias checks and the other won't.) As the comment says, we should in principle analyse each vector mode twice: once as a main loop and once as an epilogue. However, doing that up-front would be quite expensive. This patch instead goes for a compromise: if an epilogue loop for mode M2 seems better than a main loop for mode M1, re-analyse with M2 as the main loop. The patch fixes dg.torture.exp=pr69719.c when testing with -msve-vector-bits=128. gcc/ PR tree-optimization/98371 * tree-vect-loop.c (vect_reanalyze_as_main_loop): New function. (vect_analyze_loop): If an epilogue loop appears to be cheaper than the main loop, re-analyze it as a main loop before adopting it as a main loop.
2021-01-05build: libcody: Link with -lsocket -lnsl if necessary [PR98316]Rainer Orth
With the introduction of C++20 modules and libcody, cc1plus and cc1objplus gained a dependency on the socket functions. Before those were merged into libc in Solaris 11.4, one needed to link with -lsocket -lnsl on Solaris, so that merge broke the Solaris 11.3 build. While we already have 4 different checks for those libraries in the tree, I decided to import autoconf-archive's AX_LIB_SOCKET_NSL macro instead. At the same time, the patch only links libcody and the networking libs where needed (cc1plus, cc1objplus). Bootstrapped without regressions on i386-pc-solaris2.11 (Solaris 11.3 and 11.4), sparc-sun-solaris2.11, and x86_64-pc-linux-gnu. 2020-12-16 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> c++tools: PR c++/98316 * configure.ac: Include ../config/ax_lib_socket_nsl.m4. (NETLIBS): Determine using AX_LIB_SOCKET_NSL. * configure: Regenerate. * Makefile.in (NETLIBS): Define. (g++-mapper-server$(exeext)): Add $(NETLIBS). gcc/objcp: PR c++/98316 * Make-lang.in (cc1objplus$(exeext)): Add $(CODYLIB), $(NETLIBS). gcc/cp: PR c++/98316 * Make-lang.in (cc1plus$(exeext)): Add $(CODYLIB), $(NETLIBS). gcc: PR c++/98316 * configure.ac (NETLIBS): Determine using AX_LIB_SOCKET_NSL. * aclocal.m4, configure: Regenerate. * Makefile.in (NETLIBS): Define. (BACKEND): Remove $(CODYLIB). config: PR c++/98316 * ax_lib_socket_nsl.m4: Import from autoconf-archive.
2021-01-05simplify-rtx: Optimize (x - 1) * y + y [PR98334]Jakub Jelinek
We don't try to optimize for signed x, y (int) (x - 1U) * y + y into x * y, we can't do that with signed x * y, because the former is well defined for INT_MIN and -1, while the latter is not. We could perhaps optimize it during isel or some very late optimization where we'd turn magically flag_wrapv, but we don't do that yet. This patch optimizes it in simplify-rtx.c, such that we can optimize it during combine. 2021-01-05 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/98334 * simplify-rtx.c (simplify_context::simplify_binary_operation_1): Optimize (X - 1) * Y + Y to X * Y or (X + 1) * Y - Y to X * Y. * gcc.target/i386/pr98334.c: New test.
2021-01-05Restore input_location after recursive expand_call_inlineBernd Edlinger
This is just a precautionary fix. 2021-01-05 Bernd Edlinger <bernd.edlinger@hotmail.de> * tree-inline.c (expand_call_inline): Restore input_location. Return result from recursive call.
2021-01-05Fix testsuite/g++.dg/cpp1y/constexpr-66093.C execution failure...Jerome Lambourg
The constexpr iteration dereferenced an array element past the end of the array. for gcc/testsuite/ChangeLog * g++.dg/cpp1y/constexpr-66093.C: Fix bounds issue.
2021-01-04Go frontend: add -fgo-embedcfg optionIan Lance Taylor
This option will be used by the go command to implement go:embed directives, which are new with the upcoming Go 1.16 release. * lang.opt (fgo-embedcfg): New option. * go-c.h (struct go_create_gogo_args): Add embedcfg field. * go-lang.c (go_embedcfg): New static variable. (go_langhook_init): Set go_create_gogo_args embedcfg field. (go_langhook_handle_option): Handle OPT_fgo_embedcfg_. * gccgo.texi (Invoking gccgo): Document -fgo-embedcfg.
2021-01-04analyzer: fix ICE with -fsanitize=undefined [PR98293]David Malcolm
-fsanitize=undefined with calls to nonnull functions creates struct __ubsan_nonnull_arg_data instances with CONSTRUCTORs for RECORD_TYPEs with NULL index values. The analyzer was mistakenly using INTEGER_CST for these fields, leading to ICEs. Fix the issue by iterating through the fields in the type for such cases, imitating similar logic in varasm.c's output_constructor. gcc/analyzer/ChangeLog: PR analyzer/98293 * store.cc (binding_map::apply_ctor_to_region): When "index" is NULL, iterate through the fields for RECORD_TYPEs, rather than creating an INTEGER_CST index. gcc/testsuite/ChangeLog: PR analyzer/98293 * gcc.dg/analyzer/pr98293.c: New test.
2021-01-05Daily bump.GCC Administrator
2021-01-04C: Add test for incorrect warning for assignment of certain volatile ↵Martin Uecker
expressions fixed by commit 58a45ce [PR98029] 2021-01-04 Martin Uecker <muecker@gwdg.de> gcc/testsuite/ PR c/98029 * gcc.dg/pr98029.c: New test.
2021-01-04MAINTAINERS: Update my email address.Philipp Tomsich
2021-01-04 Philipp Tomsich <philipp.tomsich@vrull.eu> * MAINTAINERS: Update my email address.
2021-01-04c++: Add stdlib module test casesNathan Sidwell
The remaining modules tests use the std library. These are those. gcc/testsuite/ * g++.dg/modules/binding-1_a.H: New. * g++.dg/modules/binding-1_b.H: New. * g++.dg/modules/binding-1_c.C: New. * g++.dg/modules/binding-2.H: New. * g++.dg/modules/builtin-3_a.C: New. * g++.dg/modules/global-2_a.C: New. * g++.dg/modules/global-2_b.C: New. * g++.dg/modules/global-3_a.C: New. * g++.dg/modules/global-3_b.C: New. * g++.dg/modules/hello-1_a.C: New. * g++.dg/modules/hello-1_b.C: New. * g++.dg/modules/iostream-1_a.H: New. * g++.dg/modules/iostream-1_b.C: New. * g++.dg/modules/part-5_a.C: New. * g++.dg/modules/part-5_b.C: New. * g++.dg/modules/part-5_c.C: New. * g++.dg/modules/stdio-1_a.H: New. * g++.dg/modules/stdio-1_b.C: New. * g++.dg/modules/string-1_a.H: New. * g++.dg/modules/string-1_b.C: New. * g++.dg/modules/string-view1.C: New. * g++.dg/modules/string-view2.C: New. * g++.dg/modules/tinfo-1.C: New. * g++.dg/modules/tinfo-2_a.H: New. * g++.dg/modules/tinfo-2_b.C: New. * g++.dg/modules/tname-spec-1_a.H: New. * g++.dg/modules/tname-spec-1_b.C: New. * g++.dg/modules/xtreme-header-1.h: New. * g++.dg/modules/xtreme-header-1_a.H: New. * g++.dg/modules/xtreme-header-1_b.C: New. * g++.dg/modules/xtreme-header-1_c.C: New. * g++.dg/modules/xtreme-header-2.h: New. * g++.dg/modules/xtreme-header-2_a.H: New. * g++.dg/modules/xtreme-header-2_b.C: New. * g++.dg/modules/xtreme-header-2_c.C: New. * g++.dg/modules/xtreme-header-3.h: New. * g++.dg/modules/xtreme-header-3_a.H: New. * g++.dg/modules/xtreme-header-3_b.C: New. * g++.dg/modules/xtreme-header-3_c.C: New. * g++.dg/modules/xtreme-header-4.h: New. * g++.dg/modules/xtreme-header-4_a.H: New. * g++.dg/modules/xtreme-header-4_b.C: New. * g++.dg/modules/xtreme-header-4_c.C: New. * g++.dg/modules/xtreme-header-5.h: New. * g++.dg/modules/xtreme-header-5_a.H: New. * g++.dg/modules/xtreme-header-5_b.C: New. * g++.dg/modules/xtreme-header-5_c.C: New. * g++.dg/modules/xtreme-header-6.h: New. * g++.dg/modules/xtreme-header-6_a.H: New. * g++.dg/modules/xtreme-header-6_b.C: New. * g++.dg/modules/xtreme-header-6_c.C: New. * g++.dg/modules/xtreme-header.h: New. * g++.dg/modules/xtreme-header_a.H: New. * g++.dg/modules/xtreme-header_b.C: New. * g++.dg/modules/xtreme-tr1.h: New. * g++.dg/modules/xtreme-tr1_a.H: New. * g++.dg/modules/xtreme-tr1_b.C: New.
2021-01-04vect, aarch64: Fix alignment units for IFN_MASK* [PR95401]Richard Sandiford
The IFN_MASK* functions take two leading arguments: a load or store pointer and a “cookie”. The type of the cookie is the type of the access for TBAA purposes (like for MEM_REFs) while the value of the cookie is the alignment of the access. This PR was caused by a disagreement about whether the alignment is measured in bits or bytes. It looks like this goes back to PR68786, which made the vectoriser create its own cookie argument rather than reusing the one created by ifcvt. The alignment value of the new cookie was measured in bytes (as needed by set_ptr_info_alignment) while the existing code expected it to be measured in bits. The folds I added for IFN_MASK_LOAD and STORE then made things worse. gcc/ PR tree-optimization/95401 * config/aarch64/aarch64-sve-builtins.cc (gimple_folder::load_store_cookie): Use bits rather than bytes for the alignment argument to IFN_MASK_LOAD and IFN_MASK_STORE. * gimple-fold.c (gimple_fold_mask_load_store_mem_ref): Likewise. * tree-vect-stmts.c (vectorizable_store): Likewise. (vectorizable_load): Likewise. gcc/testsuite/ PR tree-optimization/95401 * g++.dg/vect/pr95401.cc: New test. * g++.dg/vect/pr95401a.cc: Likewise.
2021-01-04[libcody] Remove some std::move [PR 98368]Nathan Sidwell
Compiling on clang showed a couple of pessimizations. Fixed thusly. libcody/ * client.cc (Client::ProcessResponse): Remove std::move inside ?: c++tools/ * resolver.cc (module_resolver::cmi_response): Remove std::move of temporary.
2021-01-04[libcody] Windows absdir fixMateusz Wajchęprzełóż
An obvious thinko in dirve name check :( libcody/ * resolver.cc (IsAbsDir): Fix string indexing. Signed-off-by: Nathan Sidwell <nathan@acm.org>
2021-01-04tree-optimization/98308 - set vector type for mask of masked loadRichard Biener
This makes sure to set the vector type on an invariant mask argument for a masked load and SLP. 2021-01-04 Richard Biener <rguenther@suse.de> PR tree-optimization/98308 * tree-vect-stmts.c (vectorizable_load): Set invariant mask SLP vectype. * gcc.dg/vect/pr98308.c: New testcase.
2021-01-04loop-niter: Recognize popcount idioms even with char, short and __int128 ↵Jakub Jelinek
[PR95771] As the testcase shows, we punt unnecessarily on popcount loop idioms if the type is smaller than int or larger than long long. Smaller type than int can be handled by zero-extending the argument to unsigned int, and types twice as long as long long by doing __builtin_popcountll on both halves of the __int128. 2020-01-04 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/95771 * tree-ssa-loop-niter.c (number_of_iterations_popcount): Handle types with precision smaller than int's precision and types with precision twice as large as long long. Formatting fixes. * gcc.target/i386/pr95771.c: New test.