Merge up to 236941.

git-svn-id: https://gcc.gnu.org/svn/gcc/branches/ibm/power9-gcc6@236943 138bc75d-0d04-0410-961f-82ee72b054a4
author: Michael Meissner <meissner@linux.vnet.ibm.com> 2016-05-31 19:32:29 +0000
committer: Michael Meissner <meissner@linux.vnet.ibm.com> 2016-05-31 19:32:29 +0000
commit: bacd566421379a78d1fcd79fc7ccad3dde0ab185 (patch)
tree: bbb7d524571cef68f87e373685160dcf9dd7610b
parent: 2f4ed407112e38c9ff43ceb61570fe760987e4fb (diff)
parent: 741da3c33f74f41851aa32437020a8fd1fc350a4 (diff)
54 files changed, 2085 insertions, 155 deletions
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 021ece15ba9..4286aad58b3 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,152 @@
+2016-05-31  Richard Biener  <rguenther@suse.de>
+
+	Backport from mainline
+	2016-05-11  Richard Biener  <rguenther@suse.de>
+
+	PR debug/71057
+	* dwarf2out.c (retry_incomplete_types): Set early_dwarf.
+	(dwarf2out_finish): Move retry_incomplete_types call ...
+	(dwarf2out_early_finish): ... here.
+
+2016-05-31  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
+
+	PR target/71056
+	* config/arm/arm-builtins.c (arm_builtin_vectorized_function): Return
+	NULL_TREE early if NEON is not available.  Remove now redundant check
+	in ARM_CHECK_BUILTIN_MODE.
+
+2016-05-31  Tom de Vries  <tom@codesourcery.com>
+
+	backport:
+	2016-05-31  Tom de Vries  <tom@codesourcery.com>
+
+	PR tree-optimization/69068
+	* graphite-isl-ast-to-gimple.c (copy_bb_and_scalar_dependences): Handle
+	phis with more than two args.
+
+2016-05-30  Andreas Tobler  <andreast@gcc.gnu.org>
+
+	Backport from mainline
+        2016-05-30  Andreas Tobler  <andreast@gcc.gnu.org>
+
+	* config.gcc: Move hard float support for arm*hf*-*-freebsd* into
+	armv6*-*-freebsd* for FreeBSD 11. Eliminate the arm*hf*-*-freebsd*
+	target.
+
+2016-05-30  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
+
+	Backport from mainline
+	2016-04-29  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
+
+	* config/rs6000/altivec.h: Change definitions of vec_xl and
+	vec_xst.
+	* config/rs6000/rs6000-builtin.def (LD_ELEMREV_V2DF): New.
+	(LD_ELEMREV_V2DI): New.
+	(LD_ELEMREV_V4SF): New.
+	(LD_ELEMREV_V4SI): New.
+	(LD_ELEMREV_V8HI): New.
+	(LD_ELEMREV_V16QI): New.
+	(ST_ELEMREV_V2DF): New.
+	(ST_ELEMREV_V2DI): New.
+	(ST_ELEMREV_V4SF): New.
+	(ST_ELEMREV_V4SI): New.
+	(ST_ELEMREV_V8HI): New.
+	(ST_ELEMREV_V16QI): New.
+	(XL): New.
+	(XST): New.
+	* config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Add
+	descriptions for VSX_BUILTIN_VEC_XL and VSX_BUILTIN_VEC_XST.
+	* config/rs6000/rs6000.c (rs6000_builtin_mask_calculate): Map from
+	TARGET_P9_VECTOR to RS6000_BTM_P9_VECTOR.
+	(altivec_expand_builtin): Add handling for
+	VSX_BUILTIN_ST_ELEMREV_<MODE> and VSX_BUILTIN_LD_ELEMREV_<MODE>.
+	(rs6000_invalid_builtin): Add error-checking for
+	RS6000_BTM_P9_VECTOR.
+	(altivec_init_builtins): Define builtins used to implement vec_xl
+	and vec_xst.
+	(rs6000_builtin_mask_names): Define power9-vector.
+	* config/rs6000/rs6000.h (MASK_P9_VECTOR): Define.
+	(RS6000_BTM_P9_VECTOR): Define.
+	(RS6000_BTM_COMMON): Include RS6000_BTM_P9_VECTOR.
+	* config/rs6000/vsx.md (vsx_ld_elemrev_v2di): New define_insn.
+	(vsx_ld_elemrev_v2df): Likewise.
+	(vsx_ld_elemrev_v4sf): Likewise.
+	(vsx_ld_elemrev_v4si): Likewise.
+	(vsx_ld_elemrev_v8hi): Likewise.
+	(vsx_ld_elemrev_v16qi): Likewise.
+	(vsx_st_elemrev_v2df): Likewise.
+	(vsx_st_elemrev_v2di): Likewise.
+	(vsx_st_elemrev_v4sf): Likewise.
+	(vsx_st_elemrev_v4si): Likewise.
+	(vsx_st_elemrev_v8hi): Likewise.
+	(vsx_st_elemrev_v16qi): Likewise.
+	* doc/extend.texi: Add prototypes for vec_xl and vec_xst.  Correct
+	grammar.
+
+2016-05-30  Richard Biener  <rguenther@suse.de>
+
+	Backport from mainline
+	2016-05-11  Richard Biener  <rguenther@suse.de>
+
+	PR middle-end/71002
+	* alias.c (reference_alias_ptr_type): Preserve alias-set zero
+	if the langhook insists on it.
+	* fold-const.c (make_bit_field_ref): Add arg for the original
+	reference and preserve its alias-set.
+	(decode_field_reference): Take exp by reference and adjust it
+	to the original memory reference.
+	(optimize_bit_field_compare): Adjust callers.
+	(fold_truth_andor_1): Likewise.
+
+	2016-05-13  Jakub Jelinek  <jakub@redhat.com>
+
+	PR bootstrap/71071
+	* fold-const.c (fold_checksum_tree): Allow modification
+	of TYPE_ALIAS_SET during folding.
+
+2016-05-30  Eric Botcazou  <ebotcazou@adacore.com>
+
+	* config/visium/visium.c (visium_split_double_add): Minor tweaks.
+	(visium_expand_copysign): Use gen_int_mode directly.
+	(visium_compute_frame_size): Minor tweaks.
+
+2016-05-30  Tom de Vries  <tom@codesourcery.com>
+
+	backport:
+	2016-05-30  Tom de Vries  <tom@codesourcery.com>
+
+	PR tree-optimization/69067
+	* graphite-isl-ast-to-gimple.c (get_def_bb_for_const): Remove assert.
+
+2016-05-27  Eric Botcazou  <ebotcazou@adacore.com>
+
+	* config/visium/visium-protos.h (split_double_move): Rename into...
+	(visium_split_double_move): ...this.
+	(visium_split_double_add): Declare.
+	* config/visium/visium.c (split_double_move): Rename into...
+	(visium_split_double_move): ...this.
+	(visium_split_double_add): New function.
+	(visium_expand_copysign): Renumber operands for consistency.
+	* config/visium/visium.md (DImode move splitter): Adjust to renaming.
+	(DFmode move splitter): Likewise.
+	(*addi3_insn): Split by means of visium_split_double_add.
+	(*adddi3_insn_flags): Delete.
+	(*plus_plus_sltu<subst_arith>): New insn.
+	(*subdi3_insn): Split by means of visium_split_double_add.
+	(subdi3_insn_flags): Delete.
+	(*minus_minus_sltu<subst_arith>): New insn.
+	(*negdi2_insn): Split by means of visium_split_double_add.
+	(*negdi2_insn_flags): Delete.
+
+2016-05-27  Ilya Enkovich  <ilya.enkovich@intel.com>
+
+	Backport from mainline r236810.
+	2016-05-27  Ilya Enkovich  <ilya.enkovich@intel.com>
+
+	PR middle-end/71279
+	* fold-const.c (fold_ternary_loc): Don't fold VEC_COND_EXPR
+	into comparison.
+
 2016-05-25  Eric Botcazou  <ebotcazou@adacore.com>
 
 	* tree-ssa-phiopt.c (factor_out_conditional_conversion): Remove
diff --git a/gcc/ChangeLog.meissner b/gcc/ChangeLog.meissner
index efd9f122b83..6a5b4b5d750 100644
--- a/gcc/ChangeLog.meissner
+++ b/gcc/ChangeLog.meissner
@@ -1,3 +1,8 @@
+2016-05-31  Michael Meissner  <meissner@linux.vnet.ibm.com>
+
+	Merge up to 236941.
+	* REVISION: Update subversion id.
+
 2016-05-26   Michael Meissner  <meissner@linux.vnet.ibm.com>
 
 	Clone branch subversion id 236789
diff --git a/gcc/DATESTAMP b/gcc/DATESTAMP
index a335ce99a97..9e5bee5c73a 100644
--- a/gcc/DATESTAMP
+++ b/gcc/DATESTAMP
@@ -1 +1 @@
-20160526
+20160531
diff --git a/gcc/REVISION b/gcc/REVISION
index e989a4c770b..bae04b99e91 100644
--- a/gcc/REVISION
+++ b/gcc/REVISION
@@ -1 +1 @@
-power9-gcc6 branch, based on subversion id 236789
+power9-gcc6 branch, based on subversion id 236941.
diff --git a/gcc/ada/ChangeLog b/gcc/ada/ChangeLog
index 9856530ff4e..891c73d1401 100644
--- a/gcc/ada/ChangeLog
+++ b/gcc/ada/ChangeLog
@@ -1,3 +1,14 @@
+2016-05-31  Eric Botcazou  <ebotcazou@adacore.com>
+
+	* s-osinte-kfreebsd-gnu.ads (clock_getres): Define.
+	(Get_Page_Size): Remove duplicate and return int.
+
+2016-05-31  Jan Sommer  <soja-lists@aries.uberspace.de>
+
+	PR ada/71317
+	* s-osinte-rtems.ads (clock_getres): Define.
+	(Get_Page_Size): Remove duplicate and return int.
+
 2016-05-06  Eric Botcazou  <ebotcazou@adacore.com>
 
 	PR ada/70969
diff --git a/gcc/ada/s-osinte-kfreebsd-gnu.ads b/gcc/ada/s-osinte-kfreebsd-gnu.ads
index 3f6ef9bb409..647778bb053 100644
--- a/gcc/ada/s-osinte-kfreebsd-gnu.ads
+++ b/gcc/ada/s-osinte-kfreebsd-gnu.ads
@@ -7,7 +7,7 @@
 --                                  S p e c                                 --
 --                                                                          --
 --               Copyright (C) 1991-1994, Florida State University          --
---            Copyright (C) 1995-2015, Free Software Foundation, Inc.       --
+--            Copyright (C) 1995-2016, Free Software Foundation, Inc.       --
 --                                                                          --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -216,6 +216,11 @@ package System.OS_Interface is
       return int;
    pragma Import (C, clock_gettime, "clock_gettime");
 
+   function clock_getres
+     (clock_id : clockid_t;
+      res      : access timespec) return int;
+   pragma Import (C, clock_getres, "clock_getres");
+
    function To_Duration (TS : timespec) return Duration;
    pragma Inline (To_Duration);
 
@@ -330,8 +335,7 @@ package System.OS_Interface is
    --  returns the stack base of the specified thread. Only call this function
    --  when Stack_Base_Available is True.
 
-   function Get_Page_Size return size_t;
-   function Get_Page_Size return Address;
+   function Get_Page_Size return int;
    pragma Import (C, Get_Page_Size, "getpagesize");
    --  Returns the size of a page
 
diff --git a/gcc/ada/s-osinte-rtems.ads b/gcc/ada/s-osinte-rtems.ads
index 5a143cc666a..a658bbe8b0d 100644
--- a/gcc/ada/s-osinte-rtems.ads
+++ b/gcc/ada/s-osinte-rtems.ads
@@ -6,7 +6,7 @@
 --                                                                          --
 --                                   S p e c                                --
 --                                                                          --
---          Copyright (C) 1997-2011 Free Software Foundation, Inc.          --
+--          Copyright (C) 1997-2016 Free Software Foundation, Inc.          --
 --                                                                          --
 -- GNARL is free software; you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -188,6 +188,11 @@ package System.OS_Interface is
       tp       : access timespec) return int;
    pragma Import (C, clock_gettime, "clock_gettime");
 
+   function clock_getres
+     (clock_id : clockid_t;
+      res      : access timespec) return int;
+   pragma Import (C, clock_getres, "clock_getres");
+
    function To_Duration (TS : timespec) return Duration;
    pragma Inline (To_Duration);
 
@@ -291,8 +296,7 @@ package System.OS_Interface is
    --  These two functions are only needed to share s-taprop.adb with
    --  FSU threads.
 
-   function Get_Page_Size return size_t;
-   function Get_Page_Size return Address;
+   function Get_Page_Size return int;
    pragma Import (C, Get_Page_Size, "getpagesize");
    --  Returns the size of a page
 
diff --git a/gcc/alias.c b/gcc/alias.c
index a0e25dcce06..ea226fcf0ea 100644
--- a/gcc/alias.c
+++ b/gcc/alias.c
@@ -769,6 +769,10 @@ reference_alias_ptr_type_1 (tree *t)
 tree
 reference_alias_ptr_type (tree t)
 {
+  /* If the frontend assigns this alias-set zero, preserve that.  */
+  if (lang_hooks.get_alias_set (t) == 0)
+    return ptr_type_node;
+
   tree ptype = reference_alias_ptr_type_1 (&t);
   /* If there is a given pointer type for aliasing purposes, return it.  */
   if (ptype != NULL_TREE)
diff --git a/gcc/c-family/ChangeLog b/gcc/c-family/ChangeLog
index 1e7b27b265d..73a882c9e91 100644
--- a/gcc/c-family/ChangeLog
+++ b/gcc/c-family/ChangeLog
@@ -1,3 +1,11 @@
+2016-05-30  Jakub Jelinek  <jakub@redhat.com>
+
+	PR c++/71349
+	* c-omp.c (c_omp_split_clauses): Put OMP_CLAUSE_DEPEND to
+	C_OMP_CLAUSE_SPLIT_TARGET.  Put OMP_CLAUSE_NOWAIT to
+	C_OMP_CLAUSE_SPLIT_TARGET if combined with target construct,
+	instead of C_OMP_CLAUSE_SPLIT_FOR.
+
 2016-04-29  Cesar Philippidis  <cesar@codesourcery.com>
 
 	PR middle-end/70626
diff --git a/gcc/c-family/c-omp.c b/gcc/c-family/c-omp.c
index be401bbb6b4..1691c40f11a 100644
--- a/gcc/c-family/c-omp.c
+++ b/gcc/c-family/c-omp.c
@@ -983,6 +983,7 @@ c_omp_split_clauses (location_t loc, enum tree_code code,
 	case OMP_CLAUSE_MAP:
 	case OMP_CLAUSE_IS_DEVICE_PTR:
 	case OMP_CLAUSE_DEFAULTMAP:
+	case OMP_CLAUSE_DEPEND:
 	  s = C_OMP_CLAUSE_SPLIT_TARGET;
 	  break;
 	case OMP_CLAUSE_NUM_TEAMS:
@@ -998,7 +999,6 @@ c_omp_split_clauses (location_t loc, enum tree_code code,
 	  s = C_OMP_CLAUSE_SPLIT_PARALLEL;
 	  break;
 	case OMP_CLAUSE_ORDERED:
-	case OMP_CLAUSE_NOWAIT:
 	  s = C_OMP_CLAUSE_SPLIT_FOR;
 	  break;
 	case OMP_CLAUSE_SCHEDULE:
@@ -1333,6 +1333,18 @@ c_omp_split_clauses (location_t loc, enum tree_code code,
 	  else
 	    s = C_OMP_CLAUSE_SPLIT_FOR;
 	  break;
+	case OMP_CLAUSE_NOWAIT:
+	  /* Nowait clause is allowed on target, for and sections, but
+	     is not allowed on parallel for or parallel sections.  Therefore,
+	     put it on target construct if present, because that can only
+	     be combined with parallel for{, simd} and not with for{, simd},
+	     otherwise to the worksharing construct.  */
+	  if ((mask & (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_MAP))
+	      != 0)
+	    s = C_OMP_CLAUSE_SPLIT_TARGET;
+	  else
+	    s = C_OMP_CLAUSE_SPLIT_FOR;
+	  break;
 	default:
 	  gcc_unreachable ();
 	}
diff --git a/gcc/c/ChangeLog b/gcc/c/ChangeLog
index d7c0c023dc2..fc8921678ba 100644
--- a/gcc/c/ChangeLog
+++ b/gcc/c/ChangeLog
@@ -1,3 +1,9 @@
+2016-05-30  Jakub Jelinek  <jakub@redhat.com>
+
+	PR c++/71349
+	* c-parser.c (c_parser_omp_for): Don't disallow nowait clause
+	when combined with target construct.
+
 2016-05-19  David Malcolm  <dmalcolm@redhat.com>
 
 	Backport from trunk r236488.
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 1a47d0e9a3e..77c49a1e42e 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -15094,7 +15094,9 @@ c_parser_omp_for (location_t loc, c_parser *parser,
 
   strcat (p_name, " for");
   mask |= OMP_FOR_CLAUSE_MASK;
-  if (cclauses)
+  /* parallel for{, simd} disallows nowait clause, but for
+     target {teams distribute ,}parallel for{, simd} it should be accepted.  */
+  if (cclauses && (mask & (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_MAP)) == 0)
     mask &= ~(OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_NOWAIT);
   /* Composite distribute parallel for{, simd} disallows ordered clause.  */
   if ((mask & (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_DIST_SCHEDULE)) != 0)
diff --git a/gcc/config.gcc b/gcc/config.gcc
index f66e48cd1ca..beb50faec22 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1058,11 +1058,9 @@ arm*-*-freebsd*)                # ARM FreeBSD EABI
 	case $target in
 	armv6*-*-freebsd*)
 	    tm_defines="${tm_defines} TARGET_FREEBSD_ARMv6=1"
-	    ;;
-	esac
-	case $target in
-	arm*hf-*-freebsd*)
-	    tm_defines="${tm_defines} TARGET_FREEBSD_ARM_HARD_FLOAT=1"
+            if test $fbsd_major -ge 11; then
+               tm_defines="${tm_defines} TARGET_FREEBSD_ARM_HARD_FLOAT=1"
+            fi
 	    ;;
 	esac
 	with_tls=${with_tls:-gnu}
diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 90fb40fed24..68b2839879f 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -2861,6 +2861,10 @@ arm_builtin_vectorized_function (unsigned int fn, tree type_out, tree type_in)
   int in_n, out_n;
   bool out_unsigned_p = TYPE_UNSIGNED (type_out);
 
+  /* Can't provide any vectorized builtins when we can't use NEON.  */
+  if (!TARGET_NEON)
+    return NULL_TREE;
+
   if (TREE_CODE (type_out) != VECTOR_TYPE
       || TREE_CODE (type_in) != VECTOR_TYPE)
     return NULL_TREE;
@@ -2875,7 +2879,7 @@ arm_builtin_vectorized_function (unsigned int fn, tree type_out, tree type_in)
    NULL_TREE is returned if no such builtin is available.  */
 #undef ARM_CHECK_BUILTIN_MODE
 #define ARM_CHECK_BUILTIN_MODE(C)    \
-  (TARGET_NEON && TARGET_FPU_ARMV8   \
+  (TARGET_FPU_ARMV8   \
    && flag_unsafe_math_optimizations \
    && ARM_CHECK_BUILTIN_MODE_1 (C))
 
diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index ea6af8d192d..5fc1cce0165 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -327,8 +327,8 @@
 #define vec_sqrt __builtin_vec_sqrt
 #define vec_vsx_ld __builtin_vec_vsx_ld
 #define vec_vsx_st __builtin_vec_vsx_st
-#define vec_xl __builtin_vec_vsx_ld
-#define vec_xst __builtin_vec_vsx_st
+#define vec_xl __builtin_vec_xl
+#define vec_xst __builtin_vec_xst
 
 /* Note, xxsldi and xxpermdi were added as __builtin_vsx_<xxx> functions
    instead of __builtin_vec_<xxx>  */
diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def
index 891d2402676..6f332788684 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -1398,6 +1398,18 @@ BU_VSX_X (STXVW4X_V4SF,	      "stxvw4x_v4sf",	MEM)
 BU_VSX_X (STXVW4X_V4SI,	      "stxvw4x_v4si",	MEM)
 BU_VSX_X (STXVW4X_V8HI,	      "stxvw4x_v8hi",	MEM)
 BU_VSX_X (STXVW4X_V16QI,      "stxvw4x_v16qi",	MEM)
+BU_VSX_X (LD_ELEMREV_V2DF,    "ld_elemrev_v2df",  MEM)
+BU_VSX_X (LD_ELEMREV_V2DI,    "ld_elemrev_v2di",  MEM)
+BU_VSX_X (LD_ELEMREV_V4SF,    "ld_elemrev_v4sf",  MEM)
+BU_VSX_X (LD_ELEMREV_V4SI,    "ld_elemrev_v4si",  MEM)
+BU_VSX_X (LD_ELEMREV_V8HI,    "ld_elemrev_v8hi",  MEM)
+BU_VSX_X (LD_ELEMREV_V16QI,   "ld_elemrev_v16qi", MEM)
+BU_VSX_X (ST_ELEMREV_V2DF,    "st_elemrev_v2df",  MEM)
+BU_VSX_X (ST_ELEMREV_V2DI,    "st_elemrev_v2di",  MEM)
+BU_VSX_X (ST_ELEMREV_V4SF,    "st_elemrev_v4sf",  MEM)
+BU_VSX_X (ST_ELEMREV_V4SI,    "st_elemrev_v4si",  MEM)
+BU_VSX_X (ST_ELEMREV_V8HI,    "st_elemrev_v8hi",  MEM)
+BU_VSX_X (ST_ELEMREV_V16QI,   "st_elemrev_v16qi", MEM)
 BU_VSX_X (XSABSDP,	      "xsabsdp",	CONST)
 BU_VSX_X (XSADDDP,	      "xsadddp",	FP)
 BU_VSX_X (XSCMPODP,	      "xscmpodp",	FP)
@@ -1455,6 +1467,8 @@ BU_VSX_OVERLOAD_1 (DOUBLE,   "double")
 /* VSX builtins that are handled as special cases.  */
 BU_VSX_OVERLOAD_X (LD,	     "ld")
 BU_VSX_OVERLOAD_X (ST,	     "st")
+BU_VSX_OVERLOAD_X (XL,	     "xl")
+BU_VSX_OVERLOAD_X (XST,	     "xst")
 
 /* 1 argument VSX instructions added in ISA 2.07.  */
 BU_P8V_VSX_1 (XSCVSPDPN,      "xscvspdpn",	CONST,	vsx_xscvspdpn)
diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index ceb80b216ba..0985bb706fd 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -2726,6 +2726,49 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
     RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 },
   { ALTIVEC_BUILTIN_VEC_SUMS, ALTIVEC_BUILTIN_VSUMSWS,
     RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 },
+  { VSX_BUILTIN_VEC_XL, VSX_BUILTIN_LD_ELEMREV_V2DF,
+    RS6000_BTI_V2DF, RS6000_BTI_INTSI, ~RS6000_BTI_V2DF, 0 },
+  { VSX_BUILTIN_VEC_XL, VSX_BUILTIN_LD_ELEMREV_V2DF,
+    RS6000_BTI_V2DF, RS6000_BTI_INTSI, ~RS6000_BTI_double, 0 },
+  { VSX_BUILTIN_VEC_XL, VSX_BUILTIN_LD_ELEMREV_V2DI,
+    RS6000_BTI_V2DI, RS6000_BTI_INTSI, ~RS6000_BTI_V2DI, 0 },
+  { VSX_BUILTIN_VEC_XL, VSX_BUILTIN_LD_ELEMREV_V2DI,
+    RS6000_BTI_V2DI, RS6000_BTI_INTSI, ~RS6000_BTI_long_long, 0 },
+  { VSX_BUILTIN_VEC_XL, VSX_BUILTIN_LD_ELEMREV_V2DI,
+    RS6000_BTI_unsigned_V2DI, RS6000_BTI_INTSI,
+    ~RS6000_BTI_unsigned_V2DI, 0 },
+  { VSX_BUILTIN_VEC_XL, VSX_BUILTIN_LD_ELEMREV_V2DI,
+    RS6000_BTI_unsigned_V2DI, RS6000_BTI_INTSI,
+    ~RS6000_BTI_unsigned_long_long, 0 },
+  { VSX_BUILTIN_VEC_XL, VSX_BUILTIN_LD_ELEMREV_V4SF,
+    RS6000_BTI_V4SF, RS6000_BTI_INTSI, ~RS6000_BTI_V4SF, 0 },
+  { VSX_BUILTIN_VEC_XL, VSX_BUILTIN_LD_ELEMREV_V4SF,
+    RS6000_BTI_V4SF, RS6000_BTI_INTSI, ~RS6000_BTI_float, 0 },
+  { VSX_BUILTIN_VEC_XL, VSX_BUILTIN_LD_ELEMREV_V4SI,
+    RS6000_BTI_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_V4SI, 0 },
+  { VSX_BUILTIN_VEC_XL, VSX_BUILTIN_LD_ELEMREV_V4SI,
+    RS6000_BTI_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_INTSI, 0 },
+  { VSX_BUILTIN_VEC_XL, VSX_BUILTIN_LD_ELEMREV_V4SI,
+    RS6000_BTI_unsigned_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_unsigned_V4SI, 0 },
+  { VSX_BUILTIN_VEC_XL, VSX_BUILTIN_LD_ELEMREV_V4SI,
+    RS6000_BTI_unsigned_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTSI, 0 },
+  { VSX_BUILTIN_VEC_XL, VSX_BUILTIN_LD_ELEMREV_V8HI,
+    RS6000_BTI_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_V8HI, 0 },
+  { VSX_BUILTIN_VEC_XL, VSX_BUILTIN_LD_ELEMREV_V8HI,
+    RS6000_BTI_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_INTHI, 0 },
+  { VSX_BUILTIN_VEC_XL, VSX_BUILTIN_LD_ELEMREV_V8HI,
+    RS6000_BTI_unsigned_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_unsigned_V8HI, 0 },
+  { VSX_BUILTIN_VEC_XL, VSX_BUILTIN_LD_ELEMREV_V8HI,
+    RS6000_BTI_unsigned_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTHI, 0 },
+  { VSX_BUILTIN_VEC_XL, VSX_BUILTIN_LD_ELEMREV_V16QI,
+    RS6000_BTI_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_V16QI, 0 },
+  { VSX_BUILTIN_VEC_XL, VSX_BUILTIN_LD_ELEMREV_V16QI,
+    RS6000_BTI_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_INTQI, 0 },
+  { VSX_BUILTIN_VEC_XL, VSX_BUILTIN_LD_ELEMREV_V16QI,
+    RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI,
+    ~RS6000_BTI_unsigned_V16QI, 0 },
+  { VSX_BUILTIN_VEC_XL, VSX_BUILTIN_LD_ELEMREV_V16QI,
+    RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTQI, 0 },
   { ALTIVEC_BUILTIN_VEC_XOR, ALTIVEC_BUILTIN_VXOR,
     RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 },
   { ALTIVEC_BUILTIN_VEC_XOR, ALTIVEC_BUILTIN_VXOR,
@@ -3475,6 +3518,55 @@ const struct altivec_builtin_types altivec_overloaded_builtins[] = {
     RS6000_BTI_void, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_unsigned_V16QI },
   { ALTIVEC_BUILTIN_VEC_STVRXL, ALTIVEC_BUILTIN_STVRXL,
     RS6000_BTI_void, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTQI },
+  { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_ST_ELEMREV_V2DF,
+    RS6000_BTI_void, RS6000_BTI_V2DF, RS6000_BTI_INTSI, ~RS6000_BTI_V2DF },
+  { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_ST_ELEMREV_V2DF,
+    RS6000_BTI_void, RS6000_BTI_V2DF, RS6000_BTI_INTSI, ~RS6000_BTI_double },
+  { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_ST_ELEMREV_V2DI,
+    RS6000_BTI_void, RS6000_BTI_V2DI, RS6000_BTI_INTSI, ~RS6000_BTI_V2DI },
+  { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_ST_ELEMREV_V2DI,
+    RS6000_BTI_void, RS6000_BTI_V2DI, RS6000_BTI_INTSI,
+    ~RS6000_BTI_long_long },
+  { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_ST_ELEMREV_V2DI,
+    RS6000_BTI_void, RS6000_BTI_unsigned_V2DI, RS6000_BTI_INTSI,
+    ~RS6000_BTI_unsigned_V2DI },
+  { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_ST_ELEMREV_V2DI,
+    RS6000_BTI_void, RS6000_BTI_unsigned_V2DI, RS6000_BTI_INTSI,
+    ~RS6000_BTI_unsigned_long_long },
+  { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_ST_ELEMREV_V4SF,
+    RS6000_BTI_void, RS6000_BTI_V4SF, RS6000_BTI_INTSI, ~RS6000_BTI_V4SF },
+  { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_ST_ELEMREV_V4SF,
+    RS6000_BTI_void, RS6000_BTI_V4SF, RS6000_BTI_INTSI, ~RS6000_BTI_float },
+  { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_ST_ELEMREV_V4SI,
+    RS6000_BTI_void, RS6000_BTI_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_V4SI },
+  { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_ST_ELEMREV_V4SI,
+    RS6000_BTI_void, RS6000_BTI_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_INTSI },
+  { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_ST_ELEMREV_V4SI,
+    RS6000_BTI_void, RS6000_BTI_unsigned_V4SI, RS6000_BTI_INTSI,
+    ~RS6000_BTI_unsigned_V4SI },
+  { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_ST_ELEMREV_V4SI,
+    RS6000_BTI_void, RS6000_BTI_unsigned_V4SI, RS6000_BTI_INTSI,
+    ~RS6000_BTI_UINTSI },
+  { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_ST_ELEMREV_V8HI,
+    RS6000_BTI_void, RS6000_BTI_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_V8HI },
+  { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_ST_ELEMREV_V8HI,
+    RS6000_BTI_void, RS6000_BTI_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_INTHI },
+  { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_ST_ELEMREV_V8HI,
+    RS6000_BTI_void, RS6000_BTI_unsigned_V8HI, RS6000_BTI_INTSI,
+    ~RS6000_BTI_unsigned_V8HI },
+  { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_ST_ELEMREV_V8HI,
+    RS6000_BTI_void, RS6000_BTI_unsigned_V8HI, RS6000_BTI_INTSI,
+    ~RS6000_BTI_UINTHI },
+  { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_ST_ELEMREV_V16QI,
+    RS6000_BTI_void, RS6000_BTI_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_V16QI },
+  { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_ST_ELEMREV_V16QI,
+    RS6000_BTI_void, RS6000_BTI_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_INTQI },
+  { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_ST_ELEMREV_V16QI,
+    RS6000_BTI_void, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI,
+    ~RS6000_BTI_unsigned_V16QI },
+  { VSX_BUILTIN_VEC_XST, VSX_BUILTIN_ST_ELEMREV_V16QI,
+    RS6000_BTI_void, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI,
+    ~RS6000_BTI_UINTQI },
   { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_16QI,
     RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_NOT_OPAQUE },
   { VSX_BUILTIN_VEC_XXSLDWI, VSX_BUILTIN_XXSLDWI_16QI,
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index a22123ab7e4..9837aac0c2f 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -3633,6 +3633,7 @@ rs6000_builtin_mask_calculate (void)
 	  | ((TARGET_POPCNTD)		    ? RS6000_BTM_POPCNTD   : 0)
 	  | ((rs6000_cpu == PROCESSOR_CELL) ? RS6000_BTM_CELL      : 0)
 	  | ((TARGET_P8_VECTOR)		    ? RS6000_BTM_P8_VECTOR : 0)
+	  | ((TARGET_P9_VECTOR)		    ? RS6000_BTM_P9_VECTOR : 0)
 	  | ((TARGET_CRYPTO)		    ? RS6000_BTM_CRYPTO	   : 0)
 	  | ((TARGET_HTM)		    ? RS6000_BTM_HTM	   : 0)
 	  | ((TARGET_DFP)		    ? RS6000_BTM_DFP	   : 0)
@@ -14159,6 +14160,47 @@ altivec_expand_builtin (tree exp, rtx target, bool *expandedp)
     case VSX_BUILTIN_STXVW4X_V16QI:
       return altivec_expand_stv_builtin (CODE_FOR_vsx_store_v16qi, exp);
 
+    /* For the following on big endian, it's ok to use any appropriate
+       unaligned-supporting store, so use a generic expander.  For
+       little-endian, the exact element-reversing instruction must
+       be used.  */
+    case VSX_BUILTIN_ST_ELEMREV_V2DF:
+      {
+	enum insn_code code = (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v2df
+			       : CODE_FOR_vsx_st_elemrev_v2df);
+	return altivec_expand_stv_builtin (code, exp);
+      }
+    case VSX_BUILTIN_ST_ELEMREV_V2DI:
+      {
+	enum insn_code code = (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v2di
+			       : CODE_FOR_vsx_st_elemrev_v2di);
+	return altivec_expand_stv_builtin (code, exp);
+      }
+    case VSX_BUILTIN_ST_ELEMREV_V4SF:
+      {
+	enum insn_code code = (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v4sf
+			       : CODE_FOR_vsx_st_elemrev_v4sf);
+	return altivec_expand_stv_builtin (code, exp);
+      }
+    case VSX_BUILTIN_ST_ELEMREV_V4SI:
+      {
+	enum insn_code code = (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v4si
+			       : CODE_FOR_vsx_st_elemrev_v4si);
+	return altivec_expand_stv_builtin (code, exp);
+      }
+    case VSX_BUILTIN_ST_ELEMREV_V8HI:
+      {
+	enum insn_code code = (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v8hi
+			       : CODE_FOR_vsx_st_elemrev_v8hi);
+	return altivec_expand_stv_builtin (code, exp);
+      }
+    case VSX_BUILTIN_ST_ELEMREV_V16QI:
+      {
+	enum insn_code code = (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v16qi
+			       : CODE_FOR_vsx_st_elemrev_v16qi);
+	return altivec_expand_stv_builtin (code, exp);
+      }
+
     case ALTIVEC_BUILTIN_MFVSCR:
       icode = CODE_FOR_altivec_mfvscr;
       tmode = insn_data[icode].operand[0].mode;
@@ -14353,6 +14395,46 @@ altivec_expand_builtin (tree exp, rtx target, bool *expandedp)
     case VSX_BUILTIN_LXVW4X_V16QI:
       return altivec_expand_lv_builtin (CODE_FOR_vsx_load_v16qi,
 					exp, target, false);
+    /* For the following on big endian, it's ok to use any appropriate
+       unaligned-supporting load, so use a generic expander.  For
+       little-endian, the exact element-reversing instruction must
+       be used.  */
+    case VSX_BUILTIN_LD_ELEMREV_V2DF:
+      {
+	enum insn_code code = (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v2df
+			       : CODE_FOR_vsx_ld_elemrev_v2df);
+	return altivec_expand_lv_builtin (code, exp, target, false);
+      }
+    case VSX_BUILTIN_LD_ELEMREV_V2DI:
+      {
+	enum insn_code code = (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v2di
+			       : CODE_FOR_vsx_ld_elemrev_v2di);
+	return altivec_expand_lv_builtin (code, exp, target, false);
+      }
+    case VSX_BUILTIN_LD_ELEMREV_V4SF:
+      {
+	enum insn_code code = (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v4sf
+			       : CODE_FOR_vsx_ld_elemrev_v4sf);
+	return altivec_expand_lv_builtin (code, exp, target, false);
+      }
+    case VSX_BUILTIN_LD_ELEMREV_V4SI:
+      {
+	enum insn_code code = (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v4si
+			       : CODE_FOR_vsx_ld_elemrev_v4si);
+	return altivec_expand_lv_builtin (code, exp, target, false);
+      }
+    case VSX_BUILTIN_LD_ELEMREV_V8HI:
+      {
+	enum insn_code code = (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v8hi
+			       : CODE_FOR_vsx_ld_elemrev_v8hi);
+	return altivec_expand_lv_builtin (code, exp, target, false);
+      }
+    case VSX_BUILTIN_LD_ELEMREV_V16QI:
+      {
+	enum insn_code code = (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v16qi
+			       : CODE_FOR_vsx_ld_elemrev_v16qi);
+	return altivec_expand_lv_builtin (code, exp, target, false);
+      }
       break;
     default:
       break;
@@ -14822,6 +14904,8 @@ rs6000_invalid_builtin (enum rs6000_builtins fncode)
     error ("Builtin function %s requires the -mhard-dfp option", name);
   else if ((fnmask & RS6000_BTM_P8_VECTOR) != 0)
     error ("Builtin function %s requires the -mpower8-vector option", name);
+  else if ((fnmask & RS6000_BTM_P9_VECTOR) != 0)
+    error ("Builtin function %s requires the -mpower9-vector option", name);
   else if ((fnmask & (RS6000_BTM_HARD_FLOAT | RS6000_BTM_LDBL128))
 	   == (RS6000_BTM_HARD_FLOAT | RS6000_BTM_LDBL128))
     error ("Builtin function %s requires the -mhard-float and"
@@ -15846,10 +15930,44 @@ altivec_init_builtins (void)
 	       VSX_BUILTIN_STXVW4X_V8HI);
   def_builtin ("__builtin_vsx_stxvw4x_v16qi", void_ftype_v16qi_long_pvoid,
 	       VSX_BUILTIN_STXVW4X_V16QI);
+
+  def_builtin ("__builtin_vsx_ld_elemrev_v2df", v2df_ftype_long_pcvoid,
+	       VSX_BUILTIN_LD_ELEMREV_V2DF);
+  def_builtin ("__builtin_vsx_ld_elemrev_v2di", v2di_ftype_long_pcvoid,
+	       VSX_BUILTIN_LD_ELEMREV_V2DI);
+  def_builtin ("__builtin_vsx_ld_elemrev_v4sf", v4sf_ftype_long_pcvoid,
+	       VSX_BUILTIN_LD_ELEMREV_V4SF);
+  def_builtin ("__builtin_vsx_ld_elemrev_v4si", v4si_ftype_long_pcvoid,
+	       VSX_BUILTIN_LD_ELEMREV_V4SI);
+  def_builtin ("__builtin_vsx_st_elemrev_v2df", void_ftype_v2df_long_pvoid,
+	       VSX_BUILTIN_ST_ELEMREV_V2DF);
+  def_builtin ("__builtin_vsx_st_elemrev_v2di", void_ftype_v2di_long_pvoid,
+	       VSX_BUILTIN_ST_ELEMREV_V2DI);
+  def_builtin ("__builtin_vsx_st_elemrev_v4sf", void_ftype_v4sf_long_pvoid,
+	       VSX_BUILTIN_ST_ELEMREV_V4SF);
+  def_builtin ("__builtin_vsx_st_elemrev_v4si", void_ftype_v4si_long_pvoid,
+	       VSX_BUILTIN_ST_ELEMREV_V4SI);
+
+  if (TARGET_P9_VECTOR)
+    {
+      def_builtin ("__builtin_vsx_ld_elemrev_v8hi", v8hi_ftype_long_pcvoid,
+		   VSX_BUILTIN_LD_ELEMREV_V8HI);
+      def_builtin ("__builtin_vsx_ld_elemrev_v16qi", v16qi_ftype_long_pcvoid,
+		   VSX_BUILTIN_LD_ELEMREV_V16QI);
+      def_builtin ("__builtin_vsx_st_elemrev_v8hi",
+		   void_ftype_v8hi_long_pvoid, VSX_BUILTIN_ST_ELEMREV_V8HI);
+      def_builtin ("__builtin_vsx_st_elemrev_v16qi",
+		   void_ftype_v16qi_long_pvoid, VSX_BUILTIN_ST_ELEMREV_V16QI);
+    }
+
   def_builtin ("__builtin_vec_vsx_ld", opaque_ftype_long_pcvoid,
 	       VSX_BUILTIN_VEC_LD);
   def_builtin ("__builtin_vec_vsx_st", void_ftype_opaque_long_pvoid,
 	       VSX_BUILTIN_VEC_ST);
+  def_builtin ("__builtin_vec_xl", opaque_ftype_long_pcvoid,
+	       VSX_BUILTIN_VEC_XL);
+  def_builtin ("__builtin_vec_xst", void_ftype_opaque_long_pvoid,
+	       VSX_BUILTIN_VEC_XST);
 
   def_builtin ("__builtin_vec_step", int_ftype_opaque, ALTIVEC_BUILTIN_VEC_STEP);
   def_builtin ("__builtin_vec_splats", opaque_ftype_opaque, ALTIVEC_BUILTIN_VEC_SPLATS);
@@ -34555,6 +34673,7 @@ static struct rs6000_opt_mask const rs6000_builtin_mask_names[] =
   { "popcntd",		 RS6000_BTM_POPCNTD,	false, false },
   { "cell",		 RS6000_BTM_CELL,	false, false },
   { "power8-vector",	 RS6000_BTM_P8_VECTOR,	false, false },
+  { "power9-vector",	 RS6000_BTM_P9_VECTOR,	false, false },
   { "crypto",		 RS6000_BTM_CRYPTO,	false, false },
   { "htm",		 RS6000_BTM_HTM,	false, false },
   { "hard-dfp",		 RS6000_BTM_DFP,	false, false },
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 478826cfcff..a5f43879c9b 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -615,6 +615,7 @@ extern int rs6000_vector_align[];
 #define MASK_MULTIPLE			OPTION_MASK_MULTIPLE
 #define MASK_NO_UPDATE			OPTION_MASK_NO_UPDATE
 #define MASK_P8_VECTOR			OPTION_MASK_P8_VECTOR
+#define MASK_P9_VECTOR			OPTION_MASK_P9_VECTOR
 #define MASK_POPCNTB			OPTION_MASK_POPCNTB
 #define MASK_POPCNTD			OPTION_MASK_POPCNTD
 #define MASK_PPC_GFXOPT			OPTION_MASK_PPC_GFXOPT
@@ -2662,6 +2663,7 @@ extern int frame_pointer_needed;
 #define RS6000_BTM_ALTIVEC	MASK_ALTIVEC	/* VMX/altivec vectors.  */
 #define RS6000_BTM_VSX		MASK_VSX	/* VSX (vector/scalar).  */
 #define RS6000_BTM_P8_VECTOR	MASK_P8_VECTOR	/* ISA 2.07 vector.  */
+#define RS6000_BTM_P9_VECTOR	MASK_P9_VECTOR	/* ISA 3.00 vector.  */
 #define RS6000_BTM_CRYPTO	MASK_CRYPTO	/* crypto funcs.  */
 #define RS6000_BTM_HTM		MASK_HTM	/* hardware TM funcs.  */
 #define RS6000_BTM_SPE		MASK_STRING	/* E500 */
@@ -2679,6 +2681,7 @@ extern int frame_pointer_needed;
 #define RS6000_BTM_COMMON	(RS6000_BTM_ALTIVEC			\
 				 | RS6000_BTM_VSX			\
 				 | RS6000_BTM_P8_VECTOR			\
+				 | RS6000_BTM_P9_VECTOR			\
 				 | RS6000_BTM_CRYPTO			\
 				 | RS6000_BTM_FRE			\
 				 | RS6000_BTM_FRES			\
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 41f296f2d9f..1d6e4797d7c 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -887,6 +887,140 @@
   "VECTOR_MEM_VSX_P (<MODE>mode)"
   "")
 
+;; Explicit load/store expanders for the builtin functions for lxvd2x, etc.,
+;; when you really want their element-reversing behavior.
+(define_insn "vsx_ld_elemrev_v2di"
+  [(set (match_operand:V2DI 0 "vsx_register_operand" "=wa")
+        (vec_select:V2DI
+	  (match_operand:V2DI 1 "memory_operand" "Z")
+	  (parallel [(const_int 1) (const_int 0)])))]
+  "VECTOR_MEM_VSX_P (V2DImode) && !BYTES_BIG_ENDIAN"
+  "lxvd2x %x0,%y1"
+  [(set_attr "type" "vecload")])
+
+(define_insn "vsx_ld_elemrev_v2df"
+  [(set (match_operand:V2DF 0 "vsx_register_operand" "=wa")
+        (vec_select:V2DF
+	  (match_operand:V2DF 1 "memory_operand" "Z")
+	  (parallel [(const_int 1) (const_int 0)])))]
+  "VECTOR_MEM_VSX_P (V2DFmode) && !BYTES_BIG_ENDIAN"
+  "lxvd2x %x0,%y1"
+  [(set_attr "type" "vecload")])
+
+(define_insn "vsx_ld_elemrev_v4si"
+  [(set (match_operand:V4SI 0 "vsx_register_operand" "=wa")
+        (vec_select:V4SI
+	  (match_operand:V4SI 1 "memory_operand" "Z")
+	  (parallel [(const_int 3) (const_int 2)
+	             (const_int 1) (const_int 0)])))]
+  "VECTOR_MEM_VSX_P (V4SImode) && !BYTES_BIG_ENDIAN"
+  "lxvw4x %x0,%y1"
+  [(set_attr "type" "vecload")])
+
+(define_insn "vsx_ld_elemrev_v4sf"
+  [(set (match_operand:V4SF 0 "vsx_register_operand" "=wa")
+        (vec_select:V4SF
+	  (match_operand:V4SF 1 "memory_operand" "Z")
+	  (parallel [(const_int 3) (const_int 2)
+	             (const_int 1) (const_int 0)])))]
+  "VECTOR_MEM_VSX_P (V4SFmode) && !BYTES_BIG_ENDIAN"
+  "lxvw4x %x0,%y1"
+  [(set_attr "type" "vecload")])
+
+(define_insn "vsx_ld_elemrev_v8hi"
+  [(set (match_operand:V8HI 0 "vsx_register_operand" "=wa")
+        (vec_select:V8HI
+	  (match_operand:V8HI 1 "memory_operand" "Z")
+	  (parallel [(const_int 7) (const_int 6)
+	             (const_int 5) (const_int 4)
+		     (const_int 3) (const_int 2)
+	             (const_int 1) (const_int 0)])))]
+  "VECTOR_MEM_VSX_P (V8HImode) && !BYTES_BIG_ENDIAN && TARGET_P9_VECTOR"
+  "lxvh8x %x0,%y1"
+  [(set_attr "type" "vecload")])
+
+(define_insn "vsx_ld_elemrev_v16qi"
+  [(set (match_operand:V16QI 0 "vsx_register_operand" "=wa")
+        (vec_select:V16QI
+	  (match_operand:V16QI 1 "memory_operand" "Z")
+	  (parallel [(const_int 15) (const_int 14)
+	             (const_int 13) (const_int 12)
+		     (const_int 11) (const_int 10)
+		     (const_int  9) (const_int  8)
+		     (const_int  7) (const_int  6)
+	             (const_int  5) (const_int  4)
+		     (const_int  3) (const_int  2)
+	             (const_int  1) (const_int  0)])))]
+  "VECTOR_MEM_VSX_P (V16QImode) && !BYTES_BIG_ENDIAN && TARGET_P9_VECTOR"
+  "lxvb16x %x0,%y1"
+  [(set_attr "type" "vecload")])
+
+(define_insn "vsx_st_elemrev_v2df"
+  [(set (match_operand:V2DF 0 "memory_operand" "=Z")
+        (vec_select:V2DF
+	  (match_operand:V2DF 1 "vsx_register_operand" "wa")
+	  (parallel [(const_int 1) (const_int 0)])))]
+  "VECTOR_MEM_VSX_P (V2DFmode) && !BYTES_BIG_ENDIAN"
+  "stxvd2x %x1,%y0"
+  [(set_attr "type" "vecstore")])
+
+(define_insn "vsx_st_elemrev_v2di"
+  [(set (match_operand:V2DI 0 "memory_operand" "=Z")
+        (vec_select:V2DI
+	  (match_operand:V2DI 1 "vsx_register_operand" "wa")
+	  (parallel [(const_int 1) (const_int 0)])))]
+  "VECTOR_MEM_VSX_P (V2DImode) && !BYTES_BIG_ENDIAN"
+  "stxvd2x %x1,%y0"
+  [(set_attr "type" "vecstore")])
+
+(define_insn "vsx_st_elemrev_v4sf"
+  [(set (match_operand:V4SF 0 "memory_operand" "=Z")
+        (vec_select:V4SF
+	  (match_operand:V4SF 1 "vsx_register_operand" "wa")
+	  (parallel [(const_int 3) (const_int 2)
+	             (const_int 1) (const_int 0)])))]
+  "VECTOR_MEM_VSX_P (V4SFmode) && !BYTES_BIG_ENDIAN"
+  "stxvw4x %x1,%y0"
+  [(set_attr "type" "vecstore")])
+
+(define_insn "vsx_st_elemrev_v4si"
+  [(set (match_operand:V4SI 0 "memory_operand" "=Z")
+        (vec_select:V4SI
+	  (match_operand:V4SI 1 "vsx_register_operand" "wa")
+	  (parallel [(const_int 3) (const_int 2)
+	             (const_int 1) (const_int 0)])))]
+  "VECTOR_MEM_VSX_P (V4SImode) && !BYTES_BIG_ENDIAN"
+  "stxvw4x %x1,%y0"
+  [(set_attr "type" "vecstore")])
+
+(define_insn "vsx_st_elemrev_v8hi"
+  [(set (match_operand:V8HI 0 "memory_operand" "=Z")
+        (vec_select:V8HI
+	  (match_operand:V8HI 1 "vsx_register_operand" "wa")
+	  (parallel [(const_int 7) (const_int 6)
+	             (const_int 5) (const_int 4)
+		     (const_int 3) (const_int 2)
+	             (const_int 1) (const_int 0)])))]
+  "VECTOR_MEM_VSX_P (V8HImode) && !BYTES_BIG_ENDIAN && TARGET_P9_VECTOR"
+  "stxvh8x %x1,%y0"
+  [(set_attr "type" "vecstore")])
+
+(define_insn "vsx_st_elemrev_v16qi"
+  [(set (match_operand:V16QI 0 "memory_operand" "=Z")
+        (vec_select:V16QI
+	  (match_operand:V16QI 1 "vsx_register_operand" "wa")
+	  (parallel [(const_int 15) (const_int 14)
+	             (const_int 13) (const_int 12)
+		     (const_int 11) (const_int 10)
+		     (const_int  9) (const_int  8)
+	             (const_int  7) (const_int  6)
+	             (const_int  5) (const_int  4)
+		     (const_int  3) (const_int  2)
+	             (const_int  1) (const_int  0)])))]
+  "VECTOR_MEM_VSX_P (V16QImode) && !BYTES_BIG_ENDIAN && TARGET_P9_VECTOR"
+  "stxvb16x %x1,%y0"
+  [(set_attr "type" "vecstore")])
+
 
 ;; VSX vector floating point arithmetic instructions.  The VSX scalar
 ;; instructions are now combined with the insn for the traditional floating
diff --git a/gcc/config/visium/visium-protos.h b/gcc/config/visium/visium-protos.h
index 484d01e477d..9dcbc67035f 100644
--- a/gcc/config/visium/visium-protos.h
+++ b/gcc/config/visium/visium-protos.h
@@ -49,7 +49,8 @@ extern void visium_split_cbranch (enum rtx_code, rtx, rtx, rtx);
 extern const char *output_ubranch (rtx, rtx_insn *);
 extern const char *output_cbranch (rtx, enum rtx_code, enum machine_mode, int,
 				   rtx_insn *);
-extern void split_double_move (rtx *, enum machine_mode);
+extern void visium_split_double_move (rtx *, enum machine_mode);
+extern void visium_split_double_add (enum rtx_code, rtx, rtx, rtx);
 extern void visium_expand_copysign (rtx *, enum machine_mode);
 extern void visium_expand_int_cstore (rtx *, enum machine_mode);
 extern void visium_expand_fp_cstore (rtx *, enum machine_mode);
diff --git a/gcc/config/visium/visium.c b/gcc/config/visium/visium.c
index cd28f9bf90a..6712fed72bc 100644
--- a/gcc/config/visium/visium.c
+++ b/gcc/config/visium/visium.c
@@ -2026,7 +2026,7 @@ visium_rtx_costs (rtx x, machine_mode mode, int outer_code ATTRIBUTE_UNUSED,
 /* Split a double move of OPERANDS in MODE.  */
 
 void
-split_double_move (rtx *operands, enum machine_mode mode)
+visium_split_double_move (rtx *operands, enum machine_mode mode)
 {
   bool swap = false;
 
@@ -2076,14 +2076,74 @@ split_double_move (rtx *operands, enum machine_mode mode)
     }
 }
 
+/* Split a double addition or subtraction of operands.  */
+
+void
+visium_split_double_add (enum rtx_code code, rtx op0, rtx op1, rtx op2)
+{
+  rtx op3 = gen_lowpart (SImode, op0);
+  rtx op4 = gen_lowpart (SImode, op1);
+  rtx op5;
+  rtx op6 = gen_highpart (SImode, op0);
+  rtx op7 = (op1 == const0_rtx ? op1 : gen_highpart (SImode, op1));
+  rtx op8;
+  rtx x, pat, flags;
+
+  /* If operand #2 is a small constant, then its high part is null.  */
+  if (CONST_INT_P (op2))
+    {
+      HOST_WIDE_INT val = INTVAL (op2);
+
+      if (val < 0)
+	{
+	  code = (code == MINUS ? PLUS : MINUS);
+	  val = -val;
+	}
+
+      op5 = gen_int_mode (val, SImode);
+      op8 = const0_rtx;
+    }
+  else
+    {
+      op5 = gen_lowpart (SImode, op2);
+      op8 = gen_highpart (SImode, op2);
+    }
+
+  /* This is the {add,sub,neg}si3_insn_set_flags pattern.  */
+  if (op4 == const0_rtx)
+    x = gen_rtx_NEG (SImode, op5);
+  else
+    x = gen_rtx_fmt_ee (code, SImode, op4, op5);
+  pat = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (2));
+  XVECEXP (pat, 0, 0) = gen_rtx_SET (op3, x);
+  flags = gen_rtx_REG (CC_NOOVmode, FLAGS_REGNUM);
+  x = gen_rtx_COMPARE (CC_NOOVmode, shallow_copy_rtx (x), const0_rtx);
+  XVECEXP (pat, 0, 1) = gen_rtx_SET (flags, x);
+  emit_insn (pat);
+
+  /* This is the plus_[plus_]sltu_flags or minus_[minus_]sltu_flags pattern.  */
+  if (op8 == const0_rtx)
+    x = op7;
+  else
+    x = gen_rtx_fmt_ee (code, SImode, op7, op8);
+  x = gen_rtx_fmt_ee (code, SImode, x, gen_rtx_LTU (SImode, flags, const0_rtx));
+  pat = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (2));
+  XVECEXP (pat, 0, 0) = gen_rtx_SET (op6, x);
+  flags = gen_rtx_REG (CCmode, FLAGS_REGNUM);
+  XVECEXP (pat, 0, 1) = gen_rtx_CLOBBER (VOIDmode, flags);
+  emit_insn (pat);
+
+  visium_flags_exposed = true;
+}
+
 /* Expand a copysign of OPERANDS in MODE.  */
 
 void
 visium_expand_copysign (rtx *operands, enum machine_mode mode)
 {
-  rtx dest = operands[0];
-  rtx op0 = operands[1];
-  rtx op1 = operands[2];
+  rtx op0 = operands[0];
+  rtx op1 = operands[1];
+  rtx op2 = operands[2];
   rtx mask = force_reg (SImode, GEN_INT (0x7fffffff));
   rtx x;
 
@@ -2091,37 +2151,37 @@ visium_expand_copysign (rtx *operands, enum machine_mode mode)
      the FPU on the MCM have a non-standard behavior wrt NaNs.  */
   gcc_assert (mode == SFmode);
 
-  /* First get all the non-sign bits of OP0.  */
-  if (GET_CODE (op0) == CONST_DOUBLE)
+  /* First get all the non-sign bits of op1.  */
+  if (GET_CODE (op1) == CONST_DOUBLE)
     {
-      if (real_isneg (CONST_DOUBLE_REAL_VALUE (op0)))
-	op0 = simplify_unary_operation (ABS, mode, op0, mode);
-      if (op0 != CONST0_RTX (mode))
+      if (real_isneg (CONST_DOUBLE_REAL_VALUE (op1)))
+	op1 = simplify_unary_operation (ABS, mode, op1, mode);
+      if (op1 != CONST0_RTX (mode))
 	{
 	  long l;
-	  REAL_VALUE_TO_TARGET_SINGLE (*CONST_DOUBLE_REAL_VALUE (op0), l);
-	  op0 = force_reg (SImode, GEN_INT (trunc_int_for_mode (l, SImode)));
+	  REAL_VALUE_TO_TARGET_SINGLE (*CONST_DOUBLE_REAL_VALUE (op1), l);
+	  op1 = force_reg (SImode, gen_int_mode (l, SImode));
 	}
     }
   else
     {
-      op0 = copy_to_mode_reg (SImode, gen_lowpart (SImode, op0));
-      op0 = force_reg (SImode, gen_rtx_AND (SImode, op0, mask));
+      op1 = copy_to_mode_reg (SImode, gen_lowpart (SImode, op1));
+      op1 = force_reg (SImode, gen_rtx_AND (SImode, op1, mask));
     }
 
-  /* Then get the sign bit of OP1.  */
+  /* Then get the sign bit of op2.  */
   mask = force_reg (SImode, gen_rtx_NOT (SImode, mask));
-  op1 = copy_to_mode_reg (SImode, gen_lowpart (SImode, op1));
-  op1 = force_reg (SImode, gen_rtx_AND (SImode, op1, mask));
+  op2 = copy_to_mode_reg (SImode, gen_lowpart (SImode, op2));
+  op2 = force_reg (SImode, gen_rtx_AND (SImode, op2, mask));
 
   /* Finally OR the two values.  */
-  if (op0 == CONST0_RTX (SFmode))
-    x = op1;
+  if (op1 == CONST0_RTX (SFmode))
+    x = op2;
   else
-    x = force_reg (SImode, gen_rtx_IOR (SImode, op0, op1));
+    x = force_reg (SImode, gen_rtx_IOR (SImode, op1, op2));
 
   /* And move the result to the destination.  */
-  emit_insn (gen_rtx_SET (dest, gen_lowpart (SFmode, x)));
+  emit_insn (gen_rtx_SET (op0, gen_lowpart (SFmode, x)));
 }
 
 /* Expand a cstore of OPERANDS in MODE for EQ/NE/LTU/GTU/GEU/LEU.  We generate
@@ -3537,18 +3597,15 @@ visium_compute_frame_size (int size)
 int
 visium_initial_elimination_offset (int from, int to ATTRIBUTE_UNUSED)
 {
-  const int frame_size = visium_compute_frame_size (get_frame_size ());
   const int save_fp = current_frame_info.save_fp;
   const int save_lr = current_frame_info.save_lr;
   const int lr_slot = current_frame_info.lr_slot;
-  const int local_frame_offset
-    = (save_fp + save_lr + lr_slot) * UNITS_PER_WORD;
   int offset;
 
   if (from == FRAME_POINTER_REGNUM)
-    offset = local_frame_offset;
+    offset = (save_fp + save_lr + lr_slot) * UNITS_PER_WORD;
   else if (from == ARG_POINTER_REGNUM)
-    offset = frame_size;
+    offset = visium_compute_frame_size (get_frame_size ());
   else
     gcc_unreachable ();
 
diff --git a/gcc/config/visium/visium.md b/gcc/config/visium/visium.md
index 09d136f5f0f..41e3e5c5719 100644
--- a/gcc/config/visium/visium.md
+++ b/gcc/config/visium/visium.md
@@ -627,7 +627,7 @@
   [(set (match_dup 2) (match_dup 3))
    (set (match_dup 4) (match_dup 5))]
 {
-  split_double_move (operands, DImode);
+  visium_split_double_move (operands, DImode);
 })
 
 ;;
@@ -726,7 +726,7 @@
   [(set (match_dup 2) (match_dup 3))
    (set (match_dup 4) (match_dup 5))]
 {
-  split_double_move (operands, DFmode);
+  visium_split_double_move (operands, DFmode);
 })
 
 ;;
@@ -815,31 +815,20 @@
 		 (match_operand:DI 2 "add_operand" "")))]
   "")
 
+; Disfavour the use of add.l because of the early clobber.
+
 (define_insn_and_split "*addi3_insn"
   [(set (match_operand:DI 0 "register_operand"          "=r,r,&r")
 	(plus:DI (match_operand:DI 1 "register_operand" "%0,0, r")
-		 (match_operand:DI 2 "add_operand"      " J,L, r")))]
+		 (match_operand:DI 2 "add_operand"      " L,J, r")))]
   "ok_for_simple_arith_logic_operands (operands, DImode)"
   "#"
   "reload_completed"
-  [(parallel [(set (match_dup 0)
-		   (plus:DI (match_dup 1) (match_dup 2)))
-	      (clobber (reg:CC R_FLAGS))])]
-  ""
-  [(set_attr "type" "arith2")])
-
-; Disfavour the use of add.l because of the early clobber.
-
-(define_insn "*adddi3_insn_flags"
-  [(set (match_operand:DI 0 "register_operand"          "=r,r,&r")
-	(plus:DI (match_operand:DI 1 "register_operand" "%0,0, r")
-		 (match_operand:DI 2 "add_operand"      " J,L, r")))
-   (clobber (reg:CC R_FLAGS))]
-  "reload_completed"
-  "@
-    addi    %d0,%2\n\tadc.l   %0,%0,r0
-    subi    %d0,%n2\n\tsubc.l  %0,%0,r0
-    add.l   %d0,%d1,%d2\n\tadc.l   %0,%1,%2"
+  [(const_int 0)]
+{
+  visium_split_double_add (PLUS, operands[0], operands[1], operands[2]);
+  DONE;
+}
   [(set_attr "type" "arith2")])
 
 ;;
@@ -847,7 +836,7 @@
 ;;
 ;; Integer Add with Carry
 ;;
-;; Only SI mode is supported as slt[u] for the sake of cstore.
+;; Only SI mode is supported.
 ;;
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ;;
@@ -869,6 +858,16 @@
   "adc.l   %0,%1,r0"
   [(set_attr "type" "arith")])
 
+(define_insn "*plus_plus_sltu<subst_arith>"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(plus:SI (plus:SI (match_operand:SI 1 "register_operand" "r")
+			  (match_operand:SI 2 "register_operand" "r"))
+		 (ltu:SI (reg R_FLAGS) (const_int 0))))
+   (clobber (reg:CC R_FLAGS))]
+  "reload_completed"
+  "adc.l   %0,%1,%2"
+  [(set_attr "type" "arith")])
+
 ;;
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ;;
@@ -955,31 +954,20 @@
 		  (match_operand:DI 2 "add_operand" "")))]
   "")
 
+; Disfavour the use of the sub.l because of the early clobber.
+
 (define_insn_and_split "*subdi3_insn"
   [(set (match_operand:DI 0 "register_operand"           "=r,r,&r")
 	(minus:DI (match_operand:DI 1 "register_operand" " 0,0, r")
-		  (match_operand:DI 2 "add_operand"      " J,L, r")))]
+		  (match_operand:DI 2 "add_operand"      " L,J, r")))]
   "ok_for_simple_arith_logic_operands (operands, DImode)"
   "#"
   "reload_completed"
-  [(parallel [(set (match_dup 0)
-		   (minus:DI (match_dup 1) (match_dup 2)))
-	      (clobber (reg:CC R_FLAGS))])]
- ""
-  [(set_attr "type" "arith2")])
-
-; Disfavour the use of the sub.l because of the early clobber.
-
-(define_insn "*subdi3_insn_flags"
-  [(set (match_operand:DI 0 "register_operand"           "=r,r,&r")
-	(minus:DI (match_operand:DI 1 "register_operand" " 0,0, r")
-		  (match_operand:DI 2 "add_operand"      " J,L, r")))
-   (clobber (reg:CC R_FLAGS))]
-  "reload_completed"
-  "@
-    subi    %d0,%2\n\tsubc.l  %0,%0,r0
-    addi    %d0,%n2\n\tadc.l   %0,%0,r0
-    sub.l   %d0,%d1,%d2\n\tsubc.l  %0,%1,%2"
+  [(const_int 0)]
+{
+  visium_split_double_add (MINUS, operands[0], operands[1], operands[2]);
+  DONE;
+}
   [(set_attr "type" "arith2")])
 
 ;;
@@ -987,7 +975,7 @@
 ;;
 ;; Integer Subtract with Carry
 ;;
-;; Only SI mode is supported as neg<slt[u]> for the sake of cstore.
+;; Only SI mode is supported.
 ;;
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ;;
@@ -1009,6 +997,16 @@
   "subc.l  %0,%1,r0"
   [(set_attr "type" "arith")])
 
+(define_insn "*minus_minus_sltu<subst_arith>"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(minus:SI (minus:SI (match_operand:SI 1 "reg_or_0_operand" "rO")
+			    (match_operand:SI 2 "register_operand" "r"))
+		  (ltu:SI (reg R_FLAGS) (const_int 0))))
+   (clobber (reg:CC R_FLAGS))]
+  "reload_completed"
+  "subc.l  %0,%r1,%2"
+  [(set_attr "type" "arith")])
+
 ;;
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ;;
@@ -1054,17 +1052,11 @@
   "ok_for_simple_arith_logic_operands (operands, DImode)"
   "#"
   "reload_completed"
-  [(parallel [(set (match_dup 0) (neg:DI (match_dup 1)))
-	      (clobber (reg:CC R_FLAGS))])]
-  ""
-  [(set_attr "type" "arith2")])
-
-(define_insn "*negdi2_insn_flags"
-  [(set (match_operand:DI 0 "register_operand" "=&r")
-	(neg:DI (match_operand:DI 1 "register_operand" "r")))
-   (clobber (reg:CC R_FLAGS))]
-  "reload_completed"
-  "sub.l   %d0,r0,%d1\n\tsubc.l  %0,r0,%1"
+  [(const_int 0)]
+{
+  visium_split_double_add (MINUS, operands[0], const0_rtx, operands[1]);
+  DONE;
+}
   [(set_attr "type" "arith2")])
 
 ;;
diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog
index 1a421b62f04..9cb17b46c7d 100644
--- a/gcc/cp/ChangeLog
+++ b/gcc/cp/ChangeLog
@@ -1,7 +1,28 @@
+2016-05-31  Martin Sebor  <msebor@redhat.com>
+
+	PR c++/71306
+	* init.c (warn_placement_new_too_small): Handle placement new arguments
+	that are elements of arrays more carefully.  Remove a pointless loop.
+
+2016-05-30  Jakub Jelinek  <jakub@redhat.com>
+
+	PR c++/71349
+	* parser.c (cp_parser_omp_for): Don't disallow nowait clause
+	when combined with target construct.
+	(cp_parser_omp_parallel): Pass cclauses == NULL as last argument
+	to cp_parser_omp_all_clauses.
+
+2016-05-29  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/71105
+	* lambda.c (maybe_add_lambda_conv_op): Early return also when
+	LAMBDA_EXPR_DEFAULT_CAPTURE_MODE != CPLD_NONE.
+
 2016-05-24  Martin Sebor  <msebor@redhat.com>
 
 	PR c++/71147
-	* decl.c (layout_var_decl, grokdeclarator): Use complete_or_array_type_p.
+	* decl.c (layout_var_decl, grokdeclarator): Use
+	complete_or_array_type_p.
 	* pt.c (instantiate_class_template_1): Try to complete the element
 	type of a flexible array member.
 	(can_complete_type_without_circularity): Handle arrays of unknown bound.
diff --git a/gcc/cp/init.c b/gcc/cp/init.c
index 5997d53ddb5..5e2393b788e 100644
--- a/gcc/cp/init.c
+++ b/gcc/cp/init.c
@@ -2375,7 +2375,8 @@ warn_placement_new_too_small (tree type, tree nelts, tree size, tree oper)
 
   STRIP_NOPS (oper);
 
-  if (TREE_CODE (oper) == ARRAY_REF)
+  if (TREE_CODE (oper) == ARRAY_REF
+      && (addr_expr || TREE_CODE (TREE_TYPE (oper)) == ARRAY_TYPE))
     {
       /* Similar to the offset computed above, see if the array index
 	 is a compile-time constant.  If so, and unless the offset was
@@ -2404,8 +2405,8 @@ warn_placement_new_too_small (tree type, tree nelts, tree size, tree oper)
   bool compref = TREE_CODE (oper) == COMPONENT_REF;
 
   /* Descend into a struct or union to find the member whose address
-     is being used as the agument.  */
-  while (TREE_CODE (oper) == COMPONENT_REF)
+     is being used as the argument.  */
+  if (TREE_CODE (oper) == COMPONENT_REF)
     {
       tree op0 = oper;
       while (TREE_CODE (op0 = TREE_OPERAND (op0, 0)) == COMPONENT_REF);
diff --git a/gcc/cp/lambda.c b/gcc/cp/lambda.c
index cdc11febcff..539a01a089f 100644
--- a/gcc/cp/lambda.c
+++ b/gcc/cp/lambda.c
@@ -871,8 +871,10 @@ maybe_add_lambda_conv_op (tree type)
   bool nested = (cfun != NULL);
   bool nested_def = decl_function_context (TYPE_MAIN_DECL (type));
   tree callop = lambda_function (type);
+  tree lam = CLASSTYPE_LAMBDA_EXPR (type);
 
-  if (LAMBDA_EXPR_CAPTURE_LIST (CLASSTYPE_LAMBDA_EXPR (type)) != NULL_TREE)
+  if (LAMBDA_EXPR_CAPTURE_LIST (lam) != NULL_TREE
+      || LAMBDA_EXPR_DEFAULT_CAPTURE_MODE (lam) != CPLD_NONE)
     return;
 
   if (processing_template_decl)
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index d1015a63476..d5c2d1888ad 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -33885,7 +33885,9 @@ cp_parser_omp_for (cp_parser *parser, cp_token *pragma_tok,
 
   strcat (p_name, " for");
   mask |= OMP_FOR_CLAUSE_MASK;
-  if (cclauses)
+  /* parallel for{, simd} disallows nowait clause, but for
+     target {teams distribute ,}parallel for{, simd} it should be accepted.  */
+  if (cclauses && (mask & (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_MAP)) == 0)
     mask &= ~(OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_NOWAIT);
   /* Composite distribute parallel for{, simd} disallows ordered clause.  */
   if ((mask & (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_DIST_SCHEDULE)) != 0)
@@ -34224,7 +34226,8 @@ cp_parser_omp_parallel (cp_parser *parser, cp_token *pragma_tok,
 	}
     }
 
-  clauses = cp_parser_omp_all_clauses (parser, mask, p_name, pragma_tok);
+  clauses = cp_parser_omp_all_clauses (parser, mask, p_name, pragma_tok,
+				       cclauses == NULL);
   if (cclauses)
     {
       cp_omp_split_clauses (loc, OMP_PARALLEL, mask, clauses, cclauses);
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 3c91b1d5758..02fdc1ea470 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -15937,6 +15937,18 @@ void vec_st (vector double, int, vector double *);
 void vec_st (vector double, int, double *);
 vector double vec_sub (vector double, vector double);
 vector double vec_trunc (vector double);
+vector double vec_xl (int, vector double *);
+vector double vec_xl (int, double *);
+vector long long vec_xl (int, vector long long *);
+vector long long vec_xl (int, long long *);
+vector unsigned long long vec_xl (int, vector unsigned long long *);
+vector unsigned long long vec_xl (int, unsigned long long *);
+vector float vec_xl (int, vector float *);
+vector float vec_xl (int, float *);
+vector int vec_xl (int, vector int *);
+vector int vec_xl (int, int *);
+vector unsigned int vec_xl (int, vector unsigned int *);
+vector unsigned int vec_xl (int, unsigned int *);
 vector double vec_xor (vector double, vector double);
 vector double vec_xor (vector double, vector bool long);
 vector double vec_xor (vector bool long, vector double);
@@ -15946,6 +15958,18 @@ vector long vec_xor (vector bool long, vector long);
 vector unsigned long vec_xor (vector unsigned long, vector unsigned long);
 vector unsigned long vec_xor (vector unsigned long, vector bool long);
 vector unsigned long vec_xor (vector bool long, vector unsigned long);
+void vec_xst (vector double, int, vector double *);
+void vec_xst (vector double, int, double *);
+void vec_xst (vector long long, int, vector long long *);
+void vec_xst (vector long long, int, long long *);
+void vec_xst (vector unsigned long long, int, vector unsigned long long *);
+void vec_xst (vector unsigned long long, int, unsigned long long *);
+void vec_xst (vector float, int, vector float *);
+void vec_xst (vector float, int, float *);
+void vec_xst (vector int, int, vector int *);
+void vec_xst (vector int, int, int *);
+void vec_xst (vector unsigned int, int, vector unsigned int *);
+void vec_xst (vector unsigned int, int, unsigned int *);
 int vec_all_eq (vector double, vector double);
 int vec_all_ge (vector double, vector double);
 int vec_all_gt (vector double, vector double);
@@ -16060,7 +16084,7 @@ if the VSX instruction set is available.  The @samp{vec_vsx_ld} and
 @samp{LXVW4X}, @samp{STXVD2X}, and @samp{STXVW4X} instructions.
 
 If the ISA 2.07 additions to the vector/scalar (power8-vector)
-instruction set is available, the following additional functions are
+instruction set are available, the following additional functions are
 available for both 32-bit and 64-bit targets.  For 64-bit targets, you
 can use @var{vector long} instead of @var{vector long long},
 @var{vector bool long} instead of @var{vector bool long long}, and
@@ -16373,7 +16397,7 @@ vector unsigned long long vec_vupklsw (vector int);
 @end smallexample
 
 If the ISA 2.07 additions to the vector/scalar (power8-vector)
-instruction set is available, the following additional functions are
+instruction set are available, the following additional functions are
 available for 64-bit targets.  New vector types
 (@var{vector __int128_t} and @var{vector __uint128_t}) are available
 to hold the @var{__int128_t} and @var{__uint128_t} types to use these
@@ -16488,6 +16512,28 @@ The second argument to the @var{__builtin_crypto_vshasigmad} and
 integer that is 0 or 1.  The third argument to these builtin functions
 must be a constant integer in the range of 0 to 15.
 
+If the ISA 3.00 additions to the vector/scalar (power9-vector)
+instruction set are available, the following additional functions are
+available for both 32-bit and 64-bit targets.
+
+vector short vec_xl (int, vector short *);
+vector short vec_xl (int, short *);
+vector unsigned short vec_xl (int, vector unsigned short *);
+vector unsigned short vec_xl (int, unsigned short *);
+vector char vec_xl (int, vector char *);
+vector char vec_xl (int, char *);
+vector unsigned char vec_xl (int, vector unsigned char *);
+vector unsigned char vec_xl (int, unsigned char *);
+
+void vec_xst (vector short, int, vector short *);
+void vec_xst (vector short, int, short *);
+void vec_xst (vector unsigned short, int, vector unsigned short *);
+void vec_xst (vector unsigned short, int, unsigned short *);
+void vec_xst (vector char, int, vector char *);
+void vec_xst (vector char, int, char *);
+void vec_xst (vector unsigned char, int, vector unsigned char *);
+void vec_xst (vector unsigned char, int, unsigned char *);
+
 @node PowerPC Hardware Transactional Memory Built-in Functions
 @subsection PowerPC Hardware Transactional Memory Built-in Functions
 GCC provides two interfaces for accessing the Hardware Transactional
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 3c1696b0a0c..f01c15bbe9e 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -19401,11 +19401,13 @@ gen_entry_point_die (tree decl, dw_die_ref context_die)
 static void
 retry_incomplete_types (void)
 {
+  set_early_dwarf s;
   int i;
 
   for (i = vec_safe_length (incomplete_types) - 1; i >= 0; i--)
     if (should_emit_struct_debug ((*incomplete_types)[i], DINFO_USAGE_DIR_USE))
       gen_type_die ((*incomplete_types)[i], comp_unit_die ());
+  vec_safe_truncate (incomplete_types, 0);
 }
 
 /* Determine what tag to use for a record type.  */
@@ -27382,10 +27384,6 @@ dwarf2out_finish (const char *filename)
   resolve_addr (comp_unit_die ());
   move_marked_base_types ();
 
-  /* Walk through the list of incomplete types again, trying once more to
-     emit full debugging info for them.  */
-  retry_incomplete_types ();
-
   if (flag_eliminate_unused_debug_types)
     prune_unused_types ();
 
@@ -27686,6 +27684,10 @@ dwarf2out_finish (const char *filename)
 static void
 dwarf2out_early_finish (void)
 {
+  /* Walk through the list of incomplete types again, trying once more to
+     emit full debugging info for them.  */
+  retry_incomplete_types ();
+
   /* The point here is to flush out the limbo list so that it is empty
      and we don't need to stream it for LTO.  */
   flush_limbo_die_list ();
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 0cb09f4736e..79ff3f92371 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -117,14 +117,8 @@ static enum tree_code compcode_to_comparison (enum comparison_code);
 static int operand_equal_for_comparison_p (tree, tree, tree);
 static int twoval_comparison_p (tree, tree *, tree *, int *);
 static tree eval_subst (location_t, tree, tree, tree, tree, tree);
-static tree make_bit_field_ref (location_t, tree, tree,
-				HOST_WIDE_INT, HOST_WIDE_INT, int, int);
 static tree optimize_bit_field_compare (location_t, enum tree_code,
 					tree, tree, tree);
-static tree decode_field_reference (location_t, tree, HOST_WIDE_INT *,
-				    HOST_WIDE_INT *,
-				    machine_mode *, int *, int *, int *,
-				    tree *, tree *);
 static int simple_operand_p (const_tree);
 static bool simple_operand_p_2 (tree);
 static tree range_binop (enum tree_code, tree, tree, int, tree, int);
@@ -3782,15 +3776,23 @@ distribute_real_division (location_t loc, enum tree_code code, tree type,
 
 /* Return a BIT_FIELD_REF of type TYPE to refer to BITSIZE bits of INNER
    starting at BITPOS.  The field is unsigned if UNSIGNEDP is nonzero
-   and uses reverse storage order if REVERSEP is nonzero.  */
+   and uses reverse storage order if REVERSEP is nonzero.  ORIG_INNER
+   is the original memory reference used to preserve the alias set of
+   the access.  */
 
 static tree
-make_bit_field_ref (location_t loc, tree inner, tree type,
+make_bit_field_ref (location_t loc, tree inner, tree orig_inner, tree type,
 		    HOST_WIDE_INT bitsize, HOST_WIDE_INT bitpos,
 		    int unsignedp, int reversep)
 {
   tree result, bftype;
 
+  if (get_alias_set (inner) != get_alias_set (orig_inner))
+    inner = fold_build2 (MEM_REF, TREE_TYPE (inner),
+			 build_fold_addr_expr (inner),
+			 build_int_cst
+			  (reference_alias_ptr_type (orig_inner), 0));
+
   if (bitpos == 0 && !reversep)
     {
       tree size = TYPE_SIZE (TREE_TYPE (inner));
@@ -3916,13 +3918,13 @@ optimize_bit_field_compare (location_t loc, enum tree_code code,
        and return.  */
     return fold_build2_loc (loc, code, compare_type,
 			fold_build2_loc (loc, BIT_AND_EXPR, unsigned_type,
-				     make_bit_field_ref (loc, linner,
+				     make_bit_field_ref (loc, linner, lhs,
 							 unsigned_type,
 							 nbitsize, nbitpos,
 							 1, lreversep),
 				     mask),
 			fold_build2_loc (loc, BIT_AND_EXPR, unsigned_type,
-				     make_bit_field_ref (loc, rinner,
+				     make_bit_field_ref (loc, rinner, rhs,
 							 unsigned_type,
 							 nbitsize, nbitpos,
 							 1, rreversep),
@@ -3967,8 +3969,8 @@ optimize_bit_field_compare (location_t loc, enum tree_code code,
   /* Make a new bitfield reference, shift the constant over the
      appropriate number of bits and mask it with the computed mask
      (in case this was a signed field).  If we changed it, make a new one.  */
-  lhs = make_bit_field_ref (loc, linner, unsigned_type, nbitsize, nbitpos, 1,
-			    lreversep);
+  lhs = make_bit_field_ref (loc, linner, lhs, unsigned_type,
+			    nbitsize, nbitpos, 1, lreversep);
 
   rhs = const_binop (BIT_AND_EXPR,
 		     const_binop (LSHIFT_EXPR,
@@ -4007,11 +4009,12 @@ optimize_bit_field_compare (location_t loc, enum tree_code code,
    do anything with.  */
 
 static tree
-decode_field_reference (location_t loc, tree exp, HOST_WIDE_INT *pbitsize,
+decode_field_reference (location_t loc, tree *exp_, HOST_WIDE_INT *pbitsize,
 			HOST_WIDE_INT *pbitpos, machine_mode *pmode,
 			int *punsignedp, int *preversep, int *pvolatilep,
 			tree *pmask, tree *pand_mask)
 {
+  tree exp = *exp_;
   tree outer_type = 0;
   tree and_mask = 0;
   tree mask, inner, offset;
@@ -4048,6 +4051,8 @@ decode_field_reference (location_t loc, tree exp, HOST_WIDE_INT *pbitsize,
       || TREE_CODE (inner) == PLACEHOLDER_EXPR)
     return 0;
 
+  *exp_ = exp;
+
   /* If the number of bits in the reference is the same as the bitsize of
      the outer type, then the outer type gives the signedness. Otherwise
      (in case of a small bitfield) the signedness is unchanged.  */
@@ -5656,19 +5661,19 @@ fold_truth_andor_1 (location_t loc, enum tree_code code, tree truth_type,
 
   ll_reversep = lr_reversep = rl_reversep = rr_reversep = 0;
   volatilep = 0;
-  ll_inner = decode_field_reference (loc, ll_arg,
+  ll_inner = decode_field_reference (loc, &ll_arg,
 				     &ll_bitsize, &ll_bitpos, &ll_mode,
 				     &ll_unsignedp, &ll_reversep, &volatilep,
 				     &ll_mask, &ll_and_mask);
-  lr_inner = decode_field_reference (loc, lr_arg,
+  lr_inner = decode_field_reference (loc, &lr_arg,
 				     &lr_bitsize, &lr_bitpos, &lr_mode,
 				     &lr_unsignedp, &lr_reversep, &volatilep,
 				     &lr_mask, &lr_and_mask);
-  rl_inner = decode_field_reference (loc, rl_arg,
+  rl_inner = decode_field_reference (loc, &rl_arg,
 				     &rl_bitsize, &rl_bitpos, &rl_mode,
 				     &rl_unsignedp, &rl_reversep, &volatilep,
 				     &rl_mask, &rl_and_mask);
-  rr_inner = decode_field_reference (loc, rr_arg,
+  rr_inner = decode_field_reference (loc, &rr_arg,
 				     &rr_bitsize, &rr_bitpos, &rr_mode,
 				     &rr_unsignedp, &rr_reversep, &volatilep,
 				     &rr_mask, &rr_and_mask);
@@ -5830,12 +5835,14 @@ fold_truth_andor_1 (location_t loc, enum tree_code code, tree truth_type,
       lr_mask = const_binop (BIT_IOR_EXPR, lr_mask, rr_mask);
       if (lnbitsize == rnbitsize && xll_bitpos == xlr_bitpos)
 	{
-	  lhs = make_bit_field_ref (loc, ll_inner, lntype, lnbitsize, lnbitpos,
+	  lhs = make_bit_field_ref (loc, ll_inner, ll_arg,
+				    lntype, lnbitsize, lnbitpos,
 				    ll_unsignedp || rl_unsignedp, ll_reversep);
 	  if (! all_ones_mask_p (ll_mask, lnbitsize))
 	    lhs = build2 (BIT_AND_EXPR, lntype, lhs, ll_mask);
 
-	  rhs = make_bit_field_ref (loc, lr_inner, rntype, rnbitsize, rnbitpos,
+	  rhs = make_bit_field_ref (loc, lr_inner, lr_arg,
+				    rntype, rnbitsize, rnbitpos,
 				    lr_unsignedp || rr_unsignedp, lr_reversep);
 	  if (! all_ones_mask_p (lr_mask, rnbitsize))
 	    rhs = build2 (BIT_AND_EXPR, rntype, rhs, lr_mask);
@@ -5857,11 +5864,11 @@ fold_truth_andor_1 (location_t loc, enum tree_code code, tree truth_type,
 	{
 	  tree type;
 
-	  lhs = make_bit_field_ref (loc, ll_inner, lntype,
+	  lhs = make_bit_field_ref (loc, ll_inner, ll_arg, lntype,
 				    ll_bitsize + rl_bitsize,
 				    MIN (ll_bitpos, rl_bitpos),
 				    ll_unsignedp, ll_reversep);
-	  rhs = make_bit_field_ref (loc, lr_inner, rntype,
+	  rhs = make_bit_field_ref (loc, lr_inner, lr_arg, rntype,
 				    lr_bitsize + rr_bitsize,
 				    MIN (lr_bitpos, rr_bitpos),
 				    lr_unsignedp, lr_reversep);
@@ -5926,7 +5933,8 @@ fold_truth_andor_1 (location_t loc, enum tree_code code, tree truth_type,
      reference we will make.  Unless the mask is all ones the width of
      that field, perform the mask operation.  Then compare with the
      merged constant.  */
-  result = make_bit_field_ref (loc, ll_inner, lntype, lnbitsize, lnbitpos,
+  result = make_bit_field_ref (loc, ll_inner, ll_arg,
+			       lntype, lnbitsize, lnbitpos,
 			       ll_unsignedp || rl_unsignedp, ll_reversep);
 
   ll_mask = const_binop (BIT_IOR_EXPR, ll_mask, rl_mask);
@@ -11632,9 +11640,9 @@ fold_ternary_loc (location_t loc, enum tree_code code, tree type,
       /* Convert A ? 0 : 1 to !A.  This prefers the use of NOT_EXPR
 	 over COND_EXPR in cases such as floating point comparisons.  */
       if (integer_zerop (op1)
-	  && (code == VEC_COND_EXPR ? integer_all_onesp (op2)
-				    : (integer_onep (op2)
-				       && !VECTOR_TYPE_P (type)))
+	  && code == COND_EXPR
+	  && integer_onep (op2)
+	  && !VECTOR_TYPE_P (type)
 	  && truth_value_p (TREE_CODE (arg0)))
 	return pedantic_non_lvalue_loc (loc,
 				    fold_convert_loc (loc, type,
@@ -12306,7 +12314,8 @@ fold_checksum_tree (const_tree expr, struct md5_ctx *ctx,
 	       || TYPE_REFERENCE_TO (expr)
 	       || TYPE_CACHED_VALUES_P (expr)
 	       || TYPE_CONTAINS_PLACEHOLDER_INTERNAL (expr)
-	       || TYPE_NEXT_VARIANT (expr)))
+	       || TYPE_NEXT_VARIANT (expr)
+	       || TYPE_ALIAS_SET_KNOWN_P (expr)))
     {
       /* Allow these fields to be modified.  */
       tree tmp;
@@ -12316,6 +12325,7 @@ fold_checksum_tree (const_tree expr, struct md5_ctx *ctx,
       TYPE_POINTER_TO (tmp) = NULL;
       TYPE_REFERENCE_TO (tmp) = NULL;
       TYPE_NEXT_VARIANT (tmp) = NULL;
+      TYPE_ALIAS_SET (tmp) = -1;
       if (TYPE_CACHED_VALUES_P (tmp))
 	{
 	  TYPE_CACHED_VALUES_P (tmp) = 0;
diff --git a/gcc/fortran/ChangeLog b/gcc/fortran/ChangeLog
index aa16784c11b..78236544001 100644
--- a/gcc/fortran/ChangeLog
+++ b/gcc/fortran/ChangeLog
@@ -1,3 +1,10 @@
+2016-05-26  Jerry DeLisle  <jvdelisle@gcc.gnu.org>
+
+	Backport from trunk.
+	PR fortran/66461
+	* scanner.c (gfc_next_char_literal): Clear end_flag when adjusting
+	current locus back to old_locus.
+
 2016-05-20  Jakub Jelinek  <jakub@redhat.com>
 
 	PR fortran/71204
diff --git a/gcc/fortran/scanner.c b/gcc/fortran/scanner.c
index f4dedd69757..6a7a5b68bb3 100644
--- a/gcc/fortran/scanner.c
+++ b/gcc/fortran/scanner.c
@@ -1556,6 +1556,7 @@ restart:
 not_continuation:
   c = '\n';
   gfc_current_locus = old_loc;
+  end_flag = 0;
 
 done:
   if (c == '\n')
diff --git a/gcc/graphite-isl-ast-to-gimple.c b/gcc/graphite-isl-ast-to-gimple.c
index 049a4c5ed3f..fb9c8468ebc 100644
--- a/gcc/graphite-isl-ast-to-gimple.c
+++ b/gcc/graphite-isl-ast-to-gimple.c
@@ -1075,9 +1075,7 @@ bb_contains_loop_close_phi_nodes (basic_block bb)
 static bool
 bb_contains_loop_phi_nodes (basic_block bb)
 {
-  gcc_assert (EDGE_COUNT (bb->preds) <= 2);
-
-  if (bb->preds->length () == 1)
+  if (EDGE_COUNT (bb->preds) != 2)
     return false;
 
   unsigned depth = loop_depth (bb->loop_father);
@@ -1792,7 +1790,6 @@ get_def_bb_for_const (basic_block bb, basic_block old_bb) const
 	b1 = b2;
     }
 
-  gcc_assert (b1);
   return b1;
 }
 
@@ -2481,13 +2478,14 @@ copy_cond_phi_nodes (basic_block bb, basic_block new_bb, vec<tree> iv_map)
 
   gcc_assert (!bb_contains_loop_close_phi_nodes (bb));
 
+  /* TODO: Handle cond phi nodes with more than 2 predecessors.  */
+  if (EDGE_COUNT (bb->preds) != 2)
+    return false;
+
   if (dump_file)
     fprintf (dump_file, "[codegen] copying cond phi nodes in bb_%d.\n",
 	     new_bb->index);
 
-  /* Cond phi nodes should have exactly two arguments.  */
-  gcc_assert (2 == EDGE_COUNT (bb->preds));
-
   for (gphi_iterator psi = gsi_start_phis (bb); !gsi_end_p (psi);
        gsi_next (&psi))
     {
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index b777057c055..578b50f08ee 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,100 @@
+2016-05-31  Martin Sebor  <msebor@redhat.com>
+
+	PR c++/71306
+	* g++.dg/warn/Wplacement-new-size-3.C: New test.
+
+2016-05-31  Richard Biener  <rguenther@suse.de>
+
+	Backport from mainline
+	2016-05-11  Richard Biener  <rguenther@suse.de>
+
+	PR debug/71057
+	* g++.dg/debug/pr71057.C: New testcase.
+
+2016-05-31  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
+
+	PR target/71056
+	* gcc.target/arm/pr71056.c: New test.
+
+2016-05-31  Tom de Vries  <tom@codesourcery.com>
+
+	backport:
+	2016-05-31  Tom de Vries  <tom@codesourcery.com>
+
+	PR tree-optimization/69068
+	* gcc.dg/graphite/pr69068.c: New test.
+
+2016-05-30  Jakub Jelinek  <jakub@redhat.com>
+
+	PR c++/71349
+	* c-c++-common/gomp/clauses-1.c (bar): Add dd argument.  Add
+	nowait depend(inout: dd[0]) clauses where permitted.
+
+2016-05-30  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
+
+	Backport from mainline
+	2016-04-29  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
+
+	* gcc.target/powerpc/vsx-elemrev-1.c: New.
+	* gcc.target/powerpc/vsx-elemrev-2.c: New.
+	* gcc.target/powerpc/vsx-elemrev-3.c: New.
+	* gcc.target/powerpc/vsx-elemrev-4.c: New.
+
+2016-05-30  Tom de Vries  <tom@codesourcery.com>
+
+	backport:
+	2016-05-30  Tom de Vries  <tom@codesourcery.com>
+
+	* gcc.dg/graphite/pr69067.c (main): Remove superfluous argument in call
+	to ce.
+
+2016-05-30  Uros Bizjak  <ubizjak@gmail.com>
+
+	* gcc.target/i386/iamcu/args.h (clear_non_sret_int_hardware_registers):
+	Use correct register when clearing %edx.
+
+2016-05-30  Richard Biener  <rguenther@suse.de>
+
+	Backport from mainline
+	2016-05-11  Richard Biener  <rguenther@suse.de>
+
+	PR middle-end/71002
+	* g++.dg/torture/pr71002.C: New testcase.
+
+	2016-05-13  Jakub Jelinek  <jakub@redhat.com>
+
+	PR bootstrap/71071
+	* gcc.dg/pr71071.c: New test.
+
+2016-05-30  Tom de Vries  <tom@codesourcery.com>
+
+	backport:
+	2016-05-30  Tom de Vries  <tom@codesourcery.com>
+
+	PR tree-optimization/69067
+	* gcc.dg/graphite/pr69067.c: New test.
+
+2016-05-29  Paolo Carlini  <paolo.carlini@oracle.com>
+
+	PR c++/71105
+	* g++.dg/cpp0x/lambda/lambda-conv11.C: New.
+	* g++.dg/cpp1y/lambda-conv1.C: Likewise.
+	* g++.dg/cpp1y/lambda-conv2.C: Likewise.
+
+2016-05-27  Ilya Enkovich  <ilya.enkovich@intel.com>
+
+	Backport from mainline r236810.
+	2016-05-27  Ilya Enkovich  <ilya.enkovich@intel.com>
+
+	PR middle-end/71279
+	* gcc.dg/pr71279.c: New test.
+
+2016-05-26  Jerry DeLisle  <jvdelisle@gcc.gnu.org>
+
+	Backport from trunk.
+	PR fortran/66461
+	* gfortran.dg/unexpected_eof.f: New test
+
 2016-05-25  Eric Botcazou  <ebotcazou@adacore.com>
 
 	* gnat.dg/opt55.ad[sb]: New test.
diff --git a/gcc/testsuite/ChangeLog.meissner b/gcc/testsuite/ChangeLog.meissner
index 480a2628c39..dce0e82245b 100644
--- a/gcc/testsuite/ChangeLog.meissner
+++ b/gcc/testsuite/ChangeLog.meissner
@@ -1,3 +1,7 @@
+2016-05-31  Michael Meissner  <meissner@linux.vnet.ibm.com>
+
+	Merge up to 236941.
+
 2016-05-02  Michael Meissner  <meissner@linux.vnet.ibm.com>
 
 	* gcc.target/powerpc/float128-complex-1.c: New tests for complex
diff --git a/gcc/testsuite/c-c++-common/gomp/clauses-1.c b/gcc/testsuite/c-c++-common/gomp/clauses-1.c
index 91aed3960f6..fe90c2428e0 100644
--- a/gcc/testsuite/c-c++-common/gomp/clauses-1.c
+++ b/gcc/testsuite/c-c++-common/gomp/clauses-1.c
@@ -34,7 +34,7 @@ foo (int d, int m, int i1, int i2, int p, int *idp, int s,
 
 void
 bar (int d, int m, int i1, int i2, int p, int *idp, int s,
-     int nte, int tl, int nth, int g, int nta, int fi, int pp, int *q)
+     int nte, int tl, int nth, int g, int nta, int fi, int pp, int *q, int *dd)
 {
   #pragma omp for simd \
     private (p) firstprivate (f) lastprivate (l) linear (ll:1) reduction(+:r) schedule(static, 4) collapse(1) nowait \
@@ -63,29 +63,30 @@ bar (int d, int m, int i1, int i2, int p, int *idp, int s,
   }
   #pragma omp target parallel \
     device(d) map (tofrom: m) if (target: i1) private (p) firstprivate (f) defaultmap(tofrom: scalar) is_device_ptr (idp) \
-    if (parallel: i2) default(shared) shared(s) reduction(+:r) num_threads (nth) proc_bind(spread)
+    if (parallel: i2) default(shared) shared(s) reduction(+:r) num_threads (nth) proc_bind(spread) \
+    nowait depend(inout: dd[0])
     ;
   #pragma omp target parallel for \
     device(d) map (tofrom: m) if (target: i1) private (p) firstprivate (f) defaultmap(tofrom: scalar) is_device_ptr (idp) \
     if (parallel: i2) default(shared) shared(s) reduction(+:r) num_threads (nth) proc_bind(spread) \
-    lastprivate (l) linear (ll:1) ordered schedule(static, 4) collapse(1)
+    lastprivate (l) linear (ll:1) ordered schedule(static, 4) collapse(1) nowait depend(inout: dd[0])
   for (int i = 0; i < 64; i++)
     ll++;
   #pragma omp target parallel for simd \
     device(d) map (tofrom: m) if (target: i1) private (p) firstprivate (f) defaultmap(tofrom: scalar) is_device_ptr (idp) \
     if (parallel: i2) default(shared) shared(s) reduction(+:r) num_threads (nth) proc_bind(spread) \
     lastprivate (l) linear (ll:1) schedule(static, 4) collapse(1) \
-    safelen(8) simdlen(4) aligned(q: 32)
+    safelen(8) simdlen(4) aligned(q: 32) nowait depend(inout: dd[0])
   for (int i = 0; i < 64; i++)
     ll++;
   #pragma omp target teams \
     device(d) map (tofrom: m) if (target: i1) private (p) firstprivate (f) defaultmap(tofrom: scalar) is_device_ptr (idp) \
-    shared(s) default(shared) reduction(+:r) num_teams(nte) thread_limit(tl)
+    shared(s) default(shared) reduction(+:r) num_teams(nte) thread_limit(tl) nowait depend(inout: dd[0])
     ;
   #pragma omp target teams distribute \
     device(d) map (tofrom: m) if (target: i1) private (p) firstprivate (f) defaultmap(tofrom: scalar) is_device_ptr (idp) \
     shared(s) default(shared) reduction(+:r) num_teams(nte) thread_limit(tl) \
-    collapse(1) dist_schedule(static, 16)
+    collapse(1) dist_schedule(static, 16) nowait depend(inout: dd[0])
   for (int i = 0; i < 64; i++)
     ;
   #pragma omp target teams distribute parallel for \
@@ -93,7 +94,7 @@ bar (int d, int m, int i1, int i2, int p, int *idp, int s,
     shared(s) default(shared) reduction(+:r) num_teams(nte) thread_limit(tl) \
     collapse(1) dist_schedule(static, 16) \
     if (parallel: i2) num_threads (nth) proc_bind(spread) \
-    lastprivate (l) schedule(static, 4)
+    lastprivate (l) schedule(static, 4) nowait depend(inout: dd[0])
   for (int i = 0; i < 64; i++)
     ll++;
   #pragma omp target teams distribute parallel for simd \
@@ -102,19 +103,20 @@ bar (int d, int m, int i1, int i2, int p, int *idp, int s,
     collapse(1) dist_schedule(static, 16) \
     if (parallel: i2) num_threads (nth) proc_bind(spread) \
     lastprivate (l) schedule(static, 4) \
-    safelen(8) simdlen(4) aligned(q: 32)
+    safelen(8) simdlen(4) aligned(q: 32) nowait depend(inout: dd[0])
   for (int i = 0; i < 64; i++)
     ll++;
   #pragma omp target teams distribute simd \
     device(d) map (tofrom: m) if (target: i1) private (p) firstprivate (f) defaultmap(tofrom: scalar) is_device_ptr (idp) \
     shared(s) default(shared) reduction(+:r) num_teams(nte) thread_limit(tl) \
     collapse(1) dist_schedule(static, 16) \
-    safelen(8) simdlen(4) aligned(q: 32)
+    safelen(8) simdlen(4) aligned(q: 32) nowait depend(inout: dd[0])
   for (int i = 0; i < 64; i++)
     ll++;
   #pragma omp target simd \
     device(d) map (tofrom: m) if (target: i1) private (p) firstprivate (f) defaultmap(tofrom: scalar) is_device_ptr (idp) \
-    safelen(8) simdlen(4) lastprivate (l) linear(ll: 1) aligned(q: 32) reduction(+:r)
+    safelen(8) simdlen(4) lastprivate (l) linear(ll: 1) aligned(q: 32) reduction(+:r) \
+     nowait depend(inout: dd[0])
   for (int i = 0; i < 64; i++)
     ll++;
   #pragma omp taskloop simd \
@@ -128,7 +130,7 @@ bar (int d, int m, int i1, int i2, int p, int *idp, int s,
     safelen(8) simdlen(4) linear(ll: 1) aligned(q: 32) reduction(+:r)
   for (int i = 0; i < 64; i++)
     ll++;
-  #pragma omp target
+  #pragma omp target nowait depend(inout: dd[0])
   #pragma omp teams distribute \
     private(p) firstprivate (f) shared(s) default(shared) reduction(+:r) num_teams(nte) thread_limit(tl) \
     collapse(1) dist_schedule(static, 16)
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-conv11.C b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-conv11.C
new file mode 100644
index 00000000000..4b8d6487f5c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-conv11.C
@@ -0,0 +1,10 @@
+// PR c++/71105
+// { dg-do compile { target c++11 } }
+
+void foo()
+{
+  int i;
+  static_cast<void(*)()>([i]{});  // { dg-error "invalid static_cast" }
+  static_cast<void(*)()>([=]{});  // { dg-error "invalid static_cast" }
+  static_cast<void(*)()>([&]{});  // { dg-error "invalid static_cast" }
+}
diff --git a/gcc/testsuite/g++.dg/cpp1y/lambda-conv1.C b/gcc/testsuite/g++.dg/cpp1y/lambda-conv1.C
new file mode 100644
index 00000000000..2e4ec4964d5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/lambda-conv1.C
@@ -0,0 +1,13 @@
+// PR c++/71105
+// { dg-do compile { target c++14 } }
+
+void foo()
+{
+  int i;
+  static_cast<void(*)(int)>([i](auto){});  // { dg-error "invalid static_cast" }
+  static_cast<void(*)(int)>([=](auto){});  // { dg-error "invalid static_cast" }
+  static_cast<void(*)(int)>([&](auto){});  // { dg-error "invalid static_cast" }
+  static_cast<float(*)(float)>([i](auto x){ return x; });  // { dg-error "invalid static_cast" }
+  static_cast<float(*)(float)>([=](auto x){ return x; });  // { dg-error "invalid static_cast" }
+  static_cast<float(*)(float)>([&](auto x){ return x; });  // { dg-error "invalid static_cast" }
+}
diff --git a/gcc/testsuite/g++.dg/cpp1y/lambda-conv2.C b/gcc/testsuite/g++.dg/cpp1y/lambda-conv2.C
new file mode 100644
index 00000000000..45c0f3fe186
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/lambda-conv2.C
@@ -0,0 +1,23 @@
+// PR c++/71105
+// { dg-do compile { target c++14 } }
+
+template <typename T> T declval();
+template <typename, typename> struct is_same
+{ static constexpr bool value = false; };
+template <typename T> struct is_same<T, T>
+{ static constexpr bool value = true; };
+
+template <class F>
+struct indirected : F {
+  indirected(F f) : F(f) {}
+  template <class I>
+  auto operator()(I i) -> decltype(declval<F&>()(*i)) {
+    return static_cast<F&>(*this)(*i);
+  }
+};
+
+int main() {
+  auto f = [=](auto i) { return i + i; };
+  auto i = indirected<decltype(f)>{f};
+  static_assert(is_same<decltype(i(declval<int*>())), int>::value, "");
+}
diff --git a/gcc/testsuite/g++.dg/debug/pr71057.C b/gcc/testsuite/g++.dg/debug/pr71057.C
new file mode 100644
index 00000000000..2ed1eed988e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/debug/pr71057.C
@@ -0,0 +1,12 @@
+// { dg-do compile }
+// { dg-options "-g" }
+template <typename _Tp> using decay_t = _Tp;
+template <typename> struct A;
+template <typename> struct B { B(A<int>); };
+template <typename> struct C {
+      template <typename U> using constructor = B<decay_t<U>>;
+        typedef constructor<int> dummy;
+};
+template <typename> struct D {};
+C<int> a;
+D<B<int>> fn1() { fn1, a; }
diff --git a/gcc/testsuite/g++.dg/torture/pr71002.C b/gcc/testsuite/g++.dg/torture/pr71002.C
new file mode 100644
index 00000000000..8a726809217
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr71002.C
@@ -0,0 +1,160 @@
+// { dg-do run }
+
+using size_t = __SIZE_TYPE__;
+
+inline void* operator new(size_t, void* p) noexcept
+{ return p; }
+
+inline void operator delete(void*, void*)
+{ }
+
+struct long_t
+{
+  size_t is_short : 1;
+  size_t length   : (__SIZEOF_SIZE_T__ * __CHAR_BIT__ - 1);
+  size_t capacity;
+  char* pointer;
+};
+
+union long_raw_t {
+  unsigned char data[sizeof(long_t)];
+  struct __attribute__((aligned(alignof(long_t)))) { } align;
+};
+
+struct short_header
+{
+  unsigned char is_short : 1;
+  unsigned char length   : (__CHAR_BIT__ - 1);
+};
+
+struct short_t
+{
+  short_header h;
+  char data[23];
+};
+
+union repr_t
+{
+  long_raw_t  r;
+  short_t     s;
+
+  const short_t& short_repr() const
+  { return s; }
+
+  const long_t& long_repr() const
+  { return *static_cast<const long_t*>(static_cast<const void*>(&r)); }
+
+  short_t& short_repr()
+  { return s;  }
+
+  long_t& long_repr()
+  { return *static_cast<long_t*>(static_cast<void*>(&r)); }
+};
+
+class string
+{
+public:
+  string()
+  {
+    short_t& s = m_repr.short_repr();
+    s.h.is_short = 1;
+    s.h.length = 0;
+    s.data[0] = '\0';
+  }
+
+  string(const char* str)
+  {
+    size_t length = __builtin_strlen(str);
+    if (length + 1 > 23) {
+      long_t& l = m_repr.long_repr();
+      l.is_short = 0;
+      l.length = length;
+      l.capacity = length + 1;
+      l.pointer = new char[l.capacity];
+      __builtin_memcpy(l.pointer, str, length + 1);
+    } else {
+      short_t& s = m_repr.short_repr();
+      s.h.is_short = 1;
+      s.h.length = length;
+      __builtin_memcpy(s.data, str, length + 1);
+    }
+  }
+
+  string(string&& other)
+    : string{}
+  {
+    swap_data(other);
+  }
+
+  ~string()
+  {
+    if (!is_short()) {
+      delete[] m_repr.long_repr().pointer;
+    }
+  }
+
+  size_t length() const
+  { return is_short() ? short_length() : long_length(); }
+
+private:
+  bool is_short() const
+  { return m_repr.s.h.is_short != 0; }
+
+  size_t short_length() const
+  { return m_repr.short_repr().h.length; }
+
+  size_t long_length() const
+  { return m_repr.long_repr().length; }
+
+  void swap_data(string& other)
+  {
+    if (is_short()) {
+      if (other.is_short()) {
+        repr_t tmp(m_repr);
+        m_repr = other.m_repr;
+        other.m_repr = tmp;
+      } else {
+        short_t short_backup(m_repr.short_repr());
+        m_repr.short_repr().~short_t();
+        ::new(&m_repr.long_repr()) long_t(other.m_repr.long_repr());
+        other.m_repr.long_repr().~long_t();
+        ::new(&other.m_repr.short_repr()) short_t(short_backup);
+      }
+    } else {
+      if (other.is_short()) {
+        short_t short_backup(other.m_repr.short_repr());
+        other.m_repr.short_repr().~short_t();
+        ::new(&other.m_repr.long_repr()) long_t(m_repr.long_repr());
+        m_repr.long_repr().~long_t();
+        ::new(&m_repr.short_repr()) short_t(short_backup);
+      } else {
+        long_t tmp(m_repr.long_repr());
+        m_repr.long_repr() = other.m_repr.long_repr();
+        other.m_repr.long_repr() = tmp;
+      }
+    }
+  }
+
+  repr_t m_repr;
+};
+
+struct foo
+{
+  __attribute__((noinline))
+  foo(string str)
+    : m_str{static_cast<string&&>(str)},
+      m_len{m_str.length()}
+  { }
+
+  string m_str;
+  size_t m_len;
+};
+
+int main()
+{
+  foo f{"the quick brown fox jumps over the lazy dog"};
+  if (f.m_len == 0) {
+    __builtin_abort();
+  }
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/warn/Wplacement-new-size-3.C b/gcc/testsuite/g++.dg/warn/Wplacement-new-size-3.C
new file mode 100644
index 00000000000..c93e4e698a7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wplacement-new-size-3.C
@@ -0,0 +1,40 @@
+// PR c++/71306 - bogus -Wplacement-new with an array element
+// { dg-do compile }
+// { dg-options "-Wplacement-new" }
+
+void* operator new (__SIZE_TYPE__, void *p) { return p; }
+
+struct S64 { char c [64]; };
+
+S64 s2 [2];
+S64* ps2 [2];
+S64* ps2_2 [2][2];
+
+void* pv2 [2];
+
+void f ()
+{
+  char a [2][sizeof (S64)];
+
+  new (a) S64;
+  new (a [0]) S64;
+  new (a [1]) S64;
+
+  // Verify there is no warning with buffers of sufficient size.
+  new (&s2 [0]) S64;
+  new (&s2 [1]) S64;
+
+  // ..and no warning with pointers to buffers of unknown size.
+  new (ps2 [0]) S64;
+  new (ps2 [1]) S64;
+
+  // But a warning when using the ps2_2 array itself as opposed
+  // to the pointers it's elements might point to.
+  new (ps2_2 [0]) S64;	// { dg-warning "placement new" }
+  new (ps2_2 [1]) S64;	// { dg-warning "placement new" }
+
+  // ..and no warning again with pointers to buffers of unknown
+  // size.
+  new (pv2 [0]) S64;
+  new (pv2 [1]) S64;
+}
diff --git a/gcc/testsuite/gcc.dg/graphite/pr69067.c b/gcc/testsuite/gcc.dg/graphite/pr69067.c
new file mode 100644
index 00000000000..145ac822907
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/graphite/pr69067.c
@@ -0,0 +1,28 @@
+/* { dg-do link } */
+/* { dg-options " -O1 -floop-nest-optimize" } */
+/* { dg-additional-options "-flto" { target lto } } */
+
+int a1, c1, cr, kt;
+int aa[2];
+
+int
+ce (void)
+{
+  while (a1 < 1)
+    {
+      int g8;
+      for (g8 = 0; g8 < 3; ++g8)
+	if (c1 != 0)
+	  cr = aa[a1 * 2] = kt;
+      for (c1 = 0; c1 < 2; ++c1)
+	aa[c1] = cr;
+      ++a1;
+    }
+  return 0;
+}
+
+int
+main (void)
+{
+  return ce ();
+}
diff --git a/gcc/testsuite/gcc.dg/graphite/pr69068.c b/gcc/testsuite/gcc.dg/graphite/pr69068.c
new file mode 100644
index 00000000000..0abea060025
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/graphite/pr69068.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fgraphite-identity" } */
+
+int qo;
+int zh[2];
+
+void
+td (void)
+{
+  int ly, en;
+  for (ly = 0; ly < 2; ++ly)
+    for (en = 0; en < 2; ++en)
+      zh[en] = ((qo == 0) || (((qo * 2) != 0))) ? 1 : -1;
+}
diff --git a/gcc/testsuite/gcc.dg/pr71071.c b/gcc/testsuite/gcc.dg/pr71071.c
new file mode 100644
index 00000000000..582f1f15a43
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr71071.c
@@ -0,0 +1,12 @@
+/* PR bootstrap/71071 */
+/* { dg-do compile } *
+/* { dg-options "-O2" } */
+
+struct S { unsigned b : 1; } a;
+
+void
+foo ()
+{
+  if (a.b)
+    ;
+}
diff --git a/gcc/testsuite/gcc.dg/pr71279.c b/gcc/testsuite/gcc.dg/pr71279.c
new file mode 100644
index 00000000000..4ecc84b6425
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr71279.c
@@ -0,0 +1,14 @@
+/* PR middle-end/71279 */
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+/* { dg-additional-options "-march=knl" { target { i?86-*-* x86_64-*-* } } } */
+
+extern int a, b;
+long c[1][1][1];
+long d[1][1];
+
+void fn1 ()
+{
+  for (int e = 0; e < b; e = e + 1)
+    *(e + **c) = (a && *d[1]) - 1;
+}
diff --git a/gcc/testsuite/gcc.target/arm/pr71056.c b/gcc/testsuite/gcc.target/arm/pr71056.c
new file mode 100644
index 00000000000..136754eb13c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr71056.c
@@ -0,0 +1,32 @@
+/* PR target/71056.  */
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_vfp3_ok } */
+/* { dg-options "-O3 -mfpu=vfpv3" } */
+
+/* Check that compiling for a non-NEON target doesn't try to introduce
+   a NEON vectorized builtin.  */
+
+extern char *buff;
+int f2 ();
+struct T1
+{
+  int reserved[2];
+  unsigned int ip;
+  unsigned short cs;
+  unsigned short rsrv2;
+};
+void
+f3 (const char *p)
+{
+  struct T1 x;
+  __builtin_memcpy (&x, p, sizeof (struct T1));
+  x.reserved[0] = __builtin_bswap32 (x.reserved[0]);
+  x.reserved[1] = __builtin_bswap32 (x.reserved[1]);
+  x.ip = __builtin_bswap32 (x.ip);
+  x.cs = x.cs << 8 | x.cs >> 8;
+  x.rsrv2 = x.rsrv2 << 8 | x.rsrv2 >> 8;
+  if (f2 ())
+    {
+      __builtin_memcpy (buff, "\n", 1);
+    }
+}
diff --git a/gcc/testsuite/gcc.target/i386/iamcu/args.h b/gcc/testsuite/gcc.target/i386/iamcu/args.h
index f8abde40155..67808ffb565 100644
--- a/gcc/testsuite/gcc.target/i386/iamcu/args.h
+++ b/gcc/testsuite/gcc.target/i386/iamcu/args.h
@@ -30,7 +30,7 @@ extern void *iamcu_memset (void *, int, size_t);
 /* Clear all scratch integer registers, excluding the one used to return
    aggregate.  */
 #define clear_non_sret_int_hardware_registers \
-  asm __volatile__ ("xor %%edx, %%ebx\n\t" \
+  asm __volatile__ ("xor %%edx, %%edx\n\t" \
 		    "xor %%ecx, %%ecx\n\t" \
 		    ::: "edx", "ecx");
 
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-elemrev-1.c b/gcc/testsuite/gcc.target/powerpc/vsx-elemrev-1.c
new file mode 100644
index 00000000000..7ab6d446a23
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-elemrev-1.c
@@ -0,0 +1,143 @@
+/* { dg-do compile { target { powerpc64le*-*-* } } } */
+/* { dg-skip-if "do not override mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */
+/* { dg-options "-mcpu=power8 -O0" } */
+/* { dg-final { scan-assembler-times "lxvd2x" 18 } } */
+/* { dg-final { scan-assembler-times "lxvw4x" 6 } } */
+/* { dg-final { scan-assembler-times "stxvd2x" 18 } } */
+/* { dg-final { scan-assembler-times "stxvw4x" 6 } } */
+/* { dg-final { scan-assembler-times "xxpermdi" 24 } } */
+
+#include <altivec.h>
+
+extern vector double vd, *vdp;
+extern vector signed long long vsll, *vsllp;
+extern vector unsigned long long vull, *vullp;
+extern vector float vf, *vfp;
+extern vector signed int vsi, *vsip;
+extern vector unsigned int vui, *vuip;
+extern double *dp;
+extern signed long long *sllp;
+extern unsigned long long *ullp;
+extern float *fp;
+extern signed int *sip;
+extern unsigned int *uip;
+
+void foo0 (void)
+{
+  vd = vec_xl (0, vdp);
+}
+
+void foo1 (void)
+{
+  vsll = vec_xl (0, vsllp);
+}
+
+void foo2 (void)
+{
+  vull = vec_xl (0, vullp);
+}
+
+void foo3 (void)
+{
+  vf = vec_xl (0, vfp);
+}
+
+void foo4 (void)
+{
+  vsi = vec_xl (0, vsip);
+}
+
+void foo5 (void)
+{
+  vui = vec_xl (0, vuip);
+}
+
+void foo6 (void)
+{
+  vec_xst (vd, 0, vdp);
+}
+
+void foo7 (void)
+{
+  vec_xst (vsll, 0, vsllp);
+}
+
+void foo8 (void)
+{
+  vec_xst (vull, 0, vullp);
+}
+
+void foo9 (void)
+{
+  vec_xst (vf, 0, vfp);
+}
+
+void foo10 (void)
+{
+  vec_xst (vsi, 0, vsip);
+}
+
+void foo11 (void)
+{
+  vec_xst (vui, 0, vuip);
+}
+
+void foo20 (void)
+{
+  vd = vec_xl (0, dp);
+}
+
+void foo21 (void)
+{
+  vsll = vec_xl (0, sllp);
+}
+
+void foo22 (void)
+{
+  vull = vec_xl (0, ullp);
+}
+
+void foo23 (void)
+{
+  vf = vec_xl (0, fp);
+}
+
+void foo24 (void)
+{
+  vsi = vec_xl (0, sip);
+}
+
+void foo25 (void)
+{
+  vui = vec_xl (0, uip);
+}
+
+void foo26 (void)
+{
+  vec_xst (vd, 0, dp);
+}
+
+void foo27 (void)
+{
+  vec_xst (vsll, 0, sllp);
+}
+
+void foo28 (void)
+{
+  vec_xst (vull, 0, ullp);
+}
+
+void foo29 (void)
+{
+  vec_xst (vf, 0, fp);
+}
+
+void foo30 (void)
+{
+  vec_xst (vsi, 0, sip);
+}
+
+void foo31 (void)
+{
+  vec_xst (vui, 0, uip);
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-elemrev-2.c b/gcc/testsuite/gcc.target/powerpc/vsx-elemrev-2.c
new file mode 100644
index 00000000000..f1c44039c76
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-elemrev-2.c
@@ -0,0 +1,234 @@
+/* { dg-do compile { target { powerpc64le*-*-* } } } */
+/* { dg-skip-if "do not override mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-options "-mcpu=power9 -O0" } */
+/* { dg-final { scan-assembler-times "lxvd2x" 6 } } */
+/* { dg-final { scan-assembler-times "lxvw4x" 6 } } */
+/* { dg-final { scan-assembler-times "lxvh8x" 4 } } */
+/* { dg-final { scan-assembler-times "lxvb16x" 4 } } */
+/* { dg-final { scan-assembler-times "stxvd2x" 6 } } */
+/* { dg-final { scan-assembler-times "stxvw4x" 6 } } */
+/* { dg-final { scan-assembler-times "stxvh8x" 4 } } */
+/* { dg-final { scan-assembler-times "stxvb16x" 4 } } */
+
+#include <altivec.h>
+
+extern vector double vd, *vdp;
+extern vector signed long long vsll, *vsllp;
+extern vector unsigned long long vull, *vullp;
+extern vector float vf, *vfp;
+extern vector signed int vsi, *vsip;
+extern vector unsigned int vui, *vuip;
+extern vector signed short vss, *vssp;
+extern vector unsigned short vus, *vusp;
+extern vector signed char vsc, *vscp;
+extern vector unsigned char vuc, *vucp;
+extern double *dp;
+extern signed long long *sllp;
+extern unsigned long long *ullp;
+extern float *fp;
+extern signed int *sip;
+extern unsigned int *uip;
+extern signed short *ssp;
+extern unsigned short *usp;
+extern signed char *scp;
+extern unsigned char *ucp;
+
+void foo0 (void)
+{
+  vd = vec_xl (0, vdp);
+}
+
+void foo1 (void)
+{
+  vsll = vec_xl (0, vsllp);
+}
+
+void foo2 (void)
+{
+  vull = vec_xl (0, vullp);
+}
+
+void foo3 (void)
+{
+  vf = vec_xl (0, vfp);
+}
+
+void foo4 (void)
+{
+  vsi = vec_xl (0, vsip);
+}
+
+void foo5 (void)
+{
+  vui = vec_xl (0, vuip);
+}
+
+void foo6 (void)
+{
+  vss = vec_xl (0, vssp);
+}
+
+void foo7 (void)
+{
+  vus = vec_xl (0, vusp);
+}
+
+void foo8 (void)
+{
+  vsc = vec_xl (0, vscp);
+}
+
+void foo9 (void)
+{
+  vuc = vec_xl (0, vucp);
+}
+
+void foo10 (void)
+{
+  vec_xst (vd, 0, vdp);
+}
+
+void foo11 (void)
+{
+  vec_xst (vsll, 0, vsllp);
+}
+
+void foo12 (void)
+{
+  vec_xst (vull, 0, vullp);
+}
+
+void foo13 (void)
+{
+  vec_xst (vf, 0, vfp);
+}
+
+void foo14 (void)
+{
+  vec_xst (vsi, 0, vsip);
+}
+
+void foo15 (void)
+{
+  vec_xst (vui, 0, vuip);
+}
+
+void foo16 (void)
+{
+  vec_xst (vss, 0, vssp);
+}
+
+void foo17 (void)
+{
+  vec_xst (vus, 0, vusp);
+}
+
+void foo18 (void)
+{
+  vec_xst (vsc, 0, vscp);
+}
+
+void foo19 (void)
+{
+  vec_xst (vuc, 0, vucp);
+}
+
+void foo20 (void)
+{
+  vd = vec_xl (0, dp);
+}
+
+void foo21 (void)
+{
+  vsll = vec_xl (0, sllp);
+}
+
+void foo22 (void)
+{
+  vull = vec_xl (0, ullp);
+}
+
+void foo23 (void)
+{
+  vf = vec_xl (0, fp);
+}
+
+void foo24 (void)
+{
+  vsi = vec_xl (0, sip);
+}
+
+void foo25 (void)
+{
+  vui = vec_xl (0, uip);
+}
+
+void foo26 (void)
+{
+  vss = vec_xl (0, ssp);
+}
+
+void foo27 (void)
+{
+  vus = vec_xl (0, usp);
+}
+
+void foo28 (void)
+{
+  vsc = vec_xl (0, scp);
+}
+
+void foo29 (void)
+{
+  vuc = vec_xl (0, ucp);
+}
+
+void foo30 (void)
+{
+  vec_xst (vd, 0, dp);
+}
+
+void foo31 (void)
+{
+  vec_xst (vsll, 0, sllp);
+}
+
+void foo32 (void)
+{
+  vec_xst (vull, 0, ullp);
+}
+
+void foo33 (void)
+{
+  vec_xst (vf, 0, fp);
+}
+
+void foo34 (void)
+{
+  vec_xst (vsi, 0, sip);
+}
+
+void foo35 (void)
+{
+  vec_xst (vui, 0, uip);
+}
+
+void foo36 (void)
+{
+  vec_xst (vss, 0, ssp);
+}
+
+void foo37 (void)
+{
+  vec_xst (vus, 0, usp);
+}
+
+void foo38 (void)
+{
+  vec_xst (vsc, 0, scp);
+}
+
+void foo39 (void)
+{
+  vec_xst (vuc, 0, ucp);
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-elemrev-3.c b/gcc/testsuite/gcc.target/powerpc/vsx-elemrev-3.c
new file mode 100644
index 00000000000..2888c171c4f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-elemrev-3.c
@@ -0,0 +1,142 @@
+/* { dg-do compile { target { powerpc64-*-* } } } */
+/* { dg-skip-if "do not override mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */
+/* { dg-options "-mcpu=power8 -O0" } */
+/* { dg-final { scan-assembler-times "lxvd2x" 16 } } */
+/* { dg-final { scan-assembler-times "lxvw4x" 8 } } */
+/* { dg-final { scan-assembler-times "stxvd2x" 16 } } */
+/* { dg-final { scan-assembler-times "stxvw4x" 8 } } */
+
+#include <altivec.h>
+
+extern vector double vd, *vdp;
+extern vector signed long long vsll, *vsllp;
+extern vector unsigned long long vull, *vullp;
+extern vector float vf, *vfp;
+extern vector signed int vsi, *vsip;
+extern vector unsigned int vui, *vuip;
+extern double *dp;
+extern signed long long *sllp;
+extern unsigned long long *ullp;
+extern float *fp;
+extern signed int *sip;
+extern unsigned int *uip;
+
+void foo0 (void)
+{
+  vd = vec_xl (0, vdp);
+}
+
+void foo1 (void)
+{
+  vsll = vec_xl (0, vsllp);
+}
+
+void foo2 (void)
+{
+  vull = vec_xl (0, vullp);
+}
+
+void foo3 (void)
+{
+  vf = vec_xl (0, vfp);
+}
+
+void foo4 (void)
+{
+  vsi = vec_xl (0, vsip);
+}
+
+void foo5 (void)
+{
+  vui = vec_xl (0, vuip);
+}
+
+void foo6 (void)
+{
+  vec_xst (vd, 0, vdp);
+}
+
+void foo7 (void)
+{
+  vec_xst (vsll, 0, vsllp);
+}
+
+void foo8 (void)
+{
+  vec_xst (vull, 0, vullp);
+}
+
+void foo9 (void)
+{
+  vec_xst (vf, 0, vfp);
+}
+
+void foo10 (void)
+{
+  vec_xst (vsi, 0, vsip);
+}
+
+void foo11 (void)
+{
+  vec_xst (vui, 0, vuip);
+}
+
+void foo20 (void)
+{
+  vd = vec_xl (0, dp);
+}
+
+void foo21 (void)
+{
+  vsll = vec_xl (0, sllp);
+}
+
+void foo22 (void)
+{
+  vull = vec_xl (0, ullp);
+}
+
+void foo23 (void)
+{
+  vf = vec_xl (0, fp);
+}
+
+void foo24 (void)
+{
+  vsi = vec_xl (0, sip);
+}
+
+void foo25 (void)
+{
+  vui = vec_xl (0, uip);
+}
+
+void foo26 (void)
+{
+  vec_xst (vd, 0, dp);
+}
+
+void foo27 (void)
+{
+  vec_xst (vsll, 0, sllp);
+}
+
+void foo28 (void)
+{
+  vec_xst (vull, 0, ullp);
+}
+
+void foo29 (void)
+{
+  vec_xst (vf, 0, fp);
+}
+
+void foo30 (void)
+{
+  vec_xst (vsi, 0, sip);
+}
+
+void foo31 (void)
+{
+  vec_xst (vui, 0, uip);
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-elemrev-4.c b/gcc/testsuite/gcc.target/powerpc/vsx-elemrev-4.c
new file mode 100644
index 00000000000..ef84581bc0a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-elemrev-4.c
@@ -0,0 +1,228 @@
+/* { dg-do compile { target { powerpc64-*-* } } } */
+/* { dg-skip-if "do not override mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-options "-mcpu=power9 -O0" } */
+/* { dg-final { scan-assembler-times "lxvx" 40 } } */
+/* { dg-final { scan-assembler-times "stxvx" 40 } } */
+
+#include <altivec.h>
+
+extern vector double vd, *vdp;
+extern vector signed long long vsll, *vsllp;
+extern vector unsigned long long vull, *vullp;
+extern vector float vf, *vfp;
+extern vector signed int vsi, *vsip;
+extern vector unsigned int vui, *vuip;
+extern vector signed short vss, *vssp;
+extern vector unsigned short vus, *vusp;
+extern vector signed char vsc, *vscp;
+extern vector unsigned char vuc, *vucp;
+extern double *dp;
+extern signed long long *sllp;
+extern unsigned long long *ullp;
+extern float *fp;
+extern signed int *sip;
+extern unsigned int *uip;
+extern signed short *ssp;
+extern unsigned short *usp;
+extern signed char *scp;
+extern unsigned char *ucp;
+
+void foo0 (void)
+{
+  vd = vec_xl (0, vdp);
+}
+
+void foo1 (void)
+{
+  vsll = vec_xl (0, vsllp);
+}
+
+void foo2 (void)
+{
+  vull = vec_xl (0, vullp);
+}
+
+void foo3 (void)
+{
+  vf = vec_xl (0, vfp);
+}
+
+void foo4 (void)
+{
+  vsi = vec_xl (0, vsip);
+}
+
+void foo5 (void)
+{
+  vui = vec_xl (0, vuip);
+}
+
+void foo6 (void)
+{
+  vss = vec_xl (0, vssp);
+}
+
+void foo7 (void)
+{
+  vus = vec_xl (0, vusp);
+}
+
+void foo8 (void)
+{
+  vsc = vec_xl (0, vscp);
+}
+
+void foo9 (void)
+{
+  vuc = vec_xl (0, vucp);
+}
+
+void foo10 (void)
+{
+  vec_xst (vd, 0, vdp);
+}
+
+void foo11 (void)
+{
+  vec_xst (vsll, 0, vsllp);
+}
+
+void foo12 (void)
+{
+  vec_xst (vull, 0, vullp);
+}
+
+void foo13 (void)
+{
+  vec_xst (vf, 0, vfp);
+}
+
+void foo14 (void)
+{
+  vec_xst (vsi, 0, vsip);
+}
+
+void foo15 (void)
+{
+  vec_xst (vui, 0, vuip);
+}
+
+void foo16 (void)
+{
+  vec_xst (vss, 0, vssp);
+}
+
+void foo17 (void)
+{
+  vec_xst (vus, 0, vusp);
+}
+
+void foo18 (void)
+{
+  vec_xst (vsc, 0, vscp);
+}
+
+void foo19 (void)
+{
+  vec_xst (vuc, 0, vucp);
+}
+
+void foo20 (void)
+{
+  vd = vec_xl (0, dp);
+}
+
+void foo21 (void)
+{
+  vsll = vec_xl (0, sllp);
+}
+
+void foo22 (void)
+{
+  vull = vec_xl (0, ullp);
+}
+
+void foo23 (void)
+{
+  vf = vec_xl (0, fp);
+}
+
+void foo24 (void)
+{
+  vsi = vec_xl (0, sip);
+}
+
+void foo25 (void)
+{
+  vui = vec_xl (0, uip);
+}
+
+void foo26 (void)
+{
+  vss = vec_xl (0, ssp);
+}
+
+void foo27 (void)
+{
+  vus = vec_xl (0, usp);
+}
+
+void foo28 (void)
+{
+  vsc = vec_xl (0, scp);
+}
+
+void foo29 (void)
+{
+  vuc = vec_xl (0, ucp);
+}
+
+void foo30 (void)
+{
+  vec_xst (vd, 0, dp);
+}
+
+void foo31 (void)
+{
+  vec_xst (vsll, 0, sllp);
+}
+
+void foo32 (void)
+{
+  vec_xst (vull, 0, ullp);
+}
+
+void foo33 (void)
+{
+  vec_xst (vf, 0, fp);
+}
+
+void foo34 (void)
+{
+  vec_xst (vsi, 0, sip);
+}
+
+void foo35 (void)
+{
+  vec_xst (vui, 0, uip);
+}
+
+void foo36 (void)
+{
+  vec_xst (vss, 0, ssp);
+}
+
+void foo37 (void)
+{
+  vec_xst (vus, 0, usp);
+}
+
+void foo38 (void)
+{
+  vec_xst (vsc, 0, scp);
+}
+
+void foo39 (void)
+{
+  vec_xst (vuc, 0, ucp);
+}
diff --git a/gcc/testsuite/gfortran.dg/unexpected_eof.f b/gcc/testsuite/gfortran.dg/unexpected_eof.f
new file mode 100644
index 00000000000..d3cdb99596a
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/unexpected_eof.f
@@ -0,0 +1,8 @@
+! { dg-do compile }
+! PR66461  ICE on missing end program in fixed source
+      program p
+         integer x(2)
+         x = -1
+         if ( x(1) < 0 .or.
+     &        x(2) < 0 ) print *, x
+! { dg-error "Unexpected end of file" "" { target *-*-* } 0 }
diff --git a/libgcc/ChangeLog.meissner b/libgcc/ChangeLog.meissner
index efd9f122b83..13fc4561bda 100644
--- a/libgcc/ChangeLog.meissner
+++ b/libgcc/ChangeLog.meissner
@@ -1,3 +1,7 @@
+2016-05-31  Michael Meissner  <meissner@linux.vnet.ibm.com>
+
+	Merge up to 236941.
+
 2016-05-26   Michael Meissner  <meissner@linux.vnet.ibm.com>
 
 	Clone branch subversion id 236789
author	Michael Meissner <meissner@linux.vnet.ibm.com>	2016-05-31 19:32:29 +0000
committer	Michael Meissner <meissner@linux.vnet.ibm.com>	2016-05-31 19:32:29 +0000
commit	bacd566421379a78d1fcd79fc7ccad3dde0ab185 (patch)
tree	bbb7d524571cef68f87e373685160dcf9dd7610b
parent	2f4ed407112e38c9ff43ceb61570fe760987e4fb (diff)
parent	741da3c33f74f41851aa32437020a8fd1fc350a4 (diff)