aboutsummaryrefslogtreecommitdiff
path: root/gcc/testsuite/gcc.dg
diff options
context:
space:
mode:
authorRichard Biener <rguenther@suse.de>2024-05-27 16:04:35 +0200
committerRichard Biener <rguenther@suse.de>2024-05-29 13:05:24 +0200
commitf46eaad445e680034df51bd0dec4e6c7b1f372a4 (patch)
treedd1c04eef158c554d4cf5cb9af6856b8573ca008 /gcc/testsuite/gcc.dg
parent1065a7db6f2a69770a85b4d53b9123b090dd1771 (diff)
tree-optimization/115252 - enhance peeling for gaps avoidance
Code generation for contiguous load vectorization can already deal with generalized avoidance of loading from a gap. The following extends detection of peeling for gaps requirement with that, gets rid of the old special casing of a half load and makes sure when we do access the gap we have peeling for gaps enabled. PR tree-optimization/115252 * tree-vect-stmts.cc (get_group_load_store_type): Enhance detecting the number of cases where we can avoid accessing a gap during code generation. (vectorizable_load): Remove old half-vector peeling for gap avoidance which is now redundant. Add gap-aligned case where it's OK to access the gap. Add assert that we have peeling for gaps enabled when we access a gap. * gcc.dg/vect/slp-gap-1.c: New testcase.
Diffstat (limited to 'gcc/testsuite/gcc.dg')
-rw-r--r--gcc/testsuite/gcc.dg/vect/slp-gap-1.c18
1 files changed, 18 insertions, 0 deletions
diff --git a/gcc/testsuite/gcc.dg/vect/slp-gap-1.c b/gcc/testsuite/gcc.dg/vect/slp-gap-1.c
new file mode 100644
index 00000000000..36463ca22c5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/slp-gap-1.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O3" } */
+
+typedef unsigned char uint8_t;
+typedef short int16_t;
+void pixel_sub_wxh(int16_t * __restrict diff, uint8_t *pix1, uint8_t *pix2) {
+ for (int y = 0; y < 4; y++) {
+ for (int x = 0; x < 4; x++)
+ diff[x + y * 4] = pix1[x] - pix2[x];
+ pix1 += 16;
+ pix2 += 32;
+ }
+}
+
+/* We can vectorize this without peeling for gaps and thus without epilogue,
+ but the only thing we can reliably scan is the zero-padding trick for the
+ partial loads. */
+/* { dg-final { scan-tree-dump-times "\{_\[0-9\]\+, 0" 6 "vect" { target vect64 } } } */