Age | Commit message (Collapse) | Author |
|
Don't add them to files in vendor and cmd/vendor though. These will be
pulled in by updating the respective dependencies.
For #41184
Change-Id: Icc57458c9b3033c347124323f33084c85b224c70
Reviewed-on: https://go-review.googlesource.com/c/go/+/319389
Trust: Tobias Klauser <tobias.klauser@gmail.com>
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
Change-Id: Ib0469232a2b69a869e58d5d24990ad74ac96ea56
GitHub-Last-Rev: eb38e049ee1e773392ff3747e1eb2af20dd50dcd
GitHub-Pull-Request: golang/go#44805
Reviewed-on: https://go-review.googlesource.com/c/go/+/299109
Trust: Emmanuel Odeke <emmanuel@orijtech.com>
Run-TryBot: Emmanuel Odeke <emmanuel@orijtech.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
|
|
This improves the performance of memmove for almost all moves <= 16 bytes
for the ppc64 assembler, improving linux/ppc64le, linux/ppc64, aix/ppc64.
Only the forward moves were changed, the backward moves were left as is.
Additional macro defines were added to improve the readability of the asm.
Results from power8:
name old time/op new time/op delta
Memmove/0 5.70ns ± 0% 5.69ns ± 0% -0.18% (p=0.029 n=4+4)
Memmove/1 5.54ns ± 0% 5.39ns ± 0% -2.71% (p=0.029 n=4+4)
Memmove/2 6.31ns ± 0% 5.55ns ± 0% -12.08% (p=0.029 n=4+4)
Memmove/3 7.41ns ± 0% 5.54ns ± 0% -25.24% (p=0.029 n=4+4)
Memmove/4 8.41ns ± 0% 5.56ns ± 0% -33.87% (p=0.029 n=4+4)
Memmove/5 10.1ns ± 5% 5.5ns ± 0% -45.30% (p=0.029 n=4+4)
Memmove/6 10.3ns ± 0% 5.6ns ± 0% -45.92% (p=0.029 n=4+4)
Memmove/7 11.4ns ± 0% 5.7ns ± 0% -50.33% (p=0.029 n=4+4)
Memmove/8 5.66ns ± 0% 5.54ns ± 0% -2.12% (p=0.029 n=4+4)
Memmove/9 5.66ns ± 0% 6.47ns ± 0% +14.31% (p=0.029 n=4+4)
Memmove/10 6.67ns ± 0% 6.22ns ± 0% -6.82% (p=0.029 n=4+4)
Memmove/11 7.83ns ± 0% 6.45ns ± 0% -17.60% (p=0.029 n=4+4)
Memmove/12 8.91ns ± 0% 6.25ns ± 0% -29.85% (p=0.029 n=4+4)
Memmove/13 9.81ns ± 0% 6.48ns ± 0% -33.94% (p=0.029 n=4+4)
Memmove/14 10.7ns ± 1% 6.4ns ± 0% -40.00% (p=0.029 n=4+4)
Memmove/15 11.8ns ± 0% 6.7ns ± 0% -42.84% (p=0.029 n=4+4)
Memmove/16 5.63ns ± 0% 5.56ns ± 0% -1.20% (p=0.029 n=4+4)
Change-Id: I2de434f543c5a017395e0850fb9b9f7219583bbb
Reviewed-on: https://go-review.googlesource.com/c/go/+/223317
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
|
|
Unlike C's memmove, Go's memmove must be careful to do indivisible
writes of pointer values because it may be racing with the garbage
collector reading the heap.
We've had various bugs related to this over the years (#36101, #13160,
#12552). Indeed, memmove is a great target for optimization and it's
easy to forget the special requirements of Go's memmove.
The CL documents these (currently unwritten!) requirements. We're also
adding a test that should hopefully keep everyone honest going
forward, though it's hard to be sure we're hitting all cases of
memmove.
Change-Id: I2f59f8d8d6fb42d2f10006b55d605b5efd8ddc24
Reviewed-on: https://go-review.googlesource.com/c/go/+/213418
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
This improves the asm implementations for memmove and memclr on
ppc64x through use of vsx loads and stores when size is >= 32 bytes.
For memclr, dcbz is used when the size is >= 512 and aligned to 128.
Memclr/64 13.3ns ± 0% 10.7ns ± 0% -19.55% (p=0.000 n=8+7)
Memclr/96 14.9ns ± 0% 11.4ns ± 0% -23.49% (p=0.000 n=8+8)
Memclr/128 16.3ns ± 0% 12.3ns ± 0% -24.54% (p=0.000 n=8+8)
Memclr/160 17.3ns ± 0% 13.0ns ± 0% -24.86% (p=0.000 n=8+8)
Memclr/256 20.0ns ± 0% 15.3ns ± 0% -23.62% (p=0.000 n=8+8)
Memclr/512 34.2ns ± 0% 10.2ns ± 0% -70.20% (p=0.000 n=8+8)
Memclr/4096 178ns ± 0% 23ns ± 0% -87.13% (p=0.000 n=8+8)
Memclr/65536 2.67µs ± 0% 0.30µs ± 0% -88.89% (p=0.000 n=7+8)
Memclr/1M 43.2µs ± 0% 10.0µs ± 0% -76.85% (p=0.000 n=8+8)
Memclr/4M 173µs ± 0% 40µs ± 0% -76.88% (p=0.000 n=8+8)
Memclr/8M 349µs ± 0% 82µs ± 0% -76.58% (p=0.000 n=8+8)
Memclr/16M 701µs ± 7% 672µs ± 0% -4.05% (p=0.040 n=8+7)
Memclr/64M 2.70ms ± 0% 2.67ms ± 0% -0.96% (p=0.000 n=8+7)
Memmove/32 6.59ns ± 0% 5.84ns ± 0% -11.34% (p=0.029 n=4+4)
Memmove/64 7.91ns ± 0% 6.97ns ± 0% -11.92% (p=0.029 n=4+4)
Memmove/128 10.5ns ± 0% 8.8ns ± 0% -16.24% (p=0.029 n=4+4)
Memmove/256 21.0ns ± 0% 12.9ns ± 0% -38.57% (p=0.029 n=4+4)
Memmove/512 28.4ns ± 0% 26.2ns ± 0% -7.75% (p=0.029 n=4+4)
Memmove/1024 48.2ns ± 1% 39.4ns ± 0% -18.26% (p=0.029 n=4+4)
Memmove/2048 85.4ns ± 0% 69.0ns ± 0% -19.20% (p=0.029 n=4+4)
Memmove/4096 159ns ± 0% 128ns ± 0% -19.50% (p=0.029 n=4+4)
Change-Id: I8c1adf88790845bf31444a15249456006eb5bf8b
Reviewed-on: https://go-review.googlesource.com/c/141217
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <mike.munday@ibm.com>
|
|
The function signatures in the comments used a C-like style. Using
Go function signatures is cleaner.
Change-Id: I1a093ed8fe5df59f3697c613cf3fce58bba4f5c1
Reviewed-on: https://go-review.googlesource.com/113876
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
This change improves the performance of memmove
on ppc64 & ppc64le mainly for moves >=32 bytes.
In addition, the test to detect backward moves
was enhanced to avoid backward moves if source
and dest were in different types of storage, since
backward moves might not always be efficient.
Fixes #14507
The following shows some of the improvements from the test
in the runtime package:
BenchmarkMemmove32 4229.56 4717.13 1.12x
BenchmarkMemmove64 6156.03 7810.42 1.27x
BenchmarkMemmove128 7521.69 12468.54 1.66x
BenchmarkMemmove256 6729.90 18260.33 2.71x
BenchmarkMemmove512 8521.59 18033.81 2.12x
BenchmarkMemmove1024 9760.92 25762.61 2.64x
BenchmarkMemmove2048 10241.00 29584.94 2.89x
BenchmarkMemmove4096 10399.37 31882.31 3.07x
BenchmarkMemmoveUnalignedDst16 1943.69 2258.33 1.16x
BenchmarkMemmoveUnalignedDst32 3885.08 3965.81 1.02x
BenchmarkMemmoveUnalignedDst64 5121.63 6965.54 1.36x
BenchmarkMemmoveUnalignedDst128 7212.34 11372.68 1.58x
BenchmarkMemmoveUnalignedDst256 6564.52 16913.59 2.58x
BenchmarkMemmoveUnalignedDst512 8364.35 17782.57 2.13x
BenchmarkMemmoveUnalignedDst1024 9539.87 24914.72 2.61x
BenchmarkMemmoveUnalignedDst2048 9199.23 21235.11 2.31x
BenchmarkMemmoveUnalignedDst4096 10077.39 25231.99 2.50x
BenchmarkMemmoveUnalignedSrc32 3249.83 3742.52 1.15x
BenchmarkMemmoveUnalignedSrc64 5562.35 6627.96 1.19x
BenchmarkMemmoveUnalignedSrc128 6023.98 10200.84 1.69x
BenchmarkMemmoveUnalignedSrc256 6921.83 15258.43 2.20x
BenchmarkMemmoveUnalignedSrc512 8593.13 16541.97 1.93x
BenchmarkMemmoveUnalignedSrc1024 9730.95 22927.84 2.36x
BenchmarkMemmoveUnalignedSrc2048 9793.28 21537.73 2.20x
BenchmarkMemmoveUnalignedSrc4096 10132.96 26295.06 2.60x
Change-Id: I73af59970d4c97c728deabb9708b31ec7e01bdf2
Reviewed-on: https://go-review.googlesource.com/21990
Reviewed-by: Bill O'Farrell <billotosyr@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Change-Id: Icd06d99c42b8299fd931c7da821e1f418684d913
Reviewed-on: https://go-review.googlesource.com/19829
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
on ppc64x
Replace the confusing game where a frame size of $-8 would suppress the
implicit setting up of a stack frame with a nice explicit flag.
The code to set up the function prologue is still a little confusing but better
than it was.
Change-Id: I1d49278ff42c6bc734ebfb079998b32bc53f8d9a
Reviewed-on: https://go-review.googlesource.com/15670
Reviewed-by: Minux Ma <minux@golang.org>
|
|
Issue #12552 can happen on ppc64 too, although much less frequently in my
testing. I'm fairly sure this fixes it (2 out of 200 runs of oracle.test failed
without this change and 0 of 200 failed with it). It's also a lot faster for
large moves/clears:
name old speed new speed delta
Memmove1-6 157MB/s ± 9% 144MB/s ± 0% -8.20% (p=0.004 n=10+9)
Memmove2-6 281MB/s ± 1% 249MB/s ± 1% -11.53% (p=0.000 n=10+10)
Memmove3-6 376MB/s ± 1% 328MB/s ± 1% -12.64% (p=0.000 n=10+10)
Memmove4-6 475MB/s ± 4% 345MB/s ± 1% -27.28% (p=0.000 n=10+8)
Memmove5-6 540MB/s ± 1% 393MB/s ± 0% -27.21% (p=0.000 n=10+10)
Memmove6-6 609MB/s ± 0% 423MB/s ± 0% -30.56% (p=0.000 n=9+10)
Memmove7-6 659MB/s ± 0% 468MB/s ± 0% -28.99% (p=0.000 n=8+10)
Memmove8-6 705MB/s ± 0% 1295MB/s ± 1% +83.73% (p=0.000 n=9+9)
Memmove9-6 740MB/s ± 1% 1241MB/s ± 1% +67.61% (p=0.000 n=10+8)
Memmove10-6 780MB/s ± 0% 1162MB/s ± 1% +48.95% (p=0.000 n=10+9)
Memmove11-6 811MB/s ± 0% 1180MB/s ± 0% +45.58% (p=0.000 n=8+9)
Memmove12-6 820MB/s ± 1% 1073MB/s ± 1% +30.83% (p=0.000 n=10+9)
Memmove13-6 849MB/s ± 0% 1068MB/s ± 1% +25.87% (p=0.000 n=10+10)
Memmove14-6 877MB/s ± 0% 911MB/s ± 0% +3.83% (p=0.000 n=10+10)
Memmove15-6 893MB/s ± 0% 922MB/s ± 0% +3.25% (p=0.000 n=10+9)
Memmove16-6 897MB/s ± 1% 2418MB/s ± 1% +169.67% (p=0.000 n=10+9)
Memmove32-6 908MB/s ± 0% 3927MB/s ± 2% +332.64% (p=0.000 n=10+8)
Memmove64-6 1.11GB/s ± 0% 5.59GB/s ± 0% +404.64% (p=0.000 n=9+9)
Memmove128-6 1.25GB/s ± 0% 6.71GB/s ± 2% +437.49% (p=0.000 n=9+10)
Memmove256-6 1.33GB/s ± 0% 7.25GB/s ± 1% +445.06% (p=0.000 n=10+10)
Memmove512-6 1.38GB/s ± 0% 8.87GB/s ± 0% +544.43% (p=0.000 n=10+10)
Memmove1024-6 1.40GB/s ± 0% 10.00GB/s ± 0% +613.80% (p=0.000 n=10+10)
Memmove2048-6 1.41GB/s ± 0% 10.65GB/s ± 0% +652.95% (p=0.000 n=9+10)
Memmove4096-6 1.42GB/s ± 0% 11.01GB/s ± 0% +675.37% (p=0.000 n=8+10)
Memclr5-6 269MB/s ± 1% 264MB/s ± 0% -1.80% (p=0.000 n=10+10)
Memclr16-6 600MB/s ± 0% 887MB/s ± 1% +47.83% (p=0.000 n=10+10)
Memclr64-6 1.06GB/s ± 0% 2.91GB/s ± 1% +174.58% (p=0.000 n=8+10)
Memclr256-6 1.32GB/s ± 0% 6.58GB/s ± 0% +399.86% (p=0.000 n=9+10)
Memclr4096-6 1.42GB/s ± 0% 10.90GB/s ± 0% +668.03% (p=0.000 n=8+10)
Memclr65536-6 1.43GB/s ± 0% 11.37GB/s ± 0% +697.83% (p=0.000 n=9+8)
GoMemclr5-6 359MB/s ± 0% 360MB/s ± 0% +0.46% (p=0.000 n=10+10)
GoMemclr16-6 750MB/s ± 0% 1264MB/s ± 1% +68.45% (p=0.000 n=10+10)
GoMemclr64-6 1.17GB/s ± 0% 3.78GB/s ± 1% +223.58% (p=0.000 n=10+9)
GoMemclr256-6 1.35GB/s ± 0% 7.47GB/s ± 0% +452.44% (p=0.000 n=10+10)
Update #12552
Change-Id: I7192e9deb9684a843aed37f58a16a4e29970e893
Reviewed-on: https://go-review.googlesource.com/14840
Reviewed-by: Minux Ma <minux@golang.org>
|
|
All of the architectures except ppc64 have only "RET" for the return
mnemonic. ppc64 used to have only "RETURN", but commit cf06ea6
introduced RET as a synonym for RETURN to make ppc64 consistent with
the other architectures. However, that commit was never followed up to
make the code itself consistent by eliminating uses of RETURN.
This commit replaces all uses of RETURN in the ppc64 assembly with
RET.
This was done with
sed -i 's/\<RETURN\>/RET/' **/*_ppc64x.s
plus one manual change to syscall/asm.s.
Change-Id: I3f6c8d2be157df8841d48de988ee43f3e3087995
Reviewed-on: https://go-review.googlesource.com/10672
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
|
|
Fixes #8654.
LGTM=austin
R=austin
CC=golang-codereviews
https://golang.org/cl/180600043
|