Age | Commit message (Collapse) | Author |
|
Fixes #47879.
Change-Id: I35efb5fc65c4f1eb1b45918f95bbe1ff4039950e
Reviewed-on: https://go-review.googlesource.com/c/go/+/344249
Trust: Robert Griesemer <gri@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
|
|
Change-Id: I83180c472db8795803c1b9be3a33f35959e4dcc2
Reviewed-on: https://go-review.googlesource.com/c/go/+/336889
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
Change-Id: I0c2d26d6ede1452008992efbea7392162da65014
Reviewed-on: https://go-review.googlesource.com/c/go/+/331651
Reviewed-by: Robert Griesemer <gri@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
If someone sees "in [0,n)" it might look like a typo.
Saying "in the half-open interval [0,n)" will give people
something to search the web for (half-open interval).
Change-Id: I3c343f0a7171891e106e709ca77ab9db5daa5c84
Reviewed-on: https://go-review.googlesource.com/c/go/+/328210
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
|
|
The comments in the code refer to Knuth and to Burnikel and Ziegler,
but Knuth's presentation is inscrutable, and our recursive division
code does not bear much resemblance to Burnikel and Ziegler's paper
(which is fine, ours is nicer).
Add a standalone explanation of division instead of referring to
difficult or not-directly-used references.
Change-Id: Ic1b35dc167fb29a69ee00e0b4a768ac9cc9e1324
Reviewed-on: https://go-review.googlesource.com/c/go/+/321078
Trust: Russ Cox <rsc@golang.org>
Trust: Katie Hockman <katie@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Katie Hockman <katie@golang.org>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
|
|
Code moved and functions reordered to be in a consistent
top-down dependency order, but otherwise unchanged.
First step toward commenting division algorithms.
Change-Id: Ib5e604fb5b2867edff3a228ba4e57b5cb32c4137
Reviewed-on: https://go-review.googlesource.com/c/go/+/321077
Trust: Russ Cox <rsc@golang.org>
Trust: Katie Hockman <katie@golang.org>
Trust: Robert Griesemer <gri@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Katie Hockman <katie@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
|
|
Don't add them to files in vendor and cmd/vendor though. These will be
pulled in by updating the respective dependencies.
For #41184
Change-Id: Icc57458c9b3033c347124323f33084c85b224c70
Reviewed-on: https://go-review.googlesource.com/c/go/+/319389
Trust: Tobias Klauser <tobias.klauser@gmail.com>
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
Found by oss-fuzz https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=33284
Fixes #45910.
Change-Id: I61e7b04dbd80343420b57eede439e361c0f7b79c
Reviewed-on: https://go-review.googlesource.com/c/go/+/316149
Trust: Robert Griesemer <gri@golang.org>
Trust: Katie Hockman <katie@golang.org>
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Katie Hockman <katie@golang.org>
Reviewed-by: Emmanuel Odeke <emmanuel@orijtech.com>
|
|
Since we have int8 to int64 min max and uint8 to uint64 max constants,
we should probably have some for the word size types too. This change
also adds tests to validate the correctness of all integer limit
values.
Fixes #28538
Change-Id: Idd25782e98d16c2abedf39959b7b66e9c4c0c98b
Reviewed-on: https://go-review.googlesource.com/c/go/+/247058
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trust: Robert Griesemer <gri@golang.org>
|
|
Follow-up on https://golang.org/cl/315170.
Updates #44057.
Updates #44058.
Change-Id: I0b071e8ee7a1c97aae2436945cc9583cde3b40b0
Reviewed-on: https://go-review.googlesource.com/c/go/+/315969
Trust: Robert Griesemer <gri@golang.org>
Trust: Emmanuel Odeke <emmanuel@orijtech.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
The original value was rounded too early, which lead to the
surprising behavior that float64(math.SmallestNonzeroFloat64 / 2)
wasn't 0. That is, the exact compile-time computation of
math.SmallestNonzeroFloat64 / 2 resulted in a value that was
rounded up when converting to float64. To address this, added 3
more digits to the mantissa, ending in a 0.
While at it, also slightly increased the precision of MaxFloat64
to end in a 0.
Computed exact values via https://play.golang.org/p/yt4KTpIx_wP.
Added a test to verify expected behavior.
In contrast to the other (irrational) constants, expanding these
extreme values to more digits is unlikely to be important as they
are not going to appear in numeric computations except for tests
verifying their correctness (as is the case here).
Re-enabled a disabled test in go/types and types2.
Updates #44057.
Fixes #44058.
Change-Id: I8f363155e02331354e929beabe993c8d8de75646
Reviewed-on: https://go-review.googlesource.com/c/go/+/315170
Trust: Robert Griesemer <gri@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
benchmarks
This CL adds a new flag to the testing package and the go test command
which randomizes the execution order for tests and benchmarks.
This can be useful for identifying unwanted dependencies
between test or benchmark functions.
The flag is off by default. If `-shuffle` is set to `on` then the system
clock will be used as the seed value. If `-shuffle` is set to an integer
N, then N will be used as the seed value. In both cases, the seed will
be reported for failed runs so that they can reproduced later on.
Fixes #28592
Change-Id: I62e7dfae5f63f97a0cbd7830ea844d9f7beac335
Reviewed-on: https://go-review.googlesource.com/c/go/+/310033
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Emmanuel Odeke <emmanuel@orijtech.com>
Trust: Bryan C. Mills <bcmills@google.com>
|
|
Change-Id: Ibce07f8f36f7c64f7022ce656f8efbec5dff3f82
Reviewed-on: https://go-review.googlesource.com/c/go/+/313829
Reviewed-by: Robert Griesemer <gri@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Trust: Robert Griesemer <gri@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
|
|
It is still a common misconception that math/rand can be used for
security-sensitive work if seeded with crypto/rand
(lazyledger/lazyledger-core#270). It can not.
Change-Id: I8598c352d1750eabeada50be9976ab68cbb42cc0
Reviewed-on: https://go-review.googlesource.com/c/go/+/310350
Trust: Filippo Valsorda <filippo@golang.org>
Reviewed-by: Katie Hockman <katie@golang.org>
Reviewed-by: Emmanuel Odeke <emmanuel@orijtech.com>
|
|
Change-Id: I6a4bd2544276d0638bddf07ebcf2ee636db30fea
Reviewed-on: https://go-review.googlesource.com/c/go/+/311009
Run-TryBot: Yury Smolsky <yury@smolsky.by>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trust: Emmanuel Odeke <emmanuel@orijtech.com>
|
|
Currently almost all math functions have the following pattern:
func Sin(x float64) float64
func sin(x float64) float64 {
// ... pure Go implementation ...
}
Architectures that implement a function in assembly provide the
assembly implementation directly as the exported function (e.g., Sin),
and architectures that don't implement it in assembly use a small stub
to jump back to the Go code, like:
TEXT ·Sin(SB), NOSPLIT, $0
JMP ·sin(SB)
However, most functions are not implemented in assembly on most
architectures, so this jump through assembly is a waste. It defeats
compiler optimizations like inlining. And, with regabi, it actually
adds a small but non-trivial overhead because the jump from assembly
back to Go must go through an ABI0->ABIInternal bridge function.
Hence, this CL reorganizes this structure across the entire package.
It now leans on inlining to achieve peak performance, but allows the
compiler to see all the way through the pure Go implementation.
Now, functions follow this pattern:
func Sin(x float64) float64 {
if haveArchSin {
return archSin(x)
}
return sin(x)
}
func sin(x float64) float64 {
// ... pure Go implementation ...
}
Architectures that have assembly implementations use build-tagged
files to set haveArchX to true an provide an archX implementation.
That implementation can also still call back into the Go
implementation (some of them do this).
Prior to this change, enabling ABI wrappers results in a geomean
slowdown of the math benchmarks of 8.77% (full results:
https://perf.golang.org/search?q=upload:20210415.6) and of the Tile38
benchmarks by ~4%. After this change, enabling ABI wrappers is
completely performance-neutral on Tile38 and all but one math
benchmark (full results:
https://perf.golang.org/search?q=upload:20210415.7). ABI wrappers slow
down SqrtIndirectLatency-12 by 2.09%, which makes sense because that
call must still go through an ABI wrapper.
With ABI wrappers disabled (which won't be an option on amd64 much
longer), on linux/amd64, this change is largely performance-neutral
and slightly improves the performance of a few benchmarks:
(Because there are so many benchmarks, I've applied the Šidák
correction to the alpha threshold. It makes relatively little
difference in which benchmarks are statistically significant.)
name old time/op new time/op delta
Acos-12 22.3ns ± 0% 18.8ns ± 1% -15.44% (p=0.000 n=18+16)
Acosh-12 28.2ns ± 0% 28.2ns ± 0% ~ (p=0.404 n=18+20)
Asin-12 18.1ns ± 0% 18.2ns ± 0% +0.20% (p=0.000 n=18+16)
Asinh-12 32.8ns ± 0% 32.9ns ± 1% ~ (p=0.891 n=18+20)
Atan-12 9.92ns ± 0% 9.90ns ± 1% -0.24% (p=0.000 n=17+16)
Atanh-12 27.7ns ± 0% 27.5ns ± 0% -0.72% (p=0.000 n=16+20)
Atan2-12 18.5ns ± 0% 18.4ns ± 0% -0.59% (p=0.000 n=19+19)
Cbrt-12 22.1ns ± 0% 22.1ns ± 0% ~ (p=0.804 n=16+17)
Ceil-12 0.84ns ± 0% 0.84ns ± 0% ~ (p=0.663 n=18+16)
Copysign-12 0.84ns ± 0% 0.84ns ± 0% ~ (p=0.762 n=16+19)
Cos-12 12.7ns ± 0% 12.7ns ± 1% ~ (p=0.145 n=19+18)
Cosh-12 22.2ns ± 0% 22.5ns ± 0% +1.60% (p=0.000 n=17+19)
Erf-12 11.1ns ± 1% 11.1ns ± 1% ~ (p=0.010 n=19+19)
Erfc-12 12.6ns ± 1% 12.7ns ± 0% ~ (p=0.066 n=19+15)
Erfinv-12 16.1ns ± 0% 16.1ns ± 0% ~ (p=0.462 n=17+20)
Erfcinv-12 16.0ns ± 1% 16.0ns ± 1% ~ (p=0.015 n=17+16)
Exp-12 16.3ns ± 0% 16.5ns ± 1% +1.25% (p=0.000 n=19+16)
ExpGo-12 36.2ns ± 1% 36.1ns ± 1% ~ (p=0.242 n=20+18)
Expm1-12 18.6ns ± 0% 18.7ns ± 0% +0.25% (p=0.000 n=16+19)
Exp2-12 34.7ns ± 0% 34.6ns ± 1% ~ (p=0.010 n=19+18)
Exp2Go-12 34.8ns ± 1% 34.8ns ± 1% ~ (p=0.372 n=19+19)
Abs-12 0.56ns ± 0% 0.56ns ± 0% ~ (p=0.766 n=18+16)
Dim-12 0.84ns ± 1% 0.84ns ± 1% ~ (p=0.167 n=17+19)
Floor-12 0.84ns ± 0% 0.84ns ± 0% ~ (p=0.993 n=18+16)
Max-12 3.35ns ± 0% 3.35ns ± 0% ~ (p=0.894 n=17+19)
Min-12 3.35ns ± 0% 3.36ns ± 1% ~ (p=0.214 n=18+18)
Mod-12 35.2ns ± 0% 34.7ns ± 0% -1.45% (p=0.000 n=18+17)
Frexp-12 5.31ns ± 0% 4.75ns ± 0% -10.51% (p=0.000 n=19+18)
Gamma-12 14.8ns ± 0% 16.2ns ± 1% +9.21% (p=0.000 n=20+19)
Hypot-12 6.16ns ± 0% 6.17ns ± 0% +0.26% (p=0.000 n=19+20)
HypotGo-12 7.79ns ± 1% 7.78ns ± 0% ~ (p=0.497 n=18+17)
Ilogb-12 4.47ns ± 0% 4.47ns ± 0% ~ (p=0.167 n=19+19)
J0-12 76.0ns ± 0% 76.3ns ± 0% +0.35% (p=0.000 n=19+18)
J1-12 76.8ns ± 1% 75.9ns ± 0% -1.14% (p=0.000 n=18+18)
Jn-12 167ns ± 1% 168ns ± 1% ~ (p=0.038 n=18+18)
Ldexp-12 6.98ns ± 0% 6.43ns ± 0% -7.97% (p=0.000 n=17+18)
Lgamma-12 15.9ns ± 0% 16.0ns ± 1% ~ (p=0.011 n=20+17)
Log-12 13.3ns ± 0% 13.4ns ± 1% +0.37% (p=0.000 n=15+18)
Logb-12 4.75ns ± 0% 4.75ns ± 0% ~ (p=0.831 n=16+18)
Log1p-12 19.5ns ± 0% 19.5ns ± 1% ~ (p=0.851 n=18+17)
Log10-12 15.9ns ± 0% 14.0ns ± 0% -11.92% (p=0.000 n=17+16)
Log2-12 7.88ns ± 1% 8.01ns ± 0% +1.72% (p=0.000 n=20+20)
Modf-12 4.75ns ± 0% 4.34ns ± 0% -8.66% (p=0.000 n=19+17)
Nextafter32-12 5.31ns ± 0% 5.31ns ± 0% ~ (p=0.389 n=17+18)
Nextafter64-12 5.03ns ± 1% 5.03ns ± 0% ~ (p=0.774 n=17+18)
PowInt-12 29.9ns ± 0% 28.5ns ± 0% -4.69% (p=0.000 n=18+19)
PowFrac-12 91.0ns ± 0% 91.1ns ± 0% ~ (p=0.029 n=19+19)
Pow10Pos-12 1.12ns ± 0% 1.12ns ± 0% ~ (p=0.363 n=20+20)
Pow10Neg-12 3.90ns ± 0% 3.90ns ± 0% ~ (p=0.921 n=17+18)
Round-12 2.31ns ± 0% 2.31ns ± 1% ~ (p=0.390 n=18+18)
RoundToEven-12 0.84ns ± 0% 0.84ns ± 0% ~ (p=0.280 n=18+19)
Remainder-12 31.6ns ± 0% 29.6ns ± 0% -6.16% (p=0.000 n=18+17)
Signbit-12 0.56ns ± 0% 0.56ns ± 0% ~ (p=0.385 n=19+18)
Sin-12 12.5ns ± 0% 12.5ns ± 0% ~ (p=0.080 n=18+18)
Sincos-12 16.4ns ± 2% 16.4ns ± 2% ~ (p=0.253 n=20+19)
Sinh-12 26.1ns ± 0% 26.1ns ± 0% +0.18% (p=0.000 n=17+19)
SqrtIndirect-12 3.91ns ± 0% 3.90ns ± 0% ~ (p=0.133 n=19+19)
SqrtLatency-12 2.79ns ± 0% 2.79ns ± 0% ~ (p=0.226 n=16+19)
SqrtIndirectLatency-12 6.68ns ± 0% 6.37ns ± 2% -4.66% (p=0.000 n=17+20)
SqrtGoLatency-12 49.4ns ± 0% 49.4ns ± 0% ~ (p=0.289 n=18+16)
SqrtPrime-12 3.18µs ± 0% 3.18µs ± 0% ~ (p=0.084 n=17+18)
Tan-12 13.8ns ± 0% 13.9ns ± 2% ~ (p=0.292 n=19+20)
Tanh-12 25.4ns ± 0% 25.4ns ± 0% ~ (p=0.101 n=17+17)
Trunc-12 0.84ns ± 0% 0.84ns ± 0% ~ (p=0.765 n=18+16)
Y0-12 75.8ns ± 0% 75.9ns ± 1% ~ (p=0.805 n=16+18)
Y1-12 76.3ns ± 0% 75.3ns ± 1% -1.34% (p=0.000 n=19+17)
Yn-12 164ns ± 0% 164ns ± 2% ~ (p=0.356 n=18+20)
Float64bits-12 0.56ns ± 0% 0.56ns ± 0% ~ (p=0.383 n=18+18)
Float64frombits-12 0.56ns ± 0% 0.56ns ± 0% ~ (p=0.066 n=18+19)
Float32bits-12 0.56ns ± 0% 0.56ns ± 0% ~ (p=0.889 n=16+19)
Float32frombits-12 0.56ns ± 0% 0.56ns ± 0% ~ (p=0.007 n=18+19)
FMA-12 23.9ns ± 0% 24.0ns ± 0% +0.31% (p=0.000 n=16+17)
[Geo mean] 9.86ns 9.77ns -0.87%
(https://perf.golang.org/search?q=upload:20210415.5)
For #40724.
Change-Id: I44fbba2a17be930ec9daeb0a8222f55cd50555a0
Reviewed-on: https://go-review.googlesource.com/c/go/+/310331
Trust: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Discovered by Junchen Li on CL 246858, the comparison before p and z are
swapped can be simplified from
pe < ze || (pe == ze && (pm1 < zm1 || (pm1 == zm1 && pm2 < zm2)))
to
pe < ze || pe == ze && pm1 < zm1
because zm2 is initialized to 0 before the branch.
Change-Id: Iee92d570038df2b0f8941ef6e422a022654ab2d6
Reviewed-on: https://go-review.googlesource.com/c/go/+/247241
Run-TryBot: Akhil Indurti <aindurti@gmail.com>
Run-TryBot: Emmanuel Odeke <emmanuel@orijtech.com>
Trust: Emmanuel Odeke <emmanuel@orijtech.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
I don't know why the test requires runtime.(*Frame).Next symbol
present in the binary under test. I assume it is just some
sanity check? With CL 268479 runtime.(*Frame).Next can be pruned
by the linker. Replace it with runtime.main which should always
be present.
May fix the longtest builders.
Change-Id: Id3104c058b2786057ff58be41b1d35aeac2f3073
Reviewed-on: https://go-review.googlesource.com/c/go/+/304431
Trust: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
|
|
linux/mips64le
name old time/op new time/op delta
Reverse 2.31ns ± 1% 2.27ns ± 1% -1.53% (p=0.001 n=10+10)
Reverse8 0.65ns ± 1% 0.65ns ± 1% -1.19% (p=0.000 n=9+10)
Reverse16 1.15ns ± 2% 1.14ns ± 2% ~ (p=0.062 n=9+10)
Reverse32 1.96ns ± 1% 1.94ns ± 1% -1.16% (p=0.000 n=10+9)
Reverse64 2.29ns ± 1% 2.26ns ± 0% -0.94% (p=0.000 n=9+9)
ReverseBytes 0.66ns ± 3% 0.65ns ± 1% -1.58% (p=0.006 n=9+10)
ReverseBytes16 0.66ns ± 2% 0.65ns ± 1% -2.05% (p=0.000 n=10+9)
ReverseBytes32 0.41ns ± 1% 0.40ns ± 0% -1.68% (p=0.000 n=10+10)
ReverseBytes64 0.66ns ± 1% 0.65ns ± 1% -1.50% (p=0.000 n=10+9)
cpu=1 benchtime=100ms count=100
name old time/op new time/op delta
Reverse 28.0ns ± 3% 27.7ns ± 3% -0.80% (p=0.000 n=100+98)
Reverse8 2.24ns ± 1% 2.24ns ± 1% ~ (p=0.142 n=98+100)
Reverse16 4.07ns ± 3% 4.05ns ± 3% -0.66% (p=0.000 n=99+99)
Reverse32 11.3ns ± 0% 11.3ns ± 0% ~ (p=0.283 n=94+97)
Reverse64 12.6ns ± 0% 12.6ns ± 0% +0.60% (p=0.000 n=100+98)
ReverseBytes 5.25ns ± 1% 5.24ns ± 1% -0.18% (p=0.000 n=100+100)
ReverseBytes16 2.00ns ± 0% 2.21ns ± 3% +10.07% (p=0.000 n=88+100)
ReverseBytes32 4.08ns ± 2% 4.13ns ± 2% +1.39% (p=0.000 n=99+99)
ReverseBytes64 5.48ns ± 1% 5.45ns ± 1% -0.50% (p=0.000 n=98+99)
Update #43403
Change-Id: I7e7e00bb17608739d9f6b927c6dfef2580493a0e
Reviewed-on: https://go-review.googlesource.com/c/go/+/280645
Trust: Meng Zhuo <mzh@golangcn.org>
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
|
|
Change-Id: Id67d6ac856bd9271de99c3381bde910aa0c166e0
Reviewed-on: https://go-review.googlesource.com/c/go/+/296011
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
|
|
Make explicit a shrVU_g precondition.
Replace i with i+1 throughout the loop.
The resulting loop is functionally identical,
but the compiler can do better BCE without the i-1 slice offset.
Benchmarks results on amd64 with -tags=math_big_pure_go.
name old time/op new time/op delta
NonZeroShifts/1/shrVU-8 4.55ns ± 2% 4.45ns ± 3% -2.27% (p=0.000 n=28+30)
NonZeroShifts/1/shlVU-8 4.07ns ± 1% 4.13ns ± 4% +1.55% (p=0.000 n=26+29)
NonZeroShifts/2/shrVU-8 6.12ns ± 1% 5.55ns ± 1% -9.30% (p=0.000 n=28+28)
NonZeroShifts/2/shlVU-8 5.65ns ± 3% 5.70ns ± 2% +0.92% (p=0.008 n=30+29)
NonZeroShifts/3/shrVU-8 7.58ns ± 2% 6.79ns ± 2% -10.46% (p=0.000 n=28+28)
NonZeroShifts/3/shlVU-8 6.62ns ± 2% 6.69ns ± 1% +1.07% (p=0.000 n=29+28)
NonZeroShifts/4/shrVU-8 9.02ns ± 1% 7.79ns ± 2% -13.59% (p=0.000 n=27+30)
NonZeroShifts/4/shlVU-8 7.74ns ± 1% 7.82ns ± 1% +0.92% (p=0.000 n=26+28)
NonZeroShifts/5/shrVU-8 10.6ns ± 1% 8.9ns ± 3% -16.31% (p=0.000 n=25+29)
NonZeroShifts/5/shlVU-8 8.59ns ± 1% 8.68ns ± 1% +1.13% (p=0.000 n=27+29)
NonZeroShifts/10/shrVU-8 18.2ns ± 2% 14.4ns ± 1% -20.96% (p=0.000 n=27+28)
NonZeroShifts/10/shlVU-8 14.1ns ± 1% 14.1ns ± 1% +0.46% (p=0.001 n=26+28)
NonZeroShifts/100/shrVU-8 161ns ± 2% 118ns ± 1% -26.83% (p=0.000 n=29+30)
NonZeroShifts/100/shlVU-8 119ns ± 2% 120ns ± 2% +0.92% (p=0.000 n=29+29)
NonZeroShifts/1000/shrVU-8 1.54µs ± 1% 1.10µs ± 1% -28.63% (p=0.000 n=29+29)
NonZeroShifts/1000/shlVU-8 1.10µs ± 1% 1.10µs ± 2% ~ (p=0.701 n=28+29)
NonZeroShifts/10000/shrVU-8 15.3µs ± 2% 10.9µs ± 1% -28.68% (p=0.000 n=28+28)
NonZeroShifts/10000/shlVU-8 10.9µs ± 2% 10.9µs ± 2% -0.57% (p=0.003 n=26+29)
NonZeroShifts/100000/shrVU-8 154µs ± 1% 111µs ± 2% -28.04% (p=0.000 n=27+28)
NonZeroShifts/100000/shlVU-8 113µs ± 2% 113µs ± 2% ~ (p=0.790 n=30+30)
Change-Id: Ib6a621ee7c88b27f0f18121fb2cba3606c40c9b0
Reviewed-on: https://go-review.googlesource.com/c/go/+/297049
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
|
|
Add generic rule to rewrite the single-precision square root expression
with one single-precision instruction. The optimization will reduce two
times of precision converting between double-precision and single-precision.
On arm64 flatform.
previous:
FCVTSD F0, F0
FSQRTD F0, F0
FCVTDS F0, F0
optimized:
FSQRTS S0, S0
And this patch adds the test case to check the correctness.
This patch refers to CL 241877, contributed by Alice Xu
(dianhong.xu@arm.com)
Change-Id: I6de5d02281c693017ac4bd4c10963dd55989bd7e
Reviewed-on: https://go-review.googlesource.com/c/go/+/276873
Trust: fannie zhang <Fannie.Zhang@arm.com>
Run-TryBot: fannie zhang <Fannie.Zhang@arm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Make all our package sources use Go 1.17 gofmt format
(adding //go:build lines).
Part of //go:build change (#41184).
See https://golang.org/design/draft-gobuild
Change-Id: Ia0534360e4957e58cd9a18429c39d0e32a6addb4
Reviewed-on: https://go-review.googlesource.com/c/go/+/294430
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
The Surface Pro X's 386 simulator is not completely faithful to a real 387.
The most egregious problem is that it computes Log2(8) as 2.9999999999999996,
but it has some other subtler problems as well. All the problems occur in
routines that we don't even bother with assembly for on amd64.
If the speed of Go code is OK on amd64 it should be OK on 386 too.
Just remove all the 386-only assembly functions.
This leaves Ceil, Floor, Trunc, Hypot, and Sqrt in 386 assembly,
all of which are also in assembly on amd64 and all of which pass
their tests on Surface Pro X.
Compared to amd64, the 386 port omits assembly for Min, Max, and Log.
It never had Min and Max, and this CL deletes Log because Log2 wasn't
even correct. (None of the other architectures have assembly Log either.)
Change-Id: I5eb6c61084467035269d4098a36001447b7a0601
Reviewed-on: https://go-review.googlesource.com/c/go/+/291229
Trust: Russ Cox <rsc@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
There appears to be a typo in the description of
the recursive division algorithm.
Two things seem suspicious with the original comment:
1. It is talking about choosing s, but s doesn't
appear anywhere in the equation.
2. The math in the equation is incorrect.
Where
B = len(v)/2
s = B - 1
Proof that it is incorrect:
len(v) - B >= B + 1
len(v) - len(v)/2 >= len(v)/2 + 1
This doesn't hold if len(v) is even, e.g. 10:
10 - 10/2 >= 10/2 + 1
10 - 5 >= 5 + 1
5 >= 6 // this is false
The new equation will be the following,
which will be mathematically correct:
len(v) - s >= B + 1
len(v) - (len(v)/2 - 1) >= len(v)/2 + 1
len(v) - len(v)/2 + 1 >= len(v)/2 + 1
len(v) - len(v)/2 >= len(v)/2
This holds if len(v) is even or odd.
e.g. 10
10 - 10/2 >= 10/2
10 - 5 >= 5
5 >= 5
e.g. 11
11 - 11/2 >= 11/2
11 - 5 >= 5
6 >= 5
Change-Id: If77ce09286cf7038637b5dfd0fb7d4f828023f56
Reviewed-on: https://go-review.googlesource.com/c/go/+/287372
Run-TryBot: Katie Hockman <katie@golang.org>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
Trust: Katie Hockman <katie@golang.org>
|
|
"it does not necessary" -> "it is not necessary"
Change-Id: I66f9cf2670d76b3686badb4a537b3ec084447d62
GitHub-Last-Rev: 52a0f9993abf25369cdb6b31eaf476df1626cf87
GitHub-Pull-Request: golang/go#43935
Reviewed-on: https://go-review.googlesource.com/c/go/+/287052
Reviewed-by: Robert Griesemer <gri@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trust: Robert Griesemer <gri@golang.org>
|
|
Change-Id: I57fbabf272bdfd61918db155ee6f7091f18e5979
GitHub-Last-Rev: e138804b1ab8086b3742861873b077d6cca8108a
GitHub-Pull-Request: golang/go#43495
Reviewed-on: https://go-review.googlesource.com/c/go/+/281373
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trust: Alberto Donizetti <alb.donizetti@gmail.com>
|
|
The vulnerability that allowed this panic is
CVE-2020-28362 and has been fixed in a security
release, per #42552.
Change-Id: I774bcda2cc83cdd5a273d21c8d9f4b53fa17c88f
Reviewed-on: https://go-review.googlesource.com/c/go/+/277959
Run-TryBot: Katie Hockman <katie@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Katie Hockman <katie@golang.org>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
|
|
As part of #42026, these helpers from io/ioutil were moved to os.
(ioutil.TempFile and TempDir became os.CreateTemp and MkdirTemp.)
Update the Go tree to use the preferred names.
As usual, code compiled with the Go 1.4 bootstrap toolchain
and code vendored from other sources is excluded.
ReadDir changes are in a separate CL, because they are not a
simple search and replace.
For #42026.
Change-Id: If318df0216d57e95ea0c4093b89f65e5b0ababb3
Reviewed-on: https://go-review.googlesource.com/c/go/+/266365
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
The s390x assembly for shlVU does a forward copy when the shift amount s
is 0. This causes corruption of the result z when z is aliased to the
input x.
This fix removes the s390x assembly for both shlVU and shrVU so the pure
go implementations will be used.
Test cases have been added to the existing TestShiftOverlap test to
cover shift values of 0, 1 and (_W - 1).
Fixes #42838
Change-Id: I75ca0e98f3acfaa6366a26355dcd9dd82499a48b
Reviewed-on: https://go-review.googlesource.com/c/go/+/274442
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Trust: Robert Griesemer <gri@golang.org>
|
|
The previous s value could cause a crash
for certain inputs.
Will check in tests and documentation improvements later.
Thanks to the Go Ethereum team and the OSS-Fuzz project for reporting this.
Thanks to Rémy Oudompheng and Robert Griesemer for their help
developing and validating the fix.
Fixes CVE-2020-28362
Change-Id: Ibbf455c4436bcdb07c84a34fa6551fb3422356d3
Reviewed-on: https://team-review.git.corp.google.com/c/golang/go-private/+/899974
Reviewed-by: Roland Shoemaker <bracewell@google.com>
Reviewed-by: Filippo Valsorda <valsorda@google.com>
Reviewed-on: https://go-review.googlesource.com/c/go/+/269657
Trust: Katie Hockman <katie@golang.org>
Trust: Roland Shoemaker <roland@golang.org>
Run-TryBot: Katie Hockman <katie@golang.org>
Reviewed-by: Roland Shoemaker <roland@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
|
|
Append operations in the decimal String function may cause several allocations.
Use make to pre allocate slices in String that have enough capacity to avoid additional allocations in append operations.
name old time/op new time/op delta
DecimalConversion-8 139µs ± 7% 109µs ± 2% -21.06% (p=0.000 n=10+10)
Change-Id: Id0284d204918a179a0421c51c35d86a3408e1bd9
Reviewed-on: https://go-review.googlesource.com/c/go/+/233980
Run-TryBot: Emmanuel Odeke <emmanuel@orijtech.com>
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
Trust: Giovanni Bajo <rasky@develer.com>
Trust: Martin Möhrmann <moehrmann@google.com>
|
|
Division is much slower than multiplication. And the method of using
multiplication by multiplying reciprocal and replacing division with it
can increase the speed of divWVW algorithm by three times,and at the
same time increase the speed of nats division.
The benchmark test on arm64 is as follows:
name old time/op new time/op delta
DivWVW/1-4 13.1ns ± 4% 13.3ns ± 4% ~ (p=0.444 n=5+5)
DivWVW/2-4 48.6ns ± 1% 51.2ns ± 2% +5.39% (p=0.008 n=5+5)
DivWVW/3-4 82.0ns ± 1% 69.7ns ± 1% -15.03% (p=0.008 n=5+5)
DivWVW/4-4 116ns ± 1% 71ns ± 2% -38.88% (p=0.008 n=5+5)
DivWVW/5-4 152ns ± 1% 84ns ± 4% -44.70% (p=0.008 n=5+5)
DivWVW/10-4 319ns ± 1% 155ns ± 4% -51.50% (p=0.008 n=5+5)
DivWVW/100-4 3.44µs ± 3% 1.30µs ± 8% -62.30% (p=0.008 n=5+5)
DivWVW/1000-4 33.8µs ± 0% 10.9µs ± 1% -67.74% (p=0.008 n=5+5)
DivWVW/10000-4 343µs ± 4% 111µs ± 5% -67.63% (p=0.008 n=5+5)
DivWVW/100000-4 3.35ms ± 1% 1.25ms ± 3% -62.79% (p=0.008 n=5+5)
QuoRem-4 3.08µs ± 2% 2.21µs ± 4% -28.40% (p=0.008 n=5+5)
ModSqrt225_Tonelli-4 444µs ± 2% 457µs ± 3% ~ (p=0.095 n=5+5)
ModSqrt225_3Mod4-4 136µs ± 1% 138µs ± 3% ~ (p=0.151 n=5+5)
ModSqrt231_Tonelli-4 473µs ± 3% 483µs ± 4% ~ (p=0.548 n=5+5)
ModSqrt231_5Mod8-4 164µs ± 9% 169µs ±12% ~ (p=0.421 n=5+5)
Sqrt-4 36.8µs ± 1% 28.6µs ± 0% -22.17% (p=0.016 n=5+4)
Div/20/10-4 50.0ns ± 3% 51.3ns ± 6% ~ (p=0.238 n=5+5)
Div/40/20-4 49.8ns ± 2% 51.3ns ± 6% ~ (p=0.222 n=5+5)
Div/100/50-4 85.8ns ± 4% 86.5ns ± 5% ~ (p=0.246 n=5+5)
Div/200/100-4 335ns ± 3% 296ns ± 2% -11.60% (p=0.008 n=5+5)
Div/400/200-4 442ns ± 2% 359ns ± 5% -18.81% (p=0.008 n=5+5)
Div/1000/500-4 858ns ± 3% 643ns ± 6% -25.06% (p=0.008 n=5+5)
Div/2000/1000-4 1.70µs ± 3% 1.28µs ± 4% -24.80% (p=0.008 n=5+5)
Div/20000/10000-4 45.0µs ± 5% 41.8µs ± 4% -7.17% (p=0.016 n=5+5)
Div/200000/100000-4 1.51ms ± 7% 1.43ms ± 3% -5.42% (p=0.016 n=5+5)
Div/2000000/1000000-4 57.6ms ± 4% 57.5ms ± 3% ~ (p=1.000 n=5+5)
Div/20000000/10000000-4 2.08s ± 3% 2.04s ± 1% ~ (p=0.095 n=5+5)
name old speed new speed delta
DivWVW/1-4 4.87GB/s ± 4% 4.80GB/s ± 4% ~ (p=0.310 n=5+5)
DivWVW/2-4 2.63GB/s ± 1% 2.50GB/s ± 2% -5.07% (p=0.008 n=5+5)
DivWVW/3-4 2.34GB/s ± 1% 2.76GB/s ± 1% +17.70% (p=0.008 n=5+5)
DivWVW/4-4 2.21GB/s ± 1% 3.61GB/s ± 2% +63.42% (p=0.008 n=5+5)
DivWVW/5-4 2.10GB/s ± 2% 3.81GB/s ± 4% +80.89% (p=0.008 n=5+5)
DivWVW/10-4 2.01GB/s ± 0% 4.13GB/s ± 4% +105.91% (p=0.008 n=5+5)
DivWVW/100-4 1.86GB/s ± 2% 4.95GB/s ± 7% +165.63% (p=0.008 n=5+5)
DivWVW/1000-4 1.89GB/s ± 0% 5.86GB/s ± 1% +209.96% (p=0.008 n=5+5)
DivWVW/10000-4 1.87GB/s ± 4% 5.76GB/s ± 5% +208.96% (p=0.008 n=5+5)
DivWVW/100000-4 1.91GB/s ± 1% 5.14GB/s ± 3% +168.85% (p=0.008 n=5+5)
Change-Id: I049f1196562b20800e6ef8a6493fd147f93ad830
Reviewed-on: https://go-review.googlesource.com/c/go/+/250417
Trust: Giovanni Bajo <rasky@develer.com>
Trust: Keith Randall <khr@golang.org>
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Use the const variable Ln2 in math/const.go for function acosh.
Change-Id: I5381d03dd3142c227ae5773ece9be6c8f377615e
Reviewed-on: https://go-review.googlesource.com/c/go/+/232517
Reviewed-by: Robert Griesemer <gri@golang.org>
Trust: Robert Griesemer <gri@golang.org>
Trust: Giovanni Bajo <rasky@develer.com>
|
|
Add an optimization for addVW and subVW over large-sized vectors, it switches
from add/sub with carry to copy the rest of the vector when we are done with
carries. Consistent performance improvement are observed on various arm64
machines.
Add additional tests and benchmarks to increase the test coverage.
TestFunVWExt:
Testing with various types of input vector, using the result from go-version
addVW/subVW as golden reference.
BenchmarkAddVWext and BenchmarkSubVWext:
Benchmarking using input vector having all 1s or all 0s, for evaluating the
overhead of worst case.
1. Perf. comparison over randomly generated input vectors:
Server 1:
name old time/op new time/op delta
AddVW/1 12.3ns ± 3% 12.0ns ± 0% -2.60% (p=0.001 n=10+8)
AddVW/2 12.5ns ± 2% 12.3ns ± 0% -1.84% (p=0.001 n=10+8)
AddVW/3 12.6ns ± 2% 12.3ns ± 0% -1.91% (p=0.009 n=10+10)
AddVW/4 13.1ns ± 3% 12.7ns ± 0% -2.98% (p=0.006 n=10+8)
AddVW/5 14.4ns ± 1% 13.9ns ± 0% -3.81% (p=0.000 n=10+10)
AddVW/10 11.7ns ± 0% 11.7ns ± 0% ~ (all equal)
AddVW/100 47.8ns ± 0% 29.9ns ± 2% -37.38% (p=0.000 n=10+9)
AddVW/1000 446ns ± 0% 207ns ± 0% -53.59% (p=0.000 n=10+10)
AddVW/10000 4.35µs ± 1% 2.92µs ± 0% -32.85% (p=0.000 n=10+10)
AddVW/100000 43.6µs ± 0% 29.7µs ± 0% -31.92% (p=0.000 n=8+10)
SubVW/1 12.6ns ± 0% 12.3ns ± 2% -2.22% (p=0.000 n=7+10)
SubVW/2 12.7ns ± 0% 12.6ns ± 1% -0.39% (p=0.046 n=8+10)
SubVW/3 12.7ns ± 1% 12.6ns ± 1% ~ (p=0.410 n=10+10)
SubVW/4 13.3ns ± 3% 13.1ns ± 3% ~ (p=0.072 n=10+10)
SubVW/5 14.2ns ± 0% 14.1ns ± 1% -0.63% (p=0.046 n=8+10)
SubVW/10 11.7ns ± 0% 11.7ns ± 0% ~ (all equal)
SubVW/100 47.8ns ± 0% 33.1ns ±19% -30.71% (p=0.000 n=10+10)
SubVW/1000 446ns ± 0% 207ns ± 0% -53.59% (p=0.000 n=10+10)
SubVW/10000 4.33µs ± 1% 2.92µs ± 0% -32.66% (p=0.000 n=10+6)
SubVW/100000 43.4µs ± 0% 29.6µs ± 0% -31.90% (p=0.000 n=10+9)
Server 2:
name old time/op new time/op delta
AddVW/1 5.49ns ± 0% 5.53ns ± 2% ~ (p=1.000 n=9+10)
AddVW/2 5.96ns ± 2% 5.92ns ± 1% -0.69% (p=0.039 n=10+10)
AddVW/3 6.72ns ± 0% 6.73ns ± 0% ~ (p=0.078 n=10+10)
AddVW/4 7.07ns ± 0% 6.75ns ± 2% -4.55% (p=0.000 n=10+10)
AddVW/5 8.14ns ± 0% 8.17ns ± 0% +0.46% (p=0.003 n=8+8)
AddVW/10 10.0ns ± 0% 10.1ns ± 1% +0.70% (p=0.003 n=10+10)
AddVW/100 43.0ns ± 0% 33.5ns ± 0% -22.09% (p=0.000 n=9+9)
AddVW/1000 394ns ± 0% 278ns ± 0% -29.44% (p=0.000 n=10+10)
AddVW/10000 4.18µs ± 0% 3.14µs ± 0% -24.81% (p=0.000 n=8+8)
AddVW/100000 68.3µs ± 3% 62.1µs ± 5% -9.13% (p=0.000 n=10+10)
SubVW/1 5.37ns ± 2% 5.42ns ± 1% ~ (p=0.990 n=10+10)
SubVW/2 5.89ns ± 0% 5.92ns ± 1% +0.58% (p=0.000 n=8+10)
SubVW/3 6.64ns ± 1% 6.82ns ± 3% +2.63% (p=0.000 n=9+10)
SubVW/4 7.17ns ± 0% 6.69ns ± 2% -6.74% (p=0.000 n=10+9)
SubVW/5 8.22ns ± 0% 8.18ns ± 0% -0.46% (p=0.001 n=8+9)
SubVW/10 10.0ns ± 1% 10.1ns ± 1% ~ (p=0.341 n=10+10)
SubVW/100 43.0ns ± 0% 33.5ns ± 0% -22.09% (p=0.000 n=7+10)
SubVW/1000 394ns ± 0% 278ns ± 0% -29.44% (p=0.000 n=10+10)
SubVW/10000 4.18µs ± 0% 3.15µs ± 0% -24.62% (p=0.000 n=9+9)
SubVW/100000 67.7µs ± 4% 62.4µs ± 2% -7.92% (p=0.000 n=10+10)
2. Perf. comparison over input vectors of all 1s or all 0s
Server 1:
name old time/op new time/op delta
AddVWext/1 12.6ns ± 0% 12.0ns ± 0% -4.76% (p=0.000 n=6+10)
AddVWext/2 12.7ns ± 0% 12.4ns ± 1% -2.52% (p=0.000 n=10+10)
AddVWext/3 12.7ns ± 0% 12.4ns ± 0% -2.36% (p=0.000 n=9+7)
AddVWext/4 13.2ns ± 4% 12.7ns ± 0% -3.71% (p=0.001 n=10+9)
AddVWext/5 14.6ns ± 0% 13.9ns ± 0% -4.79% (p=0.000 n=10+8)
AddVWext/10 11.7ns ± 0% 11.7ns ± 0% ~ (all equal)
AddVWext/100 47.8ns ± 0% 47.4ns ± 0% -0.84% (p=0.000 n=10+10)
AddVWext/1000 446ns ± 0% 399ns ± 0% -10.54% (p=0.000 n=10+10)
AddVWext/10000 4.34µs ± 1% 3.90µs ± 0% -10.12% (p=0.000 n=10+10)
AddVWext/100000 43.9µs ± 1% 39.4µs ± 0% -10.18% (p=0.000 n=10+10)
SubVWext/1 12.6ns ± 0% 12.3ns ± 2% -2.70% (p=0.000 n=7+10)
SubVWext/2 12.6ns ± 1% 12.6ns ± 2% ~ (p=0.234 n=10+10)
SubVWext/3 12.7ns ± 0% 12.6ns ± 2% -0.71% (p=0.033 n=10+10)
SubVWext/4 13.4ns ± 0% 13.1ns ± 3% -2.01% (p=0.006 n=8+10)
SubVWext/5 14.2ns ± 0% 14.1ns ± 1% -0.85% (p=0.003 n=10+10)
SubVWext/10 11.7ns ± 0% 11.7ns ± 0% ~ (all equal)
SubVWext/100 47.8ns ± 0% 47.4ns ± 0% -0.84% (p=0.000 n=10+10)
SubVWext/1000 446ns ± 0% 399ns ± 0% -10.54% (p=0.000 n=10+10)
SubVWext/10000 4.33µs ± 1% 3.90µs ± 0% -10.02% (p=0.000 n=10+10)
SubVWext/100000 43.5µs ± 0% 39.5µs ± 1% -9.16% (p=0.000 n=7+10)
Server 2:
name old time/op new time/op delta
AddVWext/1 5.48ns ± 0% 5.43ns ± 1% -0.97% (p=0.000 n=9+9)
AddVWext/2 5.99ns ± 2% 5.93ns ± 1% ~ (p=0.054 n=10+10)
AddVWext/3 6.74ns ± 0% 6.79ns ± 1% +0.80% (p=0.000 n=9+10)
AddVWext/4 7.18ns ± 0% 7.21ns ± 1% +0.36% (p=0.034 n=9+10)
AddVWext/5 7.93ns ± 3% 8.18ns ± 0% +3.18% (p=0.000 n=10+8)
AddVWext/10 10.0ns ± 0% 10.1ns ± 1% +0.60% (p=0.011 n=10+10)
AddVWext/100 43.0ns ± 0% 47.7ns ± 0% +10.93% (p=0.000 n=9+10)
AddVWext/1000 394ns ± 0% 399ns ± 0% +1.27% (p=0.000 n=10+10)
AddVWext/10000 4.18µs ± 0% 4.50µs ± 0% +7.73% (p=0.000 n=9+10)
AddVWext/100000 67.6µs ± 2% 68.4µs ± 3% ~ (p=0.139 n=9+8)
SubVWext/1 5.46ns ± 1% 5.43ns ± 0% -0.55% (p=0.002 n=9+9)
SubVWext/2 5.89ns ± 0% 5.93ns ± 1% +0.68% (p=0.000 n=8+10)
SubVWext/3 6.72ns ± 1% 6.79ns ± 1% +1.07% (p=0.000 n=10+10)
SubVWext/4 6.98ns ± 1% 7.21ns ± 0% +3.25% (p=0.000 n=10+10)
SubVWext/5 8.22ns ± 0% 7.99ns ± 3% -2.83% (p=0.000 n=8+10)
SubVWext/10 10.0ns ± 1% 10.1ns ± 1% ~ (p=0.239 n=10+10)
SubVWext/100 43.0ns ± 0% 47.7ns ± 0% +10.93% (p=0.000 n=8+10)
SubVWext/1000 394ns ± 0% 399ns ± 0% +1.27% (p=0.000 n=10+10)
SubVWext/10000 4.18µs ± 0% 4.51µs ± 0% +7.86% (p=0.000 n=8+8)
SubVWext/100000 68.3µs ± 2% 68.0µs ± 3% ~ (p=0.515 n=10+8)
Change-Id: I134a5194b8a2deaaebbaa2b771baf72846971d58
Reviewed-on: https://go-review.googlesource.com/c/go/+/229739
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Change-Id: I9ff5d1767cf70648c2251268e5e815944a7cb371
Reviewed-on: https://go-review.googlesource.com/c/go/+/233737
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
|
|
While reading the source code of the math/big package, I found the SetString function example of float type missing.
Change-Id: Id8c16a58e2e24f9463e8ff38adbc98f8c418ab26
Reviewed-on: https://go-review.googlesource.com/c/go/+/232804
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
|
|
Don't overwrite incoming test data.
The change uses copy instead of assigning statement to avoid this.
Change-Id: Ib907101822d811de5c45145cb9d7961907e212c3
Reviewed-on: https://go-review.googlesource.com/c/go/+/250137
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
|
|
This changes the assembly implementation on ppc64x
to improve performance by reordering some instructions.
It also eliminates an unnecessary move by changing an
ADDZE to use the correct target register.
Improvement on power9:
MulAddVWW/1 6.89ns ± 0% 7.30ns ± 0% +5.95% (p=1.000 n=1+1)
MulAddVWW/2 8.04ns ± 0% 8.06ns ± 0% +0.25% (p=1.000 n=1+1)
MulAddVWW/3 9.39ns ± 0% 9.39ns ± 0% ~ (all equal)
MulAddVWW/4 9.76ns ± 0% 9.48ns ± 0% -2.87% (p=1.000 n=1+1)
MulAddVWW/5 10.5ns ± 0% 10.3ns ± 0% -1.90% (p=1.000 n=1+1)
MulAddVWW/10 15.4ns ± 0% 14.9ns ± 0% -3.25% (p=1.000 n=1+1)
MulAddVWW/100 149ns ± 0% 125ns ± 0% -16.11% (p=1.000 n=1+1)
MulAddVWW/1000 1.42µs ± 0% 1.28µs ± 0% -9.74% (p=1.000 n=1+1)
MulAddVWW/10000 14.2µs ± 0% 12.8µs ± 0% -9.73% (p=1.000 n=1+1)
MulAddVWW/100000 144µs ± 0% 129µs ± 0% -10.10% (p=1.000 n=1+1)
Change-Id: I0ae7002a69783ca19d7a4e3e42042ae75dc60069
Reviewed-on: https://go-review.googlesource.com/c/go/+/248721
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
|
|
Simplifying some code without compromising performance.
My CPU is Intel Xeon Gold 6161, 2.20GHz, 64-bit operating system.
The memory is 8GB. This is my test environment, I hope to help you judge.
Benchmark:
name old time/op new time/op delta
Log1p-4 21.8ns ± 5% 21.8ns ± 4% ~ (p=0.973 n=20+20)
Change-Id: Icd8f96f1325b00007602d114300b92d4c57de409
Reviewed-on: https://go-review.googlesource.com/c/go/+/233940
Reviewed-by: Robert Griesemer <gri@golang.org>
|
|
Change-Id: Ie5fd026af45d2e7bc371a38d15dbb52a1b4958cd
Reviewed-on: https://go-review.googlesource.com/c/go/+/235717
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Updates #38850
Change-Id: I33f48762f5520eb0c0a841d8ca1ccdd65ecc20c8
Reviewed-on: https://go-review.googlesource.com/c/go/+/234583
Run-TryBot: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Replaced almost every use of Bytes with FillBytes.
Note that the approved proposal was for
func (*Int) FillBytes(buf []byte)
while this implements
func (*Int) FillBytes(buf []byte) []byte
because the latter was far nicer to use in all callsites.
Fixes #35833
Change-Id: Ia912df123e5d79b763845312ea3d9a8051343c0a
Reviewed-on: https://go-review.googlesource.com/c/go/+/230397
Reviewed-by: Robert Griesemer <gri@golang.org>
|
|
Change-Id: If34422859d47bc8f44974a00c6b7908e7655ff41
Reviewed-on: https://go-review.googlesource.com/c/go/+/223561
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
The function Modf lacks corresponding examples.
Change-Id: Id93423500e87d35b0b6870882be1698b304797ae
Reviewed-on: https://go-review.googlesource.com/c/go/+/231097
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
|
|
Implement special case handling and testing to ensure
conformance with the C99 standard annex G.6 Complex arithmetic.
Fixes #29320
Change-Id: Id72eb4c5a35d5a54b4b8690d2f7176ab11028f1b
Reviewed-on: https://go-review.googlesource.com/c/go/+/220689
Reviewed-by: Robert Griesemer <gri@golang.org>
|
|
When I browsed the source code, I saw that there is no corresponding example of this function. I am not sure if there is a need for an increase, this is my first time to submit CL.
Change-Id: Idbf4e1e1ed2995176a76959d561e152263a2fd26
Reviewed-on: https://go-review.googlesource.com/c/go/+/230741
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
|
|
Originally, we use an assembly function that returns a boolean result to
tell whether the machine has vector facility or not. It is now no longer
needed when we can directly use cpu.S390X.HasVX variable.
Change-Id: Ic1dae851982532bcfd9a9453416c112347f21d87
Reviewed-on: https://go-review.googlesource.com/c/go/+/230318
Reviewed-by: Michael Munday <mike.munday@ibm.com>
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Originally, we use an assembly function that returns a boolean result to
tell whether the machine has vector facility or not. It is now no longer
needed when we can directly use cpu.S390X.HasVX variable.
Change-Id: Ic3ffeb9e63238ef41406d97cdc42502145ddb454
Reviewed-on: https://go-review.googlesource.com/c/go/+/230319
Reviewed-by: Michael Munday <mike.munday@ibm.com>
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Everywhere else is using "cancellation" as of 2019
The reasoning is mentioned in 170060.
> Though there is variation in the spelling of canceled,
> cancellation is always spelled with a double l.
>
> Reference: https://www.grammarly.com/blog/canceled-vs-cancelled/
Change-Id: I933ea68d7251986ce582b92c33b7cb13cee1d207
GitHub-Last-Rev: fc3d5ada2bd0087ea9cfb3f105689876e7a2ee4f
GitHub-Pull-Request: golang/go#38661
Reviewed-on: https://go-review.googlesource.com/c/go/+/230199
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|