aboutsummaryrefslogtreecommitdiff
path: root/src/runtime/cpuflags.go
AgeCommit message (Collapse)Author
2020-03-26runtime: improve MIPS64x memclrMeng Zhuo
Using MIPS MSA VLD/VST to improve mips64x large memclr. name old time/op new time/op delta Memclr/5 23.2ns ± 0% 21.5ns ± 0% -7.33% (p=0.000 n=9+8) Memclr/16 20.1ns ± 0% 17.1ns ± 0% -14.93% (p=0.000 n=10+10) Memclr/64 27.2ns ± 0% 19.1ns ± 0% -29.70% (p=0.000 n=9+9) Memclr/256 76.8ns ± 0% 24.1ns ± 0% -68.66% (p=0.000 n=10+10) Memclr/4096 1.12µs ± 1% 0.18µs ± 0% -84.32% (p=0.000 n=10+8) Memclr/65536 18.0µs ± 0% 2.8µs ± 0% -84.29% (p=0.000 n=10+10) Memclr/1M 288µs ± 0% 45µs ± 0% -84.20% (p=0.000 n=10+10) Memclr/4M 1.15ms ± 0% 0.18ms ± 0% -84.21% (p=0.000 n=9+10) Memclr/8M 2.34ms ± 0% 1.39ms ± 0% -40.55% (p=0.000 n=10+8) Memclr/16M 4.72ms ± 0% 4.74ms ± 0% +0.52% (p=0.000 n=9+10) Memclr/64M 18.9ms ± 0% 18.9ms ± 0% ~ (p=0.436 n=10+10) GoMemclr/5 13.7ns ± 0% 16.9ns ± 0% +23.36% (p=0.000 n=10+10) GoMemclr/16 14.3ns ± 0% 9.0ns ± 0% -37.27% (p=0.000 n=10+9) GoMemclr/64 26.9ns ± 0% 13.7ns ± 0% -49.07% (p=0.000 n=10+10) GoMemclr/256 77.8ns ± 0% 13.0ns ± 0% -83.24% (p=0.000 n=9+10) name old speed new speed delta Memclr/5 215MB/s ± 0% 232MB/s ± 0% +7.74% (p=0.000 n=9+9) Memclr/16 795MB/s ± 0% 935MB/s ± 0% +17.60% (p=0.000 n=10+10) Memclr/64 2.35GB/s ± 0% 3.35GB/s ± 0% +42.33% (p=0.000 n=10+9) Memclr/256 3.34GB/s ± 0% 10.65GB/s ± 0% +219.16% (p=0.000 n=10+10) Memclr/4096 3.65GB/s ± 1% 23.30GB/s ± 0% +538.36% (p=0.000 n=10+10) Memclr/65536 3.65GB/s ± 0% 23.21GB/s ± 0% +536.59% (p=0.000 n=10+10) Memclr/1M 3.64GB/s ± 0% 23.07GB/s ± 0% +532.96% (p=0.000 n=10+10) Memclr/4M 3.64GB/s ± 0% 23.08GB/s ± 0% +533.36% (p=0.000 n=9+10) Memclr/8M 3.58GB/s ± 0% 6.02GB/s ± 0% +68.20% (p=0.000 n=10+8) Memclr/16M 3.56GB/s ± 0% 3.54GB/s ± 0% -0.51% (p=0.000 n=9+10) Memclr/64M 3.55GB/s ± 0% 3.55GB/s ± 0% ~ (p=0.436 n=10+10) GoMemclr/5 364MB/s ± 0% 296MB/s ± 0% -18.76% (p=0.000 n=9+10) GoMemclr/16 1.12GB/s ± 0% 1.78GB/s ± 0% +58.86% (p=0.000 n=10+10) GoMemclr/64 2.38GB/s ± 0% 4.66GB/s ± 0% +96.27% (p=0.000 n=10+9) GoMemclr/256 3.29GB/s ± 0% 19.62GB/s ± 0% +496.45% (p=0.000 n=10+9) Change-Id: I457858368f2875fd66818a41d2f0c190a850e8f1 Reviewed-on: https://go-review.googlesource.com/c/go/+/218177 Run-TryBot: Cherry Zhang <cherryyz@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-02-26runtime: guard VZEROUPPER on CPU featureCherry Zhang
In CL 219131 we inserted a VZEROUPPER instruction on darwin/amd64. The instruction is not available on pre-AVX machines. Guard it with CPU feature. Fixes #37459. Change-Id: I9a064df277d091be4ee594eda5c7fd8ee323102b Reviewed-on: https://go-review.googlesource.com/c/go/+/221057 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-10-21cmd/compile: add fma intrinsic for armsmasher164
This change introduces an arm intrinsic that generates the FMULAD instruction for the fused-multiply-add operation on systems that support it. System support is detected via cpu.ARM.HasVFPv4. A rewrite rule translates the generic intrinsic to FMULAD. Updates #25819. Change-Id: I8459e5dd1cdbdca35f88a78dbeb7d387f1e20efa Reviewed-on: https://go-review.googlesource.com/c/go/+/142117 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-10-21cmd/compile: add fma intrinsic for amd64smasher164
To permit ssa-level optimization, this change introduces an amd64 intrinsic that generates the VFMADD231SD instruction for the fused-multiply-add operation on systems that support it. System support is detected via cpu.X86.HasFMA. A rewrite rule can then translate the generic ssa intrinsic ("Fma") to VFMADD231SD. The benchmark compares the software implementation (old) with the intrinsic (new). name old time/op new time/op delta Fma-4 27.2ns ± 1% 1.0ns ± 9% -96.48% (p=0.008 n=5+5) Updates #25819. Change-Id: I966655e5f96817a5d06dff5942418a3915b09584 Reviewed-on: https://go-review.googlesource.com/c/go/+/137156 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2018-11-14runtime: make processor capability variable naming platform specificMartin Möhrmann
The current support_XXX variables are specific for the amd64 and 386 platforms. Prefix processor capability variables by architecture to have a consistent naming scheme and avoid reuse of the existing variables for new platforms. This also aligns naming of runtime variables closer with internal/cpu processor capability variable names. Change-Id: I3eabb29a03874678851376185d3a62e73c1aff1d Reviewed-on: https://go-review.googlesource.com/c/91435 Run-TryBot: Martin Möhrmann <martisch@uos.de> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2018-08-24all: align cpu feature variable offset namingMartin Möhrmann
Add an "offset_" prefix to all cpu feature variable offset constants to signify that they are not boolean cpu feature variables. Remove _ from offset constant names. Change-Id: I6e22a79ebcbe6e2ae54c4ac8764f9260bb3223ff Reviewed-on: https://go-review.googlesource.com/131215 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-08-24runtime: move arm hardware division support detection to internal/cpuMartin Möhrmann
Assumes mandatory VFP and VFPv3 support to be present by default but not IDIVA if AT_HWCAP is not available. Adds GODEBUGCPU options to disable the use of code paths in the runtime that use hardware support for division. Change-Id: Ida02311bd9b9701de3fc120697e69445bf6c0853 Reviewed-on: https://go-review.googlesource.com/114826 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2018-08-24runtime: use internal/cpu variables in assembler codeMartin Möhrmann
Using internal/cpu variables has the benefit of avoiding false sharing (as those are padded) and allows memory and cache usage for these variables to be shared by multiple packages. Change-Id: I2bf68d03091bf52b466cf689230d5d25d5950037 Reviewed-on: https://go-review.googlesource.com/126599 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>