aboutsummaryrefslogtreecommitdiff
path: root/test/live.go
AgeCommit message (Collapse)Author
2023-12-05math/rand, math/rand/v2: use ChaCha8 for global randRuss Cox
Move ChaCha8 code into internal/chacha8rand and use it to implement runtime.rand, which is used for the unseeded global source for both math/rand and math/rand/v2. This also affects the calculation of the start point for iteration over very very large maps (when the 32-bit fastrand is not big enough). The benefit is that misuse of the global random number generators in math/rand and math/rand/v2 in contexts where non-predictable randomness is important for security reasons is no longer a security problem, removing a common mistake among programmers who are unaware of the different kinds of randomness. The cost is an extra 304 bytes per thread stored in the m struct plus 2-3ns more per random uint64 due to the more sophisticated algorithm. Using PCG looks like it would cost about the same, although I haven't benchmarked that. Before this, the math/rand and math/rand/v2 global generator was wyrand (https://github.com/wangyi-fudan/wyhash). For math/rand, using wyrand instead of the Mitchell/Reeds/Thompson ALFG was justifiable, since the latter was not any better. But for math/rand/v2, the global generator really should be at least as good as one of the well-studied, specific algorithms provided directly by the package, and it's not. (Wyrand is still reasonable for scheduling and cache decisions.) Good randomness does have a cost: about twice wyrand. Also rationalize the various runtime rand references. goos: linux goarch: amd64 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ bbb48afeb7.amd64 │ 5cf807d1ea.amd64 │ │ sec/op │ sec/op vs base │ ChaCha8-32 1.862n ± 2% 1.861n ± 2% ~ (p=0.825 n=20) PCG_DXSM-32 1.471n ± 1% 1.460n ± 2% ~ (p=0.153 n=20) SourceUint64-32 1.636n ± 2% 1.582n ± 1% -3.30% (p=0.000 n=20) GlobalInt64-32 2.087n ± 1% 3.663n ± 1% +75.54% (p=0.000 n=20) GlobalInt64Parallel-32 0.1042n ± 1% 0.2026n ± 1% +94.48% (p=0.000 n=20) GlobalUint64-32 2.263n ± 2% 3.724n ± 1% +64.57% (p=0.000 n=20) GlobalUint64Parallel-32 0.1019n ± 1% 0.1973n ± 1% +93.67% (p=0.000 n=20) Int64-32 1.771n ± 1% 1.774n ± 1% ~ (p=0.449 n=20) Uint64-32 1.863n ± 2% 1.866n ± 1% ~ (p=0.364 n=20) GlobalIntN1000-32 3.134n ± 3% 4.730n ± 2% +50.95% (p=0.000 n=20) IntN1000-32 2.489n ± 1% 2.489n ± 1% ~ (p=0.683 n=20) Int64N1000-32 2.521n ± 1% 2.516n ± 1% ~ (p=0.394 n=20) Int64N1e8-32 2.479n ± 1% 2.478n ± 2% ~ (p=0.743 n=20) Int64N1e9-32 2.530n ± 2% 2.514n ± 2% ~ (p=0.193 n=20) Int64N2e9-32 2.501n ± 1% 2.494n ± 1% ~ (p=0.616 n=20) Int64N1e18-32 3.227n ± 1% 3.205n ± 1% ~ (p=0.101 n=20) Int64N2e18-32 3.647n ± 1% 3.599n ± 1% ~ (p=0.019 n=20) Int64N4e18-32 5.135n ± 1% 5.069n ± 2% ~ (p=0.034 n=20) Int32N1000-32 2.657n ± 1% 2.637n ± 1% ~ (p=0.180 n=20) Int32N1e8-32 2.636n ± 1% 2.636n ± 1% ~ (p=0.763 n=20) Int32N1e9-32 2.660n ± 2% 2.638n ± 1% ~ (p=0.358 n=20) Int32N2e9-32 2.662n ± 2% 2.618n ± 2% ~ (p=0.064 n=20) Float32-32 2.272n ± 2% 2.239n ± 2% ~ (p=0.194 n=20) Float64-32 2.272n ± 1% 2.286n ± 2% ~ (p=0.763 n=20) ExpFloat64-32 3.762n ± 1% 3.744n ± 1% ~ (p=0.171 n=20) NormFloat64-32 3.706n ± 1% 3.655n ± 2% ~ (p=0.066 n=20) Perm3-32 32.93n ± 3% 34.62n ± 1% +5.13% (p=0.000 n=20) Perm30-32 202.9n ± 1% 204.0n ± 1% ~ (p=0.482 n=20) Perm30ViaShuffle-32 115.0n ± 1% 114.9n ± 1% ~ (p=0.358 n=20) ShuffleOverhead-32 112.8n ± 1% 112.7n ± 1% ~ (p=0.692 n=20) Concurrent-32 2.107n ± 0% 3.725n ± 1% +76.75% (p=0.000 n=20) goos: darwin goarch: arm64 pkg: math/rand/v2 │ bbb48afeb7.arm64 │ 5cf807d1ea.arm64 │ │ sec/op │ sec/op vs base │ ChaCha8-8 2.480n ± 0% 2.429n ± 0% -2.04% (p=0.000 n=20) PCG_DXSM-8 2.531n ± 0% 2.530n ± 0% ~ (p=0.877 n=20) SourceUint64-8 2.534n ± 0% 2.533n ± 0% ~ (p=0.732 n=20) GlobalInt64-8 2.172n ± 1% 4.794n ± 0% +120.67% (p=0.000 n=20) GlobalInt64Parallel-8 0.4320n ± 0% 0.9605n ± 0% +122.32% (p=0.000 n=20) GlobalUint64-8 2.182n ± 0% 4.770n ± 0% +118.58% (p=0.000 n=20) GlobalUint64Parallel-8 0.4307n ± 0% 0.9583n ± 0% +122.51% (p=0.000 n=20) Int64-8 4.107n ± 0% 4.104n ± 0% ~ (p=0.416 n=20) Uint64-8 4.080n ± 0% 4.080n ± 0% ~ (p=0.052 n=20) GlobalIntN1000-8 2.814n ± 2% 5.643n ± 0% +100.50% (p=0.000 n=20) IntN1000-8 4.141n ± 0% 4.139n ± 0% ~ (p=0.140 n=20) Int64N1000-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.313 n=20) Int64N1e8-8 4.140n ± 0% 4.139n ± 0% ~ (p=0.103 n=20) Int64N1e9-8 4.139n ± 0% 4.140n ± 0% ~ (p=0.761 n=20) Int64N2e9-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.636 n=20) Int64N1e18-8 5.266n ± 0% 5.326n ± 1% +1.14% (p=0.001 n=20) Int64N2e18-8 6.052n ± 0% 6.167n ± 0% +1.90% (p=0.000 n=20) Int64N4e18-8 8.826n ± 0% 9.051n ± 0% +2.55% (p=0.000 n=20) Int32N1000-8 4.127n ± 0% 4.132n ± 0% +0.12% (p=0.000 n=20) Int32N1e8-8 4.126n ± 0% 4.131n ± 0% +0.12% (p=0.000 n=20) Int32N1e9-8 4.127n ± 0% 4.132n ± 0% +0.12% (p=0.000 n=20) Int32N2e9-8 4.132n ± 0% 4.131n ± 0% ~ (p=0.017 n=20) Float32-8 4.109n ± 0% 4.105n ± 0% ~ (p=0.379 n=20) Float64-8 4.107n ± 0% 4.106n ± 0% ~ (p=0.867 n=20) ExpFloat64-8 5.339n ± 0% 5.383n ± 0% +0.82% (p=0.000 n=20) NormFloat64-8 5.735n ± 0% 5.737n ± 1% ~ (p=0.856 n=20) Perm3-8 26.65n ± 0% 26.80n ± 1% +0.58% (p=0.000 n=20) Perm30-8 194.8n ± 1% 197.0n ± 0% +1.18% (p=0.000 n=20) Perm30ViaShuffle-8 156.6n ± 0% 157.6n ± 1% +0.61% (p=0.000 n=20) ShuffleOverhead-8 124.9n ± 0% 125.5n ± 0% +0.52% (p=0.000 n=20) Concurrent-8 2.434n ± 3% 5.066n ± 0% +108.09% (p=0.000 n=20) goos: linux goarch: 386 pkg: math/rand/v2 cpu: AMD Ryzen 9 7950X 16-Core Processor │ bbb48afeb7.386 │ 5cf807d1ea.386 │ │ sec/op │ sec/op vs base │ ChaCha8-32 11.295n ± 1% 4.748n ± 2% -57.96% (p=0.000 n=20) PCG_DXSM-32 7.693n ± 1% 7.738n ± 2% ~ (p=0.542 n=20) SourceUint64-32 7.658n ± 2% 7.622n ± 2% ~ (p=0.344 n=20) GlobalInt64-32 3.473n ± 2% 7.526n ± 2% +116.73% (p=0.000 n=20) GlobalInt64Parallel-32 0.3198n ± 0% 0.5444n ± 0% +70.22% (p=0.000 n=20) GlobalUint64-32 3.612n ± 0% 7.575n ± 1% +109.69% (p=0.000 n=20) GlobalUint64Parallel-32 0.3168n ± 0% 0.5403n ± 0% +70.51% (p=0.000 n=20) Int64-32 7.673n ± 2% 7.789n ± 1% ~ (p=0.122 n=20) Uint64-32 7.773n ± 1% 7.827n ± 2% ~ (p=0.920 n=20) GlobalIntN1000-32 6.268n ± 1% 9.581n ± 1% +52.87% (p=0.000 n=20) IntN1000-32 10.33n ± 2% 10.45n ± 1% ~ (p=0.233 n=20) Int64N1000-32 10.98n ± 2% 11.01n ± 1% ~ (p=0.401 n=20) Int64N1e8-32 11.19n ± 2% 10.97n ± 1% ~ (p=0.033 n=20) Int64N1e9-32 11.06n ± 1% 11.08n ± 1% ~ (p=0.498 n=20) Int64N2e9-32 11.10n ± 1% 11.01n ± 2% ~ (p=0.995 n=20) Int64N1e18-32 15.23n ± 2% 15.04n ± 1% ~ (p=0.973 n=20) Int64N2e18-32 15.89n ± 1% 15.85n ± 1% ~ (p=0.409 n=20) Int64N4e18-32 18.96n ± 2% 19.34n ± 2% ~ (p=0.048 n=20) Int32N1000-32 10.46n ± 2% 10.44n ± 2% ~ (p=0.480 n=20) Int32N1e8-32 10.46n ± 2% 10.49n ± 2% ~ (p=0.951 n=20) Int32N1e9-32 10.28n ± 2% 10.26n ± 1% ~ (p=0.431 n=20) Int32N2e9-32 10.50n ± 2% 10.44n ± 2% ~ (p=0.249 n=20) Float32-32 13.80n ± 2% 13.80n ± 2% ~ (p=0.751 n=20) Float64-32 23.55n ± 2% 23.87n ± 0% ~ (p=0.408 n=20) ExpFloat64-32 15.36n ± 1% 15.29n ± 2% ~ (p=0.316 n=20) NormFloat64-32 13.57n ± 1% 13.79n ± 1% +1.66% (p=0.005 n=20) Perm3-32 45.70n ± 2% 46.99n ± 2% +2.81% (p=0.001 n=20) Perm30-32 399.0n ± 1% 403.8n ± 1% +1.19% (p=0.006 n=20) Perm30ViaShuffle-32 349.0n ± 1% 350.4n ± 1% ~ (p=0.909 n=20) ShuffleOverhead-32 322.3n ± 1% 323.8n ± 1% ~ (p=0.410 n=20) Concurrent-32 3.331n ± 1% 7.312n ± 1% +119.50% (p=0.000 n=20) For #61716. Change-Id: Ibdddeed85c34d9ae397289dc899e04d4845f9ed2 Reviewed-on: https://go-review.googlesource.com/c/go/+/516860 Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Filippo Valsorda <filippo@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-19test: migrate remaining files to go:build syntaxDmitri Shuralyov
Most of the test cases in the test directory use the new go:build syntax already. Convert the rest. In general, try to place the build constraint line below the test directive comment in more places. For #41184. For #60268. Change-Id: I11c41a0642a8a26dc2eda1406da908645bbc005b Cq-Include-Trybots: luci.golang.try:gotip-linux-386-longtest,gotip-linux-amd64-longtest,gotip-windows-amd64-longtest Reviewed-on: https://go-review.googlesource.com/c/go/+/536236 Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-08-21cmd/compile/internal/walk: reuse runtime.scaseMatthew Dempsky
Shaves ~1.5kB off cmd/go binary. Change-Id: I8ad85aa4a24bc197b009c8e1ea9201957222152a Reviewed-on: https://go-review.googlesource.com/c/go/+/521677 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Matthew Dempsky <mdempsky@google.com> Auto-Submit: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@google.com>
2023-08-21cmd/compile/internal/reflectdata: share hmap and hiter typesMatthew Dempsky
There's no need for distinct hmap and hiter types for each map. Shaves 9kB off cmd/go binary size. Change-Id: I7bc3b2d8ec82e7fcd78c1cb17733ebd8b615990a Reviewed-on: https://go-review.googlesource.com/c/go/+/521615 Run-TryBot: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Auto-Submit: Matthew Dempsky <mdempsky@google.com>
2023-01-21cmd/compile: sort liveness variable reportsKeith Randall
Sort variables before display so that when there are multiple variables to report, they are in a consistent order. Otherwise they are ordered in the order they appear in the fn.Dcl list, which can vary. Particularly, they vary depending on regabi. Change-Id: I0db380f7cbe6911e87177503a4c3b39851ff1b5a Reviewed-on: https://go-review.googlesource.com/c/go/+/462898 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
2022-08-18cmd/compile: stop using VARKILLKeith Randall
With the introduction of stack objects, VARKILL information is no longer needed. With stack objects, an object is dead when there are no more static references to it, and the stack scanner can't find any live pointers to it. VARKILL information isn't used to establish live ranges for address-taken variables any more. In effect, the last static reference *is* the VARKILL, and there's an additional dynamic liveness check during stack scanning. Next CL will actually rip out the VARKILL opcodes. Change-Id: I030a2ab867445cf4e0e69397911f8a2e2f0ed07b Reviewed-on: https://go-review.googlesource.com/c/go/+/419234 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Keith Randall <khr@golang.org>
2022-08-10test: relax live.go for GOEXPERIMENT=unifiedMatthew Dempsky
This CL applies the same change to test/live.go that was previously applied to test/live_regabi.go in golang.org/cl/415240. This wasn't noticed at the time though, because GOEXPERIMENT=unified was only being tested on linux-amd64, which is a regabi platform. Change-Id: I0c75c2b7097544305e4174c2f5ec6ec283c81a8e Reviewed-on: https://go-review.googlesource.com/c/go/+/422254 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com> Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Matthew Dempsky <mdempsky@google.com>
2021-08-09[dev.typeparams] cmd/compile: simplify interface conversionsKeith Randall
Simplify the implementation of interface conversions in the compiler. Don't pass fields that aren't needed (the data word, usually) to the runtime. For generics, we need to put a dynamic type in an interface. The new dataWord function is exactly what we need (the type word will come from a dictionary). Change-Id: Iade5de5c174854b65ad248f35c7893c603f7be3d Reviewed-on: https://go-review.googlesource.com/c/go/+/340029 Trust: Keith Randall <khr@golang.org> Trust: Dan Scales <danscales@google.com> Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Dan Scales <danscales@google.com>
2021-06-24[dev.typeparams] cmd/compile: suppress liveness diagnostics of wrappersMatthew Dempsky
Similar to the previous CL to suppress escape analysis diagnostics for method wrappers, suppress liveness analysis diagnostics too. It's hardly useful to know that all of a wrapper method's arguments are live at entry. Change-Id: I0d1e44552c6334ee3b454adc107430232abcb56a Reviewed-on: https://go-review.googlesource.com/c/go/+/330749 Trust: Matthew Dempsky <mdempsky@google.com> Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2021-06-08[dev.typeparams] cmd/compile, runtime: always enable defer/go wrappingCherry Mui
Hardwire regabidefers to true. Remove it from GOEXPERIMENTs. Fallback paths are not cleaned up in this CL. That will be done in later CLs. Change-Id: Iec1112a1e55d5f6ef70232a5ff6e702f649071c4 Reviewed-on: https://go-review.googlesource.com/c/go/+/325913 Trust: Cherry Mui <cherryyz@google.com> Run-TryBot: Cherry Mui <cherryyz@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Than McIntosh <thanm@google.com>
2021-06-04[dev.typeparams] test: test regabidefers in live.goCherry Mui
Previously, live.go is conditioned on not using regabidefers. Now we have regabidefers enabled by default everywhere, and we may remove the fallback path in the near future, test that configuration instead. Change-Id: Idf910aee323bdd6478bc7a2062b2052d82ce003f Reviewed-on: https://go-review.googlesource.com/c/go/+/325111 Trust: Cherry Mui <cherryyz@google.com> Run-TryBot: Cherry Mui <cherryyz@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Than McIntosh <thanm@google.com>
2021-04-19test: add liveness test for regabiCherry Zhang
With defer/go wrapping and register arguments, some liveness info changed and live.go test was disabled for regabi. This CL adds a new one for regabi. Change-Id: I65f03a6ef156366d8b76c62a16251c3e818f4b02 Reviewed-on: https://go-review.googlesource.com/c/go/+/311369 Trust: Cherry Zhang <cherryyz@google.com> Reviewed-by: David Chase <drchase@google.com>
2021-03-23cmd/compile: wrap/desugar defer calls for register abiThan McIntosh
Adds code to the compiler's "order" phase to rewrite go and defer statements to always be argument-less. E.g. defer f(x,y) => x1, y1 := x, y defer func() { f(x1, y1) } This transformation is not beneficial on its own, but it helps simplify runtime defer handling for the new register ABI (when invoking deferred functions on the panic path, the runtime doesn't need to manage the complexity of determining which args to pass in register vs memory). This feature is currently enabled by default if GOEXPERIMENT=regabi or GOEXPERIMENT=regabidefer is in effect. Included in this CL are some workarounds in the runtime to insure that "go" statement targets in the runtime are argument-less already (since wrapping them can potentially introduce heap-allocated closures, which are currently not allowed). The expectation is that these workarounds will be temporary, and can go away once we either A) change the rules about heap-allocated closures, or B) implement some other scheme for handling go statements. Change-Id: I01060d79a6b140c6f0838d6e6813f807ccdca319 Reviewed-on: https://go-review.googlesource.com/c/go/+/298669 Trust: Than McIntosh <thanm@google.com> Run-TryBot: Than McIntosh <thanm@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: David Chase <drchase@google.com>
2020-12-01[dev.regabi] cmd/compile: only save ONAMEs on Curfn.DclMatthew Dempsky
There's not really any use to tracking function-scoped constants and types on Curfn.Dcl, and there's sloppy code that assumes all of the declarations are variables (e.g., cmpstackvarlt). Change-Id: I5d10dc681dac2c161c7b73ba808403052ca0608e Reviewed-on: https://go-review.googlesource.com/c/go/+/274436 Reviewed-by: Russ Cox <rsc@golang.org> Trust: Matthew Dempsky <mdempsky@google.com>
2020-01-29cmd/compile,cmd/link: fix and re-enable open-coded defers on riscv64Joel Sing
The R_CALLRISCV relocation marker is on the JALR instruction, however the actual relocation is currently two instructions previous for the AUIPC+ADDI sequence. Adjust the platform dependent offset accordingly and re-enable open-coded defers. Fixes #36786. Change-Id: I71597c193c447930fbe94ce44b7355e89ae877bb Reviewed-on: https://go-review.googlesource.com/c/go/+/216797 Run-TryBot: Joel Sing <joel@sing.id.au> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-01-29test: disable the live test on riscv64Joel Sing
This test expects that open-coded defers are enabled, which is not currently the case on riscv64. Updates issue #27532 and #36786. Change-Id: I94bb558c5b0734b4cfe5ae12873be81026009bcf Reviewed-on: https://go-review.googlesource.com/c/go/+/216777 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-10-24cmd/compile, cmd/link, runtime: make defers low-cost through inline code and ↵Dan Scales
extra funcdata Generate inline code at defer time to save the args of defer calls to unique (autotmp) stack slots, and generate inline code at exit time to check which defer calls were made and make the associated function/method/interface calls. We remember that a particular defer statement was reached by storing in the deferBits variable (always stored on the stack). At exit time, we check the bits of the deferBits variable to determine which defer function calls to make (in reverse order). These low-cost defers are only used for functions where no defers appear in loops. In addition, we don't do these low-cost defers if there are too many defer statements or too many exits in a function (to limit code increase). When a function uses open-coded defers, we produce extra FUNCDATA_OpenCodedDeferInfo information that specifies the number of defers, and for each defer, the stack slots where the closure and associated args have been stored. The funcdata also includes the location of the deferBits variable. Therefore, for panics, we can use this funcdata to determine exactly which defers are active, and call the appropriate functions/methods/closures with the correct arguments for each active defer. In order to unwind the stack correctly after a recover(), we need to add an extra code segment to functions with open-coded defers that simply calls deferreturn() and returns. This segment is not reachable by the normal function, but is returned to by the runtime during recovery. We set the liveness information of this deferreturn() to be the same as the liveness at the first function call during the last defer exit code (so all return values and all stack slots needed by the defer calls will be live). I needed to increase the stackguard constant from 880 to 896, because of a small amount of new code in deferreturn(). The -N flag disables open-coded defers. '-d defer' prints out the kind of defer being used at each defer statement (heap-allocated, stack-allocated, or open-coded). Cost of defer statement [ go test -run NONE -bench BenchmarkDefer$ runtime ] With normal (stack-allocated) defers only: 35.4 ns/op With open-coded defers: 5.6 ns/op Cost of function call alone (remove defer keyword): 4.4 ns/op Text size increase (including funcdata) for go binary without/with open-coded defers: 0.09% The average size increase (including funcdata) for only the functions that use open-coded defers is 1.1%. The cost of a panic followed by a recover got noticeably slower, since panic processing now requires a scan of the stack for open-coded defer frames. This scan is required, even if no frames are using open-coded defers: Cost of panic and recover [ go test -run NONE -bench BenchmarkPanicRecover runtime ] Without open-coded defers: 62.0 ns/op With open-coded defers: 255 ns/op A CGO Go-to-C-to-Go benchmark got noticeably faster because of open-coded defers: CGO Go-to-C-to-Go benchmark [cd misc/cgo/test; go test -run NONE -bench BenchmarkCGoCallback ] Without open-coded defers: 443 ns/op With open-coded defers: 347 ns/op Updates #14939 (defer performance) Updates #34481 (design doc) Change-Id: I63b1a60d1ebf28126f55ee9fd7ecffe9cb23d1ff Reviewed-on: https://go-review.googlesource.com/c/go/+/202340 Reviewed-by: Austin Clements <austin@google.com>
2019-10-16Revert "cmd/compile, cmd/link, runtime: make defers low-cost through inline ↵Bryan C. Mills
code and extra funcdata" This reverts CL 190098. Reason for revert: broke several builders. Change-Id: I69161352f9ded02537d8815f259c4d391edd9220 Reviewed-on: https://go-review.googlesource.com/c/go/+/201519 Run-TryBot: Bryan C. Mills <bcmills@google.com> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Dan Scales <danscales@google.com>
2019-10-16cmd/compile, cmd/link, runtime: make defers low-cost through inline code and ↵Dan Scales
extra funcdata Generate inline code at defer time to save the args of defer calls to unique (autotmp) stack slots, and generate inline code at exit time to check which defer calls were made and make the associated function/method/interface calls. We remember that a particular defer statement was reached by storing in the deferBits variable (always stored on the stack). At exit time, we check the bits of the deferBits variable to determine which defer function calls to make (in reverse order). These low-cost defers are only used for functions where no defers appear in loops. In addition, we don't do these low-cost defers if there are too many defer statements or too many exits in a function (to limit code increase). When a function uses open-coded defers, we produce extra FUNCDATA_OpenCodedDeferInfo information that specifies the number of defers, and for each defer, the stack slots where the closure and associated args have been stored. The funcdata also includes the location of the deferBits variable. Therefore, for panics, we can use this funcdata to determine exactly which defers are active, and call the appropriate functions/methods/closures with the correct arguments for each active defer. In order to unwind the stack correctly after a recover(), we need to add an extra code segment to functions with open-coded defers that simply calls deferreturn() and returns. This segment is not reachable by the normal function, but is returned to by the runtime during recovery. We set the liveness information of this deferreturn() to be the same as the liveness at the first function call during the last defer exit code (so all return values and all stack slots needed by the defer calls will be live). I needed to increase the stackguard constant from 880 to 896, because of a small amount of new code in deferreturn(). The -N flag disables open-coded defers. '-d defer' prints out the kind of defer being used at each defer statement (heap-allocated, stack-allocated, or open-coded). Cost of defer statement [ go test -run NONE -bench BenchmarkDefer$ runtime ] With normal (stack-allocated) defers only: 35.4 ns/op With open-coded defers: 5.6 ns/op Cost of function call alone (remove defer keyword): 4.4 ns/op Text size increase (including funcdata) for go cmd without/with open-coded defers: 0.09% The average size increase (including funcdata) for only the functions that use open-coded defers is 1.1%. The cost of a panic followed by a recover got noticeably slower, since panic processing now requires a scan of the stack for open-coded defer frames. This scan is required, even if no frames are using open-coded defers: Cost of panic and recover [ go test -run NONE -bench BenchmarkPanicRecover runtime ] Without open-coded defers: 62.0 ns/op With open-coded defers: 255 ns/op A CGO Go-to-C-to-Go benchmark got noticeably faster because of open-coded defers: CGO Go-to-C-to-Go benchmark [cd misc/cgo/test; go test -run NONE -bench BenchmarkCGoCallback ] Without open-coded defers: 443 ns/op With open-coded defers: 347 ns/op Updates #14939 (defer performance) Updates #34481 (design doc) Change-Id: I51a389860b9676cfa1b84722f5fb84d3c4ee9e28 Reviewed-on: https://go-review.googlesource.com/c/go/+/190098 Reviewed-by: Austin Clements <austin@google.com>
2019-09-03cmd/compile: extend ssa.go to handle 1-element array and 1-field structCuong Manh Le
Assinging to 1-element array/1-field struct variable is considered clobbering the whole variable. By emitting OpVarDef in this case, liveness analysis can now know the variable is redefined. Also, the isfat is not necessary anymore, and will be removed in follow up CL. Fixes #33916 Change-Id: Iece0d90b05273f333d59d6ee5b12ee7dc71908c2 Reviewed-on: https://go-review.googlesource.com/c/go/+/192979 Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2019-08-28Revert "cmd/compile: make isfat handle 1-element array, 1-field struct"Matthew Dempsky
This reverts commit 53227762153afb39c979810bd59ec139e3c8127d. Reason for revert: broke js-wasm builder. Change-Id: If22762317c4a9e00f5060eb84377a4a52d601fca Reviewed-on: https://go-review.googlesource.com/c/go/+/192157 Run-TryBot: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Bryan C. Mills <bcmills@google.com>
2019-08-28cmd/compile: make isfat handle 1-element array, 1-field structLE Manh Cuong
This will improve liveness analysis slightly, the same logic as isdirectiface curently does. In: type T struct { m map[int]int } v := T{} v.m = make(map[int]int) T is considered "fat", now it is not. So assigning to v.m is considered to clobber the entire v. This is follow up of CL 179057. Change-Id: Id6b4807b8e8521ef5d8bcb14fedb6dceb9dbf18c Reviewed-on: https://go-review.googlesource.com/c/go/+/179578 Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2019-06-10Revert "Revert "cmd/compile,runtime: allocate defer records on the stack""Keith Randall
This reverts CL 180761 Reason for revert: Reinstate the stack-allocated defer CL. There was nothing wrong with the CL proper, but stack allocation of defers exposed two other issues. Issue #32477: Fix has been submitted as CL 181258. Issue #32498: Possible fix is CL 181377 (not submitted yet). Change-Id: I32b3365d5026600069291b068bbba6cb15295eb3 Reviewed-on: https://go-review.googlesource.com/c/go/+/181378 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-06-05Revert "cmd/compile,runtime: allocate defer records on the stack"Keith Randall
This reverts commit fff4f599fe1c21e411a99de5c9b3777d06ce0ce6. Reason for revert: Seems to still have issues around GC. Fixes #32452 Change-Id: Ibe7af629f9ad6a3d5312acd7b066123f484da7f0 Reviewed-on: https://go-review.googlesource.com/c/go/+/180761 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2019-06-04cmd/compile,runtime: allocate defer records on the stackKeith Randall
When a defer is executed at most once in a function body, we can allocate the defer record for it on the stack instead of on the heap. This should make defers like this (which are very common) faster. This optimization applies to 363 out of the 370 static defer sites in the cmd/go binary. name old time/op new time/op delta Defer-4 52.2ns ± 5% 36.2ns ± 3% -30.70% (p=0.000 n=10+10) Fixes #6980 Update #14939 Change-Id: I697109dd7aeef9e97a9eeba2ef65ff53d3ee1004 Reviewed-on: https://go-review.googlesource.com/c/go/+/171758 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>
2019-03-06cmd/compile: fix ordering for short-circuiting opsKeith Randall
Make sure the side effects inside short-circuited operations (&& and ||) happen correctly. Before this CL, we attached the side effects to the node itself using exprInPlace. That caused other side effects in sibling expressions to get reordered with respect to the short circuit side effect. Instead, rewrite a && b like: r := a if r { r = b } That code we can keep correctly ordered with respect to other side-effects extracted from part of a big expression. exprInPlace seems generally unsafe. But this was the only case where exprInPlace is called not at the top level of an expression, so I don't think the other uses can actually trigger an issue (there can't be a sibling expression). TODO: maybe those cases don't need "in place", and we can retire that function generally. This CL needed a small tweak to the SSA generation of OIF so that the short circuit optimization still triggers. The short circuit optimization looks for triangle but not diamonds, so don't bother allocating a block if it will be empty. Go 1 benchmarks are in the noise. Fixes #30566 Change-Id: I19c04296bea63cbd6ad05f87a63b005029123610 Reviewed-on: https://go-review.googlesource.com/c/go/+/165617 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2018-10-15cmd/compile: provide types for all order-allocated temporariesKeith Randall
Ensure that we correctly type the stack temps for regular closures, method function closures, and slice literals. Then we don't need to override the dummy types later. Furthermore, this allows order to reuse temporaries of these types. OARRAYLIT doesn't need a temporary as far as I can tell, so I removed that case from order. Change-Id: Ic58520fa50c90639393ff78f33d3c831d5c4acb9 Reviewed-on: https://go-review.googlesource.com/c/140306 Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-14cmd/compile: reuse temporaries in order passKeith Randall
Instead of allocating a new temporary each time one is needed, keep a list of temporaries which are free (have already been VARKILLed on every path) and use one of them. Should save a lot of stack space. In a function like this: func main() { fmt.Printf("%d %d\n", 2, 3) fmt.Printf("%d %d\n", 4, 5) fmt.Printf("%d %d\n", 6, 7) } The three [2]interface{} arrays used to hold the ... args all use the same autotmp, instead of 3 different autotmps as happened previous to this CL. Change-Id: I2d728e226f81e05ae68ca8247af62014a1b032d3 Reviewed-on: https://go-review.googlesource.com/c/140301 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-14runtime,cmd/compile: pass strings and slices to convT2{E,I} by valueKeith Randall
When we pass these types by reference, we usually have to allocate temporaries on the stack, initialize them, then pass their address to the conversion functions. It's simpler to pass these types directly by value. This particularly applies to conversions needed for fmt.Printf (to interface{} for constructing a [...]interface{}). func f(a, b, c string) { fmt.Printf("%s %s\n", a, b) fmt.Printf("%s %s\n", b, c) } This function's stack frame shrinks from 200 to 136 bytes, and its code shrinks from 535 to 453 bytes. The go binary shrinks 0.3%. Update #24286 Aside: for this function f, we don't really need to allocate temporaries for the convT2E function. We could use the address of a, b, and c directly. That might get similar (or maybe better?) improvements. I investigated a bit, but it seemed complicated to do it safely. This change was much easier. Change-Id: I78cbe51b501fb41e1e324ce4203f0de56a1db82d Reviewed-on: https://go-review.googlesource.com/c/135377 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-10-03cmd/compile,runtime: remove ambiguously live logicKeith Randall
The previous CL introduced stack objects. This CL removes the old ambiguously live liveness analysis. After this CL we're relying on stack objects exclusively. Update a bunch of liveness tests to reflect the new world. Fixes #22350 Change-Id: I739b26e015882231011ce6bc1a7f426049e59f31 Reviewed-on: https://go-review.googlesource.com/c/134156 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-08-23all: fix typos detected by github.com/client9/misspellKazuhiro Sera
Change-Id: Iadb3c5de8ae9ea45855013997ed70f7929a88661 GitHub-Last-Rev: ae85bcf82be8fee533e2b9901c6133921382c70a GitHub-Pull-Request: golang/go#26920 Reviewed-on: https://go-review.googlesource.com/128955 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-05-22cmd/compile: enable stack maps everywhere except unsafe pointsAustin Clements
This modifies issafepoint in liveness analysis to report almost every operation as a safe point. There are four things we don't mark as safe-points: 1. Runtime code (other than at calls). 2. go:nosplit functions (other than at calls). 3. Instructions between the load of the write barrier-enabled flag and the write. 4. Instructions leading up to a uintptr -> unsafe.Pointer conversion. We'll optimize this in later CLs: name old time/op new time/op delta Template 185ms ± 2% 190ms ± 2% +2.95% (p=0.000 n=10+10) Unicode 96.3ms ± 3% 96.4ms ± 1% ~ (p=0.905 n=10+9) GoTypes 658ms ± 0% 669ms ± 1% +1.72% (p=0.000 n=10+9) Compiler 3.14s ± 1% 3.18s ± 1% +1.56% (p=0.000 n=9+10) SSA 7.41s ± 2% 7.59s ± 1% +2.48% (p=0.000 n=9+10) Flate 126ms ± 1% 128ms ± 1% +2.08% (p=0.000 n=10+10) GoParser 153ms ± 1% 157ms ± 2% +2.38% (p=0.000 n=10+10) Reflect 437ms ± 1% 442ms ± 1% +0.98% (p=0.001 n=10+10) Tar 178ms ± 1% 179ms ± 1% +0.67% (p=0.035 n=10+9) XML 223ms ± 1% 229ms ± 1% +2.58% (p=0.000 n=10+10) [Geo mean] 394ms 401ms +1.75% No effect on binary size because we're not yet emitting these extra safe points. For #24543. Change-Id: I16a1eebb9183cad7cef9d53c0fd21a973cad6859 Reviewed-on: https://go-review.googlesource.com/109348 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2018-05-01cmd/compile: open code select{send,recv,default}Matthew Dempsky
Registration now looks like: var cases [4]runtime.scases var order [8]uint16 cases[0].kind = caseSend cases[0].c = c1 cases[0].elem = &v1 if raceenabled || msanenabled { selectsetpc(&cases[0]) } cases[1].kind = caseRecv cases[1].c = c2 cases[1].elem = &v2 if raceenabled || msanenabled { selectsetpc(&cases[1]) } ... Change-Id: Ib9bcf426a4797fe4bfd8152ca9e6e08e39a70b48 Reviewed-on: https://go-review.googlesource.com/37934 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>
2018-05-01runtime: eliminate runtime.hselectMatthew Dempsky
Now the registration phase looks like: var cases [4]runtime.scases var order [8]uint16 selectsend(&cases[0], c1, &v1) selectrecv(&cases[1], c2, &v2, nil) selectrecv(&cases[2], c3, &v3, &ok) selectdefault(&cases[3]) chosen := selectgo(&cases[0], &order[0], 4) Primarily, this is just preparation for having the compiler open-code selectsend, selectrecv, and selectdefault. As a minor benefit, order can now be layed out separately on the stack in the pointer-free segment, so it won't take up space in the function's stack pointer maps. Change-Id: I5552ba594201efd31fcb40084da20b42ea569a45 Reviewed-on: https://go-review.googlesource.com/37933 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>
2018-02-27cmd/compile: mark the first word of an interface as a uintptrKeith Randall
The first word of an interface is a pointer, but for the purposes of GC we don't need to treat it as such. 1. If it is a non-empty interface, the pointer points to an itab which is always in persistentalloc space. 2. If it is an empty interface, the pointer points to a _type. a. If it is a compile-time-allocated type, it points into the read-only data section. b. If it is a reflect-allocated type, it points into the Go heap. Reflect is responsible for keeping a reference to the underlying type so it won't be GCd. If we ever have a moving GC, we need to change this for 2b (as well as scan itabs to update their itab._type fields). Write barriers on the first word of interfaces have already been removed. Change-Id: I643e91d7ac4de980ac2717436eff94097c65d959 Reviewed-on: https://go-review.googlesource.com/97518 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2017-11-02cmd/compile: specialize map creation for small hint sizesMartin Möhrmann
Handle make(map[any]any) and make(map[any]any, hint) where hint <= BUCKETSIZE special to allow for faster map initialization and to improve binary size by using runtime calls with fewer arguments. Given hint is smaller or equal to BUCKETSIZE in which case overLoadFactor(hint, 0) is false and no buckets would be allocated by makemap: * If hmap needs to be allocated on the stack then only hmap's hash0 field needs to be initialized and no call to makemap is needed. * If hmap needs to be allocated on the heap then a new special makehmap function will allocate hmap and intialize hmap's hash0 field. Reduces size of the godoc by ~36kb. AMD64 name old time/op new time/op delta NewEmptyMap 16.6ns ± 2% 5.5ns ± 2% -66.72% (p=0.000 n=10+10) NewSmallMap 64.8ns ± 1% 56.5ns ± 1% -12.75% (p=0.000 n=9+10) Updates #6853 Change-Id: I624e90da6775afaa061178e95db8aca674f44e9b Reviewed-on: https://go-review.googlesource.com/61190 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2017-10-13cmd/compile: simplify slice/array range loops for some element sizesMartin Möhrmann
In range loops over slices and arrays besides a variable to track the index an extra variable containing the address of the current element is used. To compute a pointer to the next element the elements size is added to the address. On 386 and amd64 an element of size 1, 2, 4 or 8 bytes can by copied from an array using a MOV instruction with suitable addressing mode that uses the start address of the array, the index of the element and element size as scaling factor. Thereby, for arrays and slices with suitable element size we can avoid keeping and incrementing an extra variable to compute the next elements address. Shrinks cmd/go by 4 kilobytes. AMD64: name old time/op new time/op delta BinaryTree17 2.66s ± 7% 2.54s ± 0% -4.53% (p=0.000 n=10+8) Fannkuch11 3.02s ± 1% 3.02s ± 1% ~ (p=0.579 n=10+10) FmtFprintfEmpty 45.6ns ± 1% 42.2ns ± 1% -7.46% (p=0.000 n=10+10) FmtFprintfString 69.8ns ± 1% 70.4ns ± 1% +0.84% (p=0.041 n=10+10) FmtFprintfInt 80.1ns ± 1% 79.0ns ± 1% -1.35% (p=0.000 n=10+10) FmtFprintfIntInt 127ns ± 1% 125ns ± 1% -1.00% (p=0.007 n=10+9) FmtFprintfPrefixedInt 158ns ± 2% 152ns ± 1% -4.11% (p=0.000 n=10+10) FmtFprintfFloat 218ns ± 1% 214ns ± 1% -1.61% (p=0.000 n=10+10) FmtManyArgs 508ns ± 1% 504ns ± 1% -0.93% (p=0.001 n=9+10) GobDecode 6.76ms ± 1% 6.78ms ± 1% ~ (p=0.353 n=10+10) GobEncode 5.84ms ± 1% 5.77ms ± 1% -1.31% (p=0.000 n=10+9) Gzip 223ms ± 1% 218ms ± 1% -2.39% (p=0.000 n=10+10) Gunzip 40.3ms ± 1% 40.4ms ± 3% ~ (p=0.796 n=10+10) HTTPClientServer 73.5µs ± 0% 73.3µs ± 0% -0.28% (p=0.000 n=10+9) JSONEncode 12.7ms ± 1% 12.6ms ± 8% ~ (p=0.173 n=8+10) JSONDecode 57.5ms ± 1% 56.1ms ± 2% -2.40% (p=0.000 n=10+10) Mandelbrot200 3.80ms ± 1% 3.86ms ± 6% ~ (p=0.579 n=10+10) GoParse 3.25ms ± 1% 3.23ms ± 1% ~ (p=0.052 n=10+10) RegexpMatchEasy0_32 74.4ns ± 1% 76.9ns ± 1% +3.39% (p=0.000 n=10+10) RegexpMatchEasy0_1K 243ns ± 2% 248ns ± 1% +1.86% (p=0.000 n=10+8) RegexpMatchEasy1_32 71.0ns ± 2% 72.8ns ± 1% +2.55% (p=0.000 n=10+10) RegexpMatchEasy1_1K 370ns ± 1% 383ns ± 0% +3.39% (p=0.000 n=10+9) RegexpMatchMedium_32 107ns ± 0% 113ns ± 1% +5.33% (p=0.000 n=6+10) RegexpMatchMedium_1K 35.0µs ± 1% 36.0µs ± 1% +3.13% (p=0.000 n=10+10) RegexpMatchHard_32 1.65µs ± 1% 1.69µs ± 1% +2.23% (p=0.000 n=10+9) RegexpMatchHard_1K 49.8µs ± 1% 50.6µs ± 1% +1.59% (p=0.000 n=10+10) Revcomp 398ms ± 1% 396ms ± 1% -0.51% (p=0.043 n=10+10) Template 63.4ms ± 1% 60.8ms ± 0% -4.11% (p=0.000 n=10+9) TimeParse 318ns ± 1% 322ns ± 1% +1.10% (p=0.005 n=10+10) TimeFormat 323ns ± 1% 336ns ± 1% +4.15% (p=0.000 n=10+10) Updates: #15809. Change-Id: I55915aaf6d26768e12247f8a8edf14e7630726d1 Reviewed-on: https://go-review.googlesource.com/38061 Run-TryBot: Martin Möhrmann <moehrmann@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2017-08-24cmd/compile: eliminate stores to unread auto variablesMichael Munday
This is a crude compiler pass to eliminate stores to auto variables that are only ever written to. Eliminates an unnecessary store to x from the following code: func f() int { var x := 1 return *(&x) } Fixes #19765. Change-Id: If2c63a8ae67b8c590b6e0cc98a9610939a3eeffa Reviewed-on: https://go-review.googlesource.com/38746 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2017-03-21runtime: add mapdelete_fast*Hugues Bruant
Add benchmarks for map delete with int32/int64/string key Benchmark results on darwin/amd64 name old time/op new time/op delta MapDelete/Int32/1-8 151ns ± 8% 99ns ± 3% -34.39% (p=0.008 n=5+5) MapDelete/Int32/2-8 128ns ± 2% 111ns ±15% -13.40% (p=0.040 n=5+5) MapDelete/Int32/4-8 128ns ± 5% 114ns ± 2% -10.82% (p=0.008 n=5+5) MapDelete/Int64/1-8 144ns ± 0% 104ns ± 3% -27.53% (p=0.016 n=4+5) MapDelete/Int64/2-8 153ns ± 1% 126ns ± 3% -17.17% (p=0.008 n=5+5) MapDelete/Int64/4-8 178ns ± 3% 136ns ± 2% -23.60% (p=0.008 n=5+5) MapDelete/Str/1-8 187ns ± 3% 171ns ± 3% -8.54% (p=0.008 n=5+5) MapDelete/Str/2-8 221ns ± 3% 206ns ± 4% -7.18% (p=0.016 n=5+4) MapDelete/Str/4-8 256ns ± 5% 232ns ± 2% -9.36% (p=0.016 n=4+5) name old time/op new time/op delta BinaryTree17-8 2.78s ± 7% 2.70s ± 1% ~ (p=0.151 n=5+5) Fannkuch11-8 3.21s ± 2% 3.19s ± 1% ~ (p=0.310 n=5+5) FmtFprintfEmpty-8 49.1ns ± 3% 50.2ns ± 2% ~ (p=0.095 n=5+5) FmtFprintfString-8 78.6ns ± 4% 80.2ns ± 5% ~ (p=0.460 n=5+5) FmtFprintfInt-8 79.7ns ± 1% 81.0ns ± 3% ~ (p=0.103 n=5+5) FmtFprintfIntInt-8 117ns ± 2% 119ns ± 0% ~ (p=0.079 n=5+4) FmtFprintfPrefixedInt-8 153ns ± 1% 146ns ± 3% -4.19% (p=0.024 n=5+5) FmtFprintfFloat-8 239ns ± 1% 237ns ± 1% ~ (p=0.246 n=5+5) FmtManyArgs-8 506ns ± 2% 509ns ± 2% ~ (p=0.238 n=5+5) GobDecode-8 7.06ms ± 4% 6.86ms ± 1% ~ (p=0.222 n=5+5) GobEncode-8 6.01ms ± 5% 5.87ms ± 2% ~ (p=0.222 n=5+5) Gzip-8 246ms ± 4% 236ms ± 1% -4.12% (p=0.008 n=5+5) Gunzip-8 37.7ms ± 4% 37.3ms ± 1% ~ (p=0.841 n=5+5) HTTPClientServer-8 64.9µs ± 1% 64.4µs ± 0% -0.80% (p=0.032 n=5+4) JSONEncode-8 16.0ms ± 2% 16.2ms ±11% ~ (p=0.548 n=5+5) JSONDecode-8 53.2ms ± 2% 53.1ms ± 4% ~ (p=1.000 n=5+5) Mandelbrot200-8 4.33ms ± 2% 4.32ms ± 2% ~ (p=0.841 n=5+5) GoParse-8 3.24ms ± 2% 3.27ms ± 4% ~ (p=0.690 n=5+5) RegexpMatchEasy0_32-8 86.2ns ± 1% 85.2ns ± 3% ~ (p=0.286 n=5+5) RegexpMatchEasy0_1K-8 198ns ± 2% 199ns ± 1% ~ (p=0.310 n=5+5) RegexpMatchEasy1_32-8 82.6ns ± 2% 81.8ns ± 1% ~ (p=0.294 n=5+5) RegexpMatchEasy1_1K-8 359ns ± 2% 354ns ± 1% -1.39% (p=0.048 n=5+5) RegexpMatchMedium_32-8 123ns ± 2% 123ns ± 1% ~ (p=0.905 n=5+5) RegexpMatchMedium_1K-8 38.2µs ± 2% 38.6µs ± 8% ~ (p=0.690 n=5+5) RegexpMatchHard_32-8 1.92µs ± 2% 1.91µs ± 5% ~ (p=0.460 n=5+5) RegexpMatchHard_1K-8 57.6µs ± 1% 57.0µs ± 2% ~ (p=0.310 n=5+5) Revcomp-8 483ms ± 7% 441ms ± 1% -8.79% (p=0.016 n=5+4) Template-8 58.0ms ± 1% 58.2ms ± 7% ~ (p=0.310 n=5+5) TimeParse-8 324ns ± 6% 312ns ± 2% ~ (p=0.087 n=5+5) TimeFormat-8 330ns ± 1% 329ns ± 1% ~ (p=0.968 n=5+5) name old speed new speed delta GobDecode-8 109MB/s ± 4% 112MB/s ± 1% ~ (p=0.222 n=5+5) GobEncode-8 128MB/s ± 5% 131MB/s ± 2% ~ (p=0.222 n=5+5) Gzip-8 78.9MB/s ± 4% 82.3MB/s ± 1% +4.25% (p=0.008 n=5+5) Gunzip-8 514MB/s ± 4% 521MB/s ± 1% ~ (p=0.841 n=5+5) JSONEncode-8 121MB/s ± 2% 120MB/s ±10% ~ (p=0.548 n=5+5) JSONDecode-8 36.5MB/s ± 2% 36.6MB/s ± 4% ~ (p=1.000 n=5+5) GoParse-8 17.9MB/s ± 2% 17.7MB/s ± 4% ~ (p=0.730 n=5+5) RegexpMatchEasy0_32-8 371MB/s ± 1% 375MB/s ± 3% ~ (p=0.310 n=5+5) RegexpMatchEasy0_1K-8 5.15GB/s ± 1% 5.13GB/s ± 1% ~ (p=0.548 n=5+5) RegexpMatchEasy1_32-8 387MB/s ± 2% 391MB/s ± 1% ~ (p=0.310 n=5+5) RegexpMatchEasy1_1K-8 2.85GB/s ± 2% 2.89GB/s ± 1% ~ (p=0.056 n=5+5) RegexpMatchMedium_32-8 8.07MB/s ± 2% 8.06MB/s ± 1% ~ (p=0.730 n=5+5) RegexpMatchMedium_1K-8 26.8MB/s ± 2% 26.6MB/s ± 7% ~ (p=0.690 n=5+5) RegexpMatchHard_32-8 16.7MB/s ± 2% 16.7MB/s ± 5% ~ (p=0.421 n=5+5) RegexpMatchHard_1K-8 17.8MB/s ± 1% 18.0MB/s ± 2% ~ (p=0.310 n=5+5) Revcomp-8 527MB/s ± 6% 577MB/s ± 1% +9.44% (p=0.016 n=5+4) Template-8 33.5MB/s ± 1% 33.4MB/s ± 7% ~ (p=0.310 n=5+5) Updates #19495 Change-Id: Ib9ece1690813d9b4788455db43d30891e2138df5 Reviewed-on: https://go-review.googlesource.com/38172 Reviewed-by: Hugues Bruant <hugues.bruant@gmail.com> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-13runtime: add mapassign_fast*Hugues Bruant
Add benchmarks for map assignment with int32/int64/string key Benchmark results on darwin/amd64 name old time/op new time/op delta MapAssignInt32_255-8 24.7ns ± 3% 17.4ns ± 2% -29.75% (p=0.000 n=10+10) MapAssignInt32_64k-8 45.5ns ± 4% 37.6ns ± 4% -17.18% (p=0.000 n=10+10) MapAssignInt64_255-8 26.0ns ± 3% 17.9ns ± 4% -31.03% (p=0.000 n=10+10) MapAssignInt64_64k-8 46.9ns ± 5% 38.7ns ± 2% -17.53% (p=0.000 n=9+10) MapAssignStr_255-8 47.8ns ± 3% 24.8ns ± 4% -48.01% (p=0.000 n=10+10) MapAssignStr_64k-8 83.0ns ± 3% 51.9ns ± 3% -37.45% (p=0.000 n=10+9) name old time/op new time/op delta BinaryTree17-8 3.11s ±19% 2.78s ± 3% ~ (p=0.095 n=5+5) Fannkuch11-8 3.26s ± 1% 3.21s ± 2% ~ (p=0.056 n=5+5) FmtFprintfEmpty-8 50.3ns ± 1% 50.8ns ± 2% ~ (p=0.246 n=5+5) FmtFprintfString-8 82.7ns ± 4% 80.1ns ± 5% ~ (p=0.238 n=5+5) FmtFprintfInt-8 82.6ns ± 2% 81.9ns ± 3% ~ (p=0.508 n=5+5) FmtFprintfIntInt-8 124ns ± 4% 121ns ± 3% ~ (p=0.111 n=5+5) FmtFprintfPrefixedInt-8 158ns ± 6% 160ns ± 2% ~ (p=0.341 n=5+5) FmtFprintfFloat-8 249ns ± 2% 245ns ± 2% ~ (p=0.095 n=5+5) FmtManyArgs-8 513ns ± 2% 519ns ± 3% ~ (p=0.151 n=5+5) GobDecode-8 7.48ms ±12% 7.11ms ± 2% ~ (p=0.222 n=5+5) GobEncode-8 6.25ms ± 1% 6.03ms ± 2% -3.56% (p=0.008 n=5+5) Gzip-8 252ms ± 4% 252ms ± 4% ~ (p=1.000 n=5+5) Gunzip-8 38.4ms ± 3% 38.6ms ± 2% ~ (p=0.690 n=5+5) HTTPClientServer-8 76.9µs ±41% 66.4µs ± 6% ~ (p=0.310 n=5+5) JSONEncode-8 16.5ms ± 3% 16.7ms ± 3% ~ (p=0.421 n=5+5) JSONDecode-8 54.6ms ± 1% 54.3ms ± 2% ~ (p=0.548 n=5+5) Mandelbrot200-8 4.45ms ± 3% 4.47ms ± 1% ~ (p=0.841 n=5+5) GoParse-8 3.43ms ± 1% 3.32ms ± 2% -3.28% (p=0.008 n=5+5) RegexpMatchEasy0_32-8 88.2ns ± 3% 89.4ns ± 2% ~ (p=0.333 n=5+5) RegexpMatchEasy0_1K-8 205ns ± 1% 206ns ± 1% ~ (p=0.905 n=5+5) RegexpMatchEasy1_32-8 85.1ns ± 1% 85.5ns ± 5% ~ (p=0.690 n=5+5) RegexpMatchEasy1_1K-8 365ns ± 1% 371ns ± 9% ~ (p=1.000 n=5+5) RegexpMatchMedium_32-8 129ns ± 2% 128ns ± 3% ~ (p=0.730 n=5+5) RegexpMatchMedium_1K-8 39.8µs ± 0% 39.7µs ± 4% ~ (p=0.730 n=4+5) RegexpMatchHard_32-8 1.99µs ± 3% 2.05µs ±16% ~ (p=0.794 n=5+5) RegexpMatchHard_1K-8 59.3µs ± 1% 60.3µs ± 7% ~ (p=1.000 n=5+5) Revcomp-8 1.36s ±63% 0.52s ± 5% ~ (p=0.095 n=5+5) Template-8 62.6ms ±14% 60.5ms ± 5% ~ (p=0.690 n=5+5) TimeParse-8 330ns ± 2% 324ns ± 2% ~ (p=0.087 n=5+5) TimeFormat-8 350ns ± 3% 340ns ± 1% -2.86% (p=0.008 n=5+5) name old speed new speed delta GobDecode-8 103MB/s ±11% 108MB/s ± 2% ~ (p=0.222 n=5+5) GobEncode-8 123MB/s ± 1% 127MB/s ± 2% +3.71% (p=0.008 n=5+5) Gzip-8 77.1MB/s ± 4% 76.9MB/s ± 3% ~ (p=1.000 n=5+5) Gunzip-8 505MB/s ± 3% 503MB/s ± 2% ~ (p=0.690 n=5+5) JSONEncode-8 118MB/s ± 3% 116MB/s ± 3% ~ (p=0.421 n=5+5) JSONDecode-8 35.5MB/s ± 1% 35.8MB/s ± 2% ~ (p=0.397 n=5+5) GoParse-8 16.9MB/s ± 1% 17.4MB/s ± 2% +3.45% (p=0.008 n=5+5) RegexpMatchEasy0_32-8 363MB/s ± 3% 358MB/s ± 2% ~ (p=0.421 n=5+5) RegexpMatchEasy0_1K-8 4.98GB/s ± 1% 4.97GB/s ± 1% ~ (p=0.548 n=5+5) RegexpMatchEasy1_32-8 376MB/s ± 1% 375MB/s ± 5% ~ (p=0.690 n=5+5) RegexpMatchEasy1_1K-8 2.80GB/s ± 1% 2.76GB/s ± 9% ~ (p=0.841 n=5+5) RegexpMatchMedium_32-8 7.73MB/s ± 1% 7.76MB/s ± 3% ~ (p=0.730 n=5+5) RegexpMatchMedium_1K-8 25.8MB/s ± 0% 25.8MB/s ± 4% ~ (p=0.651 n=4+5) RegexpMatchHard_32-8 16.1MB/s ± 3% 15.7MB/s ±14% ~ (p=0.794 n=5+5) RegexpMatchHard_1K-8 17.3MB/s ± 1% 17.0MB/s ± 7% ~ (p=0.984 n=5+5) Revcomp-8 273MB/s ±83% 488MB/s ± 5% ~ (p=0.095 n=5+5) Template-8 31.1MB/s ±13% 32.1MB/s ± 5% ~ (p=0.690 n=5+5) Updates #19495 Change-Id: I116e9a2a4594769318b22d736464de8a98499909 Reviewed-on: https://go-review.googlesource.com/38091 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-07cmd/compile, runtime: simplify multiway select implementationMatthew Dempsky
This commit reworks multiway select statements to use normal control flow primitives instead of the previous setjmp/longjmp-like behavior. This simplifies liveness analysis and should prevent issues around "returns twice" function calls within SSA passes. test/live.go is updated because liveness analysis's CFG is more representative of actual control flow. The case bodies are the only real successors of the selectgo call, but previously the selectsend, selectrecv, etc. calls were included in the successors list too. Updates #19331. Change-Id: I7f879b103a4b85e62fc36a270d812f54c0aa3e83 Reviewed-on: https://go-review.googlesource.com/37661 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2017-02-28cmd/compile, runtime: specialize convT2x, don't alloc for zero valsJosh Bleecher Snyder
Prior to this CL, all runtime conversions from a concrete value to an interface went through one of two runtime calls: convT2E or convT2I. However, in practice, basic types are very common. Specializing convT2x for those basic types allows for a more efficient implementation for those types. For basic scalars and strings, allocation and copying can use the same methods as normal code. For pointer-free types, allocation can occur without zeroing, and copying can take place without GC calls. For slices, copying is cheaper and simpler. This CL adds twelve runtime routines: convT2E16, convT2I16 convT2E32, convT2I32 convT2E64, convT2I64 convT2Estring, convT2Istring convT2Eslice, convT2Islice convT2Enoptr, convT2Inoptr While compiling make.bash, 93% of all convT2x calls are now to one of these specialized convT2x call. Within specialized convT2x routines, it is cheap to check for a zero value, in a way that it is not in general. When we detect a zero value there, we return a pointer to zeroVal, rather than allocating. name old time/op new time/op delta ConvT2Ezero/zero/16-8 17.9ns ± 2% 3.0ns ± 3% -83.20% (p=0.000 n=56+56) ConvT2Ezero/zero/32-8 17.8ns ± 2% 3.0ns ± 3% -83.15% (p=0.000 n=59+60) ConvT2Ezero/zero/64-8 20.1ns ± 1% 3.0ns ± 2% -84.98% (p=0.000 n=57+57) ConvT2Ezero/zero/str-8 32.6ns ± 1% 3.0ns ± 4% -90.70% (p=0.000 n=59+60) ConvT2Ezero/zero/slice-8 36.7ns ± 2% 3.0ns ± 2% -91.78% (p=0.000 n=59+59) ConvT2Ezero/zero/big-8 91.9ns ± 2% 85.9ns ± 2% -6.52% (p=0.000 n=57+57) ConvT2Ezero/nonzero/16-8 17.7ns ± 2% 12.7ns ± 3% -28.38% (p=0.000 n=55+60) ConvT2Ezero/nonzero/32-8 17.8ns ± 1% 12.7ns ± 1% -28.44% (p=0.000 n=54+57) ConvT2Ezero/nonzero/64-8 20.0ns ± 1% 15.0ns ± 1% -24.90% (p=0.000 n=56+58) ConvT2Ezero/nonzero/str-8 32.6ns ± 1% 25.7ns ± 1% -21.17% (p=0.000 n=58+55) ConvT2Ezero/nonzero/slice-8 36.8ns ± 2% 30.4ns ± 1% -17.32% (p=0.000 n=60+52) ConvT2Ezero/nonzero/big-8 92.1ns ± 2% 85.9ns ± 2% -6.70% (p=0.000 n=57+59) Benchmarks on a real program (the compiler): name old time/op new time/op delta Template 227ms ± 5% 221ms ± 2% -2.48% (p=0.000 n=30+26) Unicode 102ms ± 5% 100ms ± 3% -1.30% (p=0.009 n=30+26) GoTypes 656ms ± 5% 659ms ± 4% ~ (p=0.208 n=30+30) Compiler 2.82s ± 2% 2.82s ± 1% ~ (p=0.614 n=29+27) Flate 128ms ± 2% 128ms ± 5% ~ (p=0.783 n=27+28) GoParser 158ms ± 3% 158ms ± 3% ~ (p=0.261 n=28+30) Reflect 408ms ± 7% 401ms ± 3% ~ (p=0.075 n=30+30) Tar 123ms ± 6% 121ms ± 8% ~ (p=0.287 n=29+30) XML 220ms ± 2% 220ms ± 4% ~ (p=0.805 n=29+29) name old user-ns/op new user-ns/op delta Template 281user-ms ± 4% 279user-ms ± 3% -0.87% (p=0.044 n=28+28) Unicode 142user-ms ± 4% 141user-ms ± 3% -1.04% (p=0.015 n=30+27) GoTypes 884user-ms ± 3% 886user-ms ± 2% ~ (p=0.532 n=30+30) Compiler 3.94user-s ± 3% 3.92user-s ± 1% ~ (p=0.185 n=30+28) Flate 165user-ms ± 2% 165user-ms ± 4% ~ (p=0.780 n=27+29) GoParser 209user-ms ± 2% 208user-ms ± 3% ~ (p=0.453 n=28+30) Reflect 533user-ms ± 6% 526user-ms ± 3% ~ (p=0.057 n=30+30) Tar 156user-ms ± 6% 154user-ms ± 6% ~ (p=0.133 n=29+30) XML 288user-ms ± 4% 288user-ms ± 4% ~ (p=0.633 n=30+30) name old alloc/op new alloc/op delta Template 41.0MB ± 0% 40.9MB ± 0% -0.11% (p=0.000 n=29+29) Unicode 32.6MB ± 0% 32.6MB ± 0% ~ (p=0.572 n=29+30) GoTypes 122MB ± 0% 122MB ± 0% -0.10% (p=0.000 n=30+30) Compiler 482MB ± 0% 481MB ± 0% -0.07% (p=0.000 n=30+29) Flate 26.6MB ± 0% 26.6MB ± 0% ~ (p=0.096 n=30+30) GoParser 32.7MB ± 0% 32.6MB ± 0% -0.06% (p=0.011 n=28+28) Reflect 84.2MB ± 0% 84.1MB ± 0% -0.17% (p=0.000 n=29+30) Tar 27.7MB ± 0% 27.7MB ± 0% -0.05% (p=0.032 n=27+28) XML 44.7MB ± 0% 44.7MB ± 0% ~ (p=0.131 n=28+30) name old allocs/op new allocs/op delta Template 373k ± 1% 370k ± 1% -0.76% (p=0.000 n=30+30) Unicode 325k ± 1% 325k ± 1% ~ (p=0.383 n=29+30) GoTypes 1.16M ± 0% 1.15M ± 0% -0.75% (p=0.000 n=29+30) Compiler 4.15M ± 0% 4.13M ± 0% -0.59% (p=0.000 n=30+29) Flate 238k ± 1% 237k ± 1% -0.62% (p=0.000 n=30+30) GoParser 304k ± 1% 302k ± 1% -0.64% (p=0.000 n=30+28) Reflect 1.00M ± 0% 0.99M ± 0% -1.10% (p=0.000 n=29+30) Tar 245k ± 1% 244k ± 1% -0.59% (p=0.000 n=27+29) XML 391k ± 1% 389k ± 1% -0.59% (p=0.000 n=29+30) Change-Id: Id7f456d690567c2b0a96b0d6d64de8784b6e305f Reviewed-on: https://go-review.googlesource.com/36476 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2017-02-10cmd/compile/internal/syntax: removed gcCompat code needed to pass orig. testsRobert Griesemer
The gcCompat mode was introduced to match the new parser's node position setup exactly with the positions used by the original parser. Some of the gcCompat adjustments were required to satisfy syntax error test cases, and the rest were required to make toolstash cmp pass. This change removes the former gcCompat adjustments and instead adjusts the respective test cases as necessary. In some cases this makes the error lines consistent with the ones reported by gccgo. Where it has changed, the position associated with a given syntactic construct is the position (line/col number) of the left-most token belonging to the construct. Change-Id: I5b60c00c5999a895c4d6d6e9b383c6405ccf725c Reviewed-on: https://go-review.googlesource.com/36695 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-02-03cmd/compile: make sure output params are live if there is a deferKeith Randall
If there is a defer, and that defer recovers, then the caller can see all of the output parameters. That means that we must mark all the output parameters live at any point which might panic. If there is no defer then this is not necessary. This is implemented. We could also detect whether there is a recover in any of the defers. If not, we would need to mark only output params that the defer actually references (and the closure mechanism already does that). This is not implemented. Fixes #18860. Change-Id: If984fe6686eddce9408bf25e725dd17fc16b8578 Reviewed-on: https://go-review.googlesource.com/36030 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Russ Cox <rsc@golang.org>
2017-02-02cmd/compile: convert constants to interfaces without allocatingJosh Bleecher Snyder
The order pass is responsible for ensuring that values passed to runtime functions, including convT2E/convT2I, are addressable. Prior to this CL, this was always accomplished by creating a temp, which frequently escaped to the heap, causing allocations, perhaps most notably in code like: fmt.Println(1, 2, 3) // allocates three times None of the runtime routines modify the contents of the pointers they receive, so in the case of constants, instead of creating a temp value, we can create a static value. (Marking the static value as read-only provides protection against accidental attempts by the runtime to modify the constant data.) This improves code generation for code like: panic("abc") c <- 2 // c is a chan int which can now simply refer to "abc" and 2, rather than going by way of a temporary. It also allows us to optimize convT2E/convT2I, by recognizing static readonly values and directly constructing the interface. This CL adds ~0.5% to binary size, despite decreasing the size of many functions, because it also adds many static symbols. This binary size regression could be recovered in future (but currently unplanned) work. There is a lot of content-duplication in these symbols; this statement generates six new symbols, three containing an int 1 and three containing a pointer to the string "a": fmt.Println(1, 1, 1, "a", "a", "a") These symbols could be made content-addressable. Furthermore, these symbols are small, so the alignment and naming overhead is large. As with the go.strings section, these symbols could be hidden and have their alignment reduced. The changes to test/live.go make it impossible (at least with current optimization techniques) to place the values being passed to the runtime in static symbols, preserving autotmp creation. Fixes #18704 Benchmarks from fmt and go-kit's logging package: github.com/go-kit/kit/log name old time/op new time/op delta JSONLoggerSimple-8 1.91µs ± 2% 2.11µs ±22% ~ (p=1.000 n=9+10) JSONLoggerContextual-8 2.60µs ± 6% 2.43µs ± 2% -6.29% (p=0.000 n=9+10) Discard-8 101ns ± 2% 34ns ±14% -66.33% (p=0.000 n=10+9) OneWith-8 161ns ± 1% 102ns ±16% -36.78% (p=0.000 n=10+10) TwoWith-8 175ns ± 3% 106ns ± 7% -39.36% (p=0.000 n=10+9) TenWith-8 293ns ± 3% 227ns ±15% -22.44% (p=0.000 n=9+10) LogfmtLoggerSimple-8 704ns ± 2% 608ns ± 2% -13.65% (p=0.000 n=10+9) LogfmtLoggerContextual-8 962ns ± 1% 860ns ±17% -10.57% (p=0.003 n=9+10) NopLoggerSimple-8 188ns ± 1% 120ns ± 1% -36.39% (p=0.000 n=9+10) NopLoggerContextual-8 379ns ± 1% 243ns ± 0% -35.77% (p=0.000 n=9+10) ValueBindingTimestamp-8 577ns ± 1% 499ns ± 1% -13.51% (p=0.000 n=10+10) ValueBindingCaller-8 898ns ± 2% 844ns ± 2% -6.00% (p=0.000 n=10+10) name old alloc/op new alloc/op delta JSONLoggerSimple-8 904B ± 0% 872B ± 0% -3.54% (p=0.000 n=10+10) JSONLoggerContextual-8 1.20kB ± 0% 1.14kB ± 0% -5.33% (p=0.000 n=10+10) Discard-8 64.0B ± 0% 32.0B ± 0% -50.00% (p=0.000 n=10+10) OneWith-8 96.0B ± 0% 64.0B ± 0% -33.33% (p=0.000 n=10+10) TwoWith-8 160B ± 0% 128B ± 0% -20.00% (p=0.000 n=10+10) TenWith-8 672B ± 0% 640B ± 0% -4.76% (p=0.000 n=10+10) LogfmtLoggerSimple-8 128B ± 0% 96B ± 0% -25.00% (p=0.000 n=10+10) LogfmtLoggerContextual-8 304B ± 0% 240B ± 0% -21.05% (p=0.000 n=10+10) NopLoggerSimple-8 128B ± 0% 96B ± 0% -25.00% (p=0.000 n=10+10) NopLoggerContextual-8 304B ± 0% 240B ± 0% -21.05% (p=0.000 n=10+10) ValueBindingTimestamp-8 159B ± 0% 127B ± 0% -20.13% (p=0.000 n=10+10) ValueBindingCaller-8 112B ± 0% 80B ± 0% -28.57% (p=0.000 n=10+10) name old allocs/op new allocs/op delta JSONLoggerSimple-8 19.0 ± 0% 17.0 ± 0% -10.53% (p=0.000 n=10+10) JSONLoggerContextual-8 25.0 ± 0% 21.0 ± 0% -16.00% (p=0.000 n=10+10) Discard-8 3.00 ± 0% 1.00 ± 0% -66.67% (p=0.000 n=10+10) OneWith-8 3.00 ± 0% 1.00 ± 0% -66.67% (p=0.000 n=10+10) TwoWith-8 3.00 ± 0% 1.00 ± 0% -66.67% (p=0.000 n=10+10) TenWith-8 3.00 ± 0% 1.00 ± 0% -66.67% (p=0.000 n=10+10) LogfmtLoggerSimple-8 4.00 ± 0% 2.00 ± 0% -50.00% (p=0.000 n=10+10) LogfmtLoggerContextual-8 7.00 ± 0% 3.00 ± 0% -57.14% (p=0.000 n=10+10) NopLoggerSimple-8 4.00 ± 0% 2.00 ± 0% -50.00% (p=0.000 n=10+10) NopLoggerContextual-8 7.00 ± 0% 3.00 ± 0% -57.14% (p=0.000 n=10+10) ValueBindingTimestamp-8 5.00 ± 0% 3.00 ± 0% -40.00% (p=0.000 n=10+10) ValueBindingCaller-8 4.00 ± 0% 2.00 ± 0% -50.00% (p=0.000 n=10+10) fmt name old time/op new time/op delta SprintfPadding-8 88.9ns ± 3% 79.1ns ± 1% -11.09% (p=0.000 n=10+7) SprintfEmpty-8 12.6ns ± 3% 12.8ns ± 3% ~ (p=0.136 n=10+10) SprintfString-8 38.7ns ± 5% 26.9ns ± 6% -30.65% (p=0.000 n=10+10) SprintfTruncateString-8 56.7ns ± 2% 47.0ns ± 3% -17.05% (p=0.000 n=10+10) SprintfQuoteString-8 164ns ± 2% 153ns ± 2% -7.01% (p=0.000 n=10+10) SprintfInt-8 38.9ns ±15% 26.5ns ± 2% -31.93% (p=0.000 n=10+9) SprintfIntInt-8 60.3ns ± 9% 38.2ns ± 1% -36.67% (p=0.000 n=10+8) SprintfPrefixedInt-8 58.6ns ±13% 51.2ns ±11% -12.66% (p=0.001 n=10+10) SprintfFloat-8 71.4ns ± 3% 64.2ns ± 3% -10.08% (p=0.000 n=8+10) SprintfComplex-8 175ns ± 3% 159ns ± 2% -9.03% (p=0.000 n=10+10) SprintfBoolean-8 33.5ns ± 4% 25.7ns ± 5% -23.28% (p=0.000 n=10+10) SprintfHexString-8 65.3ns ± 3% 51.7ns ± 5% -20.86% (p=0.000 n=10+9) SprintfHexBytes-8 67.2ns ± 5% 67.9ns ± 4% ~ (p=0.383 n=10+10) SprintfBytes-8 129ns ± 7% 124ns ± 7% ~ (p=0.074 n=9+10) SprintfStringer-8 127ns ± 4% 126ns ± 8% ~ (p=0.506 n=9+10) SprintfStructure-8 357ns ± 3% 359ns ± 3% ~ (p=0.469 n=10+10) ManyArgs-8 203ns ± 6% 126ns ± 3% -37.94% (p=0.000 n=10+10) FprintInt-8 119ns ±10% 74ns ± 3% -37.54% (p=0.000 n=10+10) FprintfBytes-8 122ns ± 4% 120ns ± 3% ~ (p=0.124 n=10+10) FprintIntNoAlloc-8 78.2ns ± 5% 74.1ns ± 3% -5.28% (p=0.000 n=10+10) ScanInts-8 349µs ± 1% 349µs ± 0% ~ (p=0.606 n=9+8) ScanRecursiveInt-8 43.8ms ± 7% 40.1ms ± 2% -8.42% (p=0.000 n=10+10) ScanRecursiveIntReaderWrapper-8 43.5ms ± 4% 40.4ms ± 2% -7.16% (p=0.000 n=10+9) name old alloc/op new alloc/op delta SprintfPadding-8 24.0B ± 0% 16.0B ± 0% -33.33% (p=0.000 n=10+10) SprintfEmpty-8 0.00B 0.00B ~ (all equal) SprintfString-8 21.0B ± 0% 5.0B ± 0% -76.19% (p=0.000 n=10+10) SprintfTruncateString-8 32.0B ± 0% 16.0B ± 0% -50.00% (p=0.000 n=10+10) SprintfQuoteString-8 48.0B ± 0% 32.0B ± 0% -33.33% (p=0.000 n=10+10) SprintfInt-8 16.0B ± 0% 1.0B ± 0% -93.75% (p=0.000 n=10+10) SprintfIntInt-8 24.0B ± 0% 3.0B ± 0% -87.50% (p=0.000 n=10+10) SprintfPrefixedInt-8 72.0B ± 0% 64.0B ± 0% -11.11% (p=0.000 n=10+10) SprintfFloat-8 16.0B ± 0% 8.0B ± 0% -50.00% (p=0.000 n=10+10) SprintfComplex-8 48.0B ± 0% 32.0B ± 0% -33.33% (p=0.000 n=10+10) SprintfBoolean-8 8.00B ± 0% 4.00B ± 0% -50.00% (p=0.000 n=10+10) SprintfHexString-8 96.0B ± 0% 80.0B ± 0% -16.67% (p=0.000 n=10+10) SprintfHexBytes-8 112B ± 0% 112B ± 0% ~ (all equal) SprintfBytes-8 96.0B ± 0% 96.0B ± 0% ~ (all equal) SprintfStringer-8 32.0B ± 0% 32.0B ± 0% ~ (all equal) SprintfStructure-8 256B ± 0% 256B ± 0% ~ (all equal) ManyArgs-8 80.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10) FprintInt-8 8.00B ± 0% 0.00B -100.00% (p=0.000 n=10+10) FprintfBytes-8 32.0B ± 0% 32.0B ± 0% ~ (all equal) FprintIntNoAlloc-8 0.00B 0.00B ~ (all equal) ScanInts-8 15.2kB ± 0% 15.2kB ± 0% ~ (p=0.248 n=9+10) ScanRecursiveInt-8 21.6kB ± 0% 21.6kB ± 0% ~ (all equal) ScanRecursiveIntReaderWrapper-8 21.7kB ± 0% 21.7kB ± 0% ~ (all equal) name old allocs/op new allocs/op delta SprintfPadding-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10) SprintfEmpty-8 0.00 0.00 ~ (all equal) SprintfString-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10) SprintfTruncateString-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10) SprintfQuoteString-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10) SprintfInt-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10) SprintfIntInt-8 3.00 ± 0% 1.00 ± 0% -66.67% (p=0.000 n=10+10) SprintfPrefixedInt-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10) SprintfFloat-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10) SprintfComplex-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10) SprintfBoolean-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10) SprintfHexString-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10) SprintfHexBytes-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) SprintfBytes-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) SprintfStringer-8 4.00 ± 0% 4.00 ± 0% ~ (all equal) SprintfStructure-8 7.00 ± 0% 7.00 ± 0% ~ (all equal) ManyArgs-8 8.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10) FprintInt-8 1.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10) FprintfBytes-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) FprintIntNoAlloc-8 0.00 0.00 ~ (all equal) ScanInts-8 1.60k ± 0% 1.60k ± 0% ~ (all equal) ScanRecursiveInt-8 1.71k ± 0% 1.71k ± 0% ~ (all equal) ScanRecursiveIntReaderWrapper-8 1.71k ± 0% 1.71k ± 0% ~ (all equal) Change-Id: I7ba72a25fea4140a0ba40a9f443103ed87cc69b5 Reviewed-on: https://go-review.googlesource.com/35554 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2017-01-09cmd/compile: insert scheduling checks on loop backedgesDavid Chase
Loop breaking with a counter. Benchmarked (see comments), eyeball checked for sanity on popular loops. This code ought to handle loops in general, and properly inserts phi functions in cases where the earlier version might not have. Includes test, plus modifications to test/run.go to deal with timeout and killing looping test. Tests broken by the addition of extra code (branch frequency and live vars) for added checks turn the check insertion off. If GOEXPERIMENT=preemptibleloops, the compiler inserts reschedule checks on every backedge of every reducible loop. Alternately, specifying GO_GCFLAGS=-d=ssa/insert_resched_checks/on will enable it for a single compilation, but because the core Go libraries contain some loops that may run long, this is less likely to have the desired effect. This is intended as a tool to help in the study and diagnosis of GC and other latency problems, now that goal STW GC latency is on the order of 100 microseconds or less. Updates #17831. Updates #10958. Change-Id: I6206c163a5b0248e3f21eb4fc65f73a179e1f639 Reviewed-on: https://go-review.googlesource.com/33910 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2016-10-31cmd/compile: mark temps with new AutoTemp flag, and use it.David Chase
This is an extension of https://go-review.googlesource.com/c/31662/ to mark all the temporaries, not just the ssa-generated ones. Before-and-after ls -l `go tool -n compile` shows a 3% reduction in size (or rather, a prior 3% inflation for failing to filter temps out properly.) Replaced name-dependent "is it a temp?" tests with calls to *Node.IsAutoTmp(), which depends on AutoTemp. Also replace calls to istemp(n) with n.IsAutoTmp(), to reduce duplication and clean up function name space. Generated temporaries now come with a "." prefix to avoid (apparently harmless) clashes with legal Go variable names. Fixes #17644. Fixes #17240. Change-Id: If1417f29c79a7275d7303ddf859b51472890fd43 Reviewed-on: https://go-review.googlesource.com/32255 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-10-12cmd/compile,runtime: redo how map assignments workKeith Randall
To compile: m[k] = v instead of: mapassign(maptype, m, &k, &v), do do: *mapassign(maptype, m, &k) = v mapassign returns a pointer to the value slot in the map. It is just like mapaccess except that it will allocate a new slot if k is not already present in the map. This makes map accesses faster but potentially larger (codewise). It is faster because the write into the map is done when the compiler knows the concrete type, so it can be done with a few store instructions instead of calling typedmemmove. We also potentially avoid stack temporaries to hold v. The code can be larger when the map has pointers in its value type, since there is a write barrier call in addition to the mapassign call. That makes the code at the callsite a bit bigger (go binary is 0.3% bigger). This CL is in preparation for doing operations like m[k] += v with only a single runtime call. That will roughly double the speed of such operations. Update #17133 Update #5147 Change-Id: Ia435f032090a2ed905dac9234e693972fe8c2dc5 Reviewed-on: https://go-review.googlesource.com/30815 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-03cmd/compile: relax liveness restrictions on ambiguously liveThan McIntosh
Update gc liveness to remove special conservative treatment of ambiguously live vars, since there is no longer a need to protect against GCDEBUG=gcdead. Change-Id: Id6e2d03218f7d67911e8436d283005a124e6957f Reviewed-on: https://go-review.googlesource.com/24896 Reviewed-by: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org>
2016-09-22test: errorcheck auto-generated functionsCherry Zhang
Add an "errorcheckwithauto" action which performs error check including lines with auto-generated functions (excluded by default). Comment "// ERRORAUTO" matches these lines. Add testcase for CL 29570 (as an example). Updates #16016, #17186. Change-Id: Iaba3727336cd602f3dda6b9e5f97dafe0848e632 Reviewed-on: https://go-review.googlesource.com/29652 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>