Age | Commit message (Collapse) | Author |
|
Move ChaCha8 code into internal/chacha8rand and use it to implement
runtime.rand, which is used for the unseeded global source for
both math/rand and math/rand/v2. This also affects the calculation of
the start point for iteration over very very large maps (when the
32-bit fastrand is not big enough).
The benefit is that misuse of the global random number generators
in math/rand and math/rand/v2 in contexts where non-predictable
randomness is important for security reasons is no longer a
security problem, removing a common mistake among programmers
who are unaware of the different kinds of randomness.
The cost is an extra 304 bytes per thread stored in the m struct
plus 2-3ns more per random uint64 due to the more sophisticated
algorithm. Using PCG looks like it would cost about the same,
although I haven't benchmarked that.
Before this, the math/rand and math/rand/v2 global generator
was wyrand (https://github.com/wangyi-fudan/wyhash).
For math/rand, using wyrand instead of the Mitchell/Reeds/Thompson
ALFG was justifiable, since the latter was not any better.
But for math/rand/v2, the global generator really should be
at least as good as one of the well-studied, specific algorithms
provided directly by the package, and it's not.
(Wyrand is still reasonable for scheduling and cache decisions.)
Good randomness does have a cost: about twice wyrand.
Also rationalize the various runtime rand references.
goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ bbb48afeb7.amd64 │ 5cf807d1ea.amd64 │
│ sec/op │ sec/op vs base │
ChaCha8-32 1.862n ± 2% 1.861n ± 2% ~ (p=0.825 n=20)
PCG_DXSM-32 1.471n ± 1% 1.460n ± 2% ~ (p=0.153 n=20)
SourceUint64-32 1.636n ± 2% 1.582n ± 1% -3.30% (p=0.000 n=20)
GlobalInt64-32 2.087n ± 1% 3.663n ± 1% +75.54% (p=0.000 n=20)
GlobalInt64Parallel-32 0.1042n ± 1% 0.2026n ± 1% +94.48% (p=0.000 n=20)
GlobalUint64-32 2.263n ± 2% 3.724n ± 1% +64.57% (p=0.000 n=20)
GlobalUint64Parallel-32 0.1019n ± 1% 0.1973n ± 1% +93.67% (p=0.000 n=20)
Int64-32 1.771n ± 1% 1.774n ± 1% ~ (p=0.449 n=20)
Uint64-32 1.863n ± 2% 1.866n ± 1% ~ (p=0.364 n=20)
GlobalIntN1000-32 3.134n ± 3% 4.730n ± 2% +50.95% (p=0.000 n=20)
IntN1000-32 2.489n ± 1% 2.489n ± 1% ~ (p=0.683 n=20)
Int64N1000-32 2.521n ± 1% 2.516n ± 1% ~ (p=0.394 n=20)
Int64N1e8-32 2.479n ± 1% 2.478n ± 2% ~ (p=0.743 n=20)
Int64N1e9-32 2.530n ± 2% 2.514n ± 2% ~ (p=0.193 n=20)
Int64N2e9-32 2.501n ± 1% 2.494n ± 1% ~ (p=0.616 n=20)
Int64N1e18-32 3.227n ± 1% 3.205n ± 1% ~ (p=0.101 n=20)
Int64N2e18-32 3.647n ± 1% 3.599n ± 1% ~ (p=0.019 n=20)
Int64N4e18-32 5.135n ± 1% 5.069n ± 2% ~ (p=0.034 n=20)
Int32N1000-32 2.657n ± 1% 2.637n ± 1% ~ (p=0.180 n=20)
Int32N1e8-32 2.636n ± 1% 2.636n ± 1% ~ (p=0.763 n=20)
Int32N1e9-32 2.660n ± 2% 2.638n ± 1% ~ (p=0.358 n=20)
Int32N2e9-32 2.662n ± 2% 2.618n ± 2% ~ (p=0.064 n=20)
Float32-32 2.272n ± 2% 2.239n ± 2% ~ (p=0.194 n=20)
Float64-32 2.272n ± 1% 2.286n ± 2% ~ (p=0.763 n=20)
ExpFloat64-32 3.762n ± 1% 3.744n ± 1% ~ (p=0.171 n=20)
NormFloat64-32 3.706n ± 1% 3.655n ± 2% ~ (p=0.066 n=20)
Perm3-32 32.93n ± 3% 34.62n ± 1% +5.13% (p=0.000 n=20)
Perm30-32 202.9n ± 1% 204.0n ± 1% ~ (p=0.482 n=20)
Perm30ViaShuffle-32 115.0n ± 1% 114.9n ± 1% ~ (p=0.358 n=20)
ShuffleOverhead-32 112.8n ± 1% 112.7n ± 1% ~ (p=0.692 n=20)
Concurrent-32 2.107n ± 0% 3.725n ± 1% +76.75% (p=0.000 n=20)
goos: darwin
goarch: arm64
pkg: math/rand/v2
│ bbb48afeb7.arm64 │ 5cf807d1ea.arm64 │
│ sec/op │ sec/op vs base │
ChaCha8-8 2.480n ± 0% 2.429n ± 0% -2.04% (p=0.000 n=20)
PCG_DXSM-8 2.531n ± 0% 2.530n ± 0% ~ (p=0.877 n=20)
SourceUint64-8 2.534n ± 0% 2.533n ± 0% ~ (p=0.732 n=20)
GlobalInt64-8 2.172n ± 1% 4.794n ± 0% +120.67% (p=0.000 n=20)
GlobalInt64Parallel-8 0.4320n ± 0% 0.9605n ± 0% +122.32% (p=0.000 n=20)
GlobalUint64-8 2.182n ± 0% 4.770n ± 0% +118.58% (p=0.000 n=20)
GlobalUint64Parallel-8 0.4307n ± 0% 0.9583n ± 0% +122.51% (p=0.000 n=20)
Int64-8 4.107n ± 0% 4.104n ± 0% ~ (p=0.416 n=20)
Uint64-8 4.080n ± 0% 4.080n ± 0% ~ (p=0.052 n=20)
GlobalIntN1000-8 2.814n ± 2% 5.643n ± 0% +100.50% (p=0.000 n=20)
IntN1000-8 4.141n ± 0% 4.139n ± 0% ~ (p=0.140 n=20)
Int64N1000-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.313 n=20)
Int64N1e8-8 4.140n ± 0% 4.139n ± 0% ~ (p=0.103 n=20)
Int64N1e9-8 4.139n ± 0% 4.140n ± 0% ~ (p=0.761 n=20)
Int64N2e9-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.636 n=20)
Int64N1e18-8 5.266n ± 0% 5.326n ± 1% +1.14% (p=0.001 n=20)
Int64N2e18-8 6.052n ± 0% 6.167n ± 0% +1.90% (p=0.000 n=20)
Int64N4e18-8 8.826n ± 0% 9.051n ± 0% +2.55% (p=0.000 n=20)
Int32N1000-8 4.127n ± 0% 4.132n ± 0% +0.12% (p=0.000 n=20)
Int32N1e8-8 4.126n ± 0% 4.131n ± 0% +0.12% (p=0.000 n=20)
Int32N1e9-8 4.127n ± 0% 4.132n ± 0% +0.12% (p=0.000 n=20)
Int32N2e9-8 4.132n ± 0% 4.131n ± 0% ~ (p=0.017 n=20)
Float32-8 4.109n ± 0% 4.105n ± 0% ~ (p=0.379 n=20)
Float64-8 4.107n ± 0% 4.106n ± 0% ~ (p=0.867 n=20)
ExpFloat64-8 5.339n ± 0% 5.383n ± 0% +0.82% (p=0.000 n=20)
NormFloat64-8 5.735n ± 0% 5.737n ± 1% ~ (p=0.856 n=20)
Perm3-8 26.65n ± 0% 26.80n ± 1% +0.58% (p=0.000 n=20)
Perm30-8 194.8n ± 1% 197.0n ± 0% +1.18% (p=0.000 n=20)
Perm30ViaShuffle-8 156.6n ± 0% 157.6n ± 1% +0.61% (p=0.000 n=20)
ShuffleOverhead-8 124.9n ± 0% 125.5n ± 0% +0.52% (p=0.000 n=20)
Concurrent-8 2.434n ± 3% 5.066n ± 0% +108.09% (p=0.000 n=20)
goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ bbb48afeb7.386 │ 5cf807d1ea.386 │
│ sec/op │ sec/op vs base │
ChaCha8-32 11.295n ± 1% 4.748n ± 2% -57.96% (p=0.000 n=20)
PCG_DXSM-32 7.693n ± 1% 7.738n ± 2% ~ (p=0.542 n=20)
SourceUint64-32 7.658n ± 2% 7.622n ± 2% ~ (p=0.344 n=20)
GlobalInt64-32 3.473n ± 2% 7.526n ± 2% +116.73% (p=0.000 n=20)
GlobalInt64Parallel-32 0.3198n ± 0% 0.5444n ± 0% +70.22% (p=0.000 n=20)
GlobalUint64-32 3.612n ± 0% 7.575n ± 1% +109.69% (p=0.000 n=20)
GlobalUint64Parallel-32 0.3168n ± 0% 0.5403n ± 0% +70.51% (p=0.000 n=20)
Int64-32 7.673n ± 2% 7.789n ± 1% ~ (p=0.122 n=20)
Uint64-32 7.773n ± 1% 7.827n ± 2% ~ (p=0.920 n=20)
GlobalIntN1000-32 6.268n ± 1% 9.581n ± 1% +52.87% (p=0.000 n=20)
IntN1000-32 10.33n ± 2% 10.45n ± 1% ~ (p=0.233 n=20)
Int64N1000-32 10.98n ± 2% 11.01n ± 1% ~ (p=0.401 n=20)
Int64N1e8-32 11.19n ± 2% 10.97n ± 1% ~ (p=0.033 n=20)
Int64N1e9-32 11.06n ± 1% 11.08n ± 1% ~ (p=0.498 n=20)
Int64N2e9-32 11.10n ± 1% 11.01n ± 2% ~ (p=0.995 n=20)
Int64N1e18-32 15.23n ± 2% 15.04n ± 1% ~ (p=0.973 n=20)
Int64N2e18-32 15.89n ± 1% 15.85n ± 1% ~ (p=0.409 n=20)
Int64N4e18-32 18.96n ± 2% 19.34n ± 2% ~ (p=0.048 n=20)
Int32N1000-32 10.46n ± 2% 10.44n ± 2% ~ (p=0.480 n=20)
Int32N1e8-32 10.46n ± 2% 10.49n ± 2% ~ (p=0.951 n=20)
Int32N1e9-32 10.28n ± 2% 10.26n ± 1% ~ (p=0.431 n=20)
Int32N2e9-32 10.50n ± 2% 10.44n ± 2% ~ (p=0.249 n=20)
Float32-32 13.80n ± 2% 13.80n ± 2% ~ (p=0.751 n=20)
Float64-32 23.55n ± 2% 23.87n ± 0% ~ (p=0.408 n=20)
ExpFloat64-32 15.36n ± 1% 15.29n ± 2% ~ (p=0.316 n=20)
NormFloat64-32 13.57n ± 1% 13.79n ± 1% +1.66% (p=0.005 n=20)
Perm3-32 45.70n ± 2% 46.99n ± 2% +2.81% (p=0.001 n=20)
Perm30-32 399.0n ± 1% 403.8n ± 1% +1.19% (p=0.006 n=20)
Perm30ViaShuffle-32 349.0n ± 1% 350.4n ± 1% ~ (p=0.909 n=20)
ShuffleOverhead-32 322.3n ± 1% 323.8n ± 1% ~ (p=0.410 n=20)
Concurrent-32 3.331n ± 1% 7.312n ± 1% +119.50% (p=0.000 n=20)
For #61716.
Change-Id: Ibdddeed85c34d9ae397289dc899e04d4845f9ed2
Reviewed-on: https://go-review.googlesource.com/c/go/+/516860
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Most of the test cases in the test directory use the new go:build syntax
already. Convert the rest. In general, try to place the build constraint
line below the test directive comment in more places.
For #41184.
For #60268.
Change-Id: I11c41a0642a8a26dc2eda1406da908645bbc005b
Cq-Include-Trybots: luci.golang.try:gotip-linux-386-longtest,gotip-linux-amd64-longtest,gotip-windows-amd64-longtest
Reviewed-on: https://go-review.googlesource.com/c/go/+/536236
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
|
|
Shaves ~1.5kB off cmd/go binary.
Change-Id: I8ad85aa4a24bc197b009c8e1ea9201957222152a
Reviewed-on: https://go-review.googlesource.com/c/go/+/521677
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
|
|
There's no need for distinct hmap and hiter types for each map.
Shaves 9kB off cmd/go binary size.
Change-Id: I7bc3b2d8ec82e7fcd78c1cb17733ebd8b615990a
Reviewed-on: https://go-review.googlesource.com/c/go/+/521615
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
|
|
Sort variables before display so that when there are multiple variables
to report, they are in a consistent order.
Otherwise they are ordered in the order they appear in the fn.Dcl list,
which can vary. Particularly, they vary depending on regabi.
Change-Id: I0db380f7cbe6911e87177503a4c3b39851ff1b5a
Reviewed-on: https://go-review.googlesource.com/c/go/+/462898
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
With the introduction of stack objects, VARKILL information is
no longer needed.
With stack objects, an object is dead when there are no more static
references to it, and the stack scanner can't find any live pointers
to it. VARKILL information isn't used to establish live ranges for
address-taken variables any more. In effect, the last static reference
*is* the VARKILL, and there's an additional dynamic liveness check
during stack scanning.
Next CL will actually rip out the VARKILL opcodes.
Change-Id: I030a2ab867445cf4e0e69397911f8a2e2f0ed07b
Reviewed-on: https://go-review.googlesource.com/c/go/+/419234
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
|
|
This CL applies the same change to test/live.go that was previously
applied to test/live_regabi.go in golang.org/cl/415240. This wasn't
noticed at the time though, because GOEXPERIMENT=unified was only
being tested on linux-amd64, which is a regabi platform.
Change-Id: I0c75c2b7097544305e4174c2f5ec6ec283c81a8e
Reviewed-on: https://go-review.googlesource.com/c/go/+/422254
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
|
|
Simplify the implementation of interface conversions in the compiler.
Don't pass fields that aren't needed (the data word, usually) to the runtime.
For generics, we need to put a dynamic type in an interface. The new
dataWord function is exactly what we need (the type word will come
from a dictionary).
Change-Id: Iade5de5c174854b65ad248f35c7893c603f7be3d
Reviewed-on: https://go-review.googlesource.com/c/go/+/340029
Trust: Keith Randall <khr@golang.org>
Trust: Dan Scales <danscales@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Dan Scales <danscales@google.com>
|
|
Similar to the previous CL to suppress escape analysis diagnostics for
method wrappers, suppress liveness analysis diagnostics too. It's
hardly useful to know that all of a wrapper method's arguments are
live at entry.
Change-Id: I0d1e44552c6334ee3b454adc107430232abcb56a
Reviewed-on: https://go-review.googlesource.com/c/go/+/330749
Trust: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
|
|
Hardwire regabidefers to true. Remove it from GOEXPERIMENTs.
Fallback paths are not cleaned up in this CL. That will be done
in later CLs.
Change-Id: Iec1112a1e55d5f6ef70232a5ff6e702f649071c4
Reviewed-on: https://go-review.googlesource.com/c/go/+/325913
Trust: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
|
|
Previously, live.go is conditioned on not using regabidefers. Now
we have regabidefers enabled by default everywhere, and we may
remove the fallback path in the near future, test that
configuration instead.
Change-Id: Idf910aee323bdd6478bc7a2062b2052d82ce003f
Reviewed-on: https://go-review.googlesource.com/c/go/+/325111
Trust: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
|
|
With defer/go wrapping and register arguments, some liveness info
changed and live.go test was disabled for regabi. This CL adds a
new one for regabi.
Change-Id: I65f03a6ef156366d8b76c62a16251c3e818f4b02
Reviewed-on: https://go-review.googlesource.com/c/go/+/311369
Trust: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
Adds code to the compiler's "order" phase to rewrite go and defer
statements to always be argument-less. E.g.
defer f(x,y) => x1, y1 := x, y
defer func() { f(x1, y1) }
This transformation is not beneficial on its own, but it helps
simplify runtime defer handling for the new register ABI (when
invoking deferred functions on the panic path, the runtime doesn't
need to manage the complexity of determining which args to pass in
register vs memory).
This feature is currently enabled by default if GOEXPERIMENT=regabi or
GOEXPERIMENT=regabidefer is in effect.
Included in this CL are some workarounds in the runtime to insure that
"go" statement targets in the runtime are argument-less already (since
wrapping them can potentially introduce heap-allocated closures, which
are currently not allowed). The expectation is that these workarounds
will be temporary, and can go away once we either A) change the rules
about heap-allocated closures, or B) implement some other scheme for
handling go statements.
Change-Id: I01060d79a6b140c6f0838d6e6813f807ccdca319
Reviewed-on: https://go-review.googlesource.com/c/go/+/298669
Trust: Than McIntosh <thanm@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
There's not really any use to tracking function-scoped constants and
types on Curfn.Dcl, and there's sloppy code that assumes all of the
declarations are variables (e.g., cmpstackvarlt).
Change-Id: I5d10dc681dac2c161c7b73ba808403052ca0608e
Reviewed-on: https://go-review.googlesource.com/c/go/+/274436
Reviewed-by: Russ Cox <rsc@golang.org>
Trust: Matthew Dempsky <mdempsky@google.com>
|
|
The R_CALLRISCV relocation marker is on the JALR instruction, however the actual
relocation is currently two instructions previous for the AUIPC+ADDI sequence.
Adjust the platform dependent offset accordingly and re-enable open-coded defers.
Fixes #36786.
Change-Id: I71597c193c447930fbe94ce44b7355e89ae877bb
Reviewed-on: https://go-review.googlesource.com/c/go/+/216797
Run-TryBot: Joel Sing <joel@sing.id.au>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
This test expects that open-coded defers are enabled, which is not currently
the case on riscv64.
Updates issue #27532 and #36786.
Change-Id: I94bb558c5b0734b4cfe5ae12873be81026009bcf
Reviewed-on: https://go-review.googlesource.com/c/go/+/216777
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
extra funcdata
Generate inline code at defer time to save the args of defer calls to unique
(autotmp) stack slots, and generate inline code at exit time to check which defer
calls were made and make the associated function/method/interface calls. We
remember that a particular defer statement was reached by storing in the deferBits
variable (always stored on the stack). At exit time, we check the bits of the
deferBits variable to determine which defer function calls to make (in reverse
order). These low-cost defers are only used for functions where no defers
appear in loops. In addition, we don't do these low-cost defers if there are too
many defer statements or too many exits in a function (to limit code increase).
When a function uses open-coded defers, we produce extra
FUNCDATA_OpenCodedDeferInfo information that specifies the number of defers, and
for each defer, the stack slots where the closure and associated args have been
stored. The funcdata also includes the location of the deferBits variable.
Therefore, for panics, we can use this funcdata to determine exactly which defers
are active, and call the appropriate functions/methods/closures with the correct
arguments for each active defer.
In order to unwind the stack correctly after a recover(), we need to add an extra
code segment to functions with open-coded defers that simply calls deferreturn()
and returns. This segment is not reachable by the normal function, but is returned
to by the runtime during recovery. We set the liveness information of this
deferreturn() to be the same as the liveness at the first function call during the
last defer exit code (so all return values and all stack slots needed by the defer
calls will be live).
I needed to increase the stackguard constant from 880 to 896, because of a small
amount of new code in deferreturn().
The -N flag disables open-coded defers. '-d defer' prints out the kind of defer
being used at each defer statement (heap-allocated, stack-allocated, or
open-coded).
Cost of defer statement [ go test -run NONE -bench BenchmarkDefer$ runtime ]
With normal (stack-allocated) defers only: 35.4 ns/op
With open-coded defers: 5.6 ns/op
Cost of function call alone (remove defer keyword): 4.4 ns/op
Text size increase (including funcdata) for go binary without/with open-coded defers: 0.09%
The average size increase (including funcdata) for only the functions that use
open-coded defers is 1.1%.
The cost of a panic followed by a recover got noticeably slower, since panic
processing now requires a scan of the stack for open-coded defer frames. This scan
is required, even if no frames are using open-coded defers:
Cost of panic and recover [ go test -run NONE -bench BenchmarkPanicRecover runtime ]
Without open-coded defers: 62.0 ns/op
With open-coded defers: 255 ns/op
A CGO Go-to-C-to-Go benchmark got noticeably faster because of open-coded defers:
CGO Go-to-C-to-Go benchmark [cd misc/cgo/test; go test -run NONE -bench BenchmarkCGoCallback ]
Without open-coded defers: 443 ns/op
With open-coded defers: 347 ns/op
Updates #14939 (defer performance)
Updates #34481 (design doc)
Change-Id: I63b1a60d1ebf28126f55ee9fd7ecffe9cb23d1ff
Reviewed-on: https://go-review.googlesource.com/c/go/+/202340
Reviewed-by: Austin Clements <austin@google.com>
|
|
code and extra funcdata"
This reverts CL 190098.
Reason for revert: broke several builders.
Change-Id: I69161352f9ded02537d8815f259c4d391edd9220
Reviewed-on: https://go-review.googlesource.com/c/go/+/201519
Run-TryBot: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Dan Scales <danscales@google.com>
|
|
extra funcdata
Generate inline code at defer time to save the args of defer calls to unique
(autotmp) stack slots, and generate inline code at exit time to check which defer
calls were made and make the associated function/method/interface calls. We
remember that a particular defer statement was reached by storing in the deferBits
variable (always stored on the stack). At exit time, we check the bits of the
deferBits variable to determine which defer function calls to make (in reverse
order). These low-cost defers are only used for functions where no defers
appear in loops. In addition, we don't do these low-cost defers if there are too
many defer statements or too many exits in a function (to limit code increase).
When a function uses open-coded defers, we produce extra
FUNCDATA_OpenCodedDeferInfo information that specifies the number of defers, and
for each defer, the stack slots where the closure and associated args have been
stored. The funcdata also includes the location of the deferBits variable.
Therefore, for panics, we can use this funcdata to determine exactly which defers
are active, and call the appropriate functions/methods/closures with the correct
arguments for each active defer.
In order to unwind the stack correctly after a recover(), we need to add an extra
code segment to functions with open-coded defers that simply calls deferreturn()
and returns. This segment is not reachable by the normal function, but is returned
to by the runtime during recovery. We set the liveness information of this
deferreturn() to be the same as the liveness at the first function call during the
last defer exit code (so all return values and all stack slots needed by the defer
calls will be live).
I needed to increase the stackguard constant from 880 to 896, because of a small
amount of new code in deferreturn().
The -N flag disables open-coded defers. '-d defer' prints out the kind of defer
being used at each defer statement (heap-allocated, stack-allocated, or
open-coded).
Cost of defer statement [ go test -run NONE -bench BenchmarkDefer$ runtime ]
With normal (stack-allocated) defers only: 35.4 ns/op
With open-coded defers: 5.6 ns/op
Cost of function call alone (remove defer keyword): 4.4 ns/op
Text size increase (including funcdata) for go cmd without/with open-coded defers: 0.09%
The average size increase (including funcdata) for only the functions that use
open-coded defers is 1.1%.
The cost of a panic followed by a recover got noticeably slower, since panic
processing now requires a scan of the stack for open-coded defer frames. This scan
is required, even if no frames are using open-coded defers:
Cost of panic and recover [ go test -run NONE -bench BenchmarkPanicRecover runtime ]
Without open-coded defers: 62.0 ns/op
With open-coded defers: 255 ns/op
A CGO Go-to-C-to-Go benchmark got noticeably faster because of open-coded defers:
CGO Go-to-C-to-Go benchmark [cd misc/cgo/test; go test -run NONE -bench BenchmarkCGoCallback ]
Without open-coded defers: 443 ns/op
With open-coded defers: 347 ns/op
Updates #14939 (defer performance)
Updates #34481 (design doc)
Change-Id: I51a389860b9676cfa1b84722f5fb84d3c4ee9e28
Reviewed-on: https://go-review.googlesource.com/c/go/+/190098
Reviewed-by: Austin Clements <austin@google.com>
|
|
Assinging to 1-element array/1-field struct variable is considered clobbering
the whole variable. By emitting OpVarDef in this case, liveness analysis
can now know the variable is redefined.
Also, the isfat is not necessary anymore, and will be removed in follow up CL.
Fixes #33916
Change-Id: Iece0d90b05273f333d59d6ee5b12ee7dc71908c2
Reviewed-on: https://go-review.googlesource.com/c/go/+/192979
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
|
|
This reverts commit 53227762153afb39c979810bd59ec139e3c8127d.
Reason for revert: broke js-wasm builder.
Change-Id: If22762317c4a9e00f5060eb84377a4a52d601fca
Reviewed-on: https://go-review.googlesource.com/c/go/+/192157
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
|
|
This will improve liveness analysis slightly, the same logic as
isdirectiface curently does. In:
type T struct {
m map[int]int
}
v := T{}
v.m = make(map[int]int)
T is considered "fat", now it is not. So assigning to v.m is considered
to clobber the entire v.
This is follow up of CL 179057.
Change-Id: Id6b4807b8e8521ef5d8bcb14fedb6dceb9dbf18c
Reviewed-on: https://go-review.googlesource.com/c/go/+/179578
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
|
|
This reverts CL 180761
Reason for revert: Reinstate the stack-allocated defer CL.
There was nothing wrong with the CL proper, but stack allocation of defers exposed two other issues.
Issue #32477: Fix has been submitted as CL 181258.
Issue #32498: Possible fix is CL 181377 (not submitted yet).
Change-Id: I32b3365d5026600069291b068bbba6cb15295eb3
Reviewed-on: https://go-review.googlesource.com/c/go/+/181378
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
This reverts commit fff4f599fe1c21e411a99de5c9b3777d06ce0ce6.
Reason for revert: Seems to still have issues around GC.
Fixes #32452
Change-Id: Ibe7af629f9ad6a3d5312acd7b066123f484da7f0
Reviewed-on: https://go-review.googlesource.com/c/go/+/180761
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
|
|
When a defer is executed at most once in a function body,
we can allocate the defer record for it on the stack instead
of on the heap.
This should make defers like this (which are very common) faster.
This optimization applies to 363 out of the 370 static defer sites
in the cmd/go binary.
name old time/op new time/op delta
Defer-4 52.2ns ± 5% 36.2ns ± 3% -30.70% (p=0.000 n=10+10)
Fixes #6980
Update #14939
Change-Id: I697109dd7aeef9e97a9eeba2ef65ff53d3ee1004
Reviewed-on: https://go-review.googlesource.com/c/go/+/171758
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
Make sure the side effects inside short-circuited operations (&& and ||)
happen correctly.
Before this CL, we attached the side effects to the node itself using
exprInPlace. That caused other side effects in sibling expressions
to get reordered with respect to the short circuit side effect.
Instead, rewrite a && b like:
r := a
if r {
r = b
}
That code we can keep correctly ordered with respect to other
side-effects extracted from part of a big expression.
exprInPlace seems generally unsafe. But this was the only case where
exprInPlace is called not at the top level of an expression, so I
don't think the other uses can actually trigger an issue (there can't
be a sibling expression). TODO: maybe those cases don't need "in
place", and we can retire that function generally.
This CL needed a small tweak to the SSA generation of OIF so that the
short circuit optimization still triggers. The short circuit optimization
looks for triangle but not diamonds, so don't bother allocating a block
if it will be empty.
Go 1 benchmarks are in the noise.
Fixes #30566
Change-Id: I19c04296bea63cbd6ad05f87a63b005029123610
Reviewed-on: https://go-review.googlesource.com/c/go/+/165617
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
Ensure that we correctly type the stack temps for regular closures,
method function closures, and slice literals.
Then we don't need to override the dummy types later.
Furthermore, this allows order to reuse temporaries of these types.
OARRAYLIT doesn't need a temporary as far as I can tell, so I
removed that case from order.
Change-Id: Ic58520fa50c90639393ff78f33d3c831d5c4acb9
Reviewed-on: https://go-review.googlesource.com/c/140306
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Instead of allocating a new temporary each time one
is needed, keep a list of temporaries which are free
(have already been VARKILLed on every path) and use
one of them.
Should save a lot of stack space. In a function like this:
func main() {
fmt.Printf("%d %d\n", 2, 3)
fmt.Printf("%d %d\n", 4, 5)
fmt.Printf("%d %d\n", 6, 7)
}
The three [2]interface{} arrays used to hold the ... args
all use the same autotmp, instead of 3 different autotmps
as happened previous to this CL.
Change-Id: I2d728e226f81e05ae68ca8247af62014a1b032d3
Reviewed-on: https://go-review.googlesource.com/c/140301
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
When we pass these types by reference, we usually have to allocate
temporaries on the stack, initialize them, then pass their address
to the conversion functions. It's simpler to pass these types
directly by value.
This particularly applies to conversions needed for fmt.Printf
(to interface{} for constructing a [...]interface{}).
func f(a, b, c string) {
fmt.Printf("%s %s\n", a, b)
fmt.Printf("%s %s\n", b, c)
}
This function's stack frame shrinks from 200 to 136 bytes, and
its code shrinks from 535 to 453 bytes.
The go binary shrinks 0.3%.
Update #24286
Aside: for this function f, we don't really need to allocate
temporaries for the convT2E function. We could use the address
of a, b, and c directly. That might get similar (or maybe better?)
improvements. I investigated a bit, but it seemed complicated
to do it safely. This change was much easier.
Change-Id: I78cbe51b501fb41e1e324ce4203f0de56a1db82d
Reviewed-on: https://go-review.googlesource.com/c/135377
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
|
|
The previous CL introduced stack objects. This CL removes the old
ambiguously live liveness analysis. After this CL we're relying
on stack objects exclusively.
Update a bunch of liveness tests to reflect the new world.
Fixes #22350
Change-Id: I739b26e015882231011ce6bc1a7f426049e59f31
Reviewed-on: https://go-review.googlesource.com/c/134156
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Change-Id: Iadb3c5de8ae9ea45855013997ed70f7929a88661
GitHub-Last-Rev: ae85bcf82be8fee533e2b9901c6133921382c70a
GitHub-Pull-Request: golang/go#26920
Reviewed-on: https://go-review.googlesource.com/128955
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
This modifies issafepoint in liveness analysis to report almost every
operation as a safe point. There are four things we don't mark as
safe-points:
1. Runtime code (other than at calls).
2. go:nosplit functions (other than at calls).
3. Instructions between the load of the write barrier-enabled flag and
the write.
4. Instructions leading up to a uintptr -> unsafe.Pointer conversion.
We'll optimize this in later CLs:
name old time/op new time/op delta
Template 185ms ± 2% 190ms ± 2% +2.95% (p=0.000 n=10+10)
Unicode 96.3ms ± 3% 96.4ms ± 1% ~ (p=0.905 n=10+9)
GoTypes 658ms ± 0% 669ms ± 1% +1.72% (p=0.000 n=10+9)
Compiler 3.14s ± 1% 3.18s ± 1% +1.56% (p=0.000 n=9+10)
SSA 7.41s ± 2% 7.59s ± 1% +2.48% (p=0.000 n=9+10)
Flate 126ms ± 1% 128ms ± 1% +2.08% (p=0.000 n=10+10)
GoParser 153ms ± 1% 157ms ± 2% +2.38% (p=0.000 n=10+10)
Reflect 437ms ± 1% 442ms ± 1% +0.98% (p=0.001 n=10+10)
Tar 178ms ± 1% 179ms ± 1% +0.67% (p=0.035 n=10+9)
XML 223ms ± 1% 229ms ± 1% +2.58% (p=0.000 n=10+10)
[Geo mean] 394ms 401ms +1.75%
No effect on binary size because we're not yet emitting these extra
safe points.
For #24543.
Change-Id: I16a1eebb9183cad7cef9d53c0fd21a973cad6859
Reviewed-on: https://go-review.googlesource.com/109348
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
Registration now looks like:
var cases [4]runtime.scases
var order [8]uint16
cases[0].kind = caseSend
cases[0].c = c1
cases[0].elem = &v1
if raceenabled || msanenabled {
selectsetpc(&cases[0])
}
cases[1].kind = caseRecv
cases[1].c = c2
cases[1].elem = &v2
if raceenabled || msanenabled {
selectsetpc(&cases[1])
}
...
Change-Id: Ib9bcf426a4797fe4bfd8152ca9e6e08e39a70b48
Reviewed-on: https://go-review.googlesource.com/37934
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
Now the registration phase looks like:
var cases [4]runtime.scases
var order [8]uint16
selectsend(&cases[0], c1, &v1)
selectrecv(&cases[1], c2, &v2, nil)
selectrecv(&cases[2], c3, &v3, &ok)
selectdefault(&cases[3])
chosen := selectgo(&cases[0], &order[0], 4)
Primarily, this is just preparation for having the compiler open-code
selectsend, selectrecv, and selectdefault.
As a minor benefit, order can now be layed out separately on the stack
in the pointer-free segment, so it won't take up space in the
function's stack pointer maps.
Change-Id: I5552ba594201efd31fcb40084da20b42ea569a45
Reviewed-on: https://go-review.googlesource.com/37933
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
The first word of an interface is a pointer, but for the purposes
of GC we don't need to treat it as such.
1. If it is a non-empty interface, the pointer points to an itab
which is always in persistentalloc space.
2. If it is an empty interface, the pointer points to a _type.
a. If it is a compile-time-allocated type, it points into
the read-only data section.
b. If it is a reflect-allocated type, it points into the Go heap.
Reflect is responsible for keeping a reference to
the underlying type so it won't be GCd.
If we ever have a moving GC, we need to change this for 2b (as
well as scan itabs to update their itab._type fields).
Write barriers on the first word of interfaces have already been removed.
Change-Id: I643e91d7ac4de980ac2717436eff94097c65d959
Reviewed-on: https://go-review.googlesource.com/97518
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
Handle make(map[any]any) and make(map[any]any, hint) where
hint <= BUCKETSIZE special to allow for faster map initialization
and to improve binary size by using runtime calls with fewer arguments.
Given hint is smaller or equal to BUCKETSIZE in which case
overLoadFactor(hint, 0) is false and no buckets would be allocated by makemap:
* If hmap needs to be allocated on the stack then only hmap's hash0
field needs to be initialized and no call to makemap is needed.
* If hmap needs to be allocated on the heap then a new special
makehmap function will allocate hmap and intialize hmap's
hash0 field.
Reduces size of the godoc by ~36kb.
AMD64
name old time/op new time/op delta
NewEmptyMap 16.6ns ± 2% 5.5ns ± 2% -66.72% (p=0.000 n=10+10)
NewSmallMap 64.8ns ± 1% 56.5ns ± 1% -12.75% (p=0.000 n=9+10)
Updates #6853
Change-Id: I624e90da6775afaa061178e95db8aca674f44e9b
Reviewed-on: https://go-review.googlesource.com/61190
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
In range loops over slices and arrays besides a variable to track the
index an extra variable containing the address of the current element
is used. To compute a pointer to the next element the elements size is
added to the address.
On 386 and amd64 an element of size 1, 2, 4 or 8 bytes can by copied
from an array using a MOV instruction with suitable addressing mode
that uses the start address of the array, the index of the element and
element size as scaling factor. Thereby, for arrays and slices with
suitable element size we can avoid keeping and incrementing an extra
variable to compute the next elements address.
Shrinks cmd/go by 4 kilobytes.
AMD64:
name old time/op new time/op delta
BinaryTree17 2.66s ± 7% 2.54s ± 0% -4.53% (p=0.000 n=10+8)
Fannkuch11 3.02s ± 1% 3.02s ± 1% ~ (p=0.579 n=10+10)
FmtFprintfEmpty 45.6ns ± 1% 42.2ns ± 1% -7.46% (p=0.000 n=10+10)
FmtFprintfString 69.8ns ± 1% 70.4ns ± 1% +0.84% (p=0.041 n=10+10)
FmtFprintfInt 80.1ns ± 1% 79.0ns ± 1% -1.35% (p=0.000 n=10+10)
FmtFprintfIntInt 127ns ± 1% 125ns ± 1% -1.00% (p=0.007 n=10+9)
FmtFprintfPrefixedInt 158ns ± 2% 152ns ± 1% -4.11% (p=0.000 n=10+10)
FmtFprintfFloat 218ns ± 1% 214ns ± 1% -1.61% (p=0.000 n=10+10)
FmtManyArgs 508ns ± 1% 504ns ± 1% -0.93% (p=0.001 n=9+10)
GobDecode 6.76ms ± 1% 6.78ms ± 1% ~ (p=0.353 n=10+10)
GobEncode 5.84ms ± 1% 5.77ms ± 1% -1.31% (p=0.000 n=10+9)
Gzip 223ms ± 1% 218ms ± 1% -2.39% (p=0.000 n=10+10)
Gunzip 40.3ms ± 1% 40.4ms ± 3% ~ (p=0.796 n=10+10)
HTTPClientServer 73.5µs ± 0% 73.3µs ± 0% -0.28% (p=0.000 n=10+9)
JSONEncode 12.7ms ± 1% 12.6ms ± 8% ~ (p=0.173 n=8+10)
JSONDecode 57.5ms ± 1% 56.1ms ± 2% -2.40% (p=0.000 n=10+10)
Mandelbrot200 3.80ms ± 1% 3.86ms ± 6% ~ (p=0.579 n=10+10)
GoParse 3.25ms ± 1% 3.23ms ± 1% ~ (p=0.052 n=10+10)
RegexpMatchEasy0_32 74.4ns ± 1% 76.9ns ± 1% +3.39% (p=0.000 n=10+10)
RegexpMatchEasy0_1K 243ns ± 2% 248ns ± 1% +1.86% (p=0.000 n=10+8)
RegexpMatchEasy1_32 71.0ns ± 2% 72.8ns ± 1% +2.55% (p=0.000 n=10+10)
RegexpMatchEasy1_1K 370ns ± 1% 383ns ± 0% +3.39% (p=0.000 n=10+9)
RegexpMatchMedium_32 107ns ± 0% 113ns ± 1% +5.33% (p=0.000 n=6+10)
RegexpMatchMedium_1K 35.0µs ± 1% 36.0µs ± 1% +3.13% (p=0.000 n=10+10)
RegexpMatchHard_32 1.65µs ± 1% 1.69µs ± 1% +2.23% (p=0.000 n=10+9)
RegexpMatchHard_1K 49.8µs ± 1% 50.6µs ± 1% +1.59% (p=0.000 n=10+10)
Revcomp 398ms ± 1% 396ms ± 1% -0.51% (p=0.043 n=10+10)
Template 63.4ms ± 1% 60.8ms ± 0% -4.11% (p=0.000 n=10+9)
TimeParse 318ns ± 1% 322ns ± 1% +1.10% (p=0.005 n=10+10)
TimeFormat 323ns ± 1% 336ns ± 1% +4.15% (p=0.000 n=10+10)
Updates: #15809.
Change-Id: I55915aaf6d26768e12247f8a8edf14e7630726d1
Reviewed-on: https://go-review.googlesource.com/38061
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
This is a crude compiler pass to eliminate stores to auto variables
that are only ever written to.
Eliminates an unnecessary store to x from the following code:
func f() int {
var x := 1
return *(&x)
}
Fixes #19765.
Change-Id: If2c63a8ae67b8c590b6e0cc98a9610939a3eeffa
Reviewed-on: https://go-review.googlesource.com/38746
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Add benchmarks for map delete with int32/int64/string key
Benchmark results on darwin/amd64
name old time/op new time/op delta
MapDelete/Int32/1-8 151ns ± 8% 99ns ± 3% -34.39% (p=0.008 n=5+5)
MapDelete/Int32/2-8 128ns ± 2% 111ns ±15% -13.40% (p=0.040 n=5+5)
MapDelete/Int32/4-8 128ns ± 5% 114ns ± 2% -10.82% (p=0.008 n=5+5)
MapDelete/Int64/1-8 144ns ± 0% 104ns ± 3% -27.53% (p=0.016 n=4+5)
MapDelete/Int64/2-8 153ns ± 1% 126ns ± 3% -17.17% (p=0.008 n=5+5)
MapDelete/Int64/4-8 178ns ± 3% 136ns ± 2% -23.60% (p=0.008 n=5+5)
MapDelete/Str/1-8 187ns ± 3% 171ns ± 3% -8.54% (p=0.008 n=5+5)
MapDelete/Str/2-8 221ns ± 3% 206ns ± 4% -7.18% (p=0.016 n=5+4)
MapDelete/Str/4-8 256ns ± 5% 232ns ± 2% -9.36% (p=0.016 n=4+5)
name old time/op new time/op delta
BinaryTree17-8 2.78s ± 7% 2.70s ± 1% ~ (p=0.151 n=5+5)
Fannkuch11-8 3.21s ± 2% 3.19s ± 1% ~ (p=0.310 n=5+5)
FmtFprintfEmpty-8 49.1ns ± 3% 50.2ns ± 2% ~ (p=0.095 n=5+5)
FmtFprintfString-8 78.6ns ± 4% 80.2ns ± 5% ~ (p=0.460 n=5+5)
FmtFprintfInt-8 79.7ns ± 1% 81.0ns ± 3% ~ (p=0.103 n=5+5)
FmtFprintfIntInt-8 117ns ± 2% 119ns ± 0% ~ (p=0.079 n=5+4)
FmtFprintfPrefixedInt-8 153ns ± 1% 146ns ± 3% -4.19% (p=0.024 n=5+5)
FmtFprintfFloat-8 239ns ± 1% 237ns ± 1% ~ (p=0.246 n=5+5)
FmtManyArgs-8 506ns ± 2% 509ns ± 2% ~ (p=0.238 n=5+5)
GobDecode-8 7.06ms ± 4% 6.86ms ± 1% ~ (p=0.222 n=5+5)
GobEncode-8 6.01ms ± 5% 5.87ms ± 2% ~ (p=0.222 n=5+5)
Gzip-8 246ms ± 4% 236ms ± 1% -4.12% (p=0.008 n=5+5)
Gunzip-8 37.7ms ± 4% 37.3ms ± 1% ~ (p=0.841 n=5+5)
HTTPClientServer-8 64.9µs ± 1% 64.4µs ± 0% -0.80% (p=0.032 n=5+4)
JSONEncode-8 16.0ms ± 2% 16.2ms ±11% ~ (p=0.548 n=5+5)
JSONDecode-8 53.2ms ± 2% 53.1ms ± 4% ~ (p=1.000 n=5+5)
Mandelbrot200-8 4.33ms ± 2% 4.32ms ± 2% ~ (p=0.841 n=5+5)
GoParse-8 3.24ms ± 2% 3.27ms ± 4% ~ (p=0.690 n=5+5)
RegexpMatchEasy0_32-8 86.2ns ± 1% 85.2ns ± 3% ~ (p=0.286 n=5+5)
RegexpMatchEasy0_1K-8 198ns ± 2% 199ns ± 1% ~ (p=0.310 n=5+5)
RegexpMatchEasy1_32-8 82.6ns ± 2% 81.8ns ± 1% ~ (p=0.294 n=5+5)
RegexpMatchEasy1_1K-8 359ns ± 2% 354ns ± 1% -1.39% (p=0.048 n=5+5)
RegexpMatchMedium_32-8 123ns ± 2% 123ns ± 1% ~ (p=0.905 n=5+5)
RegexpMatchMedium_1K-8 38.2µs ± 2% 38.6µs ± 8% ~ (p=0.690 n=5+5)
RegexpMatchHard_32-8 1.92µs ± 2% 1.91µs ± 5% ~ (p=0.460 n=5+5)
RegexpMatchHard_1K-8 57.6µs ± 1% 57.0µs ± 2% ~ (p=0.310 n=5+5)
Revcomp-8 483ms ± 7% 441ms ± 1% -8.79% (p=0.016 n=5+4)
Template-8 58.0ms ± 1% 58.2ms ± 7% ~ (p=0.310 n=5+5)
TimeParse-8 324ns ± 6% 312ns ± 2% ~ (p=0.087 n=5+5)
TimeFormat-8 330ns ± 1% 329ns ± 1% ~ (p=0.968 n=5+5)
name old speed new speed delta
GobDecode-8 109MB/s ± 4% 112MB/s ± 1% ~ (p=0.222 n=5+5)
GobEncode-8 128MB/s ± 5% 131MB/s ± 2% ~ (p=0.222 n=5+5)
Gzip-8 78.9MB/s ± 4% 82.3MB/s ± 1% +4.25% (p=0.008 n=5+5)
Gunzip-8 514MB/s ± 4% 521MB/s ± 1% ~ (p=0.841 n=5+5)
JSONEncode-8 121MB/s ± 2% 120MB/s ±10% ~ (p=0.548 n=5+5)
JSONDecode-8 36.5MB/s ± 2% 36.6MB/s ± 4% ~ (p=1.000 n=5+5)
GoParse-8 17.9MB/s ± 2% 17.7MB/s ± 4% ~ (p=0.730 n=5+5)
RegexpMatchEasy0_32-8 371MB/s ± 1% 375MB/s ± 3% ~ (p=0.310 n=5+5)
RegexpMatchEasy0_1K-8 5.15GB/s ± 1% 5.13GB/s ± 1% ~ (p=0.548 n=5+5)
RegexpMatchEasy1_32-8 387MB/s ± 2% 391MB/s ± 1% ~ (p=0.310 n=5+5)
RegexpMatchEasy1_1K-8 2.85GB/s ± 2% 2.89GB/s ± 1% ~ (p=0.056 n=5+5)
RegexpMatchMedium_32-8 8.07MB/s ± 2% 8.06MB/s ± 1% ~ (p=0.730 n=5+5)
RegexpMatchMedium_1K-8 26.8MB/s ± 2% 26.6MB/s ± 7% ~ (p=0.690 n=5+5)
RegexpMatchHard_32-8 16.7MB/s ± 2% 16.7MB/s ± 5% ~ (p=0.421 n=5+5)
RegexpMatchHard_1K-8 17.8MB/s ± 1% 18.0MB/s ± 2% ~ (p=0.310 n=5+5)
Revcomp-8 527MB/s ± 6% 577MB/s ± 1% +9.44% (p=0.016 n=5+4)
Template-8 33.5MB/s ± 1% 33.4MB/s ± 7% ~ (p=0.310 n=5+5)
Updates #19495
Change-Id: Ib9ece1690813d9b4788455db43d30891e2138df5
Reviewed-on: https://go-review.googlesource.com/38172
Reviewed-by: Hugues Bruant <hugues.bruant@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Add benchmarks for map assignment with int32/int64/string key
Benchmark results on darwin/amd64
name old time/op new time/op delta
MapAssignInt32_255-8 24.7ns ± 3% 17.4ns ± 2% -29.75% (p=0.000 n=10+10)
MapAssignInt32_64k-8 45.5ns ± 4% 37.6ns ± 4% -17.18% (p=0.000 n=10+10)
MapAssignInt64_255-8 26.0ns ± 3% 17.9ns ± 4% -31.03% (p=0.000 n=10+10)
MapAssignInt64_64k-8 46.9ns ± 5% 38.7ns ± 2% -17.53% (p=0.000 n=9+10)
MapAssignStr_255-8 47.8ns ± 3% 24.8ns ± 4% -48.01% (p=0.000 n=10+10)
MapAssignStr_64k-8 83.0ns ± 3% 51.9ns ± 3% -37.45% (p=0.000 n=10+9)
name old time/op new time/op delta
BinaryTree17-8 3.11s ±19% 2.78s ± 3% ~ (p=0.095 n=5+5)
Fannkuch11-8 3.26s ± 1% 3.21s ± 2% ~ (p=0.056 n=5+5)
FmtFprintfEmpty-8 50.3ns ± 1% 50.8ns ± 2% ~ (p=0.246 n=5+5)
FmtFprintfString-8 82.7ns ± 4% 80.1ns ± 5% ~ (p=0.238 n=5+5)
FmtFprintfInt-8 82.6ns ± 2% 81.9ns ± 3% ~ (p=0.508 n=5+5)
FmtFprintfIntInt-8 124ns ± 4% 121ns ± 3% ~ (p=0.111 n=5+5)
FmtFprintfPrefixedInt-8 158ns ± 6% 160ns ± 2% ~ (p=0.341 n=5+5)
FmtFprintfFloat-8 249ns ± 2% 245ns ± 2% ~ (p=0.095 n=5+5)
FmtManyArgs-8 513ns ± 2% 519ns ± 3% ~ (p=0.151 n=5+5)
GobDecode-8 7.48ms ±12% 7.11ms ± 2% ~ (p=0.222 n=5+5)
GobEncode-8 6.25ms ± 1% 6.03ms ± 2% -3.56% (p=0.008 n=5+5)
Gzip-8 252ms ± 4% 252ms ± 4% ~ (p=1.000 n=5+5)
Gunzip-8 38.4ms ± 3% 38.6ms ± 2% ~ (p=0.690 n=5+5)
HTTPClientServer-8 76.9µs ±41% 66.4µs ± 6% ~ (p=0.310 n=5+5)
JSONEncode-8 16.5ms ± 3% 16.7ms ± 3% ~ (p=0.421 n=5+5)
JSONDecode-8 54.6ms ± 1% 54.3ms ± 2% ~ (p=0.548 n=5+5)
Mandelbrot200-8 4.45ms ± 3% 4.47ms ± 1% ~ (p=0.841 n=5+5)
GoParse-8 3.43ms ± 1% 3.32ms ± 2% -3.28% (p=0.008 n=5+5)
RegexpMatchEasy0_32-8 88.2ns ± 3% 89.4ns ± 2% ~ (p=0.333 n=5+5)
RegexpMatchEasy0_1K-8 205ns ± 1% 206ns ± 1% ~ (p=0.905 n=5+5)
RegexpMatchEasy1_32-8 85.1ns ± 1% 85.5ns ± 5% ~ (p=0.690 n=5+5)
RegexpMatchEasy1_1K-8 365ns ± 1% 371ns ± 9% ~ (p=1.000 n=5+5)
RegexpMatchMedium_32-8 129ns ± 2% 128ns ± 3% ~ (p=0.730 n=5+5)
RegexpMatchMedium_1K-8 39.8µs ± 0% 39.7µs ± 4% ~ (p=0.730 n=4+5)
RegexpMatchHard_32-8 1.99µs ± 3% 2.05µs ±16% ~ (p=0.794 n=5+5)
RegexpMatchHard_1K-8 59.3µs ± 1% 60.3µs ± 7% ~ (p=1.000 n=5+5)
Revcomp-8 1.36s ±63% 0.52s ± 5% ~ (p=0.095 n=5+5)
Template-8 62.6ms ±14% 60.5ms ± 5% ~ (p=0.690 n=5+5)
TimeParse-8 330ns ± 2% 324ns ± 2% ~ (p=0.087 n=5+5)
TimeFormat-8 350ns ± 3% 340ns ± 1% -2.86% (p=0.008 n=5+5)
name old speed new speed delta
GobDecode-8 103MB/s ±11% 108MB/s ± 2% ~ (p=0.222 n=5+5)
GobEncode-8 123MB/s ± 1% 127MB/s ± 2% +3.71% (p=0.008 n=5+5)
Gzip-8 77.1MB/s ± 4% 76.9MB/s ± 3% ~ (p=1.000 n=5+5)
Gunzip-8 505MB/s ± 3% 503MB/s ± 2% ~ (p=0.690 n=5+5)
JSONEncode-8 118MB/s ± 3% 116MB/s ± 3% ~ (p=0.421 n=5+5)
JSONDecode-8 35.5MB/s ± 1% 35.8MB/s ± 2% ~ (p=0.397 n=5+5)
GoParse-8 16.9MB/s ± 1% 17.4MB/s ± 2% +3.45% (p=0.008 n=5+5)
RegexpMatchEasy0_32-8 363MB/s ± 3% 358MB/s ± 2% ~ (p=0.421 n=5+5)
RegexpMatchEasy0_1K-8 4.98GB/s ± 1% 4.97GB/s ± 1% ~ (p=0.548 n=5+5)
RegexpMatchEasy1_32-8 376MB/s ± 1% 375MB/s ± 5% ~ (p=0.690 n=5+5)
RegexpMatchEasy1_1K-8 2.80GB/s ± 1% 2.76GB/s ± 9% ~ (p=0.841 n=5+5)
RegexpMatchMedium_32-8 7.73MB/s ± 1% 7.76MB/s ± 3% ~ (p=0.730 n=5+5)
RegexpMatchMedium_1K-8 25.8MB/s ± 0% 25.8MB/s ± 4% ~ (p=0.651 n=4+5)
RegexpMatchHard_32-8 16.1MB/s ± 3% 15.7MB/s ±14% ~ (p=0.794 n=5+5)
RegexpMatchHard_1K-8 17.3MB/s ± 1% 17.0MB/s ± 7% ~ (p=0.984 n=5+5)
Revcomp-8 273MB/s ±83% 488MB/s ± 5% ~ (p=0.095 n=5+5)
Template-8 31.1MB/s ±13% 32.1MB/s ± 5% ~ (p=0.690 n=5+5)
Updates #19495
Change-Id: I116e9a2a4594769318b22d736464de8a98499909
Reviewed-on: https://go-review.googlesource.com/38091
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
This commit reworks multiway select statements to use normal control
flow primitives instead of the previous setjmp/longjmp-like behavior.
This simplifies liveness analysis and should prevent issues around
"returns twice" function calls within SSA passes.
test/live.go is updated because liveness analysis's CFG is more
representative of actual control flow. The case bodies are the only
real successors of the selectgo call, but previously the selectsend,
selectrecv, etc. calls were included in the successors list too.
Updates #19331.
Change-Id: I7f879b103a4b85e62fc36a270d812f54c0aa3e83
Reviewed-on: https://go-review.googlesource.com/37661
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Prior to this CL, all runtime conversions
from a concrete value to an interface went
through one of two runtime calls: convT2E or convT2I.
However, in practice, basic types are very common.
Specializing convT2x for those basic types allows
for a more efficient implementation for those types.
For basic scalars and strings, allocation and copying
can use the same methods as normal code.
For pointer-free types, allocation can occur without
zeroing, and copying can take place without GC calls.
For slices, copying is cheaper and simpler.
This CL adds twelve runtime routines:
convT2E16, convT2I16
convT2E32, convT2I32
convT2E64, convT2I64
convT2Estring, convT2Istring
convT2Eslice, convT2Islice
convT2Enoptr, convT2Inoptr
While compiling make.bash, 93% of all convT2x calls
are now to one of these specialized convT2x call.
Within specialized convT2x routines, it is cheap to check
for a zero value, in a way that it is not in general.
When we detect a zero value there, we return a pointer
to zeroVal, rather than allocating.
name old time/op new time/op delta
ConvT2Ezero/zero/16-8 17.9ns ± 2% 3.0ns ± 3% -83.20% (p=0.000 n=56+56)
ConvT2Ezero/zero/32-8 17.8ns ± 2% 3.0ns ± 3% -83.15% (p=0.000 n=59+60)
ConvT2Ezero/zero/64-8 20.1ns ± 1% 3.0ns ± 2% -84.98% (p=0.000 n=57+57)
ConvT2Ezero/zero/str-8 32.6ns ± 1% 3.0ns ± 4% -90.70% (p=0.000 n=59+60)
ConvT2Ezero/zero/slice-8 36.7ns ± 2% 3.0ns ± 2% -91.78% (p=0.000 n=59+59)
ConvT2Ezero/zero/big-8 91.9ns ± 2% 85.9ns ± 2% -6.52% (p=0.000 n=57+57)
ConvT2Ezero/nonzero/16-8 17.7ns ± 2% 12.7ns ± 3% -28.38% (p=0.000 n=55+60)
ConvT2Ezero/nonzero/32-8 17.8ns ± 1% 12.7ns ± 1% -28.44% (p=0.000 n=54+57)
ConvT2Ezero/nonzero/64-8 20.0ns ± 1% 15.0ns ± 1% -24.90% (p=0.000 n=56+58)
ConvT2Ezero/nonzero/str-8 32.6ns ± 1% 25.7ns ± 1% -21.17% (p=0.000 n=58+55)
ConvT2Ezero/nonzero/slice-8 36.8ns ± 2% 30.4ns ± 1% -17.32% (p=0.000 n=60+52)
ConvT2Ezero/nonzero/big-8 92.1ns ± 2% 85.9ns ± 2% -6.70% (p=0.000 n=57+59)
Benchmarks on a real program (the compiler):
name old time/op new time/op delta
Template 227ms ± 5% 221ms ± 2% -2.48% (p=0.000 n=30+26)
Unicode 102ms ± 5% 100ms ± 3% -1.30% (p=0.009 n=30+26)
GoTypes 656ms ± 5% 659ms ± 4% ~ (p=0.208 n=30+30)
Compiler 2.82s ± 2% 2.82s ± 1% ~ (p=0.614 n=29+27)
Flate 128ms ± 2% 128ms ± 5% ~ (p=0.783 n=27+28)
GoParser 158ms ± 3% 158ms ± 3% ~ (p=0.261 n=28+30)
Reflect 408ms ± 7% 401ms ± 3% ~ (p=0.075 n=30+30)
Tar 123ms ± 6% 121ms ± 8% ~ (p=0.287 n=29+30)
XML 220ms ± 2% 220ms ± 4% ~ (p=0.805 n=29+29)
name old user-ns/op new user-ns/op delta
Template 281user-ms ± 4% 279user-ms ± 3% -0.87% (p=0.044 n=28+28)
Unicode 142user-ms ± 4% 141user-ms ± 3% -1.04% (p=0.015 n=30+27)
GoTypes 884user-ms ± 3% 886user-ms ± 2% ~ (p=0.532 n=30+30)
Compiler 3.94user-s ± 3% 3.92user-s ± 1% ~ (p=0.185 n=30+28)
Flate 165user-ms ± 2% 165user-ms ± 4% ~ (p=0.780 n=27+29)
GoParser 209user-ms ± 2% 208user-ms ± 3% ~ (p=0.453 n=28+30)
Reflect 533user-ms ± 6% 526user-ms ± 3% ~ (p=0.057 n=30+30)
Tar 156user-ms ± 6% 154user-ms ± 6% ~ (p=0.133 n=29+30)
XML 288user-ms ± 4% 288user-ms ± 4% ~ (p=0.633 n=30+30)
name old alloc/op new alloc/op delta
Template 41.0MB ± 0% 40.9MB ± 0% -0.11% (p=0.000 n=29+29)
Unicode 32.6MB ± 0% 32.6MB ± 0% ~ (p=0.572 n=29+30)
GoTypes 122MB ± 0% 122MB ± 0% -0.10% (p=0.000 n=30+30)
Compiler 482MB ± 0% 481MB ± 0% -0.07% (p=0.000 n=30+29)
Flate 26.6MB ± 0% 26.6MB ± 0% ~ (p=0.096 n=30+30)
GoParser 32.7MB ± 0% 32.6MB ± 0% -0.06% (p=0.011 n=28+28)
Reflect 84.2MB ± 0% 84.1MB ± 0% -0.17% (p=0.000 n=29+30)
Tar 27.7MB ± 0% 27.7MB ± 0% -0.05% (p=0.032 n=27+28)
XML 44.7MB ± 0% 44.7MB ± 0% ~ (p=0.131 n=28+30)
name old allocs/op new allocs/op delta
Template 373k ± 1% 370k ± 1% -0.76% (p=0.000 n=30+30)
Unicode 325k ± 1% 325k ± 1% ~ (p=0.383 n=29+30)
GoTypes 1.16M ± 0% 1.15M ± 0% -0.75% (p=0.000 n=29+30)
Compiler 4.15M ± 0% 4.13M ± 0% -0.59% (p=0.000 n=30+29)
Flate 238k ± 1% 237k ± 1% -0.62% (p=0.000 n=30+30)
GoParser 304k ± 1% 302k ± 1% -0.64% (p=0.000 n=30+28)
Reflect 1.00M ± 0% 0.99M ± 0% -1.10% (p=0.000 n=29+30)
Tar 245k ± 1% 244k ± 1% -0.59% (p=0.000 n=27+29)
XML 391k ± 1% 389k ± 1% -0.59% (p=0.000 n=29+30)
Change-Id: Id7f456d690567c2b0a96b0d6d64de8784b6e305f
Reviewed-on: https://go-review.googlesource.com/36476
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
The gcCompat mode was introduced to match the new parser's node position
setup exactly with the positions used by the original parser. Some of the
gcCompat adjustments were required to satisfy syntax error test cases,
and the rest were required to make toolstash cmp pass.
This change removes the former gcCompat adjustments and instead adjusts
the respective test cases as necessary. In some cases this makes the error
lines consistent with the ones reported by gccgo.
Where it has changed, the position associated with a given syntactic construct
is the position (line/col number) of the left-most token belonging to the
construct.
Change-Id: I5b60c00c5999a895c4d6d6e9b383c6405ccf725c
Reviewed-on: https://go-review.googlesource.com/36695
Run-TryBot: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
|
|
If there is a defer, and that defer recovers, then the caller
can see all of the output parameters. That means that we must
mark all the output parameters live at any point which might panic.
If there is no defer then this is not necessary. This is implemented.
We could also detect whether there is a recover in any of the defers.
If not, we would need to mark only output params that the defer
actually references (and the closure mechanism already does that).
This is not implemented.
Fixes #18860.
Change-Id: If984fe6686eddce9408bf25e725dd17fc16b8578
Reviewed-on: https://go-review.googlesource.com/36030
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
The order pass is responsible for ensuring that
values passed to runtime functions, including
convT2E/convT2I, are addressable.
Prior to this CL, this was always accomplished
by creating a temp, which frequently escaped to
the heap, causing allocations, perhaps most
notably in code like:
fmt.Println(1, 2, 3) // allocates three times
None of the runtime routines modify the contents
of the pointers they receive, so in the case of
constants, instead of creating a temp value,
we can create a static value.
(Marking the static value as read-only provides
protection against accidental attempts by the runtime
to modify the constant data.)
This improves code generation for code like:
panic("abc")
c <- 2 // c is a chan int
which can now simply refer to "abc" and 2,
rather than going by way of a temporary.
It also allows us to optimize convT2E/convT2I,
by recognizing static readonly values
and directly constructing the interface.
This CL adds ~0.5% to binary size, despite
decreasing the size of many functions,
because it also adds many static symbols.
This binary size regression could be recovered in
future (but currently unplanned) work.
There is a lot of content-duplication in these
symbols; this statement generates six new symbols,
three containing an int 1 and three containing
a pointer to the string "a":
fmt.Println(1, 1, 1, "a", "a", "a")
These symbols could be made content-addressable.
Furthermore, these symbols are small, so the
alignment and naming overhead is large.
As with the go.strings section, these symbols
could be hidden and have their alignment reduced.
The changes to test/live.go make it impossible
(at least with current optimization techniques)
to place the values being passed to the runtime
in static symbols, preserving autotmp creation.
Fixes #18704
Benchmarks from fmt and go-kit's logging package:
github.com/go-kit/kit/log
name old time/op new time/op delta
JSONLoggerSimple-8 1.91µs ± 2% 2.11µs ±22% ~ (p=1.000 n=9+10)
JSONLoggerContextual-8 2.60µs ± 6% 2.43µs ± 2% -6.29% (p=0.000 n=9+10)
Discard-8 101ns ± 2% 34ns ±14% -66.33% (p=0.000 n=10+9)
OneWith-8 161ns ± 1% 102ns ±16% -36.78% (p=0.000 n=10+10)
TwoWith-8 175ns ± 3% 106ns ± 7% -39.36% (p=0.000 n=10+9)
TenWith-8 293ns ± 3% 227ns ±15% -22.44% (p=0.000 n=9+10)
LogfmtLoggerSimple-8 704ns ± 2% 608ns ± 2% -13.65% (p=0.000 n=10+9)
LogfmtLoggerContextual-8 962ns ± 1% 860ns ±17% -10.57% (p=0.003 n=9+10)
NopLoggerSimple-8 188ns ± 1% 120ns ± 1% -36.39% (p=0.000 n=9+10)
NopLoggerContextual-8 379ns ± 1% 243ns ± 0% -35.77% (p=0.000 n=9+10)
ValueBindingTimestamp-8 577ns ± 1% 499ns ± 1% -13.51% (p=0.000 n=10+10)
ValueBindingCaller-8 898ns ± 2% 844ns ± 2% -6.00% (p=0.000 n=10+10)
name old alloc/op new alloc/op delta
JSONLoggerSimple-8 904B ± 0% 872B ± 0% -3.54% (p=0.000 n=10+10)
JSONLoggerContextual-8 1.20kB ± 0% 1.14kB ± 0% -5.33% (p=0.000 n=10+10)
Discard-8 64.0B ± 0% 32.0B ± 0% -50.00% (p=0.000 n=10+10)
OneWith-8 96.0B ± 0% 64.0B ± 0% -33.33% (p=0.000 n=10+10)
TwoWith-8 160B ± 0% 128B ± 0% -20.00% (p=0.000 n=10+10)
TenWith-8 672B ± 0% 640B ± 0% -4.76% (p=0.000 n=10+10)
LogfmtLoggerSimple-8 128B ± 0% 96B ± 0% -25.00% (p=0.000 n=10+10)
LogfmtLoggerContextual-8 304B ± 0% 240B ± 0% -21.05% (p=0.000 n=10+10)
NopLoggerSimple-8 128B ± 0% 96B ± 0% -25.00% (p=0.000 n=10+10)
NopLoggerContextual-8 304B ± 0% 240B ± 0% -21.05% (p=0.000 n=10+10)
ValueBindingTimestamp-8 159B ± 0% 127B ± 0% -20.13% (p=0.000 n=10+10)
ValueBindingCaller-8 112B ± 0% 80B ± 0% -28.57% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
JSONLoggerSimple-8 19.0 ± 0% 17.0 ± 0% -10.53% (p=0.000 n=10+10)
JSONLoggerContextual-8 25.0 ± 0% 21.0 ± 0% -16.00% (p=0.000 n=10+10)
Discard-8 3.00 ± 0% 1.00 ± 0% -66.67% (p=0.000 n=10+10)
OneWith-8 3.00 ± 0% 1.00 ± 0% -66.67% (p=0.000 n=10+10)
TwoWith-8 3.00 ± 0% 1.00 ± 0% -66.67% (p=0.000 n=10+10)
TenWith-8 3.00 ± 0% 1.00 ± 0% -66.67% (p=0.000 n=10+10)
LogfmtLoggerSimple-8 4.00 ± 0% 2.00 ± 0% -50.00% (p=0.000 n=10+10)
LogfmtLoggerContextual-8 7.00 ± 0% 3.00 ± 0% -57.14% (p=0.000 n=10+10)
NopLoggerSimple-8 4.00 ± 0% 2.00 ± 0% -50.00% (p=0.000 n=10+10)
NopLoggerContextual-8 7.00 ± 0% 3.00 ± 0% -57.14% (p=0.000 n=10+10)
ValueBindingTimestamp-8 5.00 ± 0% 3.00 ± 0% -40.00% (p=0.000 n=10+10)
ValueBindingCaller-8 4.00 ± 0% 2.00 ± 0% -50.00% (p=0.000 n=10+10)
fmt
name old time/op new time/op delta
SprintfPadding-8 88.9ns ± 3% 79.1ns ± 1% -11.09% (p=0.000 n=10+7)
SprintfEmpty-8 12.6ns ± 3% 12.8ns ± 3% ~ (p=0.136 n=10+10)
SprintfString-8 38.7ns ± 5% 26.9ns ± 6% -30.65% (p=0.000 n=10+10)
SprintfTruncateString-8 56.7ns ± 2% 47.0ns ± 3% -17.05% (p=0.000 n=10+10)
SprintfQuoteString-8 164ns ± 2% 153ns ± 2% -7.01% (p=0.000 n=10+10)
SprintfInt-8 38.9ns ±15% 26.5ns ± 2% -31.93% (p=0.000 n=10+9)
SprintfIntInt-8 60.3ns ± 9% 38.2ns ± 1% -36.67% (p=0.000 n=10+8)
SprintfPrefixedInt-8 58.6ns ±13% 51.2ns ±11% -12.66% (p=0.001 n=10+10)
SprintfFloat-8 71.4ns ± 3% 64.2ns ± 3% -10.08% (p=0.000 n=8+10)
SprintfComplex-8 175ns ± 3% 159ns ± 2% -9.03% (p=0.000 n=10+10)
SprintfBoolean-8 33.5ns ± 4% 25.7ns ± 5% -23.28% (p=0.000 n=10+10)
SprintfHexString-8 65.3ns ± 3% 51.7ns ± 5% -20.86% (p=0.000 n=10+9)
SprintfHexBytes-8 67.2ns ± 5% 67.9ns ± 4% ~ (p=0.383 n=10+10)
SprintfBytes-8 129ns ± 7% 124ns ± 7% ~ (p=0.074 n=9+10)
SprintfStringer-8 127ns ± 4% 126ns ± 8% ~ (p=0.506 n=9+10)
SprintfStructure-8 357ns ± 3% 359ns ± 3% ~ (p=0.469 n=10+10)
ManyArgs-8 203ns ± 6% 126ns ± 3% -37.94% (p=0.000 n=10+10)
FprintInt-8 119ns ±10% 74ns ± 3% -37.54% (p=0.000 n=10+10)
FprintfBytes-8 122ns ± 4% 120ns ± 3% ~ (p=0.124 n=10+10)
FprintIntNoAlloc-8 78.2ns ± 5% 74.1ns ± 3% -5.28% (p=0.000 n=10+10)
ScanInts-8 349µs ± 1% 349µs ± 0% ~ (p=0.606 n=9+8)
ScanRecursiveInt-8 43.8ms ± 7% 40.1ms ± 2% -8.42% (p=0.000 n=10+10)
ScanRecursiveIntReaderWrapper-8 43.5ms ± 4% 40.4ms ± 2% -7.16% (p=0.000 n=10+9)
name old alloc/op new alloc/op delta
SprintfPadding-8 24.0B ± 0% 16.0B ± 0% -33.33% (p=0.000 n=10+10)
SprintfEmpty-8 0.00B 0.00B ~ (all equal)
SprintfString-8 21.0B ± 0% 5.0B ± 0% -76.19% (p=0.000 n=10+10)
SprintfTruncateString-8 32.0B ± 0% 16.0B ± 0% -50.00% (p=0.000 n=10+10)
SprintfQuoteString-8 48.0B ± 0% 32.0B ± 0% -33.33% (p=0.000 n=10+10)
SprintfInt-8 16.0B ± 0% 1.0B ± 0% -93.75% (p=0.000 n=10+10)
SprintfIntInt-8 24.0B ± 0% 3.0B ± 0% -87.50% (p=0.000 n=10+10)
SprintfPrefixedInt-8 72.0B ± 0% 64.0B ± 0% -11.11% (p=0.000 n=10+10)
SprintfFloat-8 16.0B ± 0% 8.0B ± 0% -50.00% (p=0.000 n=10+10)
SprintfComplex-8 48.0B ± 0% 32.0B ± 0% -33.33% (p=0.000 n=10+10)
SprintfBoolean-8 8.00B ± 0% 4.00B ± 0% -50.00% (p=0.000 n=10+10)
SprintfHexString-8 96.0B ± 0% 80.0B ± 0% -16.67% (p=0.000 n=10+10)
SprintfHexBytes-8 112B ± 0% 112B ± 0% ~ (all equal)
SprintfBytes-8 96.0B ± 0% 96.0B ± 0% ~ (all equal)
SprintfStringer-8 32.0B ± 0% 32.0B ± 0% ~ (all equal)
SprintfStructure-8 256B ± 0% 256B ± 0% ~ (all equal)
ManyArgs-8 80.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10)
FprintInt-8 8.00B ± 0% 0.00B -100.00% (p=0.000 n=10+10)
FprintfBytes-8 32.0B ± 0% 32.0B ± 0% ~ (all equal)
FprintIntNoAlloc-8 0.00B 0.00B ~ (all equal)
ScanInts-8 15.2kB ± 0% 15.2kB ± 0% ~ (p=0.248 n=9+10)
ScanRecursiveInt-8 21.6kB ± 0% 21.6kB ± 0% ~ (all equal)
ScanRecursiveIntReaderWrapper-8 21.7kB ± 0% 21.7kB ± 0% ~ (all equal)
name old allocs/op new allocs/op delta
SprintfPadding-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
SprintfEmpty-8 0.00 0.00 ~ (all equal)
SprintfString-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
SprintfTruncateString-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
SprintfQuoteString-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
SprintfInt-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
SprintfIntInt-8 3.00 ± 0% 1.00 ± 0% -66.67% (p=0.000 n=10+10)
SprintfPrefixedInt-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
SprintfFloat-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
SprintfComplex-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
SprintfBoolean-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
SprintfHexString-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
SprintfHexBytes-8 2.00 ± 0% 2.00 ± 0% ~ (all equal)
SprintfBytes-8 2.00 ± 0% 2.00 ± 0% ~ (all equal)
SprintfStringer-8 4.00 ± 0% 4.00 ± 0% ~ (all equal)
SprintfStructure-8 7.00 ± 0% 7.00 ± 0% ~ (all equal)
ManyArgs-8 8.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10)
FprintInt-8 1.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10)
FprintfBytes-8 1.00 ± 0% 1.00 ± 0% ~ (all equal)
FprintIntNoAlloc-8 0.00 0.00 ~ (all equal)
ScanInts-8 1.60k ± 0% 1.60k ± 0% ~ (all equal)
ScanRecursiveInt-8 1.71k ± 0% 1.71k ± 0% ~ (all equal)
ScanRecursiveIntReaderWrapper-8 1.71k ± 0% 1.71k ± 0% ~ (all equal)
Change-Id: I7ba72a25fea4140a0ba40a9f443103ed87cc69b5
Reviewed-on: https://go-review.googlesource.com/35554
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Loop breaking with a counter. Benchmarked (see comments),
eyeball checked for sanity on popular loops. This code
ought to handle loops in general, and properly inserts phi
functions in cases where the earlier version might not have.
Includes test, plus modifications to test/run.go to deal with
timeout and killing looping test. Tests broken by the addition
of extra code (branch frequency and live vars) for added
checks turn the check insertion off.
If GOEXPERIMENT=preemptibleloops, the compiler inserts reschedule
checks on every backedge of every reducible loop. Alternately,
specifying GO_GCFLAGS=-d=ssa/insert_resched_checks/on will
enable it for a single compilation, but because the core Go
libraries contain some loops that may run long, this is less
likely to have the desired effect.
This is intended as a tool to help in the study and diagnosis
of GC and other latency problems, now that goal STW GC latency
is on the order of 100 microseconds or less.
Updates #17831.
Updates #10958.
Change-Id: I6206c163a5b0248e3f21eb4fc65f73a179e1f639
Reviewed-on: https://go-review.googlesource.com/33910
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
This is an extension of
https://go-review.googlesource.com/c/31662/
to mark all the temporaries, not just the ssa-generated ones.
Before-and-after ls -l `go tool -n compile` shows a 3%
reduction in size (or rather, a prior 3% inflation for
failing to filter temps out properly.)
Replaced name-dependent "is it a temp?" tests with calls to
*Node.IsAutoTmp(), which depends on AutoTemp. Also replace
calls to istemp(n) with n.IsAutoTmp(), to reduce duplication
and clean up function name space. Generated temporaries
now come with a "." prefix to avoid (apparently harmless)
clashes with legal Go variable names.
Fixes #17644.
Fixes #17240.
Change-Id: If1417f29c79a7275d7303ddf859b51472890fd43
Reviewed-on: https://go-review.googlesource.com/32255
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
|
|
To compile:
m[k] = v
instead of:
mapassign(maptype, m, &k, &v), do
do:
*mapassign(maptype, m, &k) = v
mapassign returns a pointer to the value slot in the map. It is just
like mapaccess except that it will allocate a new slot if k is not
already present in the map.
This makes map accesses faster but potentially larger (codewise).
It is faster because the write into the map is done when the compiler
knows the concrete type, so it can be done with a few store
instructions instead of calling typedmemmove. We also potentially
avoid stack temporaries to hold v.
The code can be larger when the map has pointers in its value type,
since there is a write barrier call in addition to the mapassign call.
That makes the code at the callsite a bit bigger (go binary is 0.3%
bigger).
This CL is in preparation for doing operations like m[k] += v with
only a single runtime call. That will roughly double the speed of
such operations.
Update #17133
Update #5147
Change-Id: Ia435f032090a2ed905dac9234e693972fe8c2dc5
Reviewed-on: https://go-review.googlesource.com/30815
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Update gc liveness to remove special conservative treatment
of ambiguously live vars, since there is no longer a need to
protect against GCDEBUG=gcdead.
Change-Id: Id6e2d03218f7d67911e8436d283005a124e6957f
Reviewed-on: https://go-review.googlesource.com/24896
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
|
|
Add an "errorcheckwithauto" action which performs error check
including lines with auto-generated functions (excluded by
default). Comment "// ERRORAUTO" matches these lines.
Add testcase for CL 29570 (as an example).
Updates #16016, #17186.
Change-Id: Iaba3727336cd602f3dda6b9e5f97dafe0848e632
Reviewed-on: https://go-review.googlesource.com/29652
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|