Age | Commit message (Collapse) | Author |
|
Don't add them to files in vendor and cmd/vendor though. These will be
pulled in by updating the respective dependencies.
For #41184
Change-Id: Icc57458c9b3033c347124323f33084c85b224c70
Reviewed-on: https://go-review.googlesource.com/c/go/+/319389
Trust: Tobias Klauser <tobias.klauser@gmail.com>
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
The functions SwapPointer and CompareAndSwapPointer can be used to
interact with unsafe.Pointer, however generally it is prefered to work
with Value, due to its safer interface. As such, they have been added
along with glue logic to maintain invariants Value guarantees.
To meet these guarantees, the current implementation duplicates much of
the Store function. Some of this is due to inexperience with concurrency
and desire for correctness, but the lack of generic programming
functionality does not help.
Fixes #39351
Change-Id: I1aa394b1e70944736ac1e19de49fe861e1e46fba
Reviewed-on: https://go-review.googlesource.com/c/go/+/241678
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
This reverts CL 303949.
Reason for revert: broke linux-arm-aws TryBots.
Change-Id: Ib44949df70520cdabff857846be0d2221403d2f4
Reviewed-on: https://go-review.googlesource.com/c/go/+/313630
Trust: Bryan C. Mills <bcmills@google.com>
Run-TryBot: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
|
|
This CL provide abilty to randomly select P to steal object from its
shared queue. In order to provide such ability randomOrder structure
was copied from runtime/proc.go.
It should reduce contention in firsts Ps and improve balance of object
stealing across all Ps. Also, the patch provides new benchmark
PoolStarvation which force Ps to steal objects.
Benchmarks:
name old time/op new time/op delta
Pool-8 2.16ns ±14% 2.14ns ±16% ~ (p=0.425 n=10+10)
PoolOverflow-8 489ns ± 0% 489ns ± 0% ~ (p=0.719 n=9+10)
PoolStarvation-8 7.00µs ± 4% 6.59µs ± 2% -5.86% (p=0.000 n=10+10)
PoolSTW-8 15.1µs ± 1% 15.2µs ± 1% +0.99% (p=0.001 n=10+10)
PoolExpensiveNew-8 1.25ms ±10% 1.31ms ± 9% ~ (p=0.143 n=10+10)
[Geo mean] 2.68µs 2.68µs -0.28%
name old p50-ns/STW new p50-ns/STW delta
PoolSTW-8 15.0k ± 1% 15.1k ± 1% +0.92% (p=0.000 n=10+10)
name old p95-ns/STW new p95-ns/STW delta
PoolSTW-8 16.2k ± 3% 16.4k ± 2% ~ (p=0.143 n=10+10)
name old GCs/op new GCs/op delta
PoolExpensiveNew-8 0.29 ± 2% 0.30 ± 1% +2.84% (p=0.000 n=8+10)
name old New/op new New/op delta
PoolExpensiveNew-8 8.07 ±11% 8.49 ±10% ~ (p=0.123 n=10+10)
Change-Id: I3ca1d0bf1f358b1148c58e64740fb2d5bfc0bc02
Reviewed-on: https://go-review.googlesource.com/c/go/+/303949
Reviewed-by: David Chase <drchase@google.com>
Trust: Emmanuel Odeke <emmanuel@orijtech.com>
|
|
As discussed in: https://github.com/golang/go/issues/45429, about entry
type comments, it is possible for p == nil when m.dirty != nil, so
update the commemt about it.
Fixes #45429
Change-Id: I7ef96ee5b6948df9ac736481d177a59ab66d7d4d
GitHub-Last-Rev: 202c598a0ab98f4634cb56fe2486e8e82f9d991f
GitHub-Pull-Request: golang/go#45443
Reviewed-on: https://go-review.googlesource.com/c/go/+/308292
Reviewed-by: Changkun Ou <euryugasaki@gmail.com>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
Run-TryBot: Bryan C. Mills <bcmills@google.com>
Trust: Robert Findley <rfindley@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
|
|
Make all our package sources use Go 1.17 gofmt format
(adding //go:build lines).
Part of //go:build change (#41184).
See https://golang.org/design/draft-gobuild
Change-Id: Ia0534360e4957e58cd9a18429c39d0e32a6addb4
Reviewed-on: https://go-review.googlesource.com/c/go/+/294430
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
This aligns the naming with GOARCH using 386 as a build target for
this architecture and makes it more easily found when searching
for documentation related to the build target.
Change-Id: I393bb89dd2f71e568124107b13e1b288fbd0c76a
Reviewed-on: https://go-review.googlesource.com/c/go/+/271988
Trust: Martin Möhrmann <moehrmann@google.com>
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
RWMutex provides explicit acquire/release synchronization events to the
race detector to model the mutex. It disables sync events within the
methods to avoid e.g., the atomics from adding false synchronization
events, which could cause false negatives in the race detector.
Change-Id: I5126ce2efaab151811ac264864aab1fa025a4aaf
Reviewed-on: https://go-review.googlesource.com/c/go/+/270865
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Trust: Michael Pratt <mpratt@google.com>
|
|
CL 253748 introduced a special case in cmd/go to allow sync to import
runtime/internal/atomic. Besides introducing unnecessary complexity
into cmd/go, this breaks other packages (like gopls) that understand
how imports work, but don't understand this special case.
Fix this by using the more standard linkname-based approach to pull
the necessary functions from runtime/internal/atomic into sync. Since
these are compiler intrinsics, we also have to tell the compiler that
the linknamed symbols are intrinsics to get this optimization in sync.
Fixes #42196.
Change-Id: I1f91498c255c91583950886a89c3c9adc39a32f0
Reviewed-on: https://go-review.googlesource.com/c/go/+/265124
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Paul Murphy <murp@ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
|
|
Fixes #42160.
Change-Id: I9bf8b6f0bf1eccd3ab32cbd94c812f768746d291
Reviewed-on: https://go-review.googlesource.com/c/go/+/264557
Trust: Dmitri Shuralyov <dmitshur@golang.org>
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
|
|
Add an internal atomic intrinsic for load with acquire semantics
(extending LoadAcq to 64b) and add LoadAcquintptr for internal
use within the sync package. For other arches, this remaps to the
appropriate atomic.Load{,64} intrinsic which should not alter code
generation.
Similarly, add StoreRel{uintptr,64} for consistency, and inline.
Finally, add an exception to allow sync to directly use the
runtime/internal/atomic package which avoids more convoluted
workarounds (contributed by Lynn Boger).
In an extreme example, sync.(*Pool).pin consumes 20% of wall time
during fmt tests. This is reduced to 5% on ppc64le/power9.
From the fmt benchmarks on ppc64le:
name old time/op new time/op delta
SprintfPadding 468ns ± 0% 451ns ± 0% -3.63%
SprintfEmpty 73.3ns ± 0% 51.9ns ± 0% -29.20%
SprintfString 135ns ± 0% 122ns ± 0% -9.63%
SprintfTruncateString 232ns ± 0% 214ns ± 0% -7.76%
SprintfTruncateBytes 216ns ± 0% 202ns ± 0% -6.48%
SprintfSlowParsingPath 162ns ± 0% 142ns ± 0% -12.35%
SprintfQuoteString 1.00µs ± 0% 0.99µs ± 0% -1.39%
SprintfInt 117ns ± 0% 104ns ± 0% -11.11%
SprintfIntInt 190ns ± 0% 175ns ± 0% -7.89%
SprintfPrefixedInt 232ns ± 0% 212ns ± 0% -8.62%
SprintfFloat 270ns ± 0% 255ns ± 0% -5.56%
SprintfComplex 1.01µs ± 0% 0.99µs ± 0% -1.68%
SprintfBoolean 127ns ± 0% 111ns ± 0% -12.60%
SprintfHexString 220ns ± 0% 198ns ± 0% -10.00%
SprintfHexBytes 261ns ± 0% 252ns ± 0% -3.45%
SprintfBytes 600ns ± 0% 590ns ± 0% -1.67%
SprintfStringer 684ns ± 0% 658ns ± 0% -3.80%
SprintfStructure 2.57µs ± 0% 2.57µs ± 0% -0.12%
ManyArgs 669ns ± 0% 646ns ± 0% -3.44%
FprintInt 140ns ± 0% 136ns ± 0% -2.86%
FprintfBytes 184ns ± 0% 181ns ± 0% -1.63%
FprintIntNoAlloc 140ns ± 0% 136ns ± 0% -2.86%
ScanInts 929µs ± 0% 921µs ± 0% -0.79%
ScanRecursiveInt 122ms ± 0% 121ms ± 0% -0.11%
ScanRecursiveIntReaderWrapper 122ms ± 0% 122ms ± 0% -0.18%
Change-Id: I4d66780261b57b06ef600229e475462e7313f0d6
Reviewed-on: https://go-review.googlesource.com/c/go/+/253748
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Keith Randall <khr@golang.org>
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
|
|
On 386 and arm, unaligned 64-bit atomics aren't safe, so we check for
this and panic. Currently, we panic by dereferencing nil, which may be
expedient but is pretty user-hostile since it gives no hint of what
the actual problem was.
This CL replaces this with an actual panic. The only subtlety here is
now the atomic assembly implementations are calling back into Go, so
they have to play nicely with stack maps and stack scanning. On 386,
this just requires declaring NO_LOCAL_POINTERS. On arm, this is
somewhat more complicated: first, we have to move the alignment check
into the functions that have Go signatures. Then we have to support
both the tail call from these functions to the underlying
implementation (which requires that they have no frame) and the call
into Go to panic (which requires that they have a frame). We resolve
this by forcing them to have no frame and setting up the frame
manually just before the panic call.
Change-Id: I19f1e860045df64088013db37a18acea47342c69
Reviewed-on: https://go-review.googlesource.com/c/go/+/262778
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
.
Change-Id: I26fa26d67d01bcd583a1efaaf9a38398cbf793f7
GitHub-Last-Rev: ded020d02ca2e429f7c31065e5a27dae6eb7a611
GitHub-Pull-Request: golang/go#41932
Reviewed-on: https://go-review.googlesource.com/c/go/+/261477
Trust: Alberto Donizetti <alb.donizetti@gmail.com>
Reviewed-by: Austin Clements <austin@google.com>
|
|
Fixes #40999
Change-Id: Ie32427e5cb5ed512b976b554850f50be156ce9f2
Reviewed-on: https://go-review.googlesource.com/c/go/+/250197
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
|
|
Makes sure the copyright notice is not interpreted as the package level
godoc.
Change-Id: I2afce7c9d620f19d51ec1438b1d0db1774b57146
Reviewed-on: https://go-review.googlesource.com/c/go/+/248760
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
|
|
If the timeout triggers before writing to the done channel, the
goroutine will be blocked waiting for a corresponding read that’s
no longer existent, thus a goroutine leak. This change fixes that by
using a buffered channel instead.
Change-Id: I9cf4067a58bc5a729ab31e4426edd78bd359e8e0
GitHub-Last-Rev: a7d811a7be6d875175a894e53d474aa0034e7d2c
GitHub-Pull-Request: golang/go#40236
Reviewed-on: https://go-review.googlesource.com/c/go/+/242902
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
For #38029
Change-Id: I71de2b66c1de617d32c46d4f2c1866f9ff1756ec
Reviewed-on: https://go-review.googlesource.com/c/go/+/244631
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Dan Scales <danscales@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Ensure mutexes are unlocked right before panics, where defers aren’t easily usable.
Change-Id: I67c9870e7a626f590a8de8df6c8341c5483918dc
GitHub-Last-Rev: bb8ffe538b3956892b55884fd64eadfce326f7b0
GitHub-Pull-Request: golang/go#37143
Reviewed-on: https://go-review.googlesource.com/c/go/+/218717
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
I took some of the infrastructure from Austin's lock logging CR
https://go-review.googlesource.com/c/go/+/192704 (with deadlock
detection from the logs), and developed a setup to give static lock
ranking for runtime locks.
Static lock ranking establishes a documented total ordering among locks,
and then reports an error if the total order is violated. This can
happen if a deadlock happens (by acquiring a sequence of locks in
different orders), or if just one side of a possible deadlock happens.
Lock ordering deadlocks cannot happen as long as the lock ordering is
followed.
Along the way, I found a deadlock involving the new timer code, which Ian fixed
via https://go-review.googlesource.com/c/go/+/207348, as well as two other
potential deadlocks.
See the constants at the top of runtime/lockrank.go to show the static
lock ranking that I ended up with, along with some comments. This is
great documentation of the current intended lock ordering when acquiring
multiple locks in the runtime.
I also added an array lockPartialOrder[] which shows and enforces the
current partial ordering among locks (which is embedded within the total
ordering). This is more specific about the dependencies among locks.
I don't try to check the ranking within a lock class with multiple locks
that can be acquired at the same time (i.e. check the ranking when
multiple hchan locks are acquired).
Currently, I am doing a lockInit() call to set the lock rank of most
locks. Any lock that is not otherwise initialized is assumed to be a
leaf lock (a very high rank lock), so that eliminates the need to do
anything for a bunch of locks (including all architecture-dependent
locks). For two locks, root.lock and notifyList.lock (only in the
runtime/sema.go file), it is not as easy to do lock initialization, so
instead, I am passing the lock rank with the lock calls.
For Windows compilation, I needed to increase the StackGuard size from
896 to 928 because of the new lock-rank checking functions.
Checking of the static lock ranking is enabled by setting
GOEXPERIMENT=staticlockranking before doing a run.
To make sure that the static lock ranking code has no overhead in memory
or CPU when not enabled by GOEXPERIMENT, I changed 'go build/install' so
that it defines a build tag (with the same name) whenever any experiment
has been baked into the toolchain (by checking Expstring()). This allows
me to avoid increasing the size of the 'mutex' type when static lock
ranking is not enabled.
Fixes #38029
Change-Id: I154217ff307c47051f8dae9c2a03b53081acd83a
Reviewed-on: https://go-review.googlesource.com/c/go/+/207619
Reviewed-by: Dan Scales <danscales@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Dan Scales <danscales@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
It plays way too loose with unsafe.Pointer rules.
It runs afoul of the checkptr rules, so some race detector builds
were failing.
Fixes #38210
Change-Id: I5e1c78201d06295524fdedb3fe5b49d61446f443
Reviewed-on: https://go-review.googlesource.com/c/go/+/226880
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
|
|
The func has been unused since https://golang.org/cl/93637 in 2018.
Change-Id: I1cab6f265aa5058ac080fd7c7cbf0fe85370f073
Reviewed-on: https://go-review.googlesource.com/c/go/+/224077
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matt Layher <mdlayher@gmail.com>
Reviewed-by: Austin Clements <austin@google.com>
|
|
This CL changes some unit test functions, making sure that these tests (and goroutines spawned during test) won't block.
Since they are just test functions, I use one CL to fix them all. I hope this won't cause trouble to reviewers and can save time for us.
There are three main categories of incorrect logic fixed by this CL:
1. Use testing.Fatal()/Fatalf() in spawned goroutines, which is forbidden by Go's document.
2. Channels are used in such a way that, when errors or timeout happen, the test will be blocked and never return.
3. Channels are used in such a way that, when errors or timeout happen, the test can return but some spawned goroutines will be leaked, occupying resource until all other tests return and the process is killed.
Change-Id: I3df931ec380794a0cf1404e632c1dd57c65d63e8
Reviewed-on: https://go-review.googlesource.com/c/go/+/219380
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
This CL implements a LoadAndDelete method in sync.Map. Benchmark:
name time/op
LoadAndDeleteBalanced/*sync_test.RWMutexMap-12 98.8ns ± 1%
LoadAndDeleteBalanced/*sync.Map-12 10.3ns ±11%
LoadAndDeleteUnique/*sync_test.RWMutexMap-12 99.2ns ± 2%
LoadAndDeleteUnique/*sync.Map-12 6.63ns ±10%
LoadAndDeleteCollision/*sync_test.DeepCopyMap-12 140ns ± 0%
LoadAndDeleteCollision/*sync_test.RWMutexMap-12 75.2ns ± 2%
LoadAndDeleteCollision/*sync.Map-12 5.21ns ± 5%
In addition, Delete is bounded and more efficient if many collisions:
DeleteCollision/*sync_test.DeepCopyMap-12 120ns ± 2% 125ns ± 1% +3.80% (p=0.000 n=10+9)
DeleteCollision/*sync_test.RWMutexMap-12 73.5ns ± 3% 79.5ns ± 1% +8.03% (p=0.000 n=10+9)
DeleteCollision/*sync.Map-12 97.8ns ± 3% 5.9ns ± 4% -94.00% (p=0.000 n=10+10)
Fixes #33762
Change-Id: Ic8469a7861d27ab0edeface0078aad8af9b26c2f
Reviewed-on: https://go-review.googlesource.com/c/go/+/205899
Reviewed-by: Bryan C. Mills <bcmills@google.com>
Run-TryBot: Bryan C. Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
If one of the helper goroutine panics, the main goroutine call to Wait
may hang forever waiting for something to call Done. Put that call in
a goroutine like the others.
Fixes #35774
Change-Id: I8d2b58d8f473644a49a95338f70111d4e6ed4e12
Reviewed-on: https://go-review.googlesource.com/c/go/+/210218
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
|
|
When we have already assigned the semaphore ticket to a specific
waiter, we want to get the waiter running as fast as possible since
no other G waiting on the semaphore can acquire it optimistically.
The net effect is that, when a sync.Mutex is contended, the code in
the critical section guarded by the Mutex gets a priority boost.
Fixes #33747
The original work was done in CL 200577 by Carlo Alberto Ferraris. The
change was reverted in CL 205817 because it broke the linux-arm64-packet
and solaris-amd64-oraclerel builders.
Change-Id: I76d79b1d63fd206ed1c57fe6900cb7ae9e4d46cb
Reviewed-on: https://go-review.googlesource.com/c/go/+/206180
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
This reverts CL 200577.
Reason for revert: broke linux-arm64-packet and solaris-amd64-oraclerel builders
Fixes #35424
Updates #33747
Change-Id: I2575fd84d37995d458183caae54704f15d8b8426
Reviewed-on: https://go-review.googlesource.com/c/go/+/205817
Run-TryBot: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
When we have already assigned the semaphore ticket to a specific
waiter, we want to get the waiter running as fast as possible since
no other G waiting on the semaphore can acquire it optimistically.
The net effect is that, when a sync.Mutex is contented, the code in
the critical section guarded by the Mutex gets a priority boost.
Fixes #33747
Change-Id: I9967f0f763c25504010651bdd7f944ee0189cd45
Reviewed-on: https://go-review.googlesource.com/c/go/+/200577
Reviewed-by: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
This test could be updated to use unsafe.Pointer arithmetic properly
(e.g., see discussion at #34972), but it doesn't seem worthwhile. The
test is just checking that LoadPointer and StorePointer are atomic.
Updates #34972.
Change-Id: I85a8d610c1766cd63136cae686aa8a240a362a18
Reviewed-on: https://go-review.googlesource.com/c/go/+/202597
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Part 1: CL 199499 (GOOS nacl)
Part 2: CL 200077 (amd64p32 files, toolchain)
Part 3: stuff that arguably should've been part of Part 2, but I forgot
one of my grep patterns when splitting the original CL up into
two parts.
This one might also have interesting stuff to resurrect for any future
x32 ABI support.
Updates #30439
Change-Id: I2b4143374a253a003666f3c69e776b7e456bdb9c
Reviewed-on: https://go-review.googlesource.com/c/go/+/200318
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
It's not correct to use atomic.CompareAndSwap to implement Once.Do,
and we don't, but why we don't is a question that has come up
twice on golang-dev in the past few months.
Add a comment to help others with the same question.
Change-Id: Ia89ec9715cc5442c6e7f13e57a49c6cfe664d32c
Reviewed-on: https://go-review.googlesource.com/c/go/+/184261
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Ingo Oeser <nightlyone@googlemail.com>
|
|
In TestPoolDequeue it's surprisingly common for the queue to stay
nearly empty the whole time and for a racing PopTail to happen in the
window between the producer doing a PushHead and doing a PopHead. In
short mode, there are only 100 PopTail attempts. On linux/amd64, it's
not uncommon for this to fail 50% of the time. On linux/arm64, it's
not uncommon for this to fail 100% of the time, causing the test to
fail.
This CL fixes this by only checking for a successful PopTail in long
mode. Long mode makes 200,000 PopTail attempts, and has never been
observed to fail.
Fixes #31422.
Change-Id: If464d55eb94fcb0b8d78fbc441d35be9f056a290
Reviewed-on: https://go-review.googlesource.com/c/go/+/183981
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
|
|
TestPoolDequeue creates P-1 consumer goroutines and 1 producer
goroutine. Currently, if a consumer goroutine pops the last value from
the dequeue, it sets a flag that stops all consumers, but the producer
also periodically pops from the dequeue and doesn't set this flag.
Hence, if the producer were to pop the last element, the consumers
will continue to run and the test won't terminate. This CL fixes this
by also setting the termination flag in the producer.
I believe it's impossible for this to happen right now because the
producer only pops after pushing an element for which j%10==0 and the
last element is either 999 or 1999999, which means it should never try
to pop after pushing the last element. However, we shouldn't depend on
this reasoning.
Change-Id: Icd2bc8d7cb9a79ebfcec99e367c8a2ba76e027d8
Reviewed-on: https://go-review.googlesource.com/c/go/+/183980
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
|
|
TestPoolDequeue in long mode does a little more than 1<<21 pushes.
This was originally because the head and tail indexes were 21 bits and
the intent was to test wrap-around behavior. However, in the final
version they were both 32 bits, so the test no longer tested
wrap-around.
It would take too long to reach 32-bit wrap around in a test, so
instead we initialize the poolDequeue with indexes that are already
nearly at their limit. This keeps the knowledge of the maximum index
in one place, and lets us test wrap-around even in short mode.
Change-Id: Ib9b8d85b6d9b5be235849c2b32e81c809e806579
Reviewed-on: https://go-review.googlesource.com/c/go/+/183979
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
|
|
Shorten some of the longest tests that run during all.bash.
Removes 7r 50u 21s from all.bash.
After this change, all.bash is under 5 minutes again on my laptop.
For #26473.
Change-Id: Ie0460aa935808d65460408feaed210fbaa1d5d79
Reviewed-on: https://go-review.googlesource.com/c/go/+/177559
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Comment update.
Change-Id: If0d054216f9953f42df04647b85c38008b85b026
GitHub-Last-Rev: 133b4670be6dd1c94d16361c3a7a4bbdf8a355ab
GitHub-Pull-Request: golang/go#31539
Reviewed-on: https://go-review.googlesource.com/c/go/+/172700
Reviewed-by: Austin Clements <austin@google.com>
|
|
Currently, every Pool is cleared completely at the start of each GC.
This is a problem for heavy users of Pool because it causes an
allocation spike immediately after Pools are clear, which impacts both
throughput and latency.
This CL fixes this by introducing a victim cache mechanism. Instead of
clearing Pools, the victim cache is dropped and the primary cache is
moved to the victim cache. As a result, in steady-state, there are
(roughly) no new allocations, but if Pool usage drops, objects will
still be collected within two GCs (as opposed to one).
This victim cache approach also improves Pool's impact on GC dynamics.
The current approach causes all objects in Pools to be short lived.
However, if an application is in steady state and is just going to
repopulate its Pools, then these objects impact the live heap size *as
if* they were long lived. Since Pooled objects count as short lived
when computing the GC trigger and goal, but act as long lived objects
in the live heap, this causes GC to trigger too frequently. If Pooled
objects are a non-trivial portion of an application's heap, this
increases the CPU overhead of GC. The victim cache lets Pooled objects
affect the GC trigger and goal as long-lived objects.
This has no impact on Get/Put performance, but substantially reduces
the impact to the Pool user when a GC happens. PoolExpensiveNew
demonstrates this in the substantially reduction in the rate at which
the "New" function is called.
name old time/op new time/op delta
Pool-12 2.21ns ±36% 2.00ns ± 0% ~ (p=0.070 n=19+16)
PoolOverflow-12 587ns ± 1% 583ns ± 1% -0.77% (p=0.000 n=18+18)
PoolSTW-12 5.57µs ± 3% 4.52µs ± 4% -18.82% (p=0.000 n=20+19)
PoolExpensiveNew-12 3.69ms ± 7% 1.25ms ± 5% -66.25% (p=0.000 n=20+19)
name old p50-ns/STW new p50-ns/STW delta
PoolSTW-12 5.48k ± 2% 4.53k ± 2% -17.32% (p=0.000 n=20+20)
name old p95-ns/STW new p95-ns/STW delta
PoolSTW-12 6.69k ± 4% 5.13k ± 3% -23.31% (p=0.000 n=19+18)
name old GCs/op new GCs/op delta
PoolExpensiveNew-12 0.39 ± 1% 0.32 ± 2% -17.95% (p=0.000 n=18+20)
name old New/op new New/op delta
PoolExpensiveNew-12 40.0 ± 6% 12.4 ± 6% -68.91% (p=0.000 n=20+19)
(https://perf.golang.org/search?q=upload:20190311.2)
Fixes #22950.
Change-Id: If2e183d948c650417283076aacc20739682cdd70
Reviewed-on: https://go-review.googlesource.com/c/go/+/166961
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
Currently, Pool stores each per-P shard's overflow in a slice
protected by a Mutex. In order to store to the overflow or steal from
another shard, a P must lock that shard's Mutex. This allows for
simple synchronization between Put and Get, but has unfortunate
consequences for clearing pools.
Pools are cleared during STW sweep termination, and hence rely on
pinning a goroutine to its P to synchronize between Get/Put and
clearing. This makes the Get/Put fast path extremely fast because it
can rely on quiescence-style coordination, which doesn't even require
atomic writes, much less locking.
The catch is that a goroutine cannot acquire a Mutex while pinned to
its P (as this could deadlock). Hence, it must drop the pin on the
slow path. But this means the slow path is not synchronized with
clearing. As a result,
1) It's difficult to reason about races between clearing and the slow
path. Furthermore, this reasoning often depends on unspecified nuances
of where preemption points can occur.
2) Clearing must zero out the pointer to every object in every Pool to
prevent a concurrent slow path from causing all objects to be
retained. Since this happens during STW, this has an O(# objects in
Pools) effect on STW time.
3) We can't implement a victim cache without making clearing even
slower.
This CL solves these problems by replacing the locked overflow slice
with a lock-free structure. This allows Gets and Puts to be pinned the
whole time they're manipulating the shards slice (Pool.local), which
eliminates the races between Get/Put and clearing. This, in turn,
eliminates the need to zero all object pointers, reducing clearing to
O(# of Pools) during STW.
In addition to significantly reducing STW impact, this also happens to
speed up the Get/Put fast-path and the slow path. It somewhat
increases the cost of PoolExpensiveNew, but we'll fix that in the next
CL.
name old time/op new time/op delta
Pool-12 3.00ns ± 0% 2.21ns ±36% -26.32% (p=0.000 n=18+19)
PoolOverflow-12 600ns ± 1% 587ns ± 1% -2.21% (p=0.000 n=16+18)
PoolSTW-12 71.0µs ± 2% 5.6µs ± 3% -92.15% (p=0.000 n=20+20)
PoolExpensiveNew-12 3.14ms ± 5% 3.69ms ± 7% +17.67% (p=0.000 n=19+20)
name old p50-ns/STW new p50-ns/STW delta
PoolSTW-12 70.7k ± 1% 5.5k ± 2% -92.25% (p=0.000 n=20+20)
name old p95-ns/STW new p95-ns/STW delta
PoolSTW-12 73.1k ± 2% 6.7k ± 4% -90.86% (p=0.000 n=18+19)
name old GCs/op new GCs/op delta
PoolExpensiveNew-12 0.38 ± 1% 0.39 ± 1% +2.07% (p=0.000 n=20+18)
name old New/op new New/op delta
PoolExpensiveNew-12 33.9 ± 6% 40.0 ± 6% +17.97% (p=0.000 n=19+20)
(https://perf.golang.org/search?q=upload:20190311.1)
Fixes #22331.
For #22950.
Change-Id: Ic5cd826e25e218f3f8256dbc4d22835c1fecb391
Reviewed-on: https://go-review.googlesource.com/c/go/+/166960
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
This adds two benchmarks that will highlight two problems in Pool that
we're about to address.
The first benchmark measures the impact of large Pools on GC STW time.
Currently, STW time is O(# of items in Pools), and this benchmark
demonstrates 70µs STW times.
The second benchmark measures the impact of fully clearing all Pools
on each GC. Typically this is a problem in heavily-loaded systems
because it causes a spike in allocation. This benchmark stresses this
by simulating an expensive "New" function, so the cost of creating new
objects is reflected in the ns/op of the benchmark.
For #22950, #22331.
Change-Id: I0c8853190d23144026fa11837b6bf42adc461722
Reviewed-on: https://go-review.googlesource.com/c/go/+/166959
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
This adds a dynamically sized, lock-free, single-producer,
multi-consumer queue that will be used in the new Pool stealing
implementation. It's built on top of the fixed-size queue added in the
previous CL.
For #22950, #22331.
Change-Id: Ifc0ca3895bec7e7f9289ba9fb7dd0332bf96ba5a
Reviewed-on: https://go-review.googlesource.com/c/go/+/166958
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
This is the first step toward fixing multiple issues with sync.Pool.
This adds a fixed size, lock-free, single-producer, multi-consumer
queue that will be used in the new Pool stealing implementation.
For #22950, #22331.
Change-Id: I50e85e3cb83a2ee71f611ada88e7f55996504bb5
Reviewed-on: https://go-review.googlesource.com/c/go/+/166957
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
RWMutex.RLock is already inlineable, so add a test for it as well.
name old time/op new time/op delta
RWMutexUncontended 66.5ns ± 0% 60.3ns ± 1% -9.38% (p=0.000 n=12+20)
RWMutexUncontended-4 16.7ns ± 0% 15.3ns ± 1% -8.49% (p=0.000 n=17+20)
RWMutexUncontended-16 7.86ns ± 0% 7.69ns ± 0% -2.08% (p=0.000 n=18+15)
RWMutexWrite100 25.1ns ± 0% 24.0ns ± 1% -4.28% (p=0.000 n=20+18)
RWMutexWrite100-4 46.7ns ± 5% 44.1ns ± 4% -5.53% (p=0.000 n=20+20)
RWMutexWrite100-16 68.3ns ±11% 65.7ns ± 8% -3.81% (p=0.003 n=20+20)
RWMutexWrite10 26.7ns ± 1% 25.7ns ± 0% -3.75% (p=0.000 n=17+14)
RWMutexWrite10-4 34.9ns ± 2% 33.8ns ± 2% -3.15% (p=0.000 n=20+20)
RWMutexWrite10-16 37.4ns ± 2% 36.1ns ± 2% -3.51% (p=0.000 n=18+20)
RWMutexWorkWrite100 163ns ± 0% 162ns ± 0% -0.89% (p=0.000 n=18+20)
RWMutexWorkWrite100-4 189ns ± 4% 184ns ± 4% -2.89% (p=0.000 n=19+20)
RWMutexWorkWrite100-16 207ns ± 4% 200ns ± 2% -3.07% (p=0.000 n=19+20)
RWMutexWorkWrite10 153ns ± 0% 151ns ± 1% -0.75% (p=0.000 n=20+20)
RWMutexWorkWrite10-4 177ns ± 1% 176ns ± 2% -0.63% (p=0.004 n=17+20)
RWMutexWorkWrite10-16 191ns ± 2% 189ns ± 1% -0.83% (p=0.015 n=20+17)
linux/amd64 bin/go 14688201 (previous commit 14675861, +12340/+0.08%)
The cumulative effect of this and the previous 3 commits is:
name old time/op new time/op delta
MutexUncontended 19.3ns ± 1% 16.4ns ± 1% -15.13% (p=0.000 n=20+20)
MutexUncontended-4 5.24ns ± 0% 4.09ns ± 0% -21.95% (p=0.000 n=20+18)
MutexUncontended-16 2.10ns ± 0% 2.12ns ± 0% +0.95% (p=0.000 n=15+17)
Mutex 19.6ns ± 0% 16.3ns ± 1% -17.12% (p=0.000 n=20+20)
Mutex-4 54.6ns ± 5% 45.6ns ±10% -16.51% (p=0.000 n=20+19)
Mutex-16 133ns ± 5% 130ns ± 3% -1.99% (p=0.002 n=20+20)
MutexSlack 33.4ns ± 2% 16.2ns ± 0% -51.44% (p=0.000 n=19+20)
MutexSlack-4 206ns ± 5% 209ns ± 9% ~ (p=0.154 n=20+20)
MutexSlack-16 89.4ns ± 1% 90.9ns ± 2% +1.70% (p=0.000 n=18+17)
MutexWork 60.5ns ± 0% 55.3ns ± 1% -8.59% (p=0.000 n=12+20)
MutexWork-4 105ns ± 5% 97ns ±11% -7.95% (p=0.000 n=20+20)
MutexWork-16 157ns ± 1% 158ns ± 1% +0.66% (p=0.001 n=18+17)
MutexWorkSlack 70.2ns ± 5% 55.3ns ± 0% -21.30% (p=0.000 n=19+18)
MutexWorkSlack-4 277ns ±13% 260ns ±15% -6.35% (p=0.002 n=20+18)
MutexWorkSlack-16 156ns ± 0% 146ns ± 1% -6.40% (p=0.000 n=16+19)
MutexNoSpin 966ns ± 0% 976ns ± 1% +0.97% (p=0.000 n=15+17)
MutexNoSpin-4 269ns ± 4% 272ns ± 4% +1.15% (p=0.048 n=20+18)
MutexNoSpin-16 122ns ± 0% 119ns ± 1% -2.63% (p=0.000 n=19+15)
MutexSpin 3.13µs ± 0% 3.12µs ± 0% -0.17% (p=0.000 n=18+18)
MutexSpin-4 826ns ± 1% 833ns ± 1% +0.84% (p=0.000 n=19+17)
MutexSpin-16 397ns ± 1% 394ns ± 1% -0.78% (p=0.000 n=19+19)
Once 5.67ns ± 0% 2.07ns ± 2% -63.43% (p=0.000 n=20+20)
Once-4 1.47ns ± 2% 0.54ns ± 3% -63.49% (p=0.000 n=19+20)
Once-16 0.58ns ± 0% 0.17ns ± 5% -70.49% (p=0.000 n=17+17)
RWMutexUncontended 71.4ns ± 0% 60.3ns ± 1% -15.60% (p=0.000 n=16+20)
RWMutexUncontended-4 18.4ns ± 4% 15.3ns ± 1% -17.14% (p=0.000 n=20+20)
RWMutexUncontended-16 8.01ns ± 0% 7.69ns ± 0% -3.91% (p=0.000 n=18+15)
RWMutexWrite100 24.9ns ± 0% 24.0ns ± 1% -3.57% (p=0.000 n=19+18)
RWMutexWrite100-4 46.5ns ± 3% 44.1ns ± 4% -5.09% (p=0.000 n=17+20)
RWMutexWrite100-16 68.9ns ± 3% 65.7ns ± 8% -4.65% (p=0.000 n=18+20)
RWMutexWrite10 27.1ns ± 0% 25.7ns ± 0% -5.25% (p=0.000 n=17+14)
RWMutexWrite10-4 34.8ns ± 1% 33.8ns ± 2% -2.96% (p=0.000 n=20+20)
RWMutexWrite10-16 37.5ns ± 2% 36.1ns ± 2% -3.72% (p=0.000 n=20+20)
RWMutexWorkWrite100 164ns ± 0% 162ns ± 0% -1.49% (p=0.000 n=12+20)
RWMutexWorkWrite100-4 186ns ± 3% 184ns ± 4% ~ (p=0.097 n=20+20)
RWMutexWorkWrite100-16 204ns ± 2% 200ns ± 2% -1.58% (p=0.000 n=18+20)
RWMutexWorkWrite10 153ns ± 0% 151ns ± 1% -1.21% (p=0.000 n=20+20)
RWMutexWorkWrite10-4 179ns ± 1% 176ns ± 2% -1.25% (p=0.000 n=19+20)
RWMutexWorkWrite10-16 191ns ± 1% 189ns ± 1% -0.94% (p=0.000 n=15+17)
Change-Id: I9269bf2ac42a04c610624f707d3268dcb17390f8
Reviewed-on: https://go-review.googlesource.com/c/go/+/152698
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
Using Once.Do is now extremely cheap because the fast path is just an inlined
atomic load of a variable that is written only once and a conditional jump.
This is very beneficial for Once.Do because, due to its nature, the fast path
will be used for every call after the first one.
In a attempt to mimize code size increase, reorder the fields so that the
pointer to Once is also the pointer to Once.done, that is the only field used
in the hot path. This allows to use more compact instruction encodings or less
instructions in the hot path (that is inlined at every callsite).
name old time/op new time/op delta
Once 4.54ns ± 0% 2.06ns ± 0% -54.59% (p=0.000 n=19+16)
Once-4 1.18ns ± 0% 0.55ns ± 0% -53.39% (p=0.000 n=15+16)
Once-16 0.53ns ± 0% 0.17ns ± 0% -67.92% (p=0.000 n=18+17)
linux/amd64 bin/go 14675861 (previous commit 14663387, +12474/+0.09%)
Change-Id: Ie2708103ab473787875d66746d2f20f1d90a6916
Reviewed-on: https://go-review.googlesource.com/c/go/+/152697
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
name old time/op new time/op delta
MutexUncontended 18.9ns ± 0% 16.2ns ± 0% -14.29% (p=0.000 n=19+19)
MutexUncontended-4 4.75ns ± 1% 4.08ns ± 0% -14.20% (p=0.000 n=20+19)
MutexUncontended-16 2.05ns ± 0% 2.11ns ± 0% +2.93% (p=0.000 n=19+16)
Mutex 19.3ns ± 1% 16.2ns ± 0% -15.86% (p=0.000 n=17+19)
Mutex-4 52.4ns ± 4% 48.6ns ± 9% -7.22% (p=0.000 n=20+20)
Mutex-16 139ns ± 2% 140ns ± 3% +1.03% (p=0.011 n=16+20)
MutexSlack 18.9ns ± 1% 16.2ns ± 1% -13.96% (p=0.000 n=20+20)
MutexSlack-4 225ns ± 8% 211ns ±10% -5.94% (p=0.000 n=18+19)
MutexSlack-16 98.4ns ± 1% 90.9ns ± 1% -7.60% (p=0.000 n=17+18)
MutexWork 58.2ns ± 3% 55.4ns ± 0% -4.82% (p=0.000 n=20+17)
MutexWork-4 103ns ± 7% 95ns ±18% -8.03% (p=0.000 n=20+20)
MutexWork-16 163ns ± 2% 155ns ± 2% -4.47% (p=0.000 n=18+18)
MutexWorkSlack 57.7ns ± 1% 55.4ns ± 0% -3.99% (p=0.000 n=20+13)
MutexWorkSlack-4 276ns ±13% 260ns ±10% -5.64% (p=0.001 n=19+19)
MutexWorkSlack-16 147ns ± 0% 156ns ± 1% +5.87% (p=0.000 n=14+19)
MutexNoSpin 968ns ± 0% 900ns ± 1% -6.98% (p=0.000 n=20+18)
MutexNoSpin-4 270ns ± 2% 255ns ± 2% -5.74% (p=0.000 n=19+20)
MutexNoSpin-16 120ns ± 4% 112ns ± 0% -6.99% (p=0.000 n=19+14)
MutexSpin 3.13µs ± 1% 3.19µs ± 6% ~ (p=0.401 n=20+20)
MutexSpin-4 832ns ± 2% 831ns ± 1% -0.17% (p=0.023 n=16+18)
MutexSpin-16 395ns ± 0% 399ns ± 0% +0.94% (p=0.000 n=17+19)
RWMutexUncontended 69.5ns ± 0% 68.4ns ± 0% -1.59% (p=0.000 n=20+20)
RWMutexUncontended-4 17.5ns ± 0% 16.7ns ± 0% -4.30% (p=0.000 n=18+17)
RWMutexUncontended-16 7.92ns ± 0% 7.87ns ± 0% -0.61% (p=0.000 n=18+17)
RWMutexWrite100 24.9ns ± 1% 25.0ns ± 1% +0.32% (p=0.000 n=20+20)
RWMutexWrite100-4 46.2ns ± 4% 46.2ns ± 5% ~ (p=0.840 n=19+20)
RWMutexWrite100-16 69.9ns ± 5% 69.9ns ± 3% ~ (p=0.545 n=20+19)
RWMutexWrite10 27.0ns ± 2% 26.8ns ± 2% -0.98% (p=0.001 n=20+20)
RWMutexWrite10-4 34.7ns ± 2% 35.0ns ± 4% ~ (p=0.191 n=18+20)
RWMutexWrite10-16 37.2ns ± 4% 37.3ns ± 2% ~ (p=0.438 n=20+19)
RWMutexWorkWrite100 164ns ± 0% 163ns ± 0% -0.24% (p=0.025 n=20+20)
RWMutexWorkWrite100-4 193ns ± 3% 191ns ± 2% -1.06% (p=0.027 n=20+20)
RWMutexWorkWrite100-16 210ns ± 3% 207ns ± 3% -1.22% (p=0.038 n=20+20)
RWMutexWorkWrite10 153ns ± 0% 153ns ± 0% ~ (all equal)
RWMutexWorkWrite10-4 178ns ± 2% 179ns ± 2% ~ (p=0.186 n=20+20)
RWMutexWorkWrite10-16 192ns ± 2% 192ns ± 2% ~ (p=0.731 n=19+20)
linux/amd64 bin/go 14663387 (previous commit 14630572, +32815/+0.22%)
Change-Id: I98171006dce14069b1a62da07c3d165455a7906b
Reviewed-on: https://go-review.googlesource.com/c/go/+/148959
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Make use of the newly-enabled limited midstack inlining.
Similar changes will be done in followup CLs.
name old time/op new time/op delta
MutexUncontended 19.3ns ± 1% 18.9ns ± 0% -1.92% (p=0.000 n=20+19)
MutexUncontended-4 5.24ns ± 0% 4.75ns ± 1% -9.25% (p=0.000 n=20+20)
MutexUncontended-16 2.10ns ± 0% 2.05ns ± 0% -2.38% (p=0.000 n=15+19)
Mutex 19.6ns ± 0% 19.3ns ± 1% -1.92% (p=0.000 n=20+17)
Mutex-4 54.6ns ± 5% 52.4ns ± 4% -4.09% (p=0.000 n=20+20)
Mutex-16 133ns ± 5% 139ns ± 2% +4.23% (p=0.000 n=20+16)
MutexSlack 33.4ns ± 2% 18.9ns ± 1% -43.56% (p=0.000 n=19+20)
MutexSlack-4 206ns ± 5% 225ns ± 8% +9.12% (p=0.000 n=20+18)
MutexSlack-16 89.4ns ± 1% 98.4ns ± 1% +10.10% (p=0.000 n=18+17)
MutexWork 60.5ns ± 0% 58.2ns ± 3% -3.75% (p=0.000 n=12+20)
MutexWork-4 105ns ± 5% 103ns ± 7% -1.68% (p=0.007 n=20+20)
MutexWork-16 157ns ± 1% 163ns ± 2% +3.90% (p=0.000 n=18+18)
MutexWorkSlack 70.2ns ± 5% 57.7ns ± 1% -17.81% (p=0.000 n=19+20)
MutexWorkSlack-4 277ns ±13% 276ns ±13% ~ (p=0.682 n=20+19)
MutexWorkSlack-16 156ns ± 0% 147ns ± 0% -5.62% (p=0.000 n=16+14)
MutexNoSpin 966ns ± 0% 968ns ± 0% +0.11% (p=0.029 n=15+20)
MutexNoSpin-4 269ns ± 4% 270ns ± 2% ~ (p=0.807 n=20+19)
MutexNoSpin-16 122ns ± 0% 120ns ± 4% -1.63% (p=0.000 n=19+19)
MutexSpin 3.13µs ± 0% 3.13µs ± 1% +0.16% (p=0.004 n=18+20)
MutexSpin-4 826ns ± 1% 832ns ± 2% +0.74% (p=0.000 n=19+16)
MutexSpin-16 397ns ± 1% 395ns ± 0% -0.50% (p=0.000 n=19+17)
RWMutexUncontended 71.4ns ± 0% 69.5ns ± 0% -2.72% (p=0.000 n=16+20)
RWMutexUncontended-4 18.4ns ± 4% 17.5ns ± 0% -4.92% (p=0.000 n=20+18)
RWMutexUncontended-16 8.01ns ± 0% 7.92ns ± 0% -1.15% (p=0.000 n=18+18)
RWMutexWrite100 24.9ns ± 0% 24.9ns ± 1% ~ (p=0.099 n=19+20)
RWMutexWrite100-4 46.5ns ± 3% 46.2ns ± 4% ~ (p=0.253 n=17+19)
RWMutexWrite100-16 68.9ns ± 3% 69.9ns ± 5% +1.46% (p=0.012 n=18+20)
RWMutexWrite10 27.1ns ± 0% 27.0ns ± 2% ~ (p=0.128 n=17+20)
RWMutexWrite10-4 34.8ns ± 1% 34.7ns ± 2% ~ (p=0.180 n=20+18)
RWMutexWrite10-16 37.5ns ± 2% 37.2ns ± 4% -0.89% (p=0.023 n=20+20)
RWMutexWorkWrite100 164ns ± 0% 164ns ± 0% ~ (p=0.106 n=12+20)
RWMutexWorkWrite100-4 186ns ± 3% 193ns ± 3% +3.46% (p=0.000 n=20+20)
RWMutexWorkWrite100-16 204ns ± 2% 210ns ± 3% +2.96% (p=0.000 n=18+20)
RWMutexWorkWrite10 153ns ± 0% 153ns ± 0% -0.20% (p=0.017 n=20+19)
RWMutexWorkWrite10-4 179ns ± 1% 178ns ± 2% ~ (p=0.215 n=19+20)
RWMutexWorkWrite10-16 191ns ± 1% 192ns ± 2% ~ (p=0.166 n=15+19)
linux/amd64 bin/go 14630572 (previous commit 14605947, +24625/+0.17%)
Change-Id: I3f9d1765801fe0b8deb1bc2728b8bba8a7508e23
Reviewed-on: https://go-review.googlesource.com/c/go/+/148958
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
runtime/internal/atomic/atomic_mipsx.go enforces 64-bit alignment.
Change-Id: Ifdc36e1c0322827711425054d10f1c52425a13fa
Reviewed-on: https://go-review.googlesource.com/c/161697
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Go documentation style for boolean funcs is to say:
// Foo reports whether ...
func Foo() bool
(rather than "returns true if")
This CL also replaces 4 uses of "iff" with the same "reports whether"
wording, which doesn't lose any meaning, and will prevent people from
sending typo fixes when they don't realize it's "if and only if". In
the past I think we've had the typo CLs updated to just say "reports
whether". So do them all at once.
(Inspired by the addition of another "returns true if" in CL 146938
in fd_plan9.go)
Created with:
$ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true iff" | grep -v vendor)
$ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true if" | grep -v vendor)
Change-Id: Ided502237f5ab0d25cb625dbab12529c361a8b9f
Reviewed-on: https://go-review.googlesource.com/c/147037
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Change-Id: Ie1f35c7598bd2549a048d64e1b1279bf4acaa103
GitHub-Last-Rev: c8cc7dfef987cbd04f48daabf23efa64c0c67322
GitHub-Pull-Request: golang/go#28051
Reviewed-on: https://go-review.googlesource.com/c/140302
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
|
|
The only change to the go build -gcflags=-m=2 output was to remove
these two lines:
sync/map.go:178:26: &e.p escapes to heap
sync/map.go:178:26: from &e.p (passed to call[argument escapes]) at sync/map.go:178:25
Benchstat report for sync.Map benchmarks:
name old time/op new time/op delta
LoadMostlyHits/*sync_test.DeepCopyMap-12 10.6ns ±11% 10.2ns ± 3% ~ (p=0.299 n=10+8)
LoadMostlyHits/*sync_test.RWMutexMap-12 54.6ns ± 3% 54.6ns ± 2% ~ (p=0.782 n=10+10)
LoadMostlyHits/*sync.Map-12 10.1ns ± 1% 10.1ns ± 1% ~ (p=1.127 n=10+8)
LoadMostlyMisses/*sync_test.DeepCopyMap-12 8.65ns ± 1% 8.77ns ± 5% +1.39% (p=0.017 n=9+10)
LoadMostlyMisses/*sync_test.RWMutexMap-12 53.6ns ± 2% 53.8ns ± 2% ~ (p=0.408 n=10+9)
LoadMostlyMisses/*sync.Map-12 7.37ns ± 1% 7.46ns ± 1% +1.19% (p=0.000 n=9+10)
LoadOrStoreBalanced/*sync_test.RWMutexMap-12 895ns ± 4% 906ns ± 3% ~ (p=0.203 n=9+10)
LoadOrStoreBalanced/*sync.Map-12 872ns ±10% 804ns ±12% -7.75% (p=0.014 n=10+10)
LoadOrStoreUnique/*sync_test.RWMutexMap-12 1.29µs ± 2% 1.28µs ± 1% ~ (p=0.586 n=10+9)
LoadOrStoreUnique/*sync.Map-12 1.30µs ± 7% 1.40µs ± 2% +6.95% (p=0.000 n=9+10)
LoadOrStoreCollision/*sync_test.DeepCopyMap-12 6.98ns ± 1% 6.91ns ± 1% -1.10% (p=0.000 n=10+10)
LoadOrStoreCollision/*sync_test.RWMutexMap-12 371ns ± 1% 372ns ± 2% ~ (p=0.679 n=9+9)
LoadOrStoreCollision/*sync.Map-12 5.49ns ± 1% 5.49ns ± 1% ~ (p=0.732 n=9+10)
Range/*sync_test.DeepCopyMap-12 2.49µs ± 1% 2.50µs ± 0% ~ (p=0.148 n=10+10)
Range/*sync_test.RWMutexMap-12 54.7µs ± 1% 54.6µs ± 3% ~ (p=0.549 n=9+10)
Range/*sync.Map-12 2.74µs ± 1% 2.76µs ± 1% +0.68% (p=0.011 n=10+8)
AdversarialAlloc/*sync_test.DeepCopyMap-12 2.52µs ± 5% 2.54µs ± 7% ~ (p=0.225 n=10+10)
AdversarialAlloc/*sync_test.RWMutexMap-12 108ns ± 1% 107ns ± 1% ~ (p=0.101 n=10+9)
AdversarialAlloc/*sync.Map-12 712ns ± 2% 714ns ± 3% ~ (p=0.984 n=8+10)
AdversarialDelete/*sync_test.DeepCopyMap-12 581ns ± 3% 578ns ± 3% ~ (p=0.781 n=9+9)
AdversarialDelete/*sync_test.RWMutexMap-12 126ns ± 2% 126ns ± 1% ~ (p=0.883 n=10+10)
AdversarialDelete/*sync.Map-12 155ns ± 8% 158ns ± 2% ~ (p=0.158 n=10+9)
Change-Id: I1ed8e3109baca03087d0fad3df769fc7e38f6dbb
Reviewed-on: https://go-review.googlesource.com/137441
Reviewed-by: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Fixes #26165
Change-Id: I1f3bd193af9b6f8461c736330952b6e50d3e00d9
Reviewed-on: https://go-review.googlesource.com/121876
Reviewed-by: Alan Donovan <adonovan@google.com>
Run-TryBot: Rob Pike <r@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Remove unnecessary race.Release annotation from Unlock.
For RWMutex we want to establish the following happens-before (HB) edges:
1. Between Unlock and the subsequent Lock.
2. Between Unlock and the subsequent RLock.
3. Between batch of RUnlock's and the subsequent Lock.
1 is provided by Release(&rw.readerSem) in Unlock and Acquire(&rw.readerSem) in Lock.
2 is provided by Release(&rw.readerSem) in Unlock and Acquire(&rw.readerSem) in RLock.
3 is provided by ReleaseMerge(&rw.writerSem) in RUnlock in Acquire(&rw.writerSem) in Lock,
since we want to establish HB between a batch of RUnlock's this uses ReleaseMerge instead of Release.
Release(&rw.writerSem) in Unlock is simply not needed.
FWIW this is also how C++ tsan handles mutexes, not a proof but at least something.
Making 2 implementations consistent also simplifies any kind of reasoning against both of them.
Since this only affects performance, a reasonable test is not possible.
Everything should just continue to work but slightly faster.
Credit for discovering this goes to Jamie Liu.
Change-Id: Ice37d29ecb7a5faed3f7781c38dd32c7469b2735
Reviewed-on: https://go-review.googlesource.com/120495
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|