Age | Commit message (Collapse) | Author |
|
This change moves certain important but internal-only GC statistics from
memstats into gcController. These statistics are mainly used in pacing
the GC, so it makes sense to keep them in the pacer's state.
This CL was mostly generated via
rf '
ex . {
memstats.gc_trigger -> gcController.trigger
memstats.triggerRatio -> gcController.triggerRatio
memstats.heap_marked -> gcController.heapMarked
memstats.heap_live -> gcController.heapLive
memstats.heap_scan -> gcController.heapScan
}
'
except for a few special cases, like updating names in comments and when
these fields are used within gcControllerState methods (at which point
they're accessed through the reciever).
For #44167.
Change-Id: I6bd1602585aeeb80818ded24c07d8e6fec992b93
Reviewed-on: https://go-review.googlesource.com/c/go/+/306598
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
The runtime currently has two different notions of sweep completion:
1. All spans are either swept or have begun sweeping.
2. The sweeper has *finished* sweeping all spans.
Having both is confusing (it doesn't help that the documentation is
often unclear or wrong). Condition 2 is stronger and the theoretical
slight optimization that condition 1 could impact is never actually
useful. Hence, this CL consolidates both conditions down to condition 2.
Updates #45315.
Change-Id: I55c84d767d74eb31a004a5619eaba2e351162332
Reviewed-on: https://go-review.googlesource.com/c/go/+/307916
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
The runtime currently has two different notions of sweep completion:
1. All spans are either swept or have begun sweeping.
2. The sweeper has *finished* sweeping all spans.
Most things depend on condition 1. Notably, GC correctness depends on
condition 1, but since all sweep operations a non-preemptible, the STW
at the beginning of GC forces condition 1 to become condition 2.
runtime.GC(), however, depends on condition 2, since the intent is to
complete a complete GC cycle, and also update the heap profile (which
can only be done after sweeping is complete).
However, the way we compute condition 2 is racy right now and may in
fact only indicate condition 1. Specifically, sweepone blocks
condition 2 until all sweepone calls are done, but there are many
other ways to enter the sweeper that don't block this. Hence, sweepone
may see that there are no more spans in the sweep list and see that
it's the last sweepone and declare sweeping done, while there's some
other sweeper still working on a span.
Fix this by making sure every entry to the sweeper participates in the
protocol that blocks condition 2. To make sure we get this right, this
CL introduces a type to track sweep blocking and (lightly) enforces
span sweep ownership via the type system. This has the nice
side-effect of abstracting the pattern of acquiring sweep ownership
that's currently repeated in many different places.
Fixes #45315.
Change-Id: I7fab30170c5ae14c8b2f10998628735b8be6d901
Reviewed-on: https://go-review.googlesource.com/c/go/+/307915
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
This CL adds a set of helper functions for testing GC interactions.
These are intended for use in the regabi signature fuzzer, but are
generally useful for GC tests, so we make them generally available to
runtime tests.
These provide:
1. An easy way to force stack movement, for testing stack copying.
2. A simple and robust way to check the reachability of a set of
pointers.
3. A way to check what general category of memory a pointer points to,
mostly so tests can make sure they're testing what they mean to.
For #40724, but generally useful.
Change-Id: I15d33ccb3f5a792c0472a19c2cc9a8b4a9356a66
Reviewed-on: https://go-review.googlesource.com/c/go/+/305330
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
|
|
The specials processing loop in mspan.sweep is about to get more
complicated and I'm too allergic to list manipulation to open code
more of it there.
Change-Id: I767a0889739da85fb2878fc06a5c55b73bf2ba7d
Reviewed-on: https://go-review.googlesource.com/c/go/+/305551
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Adds code to the compiler's "order" phase to rewrite go and defer
statements to always be argument-less. E.g.
defer f(x,y) => x1, y1 := x, y
defer func() { f(x1, y1) }
This transformation is not beneficial on its own, but it helps
simplify runtime defer handling for the new register ABI (when
invoking deferred functions on the panic path, the runtime doesn't
need to manage the complexity of determining which args to pass in
register vs memory).
This feature is currently enabled by default if GOEXPERIMENT=regabi or
GOEXPERIMENT=regabidefer is in effect.
Included in this CL are some workarounds in the runtime to insure that
"go" statement targets in the runtime are argument-less already (since
wrapping them can potentially introduce heap-allocated closures, which
are currently not allowed). The expectation is that these workarounds
will be temporary, and can go away once we either A) change the rules
about heap-allocated closures, or B) implement some other scheme for
handling go statements.
Change-Id: I01060d79a6b140c6f0838d6e6813f807ccdca319
Reviewed-on: https://go-review.googlesource.com/c/go/+/298669
Trust: Than McIntosh <thanm@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
This change modifies the consistent stats implementation to keep the
per-P sequence counter on each P instead of each mcache. A valid mcache
is not available everywhere that we want to call e.g. allocSpan, as per
issue #42339. By decoupling these two, we can add a mechanism to allow
contexts without a P to update stats consistently.
In this CL, we achieve that with a mutex. In practice, it will be very
rare for an M to update these stats without a P. Furthermore, the stats
reader also only needs to hold the mutex across the update to "gen"
since once that changes, writers are free to continue updating the new
stats generation. Contention could thus only arise between writers
without a P, and as mentioned earlier, those should be rare.
A nice side-effect of this change is that the consistent stats acquire
and release API becomes simpler.
Fixes #42339.
Change-Id: Ied74ab256f69abd54b550394c8ad7c4c40a5fe34
Reviewed-on: https://go-review.googlesource.com/c/go/+/267158
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Stopping the world is an implicit lock for many operations, so we should
assert the world is stopped in functions that require it.
This is enabled along with the rest of lock ranking, though it is a bit
orthogonal and likely cheap enough to enable all the time should we
choose.
Requiring a lock _or_ world stop is common, so that can be expressed as
well.
Updates #40677
Change-Id: If0a58544f4251d367f73c4120c9d39974c6cd091
Reviewed-on: https://go-review.googlesource.com/c/go/+/248577
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Trust: Michael Pratt <mpratt@google.com>
|
|
This change moves the mcache-local malloc stats into the
consistentHeapStats structure so the malloc stats can be managed
consistently with the memory stats. The one exception here is
tinyAllocs for which moving that into the global stats would incur
several atomic writes on the fast path. Microbenchmarks for just one CPU
core have shown a 50% loss in throughput. Since tiny allocation counnt
isn't exposed anyway and is always blindly added to both allocs and
frees, let that stay inconsistent and flush the tiny allocation count
every so often.
Change-Id: I2a4b75f209c0e659b9c0db081a3287bf227c10ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/247039
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This change renames a bunch of malloc statistics stored in the mcache
that are all named with the "local_" prefix. It also renames largeAlloc
to allocLarge to prevent a naming conflict, and next_sample because it
would be the last mcache field with the old C naming style.
Change-Id: I29695cb83b397a435ede7e9ad5c3c9be72767ea3
Reviewed-on: https://go-review.googlesource.com/c/go/+/246969
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This change deletes the old mcentral implementation from the code base
and the newMCentralImpl feature flag along with it.
Updates #37487.
Change-Id: Ibca8f722665f0865051f649ffe699cbdbfdcfcf2
Reviewed-on: https://go-review.googlesource.com/c/go/+/221184
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This change removes the old markrootSpans implementation and deletes the
feature flag.
Updates #37487.
Change-Id: Idb5a2559abcc3be5a7da6f2ccce1a86e1d7634e3
Reviewed-on: https://go-review.googlesource.com/c/go/+/221183
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Austin Clements <austin@google.com>
|
|
A zombie slot is a slot that is marked, but isn't allocated. This can
indicate a bug in the GC, or a bad use of unsafe.Pointer. Currently,
the sweeper has best-effort detection for zombie slots: if there are
more marked slots than allocated slots, then there must have been a
zombie slot. However, this is imprecise since it only compares totals
and it reports almost no information that may be helpful to debug the
issue.
Add a precise check that compares the mark and allocation bitmaps and
reports detailed information if it detects a zombie slot.
No appreciable effect on performance as measured by the sweet
benchmarks:
name old time/op new time/op delta
BiogoIgor 15.8s ± 2% 15.8s ± 2% ~ (p=0.421 n=24+25)
BiogoKrishna 15.6s ± 2% 15.8s ± 5% ~ (p=0.082 n=22+23)
BleveIndexBatch100 4.90s ± 3% 4.88s ± 2% ~ (p=0.627 n=25+24)
CompileTemplate 204ms ± 1% 205ms ± 0% +0.22% (p=0.010 n=24+23)
CompileUnicode 77.8ms ± 2% 78.0ms ± 1% ~ (p=0.236 n=25+24)
CompileGoTypes 729ms ± 0% 731ms ± 0% +0.26% (p=0.000 n=24+24)
CompileCompiler 3.52s ± 0% 3.52s ± 1% ~ (p=0.152 n=25+25)
CompileSSA 8.06s ± 1% 8.05s ± 0% ~ (p=0.192 n=25+24)
CompileFlate 132ms ± 1% 132ms ± 1% ~ (p=0.373 n=24+24)
CompileGoParser 163ms ± 1% 164ms ± 1% +0.32% (p=0.003 n=24+25)
CompileReflect 453ms ± 1% 455ms ± 1% +0.39% (p=0.000 n=22+22)
CompileTar 181ms ± 1% 181ms ± 1% +0.20% (p=0.029 n=24+21)
CompileXML 244ms ± 1% 244ms ± 1% ~ (p=0.065 n=24+24)
CompileStdCmd 15.8s ± 2% 15.7s ± 2% ~ (p=0.059 n=23+24)
FoglemanFauxGLRenderRotateBoat 13.4s ±11% 12.8s ± 0% ~ (p=0.377 n=25+24)
FoglemanPathTraceRenderGopherIter1 18.6s ± 0% 18.6s ± 0% ~ (p=0.696 n=23+24)
GopherLuaKNucleotide 28.7s ± 4% 28.6s ± 5% ~ (p=0.700 n=25+25)
MarkdownRenderXHTML 250ms ± 1% 248ms ± 1% -1.01% (p=0.000 n=24+24)
[Geo mean] 1.60s 1.60s -0.11%
(https://perf.golang.org/search?q=upload:20200517.6)
For #38702.
Change-Id: I8af1fefd5fbf7b9cb665b98f9c4b73d1d08eea81
Reviewed-on: https://go-review.googlesource.com/c/go/+/234100
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
This change removes the concept of s.scavAddr in favor of explicitly
reserving and unreserving address ranges. s.scavAddr has several
problems with raciness that can cause the scavenger to miss updates, or
move it back unnecessarily, forcing future scavenge calls to iterate
over searched address space unnecessarily.
This change achieves this by replacing scavAddr with a second addrRanges
which is cloned from s.inUse at the end of each sweep phase. Ranges from
this second addrRanges are then reserved by scavengers (with the
reservation size proportional to the heap size) who are then able to
safely iterate over those ranges without worry of another scavenger
coming in.
Fixes #35788.
Change-Id: Ief01ae170384174875118742f6c26b2a41cbb66d
Reviewed-on: https://go-review.googlesource.com/c/go/+/208378
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Austin Clements <austin@google.com>
|
|
This change modifies the semantics of waking the scavenger: rather than
wake on any update to pacing, wake when we know we will have work to do,
that is, when the sweeper is done. The current scavenger runs over the
address space just once per GC cycle, and we want to maximize the chance
that the scavenger observes the most attractive scavengable memory in
that pass (i.e. free memory with the highest address), so the timing is
important. By having the scavenger awaken and reset its search space
when the sweeper is done, we increase the chance that the scavenger will
observe the most attractive scavengable memory, because no more memory
will be freed that GC cycle (so the highest scavengable address should
now be available).
Furthermore, in applications that go idle, this means the background
scavenger will be awoken even if another GC doesn't happen, which isn't
true today.
However, we're unable to wake the scavenger directly from within the
sweeper; waking the scavenger involves modifying timers and readying
goroutines, the latter of which may trigger an allocation today (and the
sweeper may run during allocation!). Instead, we do the following:
1. Set a flag which is checked by sysmon. sysmon will clear the flag and
wake the scavenger.
2. Wake the scavenger unconditionally at sweep termination.
The idea behind this policy is that it gets us close enough to the state
above without having to deal with the complexity of waking the scavenger
in deep parts of the runtime. If the application goes idle and sweeping
finishes (so we don't reach sweep termination), then sysmon will wake
the scavenger. sysmon has a worst-case 20 ms delay in responding to this
signal, which is probably fine if the application is completely idle
anyway, but if the application is actively allocating, then the
proportional sweeper should help ensure that sweeping ends very close to
sweep termination, so sweep termination is a perfectly reasonable time
to wake up the scavenger.
Updates #35788.
Change-Id: I84289b37816a7d595d803c72a71b7f5c59d47e6b
Reviewed-on: https://go-review.googlesource.com/c/go/+/207998
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
Currently mcentral is implemented as a couple of linked lists of spans
protected by a lock. Unfortunately this design leads to significant lock
contention.
The span ownership model is also confusing and complicated. In-use spans
jump between being owned by multiple sources, generally some combination
of a gcSweepBuf, a concurrent sweeper, an mcentral or an mcache.
So first to address contention, this change replaces those linked lists
with gcSweepBufs which have an atomic fast path. Then, we change up the
ownership model: a span may be simultaneously owned only by an mcentral
and the page reclaimer. Otherwise, an mcentral (which now consists of
sweep bufs), a sweeper, or an mcache are the sole owners of a span at
any given time. This dramatically simplifies reasoning about span
ownership in the runtime.
As a result of this new ownership model, sweeping is now driven by
walking over the mcentrals rather than having its own global list of
spans. Because we no longer have a global list and we traditionally
haven't used the mcentrals for large object spans, we no longer have
anywhere to put large objects. So, this change also makes it so that we
keep large object spans in the appropriate mcentral lists.
In terms of the static lock ranking, we add the spanSet spine locks in
pretty much the same place as the mcentral locks, since they have the
potential to be manipulated both on the allocation and sweep paths, like
the mcentral locks.
This new implementation is turned on by default via a feature flag
called go115NewMCentralImpl.
Benchmark results for 1 KiB allocation throughput (5 runs each):
name \ MiB/s go113 go114 gotip gotip+this-patch
AllocKiB-1 1.71k ± 1% 1.68k ± 1% 1.59k ± 2% 1.71k ± 1%
AllocKiB-2 2.46k ± 1% 2.51k ± 1% 2.54k ± 1% 2.93k ± 1%
AllocKiB-4 4.27k ± 1% 4.41k ± 2% 4.33k ± 1% 5.01k ± 2%
AllocKiB-8 4.38k ± 3% 5.24k ± 1% 5.46k ± 1% 8.23k ± 1%
AllocKiB-12 4.38k ± 3% 4.49k ± 1% 5.10k ± 1% 10.04k ± 0%
AllocKiB-16 4.31k ± 1% 4.14k ± 3% 4.22k ± 0% 10.42k ± 0%
AllocKiB-20 4.26k ± 1% 3.98k ± 1% 4.09k ± 1% 10.46k ± 3%
AllocKiB-24 4.20k ± 1% 3.97k ± 1% 4.06k ± 1% 10.74k ± 1%
AllocKiB-28 4.15k ± 0% 4.00k ± 0% 4.20k ± 0% 10.76k ± 1%
Fixes #37487.
Change-Id: I92d47355acacf9af2c41bf080c08a8c1638ba210
Reviewed-on: https://go-review.googlesource.com/c/go/+/221182
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
Currently markrootSpans, the scanning routine which scans span specials
(particularly finalizers) as roots, uses sweepSpans to shard work and
find spans to mark.
However, as part of a future CL to change span ownership and how
mcentral works, we want to avoid having markrootSpans use the sweep bufs
to find specials, so in this change we introduce a new mechanism.
Much like for the page reclaimer, we set up a per-page bitmap where the
first page for a span is marked if the span contains any specials, and
unmarked if it has no specials. This bitmap is updated by addspecial,
removespecial, and during sweeping.
markrootSpans then shards this bitmap into mark work and markers iterate
over the bitmap looking for spans with specials to mark. Unlike the page
reclaimer, we don't need to use the pageInUse bits because having a
special implies that a span is in-use.
While in terms of computational complexity this design is technically
worse, because it needs to iterate over the mapped heap, in practice
this iteration is very fast (we can skip over large swathes of the heap
very quickly) and we only look at spans that have any specials at all,
rather than having to touch each span.
This new implementation of markrootSpans is behind a feature flag called
go115NewMarkrootSpans.
Updates #37487.
Change-Id: I8ea07b6c11059f6d412fe419e0ab512d989377b8
Reviewed-on: https://go-review.googlesource.com/c/go/+/221178
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
I took some of the infrastructure from Austin's lock logging CR
https://go-review.googlesource.com/c/go/+/192704 (with deadlock
detection from the logs), and developed a setup to give static lock
ranking for runtime locks.
Static lock ranking establishes a documented total ordering among locks,
and then reports an error if the total order is violated. This can
happen if a deadlock happens (by acquiring a sequence of locks in
different orders), or if just one side of a possible deadlock happens.
Lock ordering deadlocks cannot happen as long as the lock ordering is
followed.
Along the way, I found a deadlock involving the new timer code, which Ian fixed
via https://go-review.googlesource.com/c/go/+/207348, as well as two other
potential deadlocks.
See the constants at the top of runtime/lockrank.go to show the static
lock ranking that I ended up with, along with some comments. This is
great documentation of the current intended lock ordering when acquiring
multiple locks in the runtime.
I also added an array lockPartialOrder[] which shows and enforces the
current partial ordering among locks (which is embedded within the total
ordering). This is more specific about the dependencies among locks.
I don't try to check the ranking within a lock class with multiple locks
that can be acquired at the same time (i.e. check the ranking when
multiple hchan locks are acquired).
Currently, I am doing a lockInit() call to set the lock rank of most
locks. Any lock that is not otherwise initialized is assumed to be a
leaf lock (a very high rank lock), so that eliminates the need to do
anything for a bunch of locks (including all architecture-dependent
locks). For two locks, root.lock and notifyList.lock (only in the
runtime/sema.go file), it is not as easy to do lock initialization, so
instead, I am passing the lock rank with the lock calls.
For Windows compilation, I needed to increase the StackGuard size from
896 to 928 because of the new lock-rank checking functions.
Checking of the static lock ranking is enabled by setting
GOEXPERIMENT=staticlockranking before doing a run.
To make sure that the static lock ranking code has no overhead in memory
or CPU when not enabled by GOEXPERIMENT, I changed 'go build/install' so
that it defines a build tag (with the same name) whenever any experiment
has been baked into the toolchain (by checking Expstring()). This allows
me to avoid increasing the size of the 'mutex' type when static lock
ranking is not enabled.
Fixes #38029
Change-Id: I154217ff307c47051f8dae9c2a03b53081acd83a
Reviewed-on: https://go-review.googlesource.com/c/go/+/207619
Reviewed-by: Dan Scales <danscales@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Dan Scales <danscales@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Having an mcache field in both m and p is confusing, so remove it from m.
Always use mcache field from p. Use new variable mcache0 during bootstrap.
Change-Id: If2cba9f8bb131d911d512b61fd883a86cf62cc98
Reviewed-on: https://go-review.googlesource.com/c/go/+/205239
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
This change removes useless additional heap_objects accounting for large
objects. heap_objects is computed from scratch at ReadMemStats time
(which stops the world) by using nlargealloc and nlargefree, so mutating
heap_objects turns out to be pointless.
As a result, the "large" parameter on "mheap_.freeSpan" is no longer
necessary and so this change cleans that up too.
Change-Id: I7d6b486d9b57c018e3db46221d81b55fe4c1b021
Reviewed-on: https://go-review.googlesource.com/c/go/+/196637
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
When everything is working correctly, any pointer the garbage
collector encounters can only point into a fully initialized heap
span, since the span must have been initialized before that pointer
could escape the heap allocator and become visible to the GC.
However, in various cases, we try to be defensive against bad
pointers. In findObject, this is just a sanity check: we never expect
to find a bad pointer, but programming errors can lead to them. In
spanOfHeap, we don't necessarily trust the pointer and we're trying to
check if it really does point to the heap, though it should always
point to something. Conservative scanning takes this to a new level,
since it can only guess that a word may be a pointer and verify this.
In all of these cases, we have a problem that the span lookup and
check can race with span initialization, since the span becomes
visible to lookups before it's fully initialized.
Furthermore, we're about to start initializing the span without the
heap lock held, which is going to introduce races where accesses were
previously protected by the heap lock.
To address this, this CL makes accesses to mspan.state atomic, and
ensures that the span is fully initialized before setting the state to
mSpanInUse. All loads are now atomic, and in any case where we don't
trust the pointer, it first atomically loads the span state and checks
that it's mSpanInUse, after which it will have synchronized with span
initialization and can safely check the other span fields.
For #10958, #24543, but a good fix in general.
Change-Id: I518b7c63555b02064b98aa5f802c92b758fef853
Reviewed-on: https://go-review.googlesource.com/c/go/+/203286
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
These were added in https://go-review.googlesource.com/1224; according
to austin@google.com these annotations are not valuable - resolving by
removing the TODOs.
Change-Id: Icf3f21bc385cac9673ba29f0154680e970cf91f2
Reviewed-on: https://go-review.googlesource.com/c/go/+/176899
Reviewed-by: Austin Clements <austin@google.com>
|
|
In runtime.gopanic, the _panic object p is stack allocated and
referenced from gp._panic. With stack objects, p on stack is dead
at the point preprintpanics runs. gp._panic points to p, but
stack scan doesn't look at gp. Heap scan of gp does look at
gp._panic, but it stops and ignores the pointer as it points to
the stack. So whatever p points to may be collected and clobbered.
We need to scan gp._panic explicitly during stack scan.
To test it reliably, we introduce a GODEBUG mode "clobberfree",
which clobbers the memory content when the GC frees an object.
Fixes #30150.
Change-Id: I11128298f03a89f817faa221421a9d332b41dced
Reviewed-on: https://go-review.googlesource.com/c/161778
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
When we attempt to allocate an N page span (either for a large
allocation or when an mcentral runs dry), we first try to sweep spans
to release N pages. Currently, this can be extremely expensive:
sweeping a span to emptiness is the hardest thing to ask for and the
sweeper generally doesn't know where to even look for potentially
fruitful results. Since this is on the critical path of many
allocations, this is unfortunate.
This CL changes how we reclaim empty spans. Instead of trying lots of
spans and hoping for the best, it uses the newly introduced span marks
to efficiently find empty spans. The span marks (and in-use bits) are
in a dense bitmap, so these spans can be found with an efficient
sequential memory scan. This approach can scan for unmarked spans at
about 300 GB/ms and can free unmarked spans at about 32 MB/ms. We
could probably significantly improve the rate at which is can free
unmarked spans, but that's a separate issue.
Like the current reclaimer, this is still linear in the number of
spans that are swept, but the constant factor is now so vanishingly
small that it doesn't matter.
The benchmark in #18155 demonstrates both significant page reclaiming
delays, and object reclaiming delays. With "-retain-count=20000000
-preallocate=true -loop-count=3", the benchmark demonstrates several
page reclaiming delays on the order of 40ms. After this change, the
page reclaims are insignificant. The longest sweeps are still ~150ms,
but are object reclaiming delays. We'll address those in the next
several CLs.
Updates #18155.
Fixes #21378 by completely replacing the logic that had that bug.
Change-Id: Iad80eec11d7fc262d02c8f0761ac6998425c4064
Reviewed-on: https://go-review.googlesource.com/c/138959
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
|
|
This change cleans up references to MSpan, MCache, and MCentral in the
docs via a bunch of sed invocations to better reflect the Go names for
the equivalent structures (i.e. mspan, mcache, mcentral) and their
methods (i.e. MSpan_Sweep -> mspan.sweep).
Change-Id: Ie911ac975a24bd25200a273086dd835ab78b1711
Reviewed-on: https://go-review.googlesource.com/c/147557
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
gosweepdone is another anachronism from the time when the sweeper was
implemented in C. Rename it to "isSweepDone" for the modern era.
Change-Id: I8472aa6f52478459c3f2edc8a4b2761e73c4c2dd
Reviewed-on: https://go-review.googlesource.com/c/138658
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
gosweepone just switches to the system stack and calls sweepone.
sweepone doesn't need to run on the system stack, so this is pretty
pointless.
Historically, this was necessary because the sweeper was written in C
and hence needed to run on the system stack. gosweepone was the
function that Go code (specifically, bgsweep) used to call into the C
sweeper implementation. This probably became unnecessary in 2014 with
CL golang.org/cl/167540043, which ported the sweeper to Go.
This CL changes all callers of gosweepone to call sweepone and
eliminates gosweepone.
Change-Id: I26b8ef0c7d060b4c0c5dedbb25ecfc936acc7269
Reviewed-on: https://go-review.googlesource.com/c/138657
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Lazy mcache flushing (golang.org/cl/134783) introduced a second value
for sweepgen that indicates a span has been swept. I missed adding
this case to a sanity check in sweepone, so it can now panic if it
finds a non-in-use spans that's been swept *and* put in an mcache.
Fix this by adding the second sweepgen case to this check.
Fixes #27997.
Change-Id: I568d9f2cc8923396ca897a37d154cd2c859c7bef
Reviewed-on: https://go-review.googlesource.com/c/140697
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
freeSpan currently takes a mysterious "acct int32" argument. This is
really just a boolean and actually just needs to match the "large"
argument to alloc in order to balance out accounting.
To make this clearer, replace acct with a "large bool" argument that
must match the call to mheap.alloc.
Change-Id: Ibc81faefdf9f0583114e1953fcfb362e9c3c76de
Reviewed-on: https://go-review.googlesource.com/c/138655
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Currently, all mcaches are flushed during STW mark termination as a
root marking job. This is currently necessary because all spans must
be out of these caches before sweeping begins to avoid races with
allocation and to ensure the spans are in the state expected by
sweeping. We do it as a root marking job because mcache flushing is
somewhat expensive and O(GOMAXPROCS) and this parallelizes the work
across the Ps. However, it's also the last remaining root marking job
performed during mark termination.
This CL moves mcache flushing out of mark termination and performs it
lazily. We keep track of the last sweepgen at which each mcache was
flushed and as each P is woken from STW, it observes that its mcache
is out-of-date and flushes it.
The introduces a complication for spans cached in stale mcaches. These
may now be observed by background or proportional sweeping or when
attempting to add a finalizer, but aren't in a stable state. For
example, they are likely to be on the wrong mcentral list. To fix
this, this CL extends the sweepgen protocol to also capture whether a
span is cached and, if so, whether or not its cache is stale. This
protocol blocks asynchronous sweeping from touching cached spans and
makes it the responsibility of mcache flushing to sweep the flushed
spans.
This eliminates the last mark termination root marking job, which
means we can now eliminate that entire infrastructure.
Updates #26903. This implements lazy mcache flushing.
Change-Id: Iadda7aabe540b2026cffc5195da7be37d5b4125e
Reviewed-on: https://go-review.googlesource.com/c/134783
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
|
|
Ending a loop with a break is confusing. Rewrite the loop so the
default behavior is to loop and then do the "post-loop" work outside
of the loop.
Change-Id: Ie49b4132541dfb5124c31a8163f2c883aa4abc75
Reviewed-on: https://go-review.googlesource.com/138155
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
|
|
We already aliased mSpanInUse to _MSpanInUse. The dual constants are
getting annoying, so fix all of these to use the mSpan* naming
convention.
This was done automatically with:
sed -i -re 's/_?MSpan(Dead|InUse|Manual|Free)/mSpan\1/g' *.go
plus deleting the existing definition of mSpanInUse.
Change-Id: I09979d9d491d06c10689cea625dc57faa9cc6767
Reviewed-on: https://go-review.googlesource.com/137875
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Change-Id: Icbaedc49c810c63c51d56ae394d2f70e4d64b3e0
Reviewed-on: https://go-review.googlesource.com/136495
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Every time I poke at #14921, the g.waitreason string
pointer writes show up.
They're not particularly important performance-wise,
but it'd be nice to clear the noise away.
And it does open up a few extra bytes in the g struct
for some future use.
This is a re-roll of CL 99078, which was rolled
back because of failures on s390x.
Those failures were apparently due to an old version of gdb.
Change-Id: Icc2c12f449b2934063fd61e272e06237625ed589
Reviewed-on: https://go-review.googlesource.com/111256
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <mike.munday@ibm.com>
|
|
This reverts commit 4eea887fd477368653f6fcf8ad766030167936e5.
Reason for revert: broke s390x build
Change-Id: Id6c2b6a7319273c4d21f613d4cdd38b00d49f847
Reviewed-on: https://go-review.googlesource.com/100375
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
|
|
Every time I poke at #14921, the g.waitreason string
pointer writes show up.
They're not particularly important performance-wise,
but it'd be nice to clear the noise away.
And it does open up a few extra bytes in the g struct
for some future use.
Change-Id: I7ffbd52fbc2a286931a2218038fda52ed6473cc9
Reviewed-on: https://go-review.googlesource.com/99078
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
Currently, we mix objects with pointers and objects without pointers
("noscan" objects) together in memory. As a result, for every object
we grey, we have to check that object's heap bits to find out if it's
noscan, which adds to the per-object cost of GC. This also hurts the
TLB footprint of the garbage collector because it decreases the
density of scannable objects at the page level.
This commit improves the situation by using separate spans for noscan
objects. This will allow a much simpler noscan check (in a follow up
CL), eliminate the need to clear the bitmap of noscan objects (in a
follow up CL), and improves TLB footprint by increasing the density of
scannable objects.
This is also a step toward eliminating dead bits, since the current
noscan check depends on checking the dead bit of the first word.
This has no effect on the heap size of the garbage benchmark.
We'll measure the performance change of this after the follow-up
optimizations.
This is a cherry-pick from dev.garbage commit d491e550c3. The only
non-trivial merge conflict was in updatememstats in mstats.go, where
we now have to separate the per-spanclass stats from the per-sizeclass
stats.
Change-Id: I13bdc4869538ece5649a8d2a41c6605371618e40
Reviewed-on: https://go-review.googlesource.com/41251
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
|
|
This changes gcSetTriggerRatio so it can be called even during
concurrent mark or sweep. In this case, it will adjust the pacing of
the current phase, accounting for progress that has already been made.
To make this work for concurrent sweep, this introduces a "basis" for
the pagesSwept count, much like the basis we just introduced for
heap_live. This lets gcSetTriggerRatio shift the basis to the current
heap_live and pagesSwept and compute a slope from there to completion.
This avoids creating a discontinuity where, if the ratio has
increased, there has to be a flurry of sweep activity to catch up.
Instead, this creates a continuous, piece-wise linear function as
adjustments are made.
For #19076.
Change-Id: Ibcd76aeeb81ff4814b00be7cbd3530b73bbdbba9
Reviewed-on: https://go-review.googlesource.com/39833
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
|
|
Currently, proportional sweep maintains its own count of how many
bytes have been allocated since the beginning of the sweep cycle so it
can compute how many pages need to be swept for a given allocation.
However, this requires a somewhat complex reimbursement scheme since
proportional sweep must be done before a span is allocated, but we
don't know how many bytes to charge until we've allocated a span. This
means that the allocated byte count used by proportional sweep can go
up and down, which has led to underflow bugs in the past (#18043) and
is going to interfere with adjusting sweep pacing on-the-fly (for #19076).
This approach also means we're maintaining a statistic that is very
closely related to heap_live, but has a different 0 value. This is
particularly confusing because the sweep ratio is computed based on
heap_live, so you have to understand that these two statistics are
very closely related.
Replace all of this and compute the sweep debt directly from the
current value of heap_live. To make this work, we simply save the
value of heap_live when the sweep ratio is computed to use as a
"basis" for later computing the sweep debt.
This eliminates the need for reimbursement as well as the code for
maintaining the sweeper's version of the live heap size.
For #19076.
Coincidentally fixes #18043, since this eliminates sweep reimbursement
entirely.
Change-Id: I1f931ddd6e90c901a3972c7506874c899251dc2a
Reviewed-on: https://go-review.googlesource.com/39832
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
|
|
This extends the GCSweepDone event with counts of swept and reclaimed
bytes. These are useful for understanding the duration and
effectiveness of sweep events.
Change-Id: I3c97a4f0f3aad3adbd188adb264859775f54e2df
Reviewed-on: https://go-review.googlesource.com/40811
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
|
|
Currently, each individual span sweep emits a span to the trace. But
sweeps are generally done in loops until some condition is satisfied,
so this tracing is lower-level than anyone really wants any hides the
fact that no other work is being accomplished between adjacent sweep
events. This is also high overhead: enabling tracing significantly
impacts sweep latency.
Replace this with instead tracing around the sweep loops used for
allocation. This is slightly tricky because sweep loops don't
generally know if any sweeping will happen in them. Hence, we make the
tracing lazy by recording in the P that we would like to start tracing
the sweep *if* one happens, and then only closing the sweep event if
we started it.
This does mean we don't get tracing on every sweep path, which are
legion. However, we get much more informative tracing on the paths
that block allocation, which are the paths that matter.
Change-Id: I73e14fbb250acb0c9d92e3648bddaa5e7d7e271c
Reviewed-on: https://go-review.googlesource.com/40810
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
This extends the sweeper to free workbufs back to the heap between GC
cycles, allowing this memory to be reused for GC'd allocations or
eventually returned to the OS.
This helps for applications that have high peak heap usage relative to
their regular heap usage (for example, a high-memory initialization
phase). Workbuf memory is roughly proportional to heap size and since
we currently never free workbufs, it's proportional to *peak* heap
size. By freeing workbufs, we can release and reuse this memory for
other purposes when the heap shrinks.
This is somewhat complicated because this costs ~1–2 µs per workbuf
span, so for large heaps it's too expensive to just do synchronously
after mark termination between starting the world and dropping the
worldsema. Hence, we do it asynchronously in the sweeper. This adds a
list of "free" workbuf spans that can be returned to the heap. GC
moves all workbuf spans to this list after mark termination and the
background sweeper drains this list back to the heap. If the sweeper
doesn't finish, that's fine, since getempty can directly reuse any
remaining spans to allocate more workbufs.
Performance impact is negligible. On the x/benchmarks, this reduces
GC-bytes-from-system by 6–11%.
Fixes #19325.
Change-Id: Icb92da2196f0c39ee984faf92d52f29fd9ded7a8
Reviewed-on: https://go-review.googlesource.com/38582
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
|
|
Commit 44ed88a5a7 moved printing of the "sweep done" gcpacertrace
message so that it is printed when the final sweeper finishes.
However, by this point some other thread has often already observed
that there are no more spans to sweep and zeroed sweepPagesPerByte.
Avoid printing a 0 sweep ratio in the trace when this race happens by
getting the value of the sweep ratio upon entry to sweepone and
printing that.
Change-Id: Iac0c48ae899e12f193267cdfb012c921f8b71c85
Reviewed-on: https://go-review.googlesource.com/39492
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
|
|
sweepone returns ^uintptr(0) when there are no more spans to *start*
sweeping, but there may be spans being swept concurrently at the time
and there's currently no efficient way to tell when the sweeper is
done sweeping all the spans.
We'll need this for concurrent runtime.GC(), so add a count of the
number of active sweepone calls to make it possible to block until
sweeping is truly done.
This is also useful for more accurately printing the gcpacertrace,
since that should be printed after all of the sweeping stats are in
(currently we can print it slightly too early).
For #18216.
Change-Id: I06e6240c9e7b40aca6fd7b788bb6962107c10a0f
Reviewed-on: https://go-review.googlesource.com/37716
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
|
|
Currently sweep counts the number of allocated objects, computes the
number of free objects from that, then re-computes the number of
allocated objects from that. Simplify and clean this up by skipping
these intermediate steps.
Change-Id: I3ed98e371eb54bbcab7c8530466c4ab5fde35f0a
Reviewed-on: https://go-review.googlesource.com/34935
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
|
|
Updates #18043.
Change-Id: I24e687fdd5521c48b672987f15f0d5de9f308884
Reviewed-on: https://go-review.googlesource.com/34612
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Now that sweeping and span marking use the sweep list, there's no need
for the work.spans snapshot of the allspans list. This change
eliminates the few remaining uses of it, which are either dead code or
can use allspans directly, and removes work.spans and its support
functions.
Change-Id: Id5388b42b1e68e8baee853d8eafb8bb4ff95bb43
Reviewed-on: https://go-review.googlesource.com/30537
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
|
|
Currently sweeping walks the list of all spans, which means the work
in sweeping is proportional to the maximum number of spans ever used.
If the heap was once large but is now small, this causes an
amortization failure: on a small heap, GCs happen frequently, but a
full sweep still has to happen in each GC cycle, which means we spent
a lot of time in sweeping.
Fix this by creating a separate list consisting of just the in-use
spans to be swept, so sweeping is proportional to the number of in-use
spans (which is proportional to the live heap). Specifically, we
create two lists: a list of unswept in-use spans and a list of swept
in-use spans. At the start of the sweep cycle, the swept list becomes
the unswept list and the new swept list is empty. Allocating a new
in-use span adds it to the swept list. Sweeping moves spans from the
unswept list to the swept list.
This fixes the amortization problem because a shrinking heap moves
spans off the unswept list without adding them to the swept list,
reducing the time required by the next sweep cycle.
Updates #9265. This fix eliminates almost all of the time spent in
sweepone; however, markrootSpans has essentially the same bug, so now
the test program from this issue spends all of its time in
markrootSpans.
No significant effect on other benchmarks.
Change-Id: Ib382e82790aad907da1c127e62b3ab45d7a4ac1e
Reviewed-on: https://go-review.googlesource.com/30535
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
|
|
Race runtime also needs local malloc caches and currently uses
a mix of per-OS-thread and per-goroutine caches. This leads to
increased memory consumption. But more importantly cache of
synchronization objects is per-goroutine and we don't always
have goroutine context when feeing memory in GC. As the result
synchronization object descriptors leak (more precisely, they
can be reused if another synchronization object is recreated
at the same address, but it does not always help). For example,
the added BenchmarkSyncLeak has effectively runaway memory
consumption (based on a real long running server).
This change updates race runtime with support for per-P contexts.
BenchmarkSyncLeak now stabilizes at ~1GB memory consumption.
Long term, this will allow us to remove race runtime dependency
on glibc (as malloc is the main cornerstone).
I've also implemented a different scheme to pass P context to
race runtime: scheduler notified race runtime about association
between G and P by calling procwire(g, p)/procunwire(g, p).
But it turned out to be very messy as we have lots of places
where the association changes (e.g. syscalls). So I dropped it
in favor of the current scheme: race runtime asks scheduler
about the current P.
Fixes #14533
Change-Id: Iad10d2f816a44affae1b9fed446b3580eafd8c69
Reviewed-on: https://go-review.googlesource.com/19970
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
sweep used to skip mcental.freeSpan (and its locking) if it didn't
find any new free objects. We lost that optimization when the
freed-object counting changed in dad83f7 to count total free objects
instead of newly freed objects.
The previous commit brings back counting of newly freed objects, so we
can easily revive this optimization by checking that count (like we
used to) instead of the total free objects count.
Change-Id: I43658707a1c61674d0366124d5976b00d98741a9
Reviewed-on: https://go-review.googlesource.com/22596
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
|