Age | Commit message (Collapse) | Author |
|
If something "huge" is allocated, and the zeroing is trivial (no pointers
involved) then zero it by chunks in a loop so that preemption can occur,
not all in a single non-preemptible call.
Benchmarking suggests that 256K is the best chunk size.
Updates #42642.
Change-Id: I94015e467eaa098c59870e479d6d83bc88efbfb4
Reviewed-on: https://go-review.googlesource.com/c/go/+/270943
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
Currently tiny allocations are not represented in either MemStats or
runtime/metrics, but they're represented in MemStats (indirectly) via
Mallocs. Add them to runtime/metrics by first merging
memstats.tinyallocs into consistentHeapStats (just for simplicity; it's
monotonic so metrics would still be self-consistent if we just read it
atomically) and then adding /gc/heap/tiny/allocs:objects to the list of
supported metrics.
Change-Id: Ie478006ab942a3e877b4a79065ffa43569722f3d
Reviewed-on: https://go-review.googlesource.com/c/go/+/312909
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This change moves certain important but internal-only GC statistics from
memstats into gcController. These statistics are mainly used in pacing
the GC, so it makes sense to keep them in the pacer's state.
This CL was mostly generated via
rf '
ex . {
memstats.gc_trigger -> gcController.trigger
memstats.triggerRatio -> gcController.triggerRatio
memstats.heap_marked -> gcController.heapMarked
memstats.heap_live -> gcController.heapLive
memstats.heap_scan -> gcController.heapScan
}
'
except for a few special cases, like updating names in comments and when
these fields are used within gcControllerState methods (at which point
they're accessed through the reciever).
For #44167.
Change-Id: I6bd1602585aeeb80818ded24c07d8e6fec992b93
Reviewed-on: https://go-review.googlesource.com/c/go/+/306598
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This change modifies the consistent stats implementation to keep the
per-P sequence counter on each P instead of each mcache. A valid mcache
is not available everywhere that we want to call e.g. allocSpan, as per
issue #42339. By decoupling these two, we can add a mechanism to allow
contexts without a P to update stats consistently.
In this CL, we achieve that with a mutex. In practice, it will be very
rare for an M to update these stats without a P. Furthermore, the stats
reader also only needs to hold the mutex across the update to "gen"
since once that changes, writers are free to continue updating the new
stats generation. Contention could thus only arise between writers
without a P, and as mentioned earlier, those should be rare.
A nice side-effect of this change is that the consistent stats acquire
and release API becomes simpler.
Fixes #42339.
Change-Id: Ied74ab256f69abd54b550394c8ad7c4c40a5fe34
Reviewed-on: https://go-review.googlesource.com/c/go/+/267158
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This change moves the responsibility of throwing if an mcache is not
available to the caller, because the inlining cost of throw is set very
high in the compiler. Even if it was reduced down to the cost of a usual
function call, it would still be too expensive, so just move it out.
This choice also makes sense in the context of #42339 since we're going
to have to handle the case where we don't have an mcache to update stats
in a few contexts anyhow.
Also, add getMCache to the list of functions that should be inlined to
prevent future regressions.
getMCache is called on the allocation fast path and because its not
inlined actually causes a significant regression (~10%) in some
microbenchmarks.
Fixes #42305.
Change-Id: I64ac5e4f26b730bd4435ea1069a4a50f55411ced
Reviewed-on: https://go-review.googlesource.com/c/go/+/267157
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
|
|
This change moves the mcache-local malloc stats into the
consistentHeapStats structure so the malloc stats can be managed
consistently with the memory stats. The one exception here is
tinyAllocs for which moving that into the global stats would incur
several atomic writes on the fast path. Microbenchmarks for just one CPU
core have shown a 50% loss in throughput. Since tiny allocation counnt
isn't exposed anyway and is always blindly added to both allocs and
frees, let that stay inconsistent and flush the tiny allocation count
every so often.
Change-Id: I2a4b75f209c0e659b9c0db081a3287bf227c10ca
Reviewed-on: https://go-review.googlesource.com/c/go/+/247039
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This change adds a global set of heap statistics which are similar
to existing memory statistics. The purpose of these new statistics
is to be able to read them and get a consistent result without stopping
the world. The goal is to eventually replace as many of the existing
memstats statistics with the sharded ones as possible.
The consistent memory statistics use a tailor-made synchronization
mechanism to allow writers (allocators) to proceed with minimal
synchronization by using a sequence counter and a global generation
counter to determine which set of statistics to update. Readers
increment the global generation counter to effectively grab a snapshot
of the statistics, and then iterate over all Ps using the sequence
counter to ensure that they may safely read the snapshotted statistics.
To keep statistics fresh, the reader also has a responsibility to merge
sets of statistics.
These consistent statistics are computed, but otherwise unused for now.
Upcoming changes will integrate them with the rest of the codebase and
will begin to phase out existing statistics.
Change-Id: I637a11f2439e2049d7dccb8650c5d82500733ca5
Reviewed-on: https://go-review.googlesource.com/c/go/+/247037
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This change adds a function getMCache which returns the current P's
mcache if it's available, and otherwise tries to get mcache0 if we're
bootstrapping. This function will come in handy as we need to replicate
this behavior in multiple places in future changes.
Change-Id: I536073d6f6dc6c6390269e613ead9f8bcb6e7f98
Reviewed-on: https://go-review.googlesource.com/c/go/+/246976
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This change renames a bunch of malloc statistics stored in the mcache
that are all named with the "local_" prefix. It also renames largeAlloc
to allocLarge to prevent a naming conflict, and next_sample because it
would be the last mcache field with the old C naming style.
Change-Id: I29695cb83b397a435ede7e9ad5c3c9be72767ea3
Reviewed-on: https://go-review.googlesource.com/c/go/+/246969
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Now that local_scan is the last mcache-based statistic that is flushed
by purgecachedstats, and heap_scan and gcController.revise may be
interacted with concurrently, we don't need to flush heap_scan at
arbitrary locations where the heap is locked, and we don't need
purgecachedstats and cachestats anymore. Instead, we can flush
local_scan at the same time we update heap_live in refill, so the two
updates may share the same revise call.
Clean up unused functions, remove code that would cause the heap to get
locked in the allocSpan when it didn't need to (other than to flush
local_scan), and flush local_scan explicitly in a few important places.
Notably we need to flush local_scan whenever we flush the other stats,
but it doesn't need to be donated anywhere, so have releaseAll do the
flushing. Also, we need to flush local_scan before we set heap_scan at
the end of a GC, which was previously handled by cachestats. Just do so
explicitly -- it's not much code and it becomes a lot more clear why we
need to do so.
Change-Id: I35ac081784df7744d515479896a41d530653692d
Reviewed-on: https://go-review.googlesource.com/c/go/+/246968
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This change makes local_tinyallocs work like the rest of the malloc
stats and doesn't flush local_tinyallocs, instead making that the
source-of-truth.
Change-Id: I3e6cb5f1b3d086e432ce7d456895511a48e3617a
Reviewed-on: https://go-review.googlesource.com/c/go/+/246967
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This change removes mcentral.nmalloc and adds mcache.local_nsmallalloc
which fulfills the same role but may be accessed non-atomically. It also
moves responsibility for updating heap_live and local_nsmallalloc into
mcache functions.
As a result of this change, mcache is now the sole source-of-truth for
malloc stats. It is also solely responsible for updating heap_live and
performing the various operations required as a result of updating
heap_live. The overall improvement here is in code organization:
previously malloc stats were fairly scattered, and now they have one
single home, and nearly all the required manipulations exist in a single
file.
Change-Id: I7e93fa297c1debf17e3f2a0d68aeed28a9c6af00
Reviewed-on: https://go-review.googlesource.com/c/go/+/246966
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This change makes nlargealloc and largealloc into mcache fields just
like nlargefree and largefree. These local fields become the new
source-of-truth. This change also moves the accounting for these fields
out of allocSpan (which is an inappropriate place for it -- this
accounting generally happens much closer to the point of allocation) and
into largeAlloc. This move is partially possible now that we can call
gcController.revise at that point.
Furthermore, this change moves largeAlloc into mcache.go and makes it a
method of mcache. While there's a little bit of a mismatch here because
largeAlloc barely interacts with the mcache, it helps solidify the
mcache as the first allocation layer and provides a clear place to
aggregate and manage statistics.
Change-Id: I37b5e648710733bb4c04430b71e96700e438587a
Reviewed-on: https://go-review.googlesource.com/c/go/+/246965
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This change makes it so that various local malloc stats (excluding
heap_scan and local_tinyallocs) are no longer written first to mheap
fields but are instead accessed directly from each mcache.
This change is part of a move toward having stats be distributed, and
cleaning up some old code related to the stats.
Note that because there's no central source-of-truth, when an mcache
dies, it must donate its stats to another mcache. It's always safe to
donate to the mcache for the 0th P, so do that.
Change-Id: I2556093dbc27357cb9621c9b97671f3c00aa1173
Reviewed-on: https://go-review.googlesource.com/c/go/+/246964
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
This change deletes the old mcentral implementation from the code base
and the newMCentralImpl feature flag along with it.
Updates #37487.
Change-Id: Ibca8f722665f0865051f649ffe699cbdbfdcfcf2
Reviewed-on: https://go-review.googlesource.com/c/go/+/221184
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
Currently mcentral is implemented as a couple of linked lists of spans
protected by a lock. Unfortunately this design leads to significant lock
contention.
The span ownership model is also confusing and complicated. In-use spans
jump between being owned by multiple sources, generally some combination
of a gcSweepBuf, a concurrent sweeper, an mcentral or an mcache.
So first to address contention, this change replaces those linked lists
with gcSweepBufs which have an atomic fast path. Then, we change up the
ownership model: a span may be simultaneously owned only by an mcentral
and the page reclaimer. Otherwise, an mcentral (which now consists of
sweep bufs), a sweeper, or an mcache are the sole owners of a span at
any given time. This dramatically simplifies reasoning about span
ownership in the runtime.
As a result of this new ownership model, sweeping is now driven by
walking over the mcentrals rather than having its own global list of
spans. Because we no longer have a global list and we traditionally
haven't used the mcentrals for large object spans, we no longer have
anywhere to put large objects. So, this change also makes it so that we
keep large object spans in the appropriate mcentral lists.
In terms of the static lock ranking, we add the spanSet spine locks in
pretty much the same place as the mcentral locks, since they have the
potential to be manipulated both on the allocation and sweep paths, like
the mcentral locks.
This new implementation is turned on by default via a feature flag
called go115NewMCentralImpl.
Benchmark results for 1 KiB allocation throughput (5 runs each):
name \ MiB/s go113 go114 gotip gotip+this-patch
AllocKiB-1 1.71k ± 1% 1.68k ± 1% 1.59k ± 2% 1.71k ± 1%
AllocKiB-2 2.46k ± 1% 2.51k ± 1% 2.54k ± 1% 2.93k ± 1%
AllocKiB-4 4.27k ± 1% 4.41k ± 2% 4.33k ± 1% 5.01k ± 2%
AllocKiB-8 4.38k ± 3% 5.24k ± 1% 5.46k ± 1% 8.23k ± 1%
AllocKiB-12 4.38k ± 3% 4.49k ± 1% 5.10k ± 1% 10.04k ± 0%
AllocKiB-16 4.31k ± 1% 4.14k ± 3% 4.22k ± 0% 10.42k ± 0%
AllocKiB-20 4.26k ± 1% 3.98k ± 1% 4.09k ± 1% 10.46k ± 3%
AllocKiB-24 4.20k ± 1% 3.97k ± 1% 4.06k ± 1% 10.74k ± 1%
AllocKiB-28 4.15k ± 0% 4.00k ± 0% 4.20k ± 0% 10.76k ± 1%
Fixes #37487.
Change-Id: I92d47355acacf9af2c41bf080c08a8c1638ba210
Reviewed-on: https://go-review.googlesource.com/c/go/+/221182
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
Overflow of the comparison caused very large (>=1<<32) allocations to
sometimes not get sampled at all. Use uintptr so the comparison will
never overflow.
Fixes #33342
Tested on the example in 33342. I don't want to check a test in that
needs that much memory, however.
Change-Id: I51fe77a9117affed8094da93c0bc5f445ac2d3d3
Reviewed-on: https://go-review.googlesource.com/c/go/+/188017
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
Currently there's an invariant in the runtime wherein the heap lock
can only be acquired on the system stack, otherwise a self-deadlock
could occur if the stack grows while the lock is held.
This invariant is upheld and documented in a number of situations (e.g.
allocManual, freeManual) but there are other places where the invariant
is either not maintained at all which risks self-deadlock (e.g.
setGCPercent, gcResetMarkState, allocmcache) or is maintained but
undocumented (e.g. gcSweep, readGCStats_m).
This change adds go:systemstack to any function that acquires the heap
lock or adds a systemstack(func() { ... }) around the critical section,
where appropriate. It also documents the invariant on (*mheap).lock
directly and updates repetitive documentation to refer to that comment.
Fixes #32105.
Change-Id: I702b1290709c118b837389c78efde25c51a2cafb
Reviewed-on: https://go-review.googlesource.com/c/go/+/177857
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
|
|
This change cleans up references to MSpan, MCache, and MCentral in the
docs via a bunch of sed invocations to better reflect the Go names for
the equivalent structures (i.e. mspan, mcache, mcentral) and their
methods (i.e. MSpan_Sweep -> mspan.sweep).
Change-Id: Ie911ac975a24bd25200a273086dd835ab78b1711
Reviewed-on: https://go-review.googlesource.com/c/147557
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Currently, all mcaches are flushed during STW mark termination as a
root marking job. This is currently necessary because all spans must
be out of these caches before sweeping begins to avoid races with
allocation and to ensure the spans are in the state expected by
sweeping. We do it as a root marking job because mcache flushing is
somewhat expensive and O(GOMAXPROCS) and this parallelizes the work
across the Ps. However, it's also the last remaining root marking job
performed during mark termination.
This CL moves mcache flushing out of mark termination and performs it
lazily. We keep track of the last sweepgen at which each mcache was
flushed and as each P is woken from STW, it observes that its mcache
is out-of-date and flushes it.
The introduces a complication for spans cached in stale mcaches. These
may now be observed by background or proportional sweeping or when
attempting to add a finalizer, but aren't in a stable state. For
example, they are likely to be on the wrong mcentral list. To fix
this, this CL extends the sweepgen protocol to also capture whether a
span is cached and, if so, whether or not its cache is stale. This
protocol blocks asynchronous sweeping from touching cached spans and
makes it the responsibility of mcache flushing to sweep the flushed
spans.
This eliminates the last mark termination root marking job, which
means we can now eliminate that entire infrastructure.
Updates #26903. This implements lazy mcache flushing.
Change-Id: Iadda7aabe540b2026cffc5195da7be37d5b4125e
Reviewed-on: https://go-review.googlesource.com/c/134783
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
|
|
mcache.refill acquires g.m.locks, which is pointless because the
caller itself absolutely must have done so already to prevent
ownership of mcache from shifting.
Also, mcache.refill's documentation is generally a bit out-of-date, so
this cleans this up.
Change-Id: Idc8de666fcaf3c3d96006bd23a8f307539587d6c
Reviewed-on: https://go-review.googlesource.com/138195
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
|
|
These functions all serve essentially the same purpose. mlookup is
used in only one place and findObject in only three. Use
heapBitsForObject instead, which is the most optimized implementation.
(This may seem slightly silly because none of these uses care about
the heap bits, but we're about to split up the functionality of
heapBitsForObject anyway. At that point, findObject will rise from the
ashes.)
Change-Id: I906468c972be095dd23cf2404a7d4434e802f250
Reviewed-on: https://go-review.googlesource.com/85877
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
|
|
These have never had a use - not even going back to when they were added
in C.
Change-Id: I143b6902b3bacb1fa83c56c9070a8adb9f61a844
Reviewed-on: https://go-review.googlesource.com/69119
Reviewed-by: Dave Cheney <dave@cheney.net>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Dave Cheney <dave@cheney.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Currently, we mix objects with pointers and objects without pointers
("noscan" objects) together in memory. As a result, for every object
we grey, we have to check that object's heap bits to find out if it's
noscan, which adds to the per-object cost of GC. This also hurts the
TLB footprint of the garbage collector because it decreases the
density of scannable objects at the page level.
This commit improves the situation by using separate spans for noscan
objects. This will allow a much simpler noscan check (in a follow up
CL), eliminate the need to clear the bitmap of noscan objects (in a
follow up CL), and improves TLB footprint by increasing the density of
scannable objects.
This is also a step toward eliminating dead bits, since the current
noscan check depends on checking the dead bit of the first word.
This has no effect on the heap size of the garbage benchmark.
We'll measure the performance change of this after the follow-up
optimizations.
This is a cherry-pick from dev.garbage commit d491e550c3. The only
non-trivial merge conflict was in updatememstats in mstats.go, where
we now have to separate the per-spanclass stats from the per-sizeclass
stats.
Change-Id: I13bdc4869538ece5649a8d2a41c6605371618e40
Reviewed-on: https://go-review.googlesource.com/41251
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
|
|
Currently fixalloc does not zero memory it reuses. This is dangerous
with the hybrid barrier if the type may contain heap pointers, since
it may cause us to observe a dead heap pointer on reuse. It's also
error-prone since it's the only allocator that doesn't zero on
allocation (mallocgc of course zeroes, but so do persistentalloc and
sysAlloc). It's also largely pointless: for mcache, the caller
immediately memclrs the allocation; and the two specials types are
tiny so there's no real cost to zeroing them.
Change fixalloc to zero allocations by default.
The only type we don't zero by default is mspan. This actually
requires that the spsn's sweepgen survive across freeing and
reallocating a span. If we were to zero it, the following race would
be possible:
1. The current sweepgen is 2. Span s is on the unswept list.
2. Direct sweeping sweeps span s, finds it's all free, and releases s
to the fixalloc.
3. Thread 1 allocates s from fixalloc. Suppose this zeros s, including
s.sweepgen.
4. Thread 1 calls s.init, which sets s.state to _MSpanDead.
5. On thread 2, background sweeping comes across span s in allspans
and cas's s.sweepgen from 0 (sg-2) to 1 (sg-1). Now it thinks it
owns it for sweeping. 6. Thread 1 continues initializing s.
Everything breaks.
I would like to fix this because it's obviously confusing, but it's a
subtle enough problem that I'm leaving it alone for now. The solution
may be to skip sweepgen 0, but then we have to think about wrap-around
much more carefully.
Updates #17503.
Change-Id: Ie08691feed3abbb06a31381b94beb0a2e36a0613
Reviewed-on: https://go-review.googlesource.com/31368
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
|
|
This covers basically all sysAlloc'd, persistentalloc'd, and
fixalloc'd types.
Change-Id: I0487c887c2a0ade5e33d4c4c12d837e97468e66b
Reviewed-on: https://go-review.googlesource.com/30941
Reviewed-by: Rick Hudson <rlh@golang.org>
|
|
This is a renaming of the field ref to the
more appropriate allocCount. The field
holds the number of objects in the span
that are currently allocated. Some throws
strings were adjusted to more accurately
convey the meaning of allocCount.
Change-Id: I10daf44e3e9cc24a10912638c7de3c1984ef8efe
Reviewed-on: https://go-review.googlesource.com/19518
Reviewed-by: Austin Clements <austin@google.com>
|
|
Instead of building a freelist from the mark bits generated
by the GC this CL allocates directly from the mark bits.
The approach moves the mark bits from the pointer/no pointer
heap structures into their own per span data structures. The
mark/allocation vectors consist of a single mark bit per
object. Two vectors are maintained, one for allocation and
one for the GC's mark phase. During the GC cycle's sweep
phase the interpretation of the vectors is swapped. The
mark vector becomes the allocation vector and the old
allocation vector is cleared and becomes the mark vector that
the next GC cycle will use.
Marked entries in the allocation vector indicate that the
object is not free. Each allocation vector maintains a boundary
between areas of the span already allocated from and areas
not yet allocated from. As objects are allocated this boundary
is moved until it reaches the end of the span. At this point
further allocations will be done from another span.
Since we no longer sweep a span inspecting each freed object
the responsibility for maintaining pointer/scalar bits in
the heapBitMap containing is now the responsibility of the
the routines doing the actual allocation.
This CL is functionally complete and ready for performance
tuning.
Change-Id: I336e0fc21eef1066e0b68c7067cc71b9f3d50e04
Reviewed-on: https://go-review.googlesource.com/19470
Reviewed-by: Austin Clements <austin@google.com>
|
|
The tree's pretty inconsistent about single space vs double space
after a period in documentation. Make it consistently a single space,
per earlier decisions. This means contributors won't be confused by
misleading precedence.
This CL doesn't use go/doc to parse. It only addresses // comments.
It was generated with:
$ perl -i -npe 's,^(\s*// .+[a-z]\.) +([A-Z]),$1 $2,' $(git grep -l -E '^\s*//(.+\.) +([A-Z])')
$ go test go/doc -update
Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7
Reviewed-on: https://go-review.googlesource.com/20022
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Dave Day <djd@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Currently, we update memstats.heap_live from mcache.local_cachealloc
whenever we lock the heap (e.g., to obtain a fresh span or to release
an unused span). However, under the right circumstances,
local_cachealloc can accumulate allocations up to the size of
the *entire heap* without flushing them to heap_live. Specifically,
since span allocations from an mcentral don't lock the heap, if a
large number of pages are held in an mcentral and the application
continues to use and free objects of that size class (e.g., the
BinaryTree17 benchmark), local_cachealloc won't be flushed until the
mcentral runs out of spans.
This is a problem because, unlike many of the memory statistics that
are purely informative, heap_live is used to determine when the
garbage collector should start and how hard it should work.
This commit eliminates local_cachealloc, instead atomically updating
heap_live directly. To control contention, we do this only when
obtaining a span from an mcentral. Furthermore, we make heap_live
conservative: allocating a span assumes that all free slots in that
span will be used and accounts for these when the span is
allocated, *before* the objects themselves are. This is important
because 1) this triggers the GC earlier than necessary rather than
potentially too late and 2) this leads to a conservative GC rate
rather than a GC rate that is potentially too low.
Alternatively, we could have flushed local_cachealloc when it passed
some threshold, but this would require determining a threshold and
would cause heap_live to underestimate the true value rather than
overestimate.
Fixes #12199.
name old time/op new time/op delta
BinaryTree17-12 2.88s ± 4% 2.88s ± 1% ~ (p=0.470 n=19+19)
Fannkuch11-12 2.48s ± 1% 2.48s ± 1% ~ (p=0.243 n=16+19)
FmtFprintfEmpty-12 50.9ns ± 2% 50.7ns ± 1% ~ (p=0.238 n=15+14)
FmtFprintfString-12 175ns ± 1% 171ns ± 1% -2.48% (p=0.000 n=18+18)
FmtFprintfInt-12 159ns ± 1% 158ns ± 1% -0.78% (p=0.000 n=19+18)
FmtFprintfIntInt-12 270ns ± 1% 265ns ± 2% -1.67% (p=0.000 n=18+18)
FmtFprintfPrefixedInt-12 235ns ± 1% 234ns ± 0% ~ (p=0.362 n=18+19)
FmtFprintfFloat-12 309ns ± 1% 308ns ± 1% -0.41% (p=0.001 n=18+19)
FmtManyArgs-12 1.10µs ± 1% 1.08µs ± 0% -1.96% (p=0.000 n=19+18)
GobDecode-12 7.81ms ± 1% 7.80ms ± 1% ~ (p=0.425 n=18+19)
GobEncode-12 6.53ms ± 1% 6.53ms ± 1% ~ (p=0.817 n=19+19)
Gzip-12 312ms ± 1% 312ms ± 2% ~ (p=0.967 n=19+20)
Gunzip-12 42.0ms ± 1% 41.9ms ± 1% ~ (p=0.172 n=19+19)
HTTPClientServer-12 63.7µs ± 1% 63.8µs ± 1% ~ (p=0.639 n=19+19)
JSONEncode-12 16.4ms ± 1% 16.4ms ± 1% ~ (p=0.954 n=19+19)
JSONDecode-12 58.5ms ± 1% 57.8ms ± 1% -1.27% (p=0.000 n=18+19)
Mandelbrot200-12 3.86ms ± 1% 3.88ms ± 0% +0.44% (p=0.000 n=18+18)
GoParse-12 3.67ms ± 2% 3.66ms ± 1% -0.52% (p=0.001 n=18+19)
RegexpMatchEasy0_32-12 100ns ± 1% 100ns ± 0% ~ (p=0.257 n=19+18)
RegexpMatchEasy0_1K-12 347ns ± 1% 347ns ± 1% ~ (p=0.527 n=18+18)
RegexpMatchEasy1_32-12 83.7ns ± 2% 83.1ns ± 2% ~ (p=0.096 n=18+19)
RegexpMatchEasy1_1K-12 509ns ± 1% 505ns ± 1% -0.75% (p=0.000 n=18+19)
RegexpMatchMedium_32-12 130ns ± 2% 129ns ± 1% ~ (p=0.962 n=20+20)
RegexpMatchMedium_1K-12 39.5µs ± 2% 39.4µs ± 1% ~ (p=0.376 n=20+19)
RegexpMatchHard_32-12 2.04µs ± 0% 2.04µs ± 1% ~ (p=0.195 n=18+17)
RegexpMatchHard_1K-12 61.4µs ± 1% 61.4µs ± 1% ~ (p=0.885 n=19+19)
Revcomp-12 540ms ± 2% 542ms ± 4% ~ (p=0.552 n=19+17)
Template-12 69.6ms ± 1% 71.2ms ± 1% +2.39% (p=0.000 n=20+20)
TimeParse-12 357ns ± 1% 357ns ± 1% ~ (p=0.883 n=18+20)
TimeFormat-12 379ns ± 1% 362ns ± 1% -4.53% (p=0.000 n=18+19)
[Geo mean] 62.0µs 61.8µs -0.44%
name old time/op new time/op delta
XBenchGarbage-12 5.89ms ± 2% 5.81ms ± 2% -1.41% (p=0.000 n=19+18)
Change-Id: I96b31cca6ae77c30693a891cff3fe663fa2447a0
Reviewed-on: https://go-review.googlesource.com/17748
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
mcache.tiny is in non-GC'd memory, but points to heap memory. As a
result, there may or may not be write barriers when writing to
mcache.tiny. Make it clearer that funny things are going on by making
mcache.tiny a uintptr instead of an unsafe.Pointer.
Change-Id: I732a5b7ea17162f196a9155154bbaff8d4d00eac
Reviewed-on: https://go-review.googlesource.com/16963
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
The tiny alloc cache is maintained in a pointer from non-GC'd memory
(mcache) to heap memory and hence must be handled carefully.
Currently we clear the tiny alloc cache during sweep termination and,
if it is assigned to a non-nil value during concurrent marking, we
depend on a write barrier to keep the new value alive. However, while
the compiler currently always generates this write barrier, we're
treading on thin ice because write barriers may not happen for writes
to non-heap memory (e.g., typedmemmove). Without this lucky write
barrier, the GC may free a current tiny block while it's still
reachable by the tiny allocator, leading to later memory corruption.
Change this code so that, rather than depending on the write barrier,
we simply clear the tiny cache during mark termination when we're
clearing all of the other mcaches. If the current tiny block is
reachable from regular pointers, it will be retained; if it isn't
reachable from regular pointers, it may be freed, but that's okay
because there won't be any pointers in non-GC'd memory to it.
Change-Id: I8230980d8612c35c2997b9705641a1f9f865f879
Reviewed-on: https://go-review.googlesource.com/16962
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Applies to types fixAlloc, mCache, mCentral, mHeap, mSpan, and
mSpanList.
Two special cases:
1. mHeap_Scavenge() previously didn't take an *mheap parameter, so it
was specially handled in this CL.
2. mHeap_Free() would have collided with mheap's "free" field, so it's
been renamed to (*mheap).freeSpan to parallel its underlying
(*mheap).freeSpanLocked method.
Change-Id: I325938554cca432c166fe9d9d689af2bbd68de4b
Reviewed-on: https://go-review.googlesource.com/16221
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
The current heap sampling introduces some bias that interferes
with unsampling, producing unexpected heap profiles.
The solution is to use a Poisson process to generate the
sampling points, using the formulas described at
https://en.wikipedia.org/wiki/Poisson_process
This fixes #12620
Change-Id: If2400809ed3c41de504dd6cff06be14e476ff96c
Reviewed-on: https://go-review.googlesource.com/14590
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
This tracks the number of scannable bytes in the allocated heap. That
is, bytes that the garbage collector must scan before reaching the
last pointer field in each object.
This will be used to compute a more robust estimate of the GC scan
work.
Change-Id: I1eecd45ef9cdd65b69d2afb5db5da885c80086bb
Reviewed-on: https://go-review.googlesource.com/9695
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
This field used to decrease with sweeps (and potentially go
negative). Now it is always zero or positive, so change it to a
uintptr so it meshes better with other memory stats.
Change-Id: I6a50a956ddc6077eeaf92011c51743cb69540a3c
Reviewed-on: https://go-review.googlesource.com/8899
Reviewed-by: Rick Hudson <rlh@golang.org>
|
|
Currently there are two main consumers of memstats.heap_alloc:
updatememstats (aka ReadMemStats) and shouldtriggergc.
updatememstats recomputes heap_alloc from the ground up, so we don't
need to keep heap_alloc up to date for it. shouldtriggergc wants to
know how many bytes were marked by the previous GC plus how many bytes
have been allocated since then, but this *isn't* what heap_alloc
tracks. heap_alloc also includes objects that are not marked and
haven't yet been swept.
Introduce a new memstat called heap_live that actually tracks what
shouldtriggergc wants to know and stop keeping heap_alloc up to date.
Unlike heap_alloc, heap_live follows a simple sawtooth that drops
during each mark termination and increases monotonically between GCs.
heap_alloc, on the other hand, has much more complicated behavior: it
may drop during sweep termination, slowly decreases from background
sweeping between GCs, is roughly unaffected by allocation as long as
there are unswept spans (because we sweep and allocate at the same
rate), and may go up after background sweeping is done depending on
the GC trigger.
heap_live simplifies computing next_gc and using it to figure out when
to trigger garbage collection. Currently, we guess next_gc at the end
of a cycle and update it as we sweep and get a better idea of how much
heap was marked. Now, since we're directly tracking how much heap is
marked, we can directly compute next_gc.
This also corrects bugs that could cause us to trigger GC early.
Currently, in any case where sweep termination actually finds spans to
sweep, heap_alloc is an overestimation of live heap, so we'll trigger
GC too early. heap_live, on the other hand, is unaffected by sweeping.
Change-Id: I1f96807b6ed60d4156e8173a8e68745ffc742388
Reviewed-on: https://go-review.googlesource.com/8389
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
The unbounded list-based sudog cache can grow infinitely.
This can happen if a goroutine is routinely blocked on one P
and then unblocked and scheduled on another P.
The scenario was reported on golang-nuts list.
We've been here several times. Any unbounded local caches
are bad and grow to infinite size. This change introduces
central sudog cache; local caches become fixed-size
with the only purpose of amortizing accesses to the
central cache.
The change required to move sudog cache from mcache to P,
because mcache is not scanned by GC.
Change-Id: I3bb7b14710354c026dcba28b3d3c8936a8db4e90
Reviewed-on: https://go-review.googlesource.com/3742
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
|
|
Move code from malloc1.go, malloc2.go, mem.go, mgc0.go into
appropriate locations.
Factor mgc.go into mgc.go, mgcmark.go, mgcsweep.go, mstats.go.
A lot of this code was in certain files because the right place was in
a C file but it was written in Go, or vice versa. This is one step toward
making things actually well-organized again.
Change-Id: I6741deb88a7cfb1c17ffe0bcca3989e10207968f
Reviewed-on: https://go-review.googlesource.com/5300
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
|
|
Rename "gothrow" to "throw" now that the C version of "throw"
is no longer needed.
This change is purely mechanical except in panic.go where the
old version of "throw" has been deleted.
sed -i "" 's/[[:<:]]gothrow[[:>:]]/throw/g' runtime/*.go
Change-Id: Icf0752299c35958b92870a97111c67bcd9159dc3
Reviewed-on: https://go-review.googlesource.com/2150
Reviewed-by: Minux Ma <minux@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
|
|
barriers for GC internal structures such as free lists.
LGTM=rsc
R=rsc
CC=golang-codereviews, rsc
https://golang.org/cl/179000043
|
|
The garbage collector is now written in Go.
There is plenty to clean up (just like on dev.cc).
all.bash passes on darwin/amd64, darwin/386, linux/amd64, linux/386.
TBR=rlh
R=austin, rlh, bradfitz
CC=golang-codereviews
https://golang.org/cl/173250043
|
|
Scalararg and ptrarg are not "signal safe".
Go code filling them out can be interrupted by a signal,
and then the signal handler runs, and if it also ends up
in Go code that uses scalararg or ptrarg, now the old
values have been smashed.
For the pieces of code that do need to run in a signal handler,
we introduced onM_signalok, which is really just onM
except that the _signalok is meant to convey that the caller
asserts that scalarg and ptrarg will be restored to their old
values after the call (instead of the usual behavior, zeroing them).
Scalararg and ptrarg are also untyped and therefore error-prone.
Go code can always pass a closure instead of using scalararg
and ptrarg; they were only really necessary for C code.
And there's no more C code.
For all these reasons, delete scalararg and ptrarg, converting
the few remaining references to use closures.
Once those are gone, there is no need for a distinction between
onM and onM_signalok, so replace both with a single function
equivalent to the current onM_signalok (that is, it can be called
on any of the curg, g0, and gsignal stacks).
The name onM and the phrase 'm stack' are misnomers,
because on most system an M has two system stacks:
the main thread stack and the signal handling stack.
Correct the misnomer by naming the replacement function systemstack.
Fix a few references to "M stack" in code.
The main motivation for this change is to eliminate scalararg/ptrarg.
Rick and I have already seen them cause problems because
the calling sequence m.ptrarg[0] = p is a heap pointer assignment,
so it gets a write barrier. The write barrier also uses onM, so it has
all the same problems as if it were being invoked by a signal handler.
We worked around this by saving and restoring the old values
and by calling onM_signalok, but there's no point in keeping this nice
home for bugs around any longer.
This CL also changes funcline to return the file name as a result
instead of filling in a passed-in *string. (The *string signature is
left over from when the code was written in and called from C.)
That's arguably an unrelated change, except that once I had done
the ptrarg/scalararg/onM cleanup I started getting false positives
about the *string argument escaping (not allowed in package runtime).
The compiler is wrong, but the easiest fix is to write the code like
Go code instead of like C code. I am a bit worried that the compiler
is wrong because of some use of uninitialized memory in the escape
analysis. If that's the reason, it will go away when we convert the
compiler to Go. (And if not, we'll debug it the next time.)
LGTM=khr
R=r, khr
CC=austin, golang-codereviews, iant, rlh
https://golang.org/cl/174950043
|
|
The conversion was done with an automated tool and then
modified only as necessary to make it compile and run.
[This CL is part of the removal of C code from package runtime.
See golang.org/s/dev.cc for an overview.]
LGTM=r
R=r
CC=austin, dvyukov, golang-codereviews, iant, khr
https://golang.org/cl/167540043
|