aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMichael Anthony Knyszek <mknyszek@google.com>2023-09-29 19:16:38 +0000
committerGopher Robot <gobot@golang.org>2023-10-12 23:27:00 +0000
commitef6993f3270c95e1158b11d9fc1a047528b34da6 (patch)
treea2e36af713cc2687db08744f7460f759b043c844
parentbae01521f3ab27979b454f2ecc77ff9403965957 (diff)
downloadgo-ef6993f3270c95e1158b11d9fc1a047528b34da6.tar.gz
go-ef6993f3270c95e1158b11d9fc1a047528b34da6.zip
[release-branch.go1.21] runtime: don't eagerly collapse hugepages
This has caused performance issues in production environments. MADV_COLLAPSE can go into direct reclaim, but we call it with the heap lock held. This means that the process could end up stalled fairly quickly if just one allocating goroutine ends up in the madvise call, at least until the madvise(MADV_COLLAPSE) call returns. A similar issue occurred with madvise(MADV_HUGEPAGE), because that could go into direct reclaim on any page fault for MADV_HUGEPAGE-marked memory. My understanding was that the calls to madvise(MADV_COLLAPSE) were fairly rare, and it's "best-effort" nature prevented it from going into direct reclaim often, but this was wrong. It tends to be fairly heavyweight even when it doesn't end up in direct reclaim, and it's almost certainly not worth it. Disable it until further notice and let the kernel fully dictate hugepage policy. The updated scavenger policy is still more hugepage friendly by delaying scavening until hugepages are no longer densely packed, so we don't lose all that much. The Sweet benchmarks show a minimal difference. A couple less realistic benchmarks seem to slow down a bit; they might just be getting unlucky with what the kernel decides to back with a huge page. Some benchmarks on the other hand improve. Overall, it's a wash. name old time/op new time/op delta BiogoIgor 13.1s ± 1% 13.2s ± 2% ~ (p=0.182 n=9+10) BiogoKrishna 12.0s ± 1% 12.1s ± 1% +1.23% (p=0.002 n=9+10) BleveIndexBatch100 4.51s ± 4% 4.56s ± 3% ~ (p=0.393 n=10+10) EtcdPut 20.2ms ± 4% 19.8ms ± 2% ~ (p=0.079 n=10+9) EtcdSTM 109ms ± 3% 111ms ± 3% +1.63% (p=0.035 n=10+10) GoBuildKubelet 31.2s ± 1% 31.3s ± 1% ~ (p=0.780 n=9+10) GoBuildKubeletLink 7.77s ± 0% 7.81s ± 2% ~ (p=0.237 n=8+10) GoBuildIstioctl 31.8s ± 1% 31.7s ± 0% ~ (p=0.136 n=9+9) GoBuildIstioctlLink 7.88s ± 1% 7.89s ± 1% ~ (p=0.720 n=9+10) GoBuildFrontend 11.7s ± 1% 11.8s ± 1% ~ (p=0.278 n=10+9) GoBuildFrontendLink 1.15s ± 4% 1.15s ± 5% ~ (p=0.387 n=9+9) GopherLuaKNucleotide 19.7s ± 1% 20.6s ± 0% +4.48% (p=0.000 n=10+10) MarkdownRenderXHTML 194ms ± 3% 196ms ± 3% ~ (p=0.356 n=9+10) Tile38QueryLoad 633µs ± 2% 629µs ± 2% ~ (p=0.075 n=10+10) name old average-RSS-bytes new average-RSS-bytes delta BiogoIgor 69.2MB ± 3% 68.4MB ± 1% ~ (p=0.190 n=10+10) BiogoKrishna 4.40GB ± 0% 4.40GB ± 0% ~ (p=0.605 n=9+9) BleveIndexBatch100 195MB ± 3% 195MB ± 2% ~ (p=0.853 n=10+10) EtcdPut 107MB ± 4% 108MB ± 3% ~ (p=0.190 n=10+10) EtcdSTM 91.6MB ± 5% 92.6MB ± 4% ~ (p=0.481 n=10+10) GoBuildKubelet 2.26GB ± 1% 2.28GB ± 1% +1.22% (p=0.000 n=10+10) GoBuildIstioctl 1.53GB ± 0% 1.53GB ± 0% +0.21% (p=0.017 n=9+10) GoBuildFrontend 556MB ± 1% 554MB ± 2% ~ (p=0.497 n=9+10) GopherLuaKNucleotide 39.0MB ± 3% 39.0MB ± 1% ~ (p=1.000 n=10+8) MarkdownRenderXHTML 21.2MB ± 2% 21.4MB ± 3% ~ (p=0.190 n=10+10) Tile38QueryLoad 5.99GB ± 2% 6.02GB ± 0% ~ (p=0.243 n=10+9) name old peak-RSS-bytes new peak-RSS-bytes delta BiogoIgor 90.2MB ± 4% 89.2MB ± 2% ~ (p=0.143 n=10+10) BiogoKrishna 4.49GB ± 0% 4.49GB ± 0% ~ (p=0.190 n=10+10) BleveIndexBatch100 283MB ± 8% 274MB ± 6% ~ (p=0.075 n=10+10) EtcdPut 147MB ± 4% 149MB ± 2% +1.55% (p=0.034 n=10+8) EtcdSTM 117MB ± 5% 117MB ± 4% ~ (p=0.905 n=9+10) GopherLuaKNucleotide 44.9MB ± 1% 44.6MB ± 1% ~ (p=0.083 n=8+8) MarkdownRenderXHTML 22.0MB ± 8% 22.1MB ± 9% ~ (p=0.436 n=10+10) Tile38QueryLoad 6.24GB ± 2% 6.29GB ± 2% ~ (p=0.218 n=10+10) name old peak-VM-bytes new peak-VM-bytes delta BiogoIgor 1.33GB ± 0% 1.33GB ± 0% ~ (p=0.504 n=10+9) BiogoKrishna 5.77GB ± 0% 5.77GB ± 0% ~ (p=1.000 n=10+9) BleveIndexBatch100 3.53GB ± 0% 3.53GB ± 0% ~ (p=0.642 n=10+10) EtcdPut 12.1GB ± 0% 12.1GB ± 0% ~ (p=0.564 n=10+10) EtcdSTM 12.1GB ± 0% 12.1GB ± 0% ~ (p=0.633 n=10+10) GopherLuaKNucleotide 1.26GB ± 0% 1.26GB ± 0% ~ (p=0.297 n=9+10) MarkdownRenderXHTML 1.26GB ± 0% 1.26GB ± 0% ~ (p=0.069 n=10+10) Tile38QueryLoad 7.47GB ± 2% 7.53GB ± 2% ~ (p=0.280 n=10+10) name old p50-latency-ns new p50-latency-ns delta EtcdPut 19.8M ± 5% 19.3M ± 3% -2.74% (p=0.043 n=10+9) EtcdSTM 81.4M ± 4% 83.4M ± 4% +2.46% (p=0.029 n=10+10) Tile38QueryLoad 241k ± 1% 240k ± 1% ~ (p=0.393 n=10+10) name old p90-latency-ns new p90-latency-ns delta EtcdPut 30.4M ± 5% 30.6M ± 5% ~ (p=0.971 n=10+10) EtcdSTM 222M ± 3% 226M ± 4% ~ (p=0.063 n=10+10) Tile38QueryLoad 687k ± 2% 691k ± 1% ~ (p=0.173 n=10+8) name old p99-latency-ns new p99-latency-ns delta EtcdPut 42.3M ±10% 41.4M ± 7% ~ (p=0.353 n=10+10) EtcdSTM 486M ± 7% 487M ± 4% ~ (p=0.579 n=10+10) Tile38QueryLoad 6.43M ± 2% 6.37M ± 3% ~ (p=0.280 n=10+10) name old ops/s new ops/s delta EtcdPut 48.6k ± 3% 49.5k ± 2% ~ (p=0.065 n=10+9) EtcdSTM 9.09k ± 2% 8.95k ± 3% -1.56% (p=0.045 n=10+10) Tile38QueryLoad 28.4k ± 1% 28.6k ± 1% +0.87% (p=0.016 n=9+10) Fixes #63335. For #63334. Related to #61718 and #59960. Change-Id: If84c5a8685825d43c912a71418f2597e44e867e5 Reviewed-on: https://go-review.googlesource.com/c/go/+/531816 Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> (cherry picked from commit 595deec3dda8e81d514389efdbb4ee2bc38dcabe) Reviewed-on: https://go-review.googlesource.com/c/go/+/532255 Auto-Submit: Dmitri Shuralyov <dmitshur@google.com>
-rw-r--r--src/runtime/mgcscavenge.go20
1 files changed, 5 insertions, 15 deletions
diff --git a/src/runtime/mgcscavenge.go b/src/runtime/mgcscavenge.go
index 4c6d6be4f0..659ca8df2e 100644
--- a/src/runtime/mgcscavenge.go
+++ b/src/runtime/mgcscavenge.go
@@ -1145,21 +1145,11 @@ func (s *scavengeIndex) alloc(ci chunkIdx, npages uint) {
// Mark that we're considering this chunk as backed by huge pages.
sc.setHugePage()
- // Collapse dense chunks into huge pages and mark that
- // we did that, but only if we're not allocating to
- // use the entire chunk. If we're allocating an entire chunk,
- // this is likely part of a much bigger allocation. For
- // instance, if the caller is allocating a 1 GiB slice of bytes, we
- // don't want to go and manually collapse all those pages; we want
- // them to be demand-paged. If the caller is actually going to use
- // all that memory, it'll naturally get backed by huge pages later.
- //
- // This also avoids having sysHugePageCollapse fail. On Linux,
- // the call requires that some part of the huge page being collapsed
- // is already paged in.
- if !s.test && npages < pallocChunkPages {
- sysHugePageCollapse(unsafe.Pointer(chunkBase(ci)), pallocChunkBytes)
- }
+ // TODO(mknyszek): Consider eagerly backing memory with huge pages
+ // here. In the past we've attempted to use sysHugePageCollapse
+ // (which uses MADV_COLLAPSE on Linux, and is unsupported elswhere)
+ // for this purpose, but that caused performance issues in production
+ // environments.
}
s.chunks[ci].store(sc)
}