go - The Go programming language

Age	Commit message (Collapse)	Author
2022-02-18	[release-branch.go1.16] runtime: simplify histogram buckets considerably	Michael Anthony Knyszek
	There was an off-by-one error in the time histogram buckets calculation that caused the linear sub-buckets distances to be off by 2x. The fix was trivial, but in writing tests I realized there was a much simpler way to express the calculation for the histogram buckets, and took the opportunity to do that here. The new bucket calculation also fixes the bug. For #50732. Fixes #50733. Change-Id: Idae89986de1c415ee4e148f778e0e101ca003ade Reviewed-on: https://go-review.googlesource.com/c/go/+/380094 Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Emmanuel Odeke <emmanuel@orijtech.com> Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> (cherry picked from commit 2e9dcb508647dc473a37ecfa244d2bc4a1843ab4) Reviewed-on: https://go-review.googlesource.com/c/go/+/384620
2021-12-22	[release-branch.go1.16] runtime: set iOS addr space to 40 bits with ↵	Michael Anthony Knyszek
	incremental pagealloc In iOS <14, the address space is strictly limited to 8 GiB, or 33 bits. As a result, the page allocator also assumes all heap memory lives in this region. This is especially necessary because the page allocator has a PROT_NONE mapping proportional to the size of the usable address space, so this keeps that mapping very small. However starting with iOS 14, this restriction is relaxed, and mmap may start returning addresses outside of the <14 range. Today this means that in iOS 14 and later, users experience an error in the page allocator when a heap arena is mapped outside of the old range. This change increases the ios/arm64 heapAddrBits to 40 while simultaneously making ios/arm64 use the 64-bit pagealloc implementation (with reservations and incremental mapping) to accommodate both iOS versions <14 and 14+. Once iOS <14 is deprecated, we can remove these exceptions and treat ios/arm64 like any other arm64 platform. This change also makes the BaseChunkIdx expression a little bit easier to read, while we're here. For #46860. Fixes #48115. Change-Id: I13865f799777739109585f14f1cc49d6d57e096b Reviewed-on: https://go-review.googlesource.com/c/go/+/344401 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Austin Clements <austin@google.com> (cherry picked from commit af368da0b137116faba81ca249a8d964297e6e45) Reviewed-on: https://go-review.googlesource.com/c/go/+/369736 Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
2020-12-23	runtime: shift timeHistogram buckets and allow negative durations	Michael Anthony Knyszek
	Today, timeHistogram, when copied, has the wrong set of counts for the bucket that should represent (-inf, 0), when in fact it contains [0, 1). In essence, the buckets are all shifted over by one from where they're supposed to be. But this also means that the existence of the overflow bucket is wrong: the top bucket is supposed to extend to infinity, and what we're really missing is an underflow bucket to represent the range (-inf, 0). We could just always zero this bucket and continue ignoring negative durations, but that likely isn't prudent. timeHistogram is intended to be used with differences in nanotime, but depending on how a platform is implemented (or due to a bug in that platform) it's possible to get a negative duration without having done anything wrong. We should just be resilient to that and be able to detect it. So this change removes the overflow bucket and replaces it with an underflow bucket, and timeHistogram no longer panics when faced with a negative duration. Fixes #43328. Fixes #43329. Change-Id: If336425d7d080fd37bf071e18746800e22d38108 Reviewed-on: https://go-review.googlesource.com/c/go/+/279468 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Trust: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-30	runtime: add heap lock assertions	Michael Pratt
	Some functions that required holding the heap lock _or_ world stop have been simplified to simply requiring the heap lock. This is conceptually simpler and taking the heap lock during world stop is guaranteed to not contend. This was only done on functions already called on the systemstack to avoid too many extra systemstack calls in GC. Updates #40677 Change-Id: I15aa1dadcdd1a81aac3d2a9ecad6e7d0377befdc Reviewed-on: https://go-review.googlesource.com/c/go/+/250262 Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Trust: Michael Pratt <mpratt@google.com>
2020-10-26	runtime: add timeHistogram type	Michael Anthony Knyszek
	This change adds a concurrent HDR time histogram to the runtime with tests. It also adds a function to generate boundaries for use by the metrics package. For #37112. Change-Id: Ifbef8ddce8e3a965a0dcd58ccd4915c282ae2098 Reviewed-on: https://go-review.googlesource.com/c/go/+/247046 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Trust: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26	runtime,runtime/metrics: add memory metrics	Michael Anthony Knyszek
	This change adds support for a variety of runtime memory metrics and contains the base implementation of Read for the runtime/metrics package, which lives in the runtime. It also adds testing infrastructure for the metrics package, and a bunch of format and documentation tests. For #37112. Change-Id: I16a2c4781eeeb2de0abcb045c15105f1210e2d8a Reviewed-on: https://go-review.googlesource.com/c/go/+/247041 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> Trust: Michael Knyszek <mknyszek@google.com>
2020-10-26	runtime: move malloc stats into consistentHeapStats	Michael Anthony Knyszek
	This change moves the mcache-local malloc stats into the consistentHeapStats structure so the malloc stats can be managed consistently with the memory stats. The one exception here is tinyAllocs for which moving that into the global stats would incur several atomic writes on the fast path. Microbenchmarks for just one CPU core have shown a 50% loss in throughput. Since tiny allocation counnt isn't exposed anyway and is always blindly added to both allocs and frees, let that stay inconsistent and flush the tiny allocation count every so often. Change-Id: I2a4b75f209c0e659b9c0db081a3287bf227c10ca Reviewed-on: https://go-review.googlesource.com/c/go/+/247039 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Trust: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26	runtime: delineate which memstats are system stats with a type	Michael Anthony Knyszek
	This change modifies the type of several mstats fields to be a new type: sysMemStat. This type has the same structure as the fields used to have. The purpose of this change is to make it very clear which stats may be used in various functions for accounting (usually the platform-specific sys* functions, but there are others). Currently there's an implicit understanding that the *uint64 value passed to these functions is some kind of statistic whose value is atomically managed. This understanding isn't inherently problematic, but we're about to change how some stats (which currently use mSysStatInc and mSysStatDec) work, so we want to make it very clear what the various requirements are around "sysStat". This change also removes mSysStatInc and mSysStatDec in favor of a method on sysMemStat. Note that those two functions were originally written the way they were because atomic 64-bit adds required a valid G on ARM, but this hasn't been the case for a very long time (since golang.org/cl/14204, but even before then it wasn't clear if mutexes required a valid G anymore). Today we implement 64-bit adds on ARM with a spinlock table. Change-Id: I4e9b37cf14afc2ae20cf736e874eb0064af086d7 Reviewed-on: https://go-review.googlesource.com/c/go/+/246971 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Trust: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26	runtime: rename mcache fields to match Go style	Michael Anthony Knyszek
	This change renames a bunch of malloc statistics stored in the mcache that are all named with the "local_" prefix. It also renames largeAlloc to allocLarge to prevent a naming conflict, and next_sample because it would be the last mcache field with the old C naming style. Change-Id: I29695cb83b397a435ede7e9ad5c3c9be72767ea3 Reviewed-on: https://go-review.googlesource.com/c/go/+/246969 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26	runtime: don't flush local_tinyallocs	Michael Anthony Knyszek
	This change makes local_tinyallocs work like the rest of the malloc stats and doesn't flush local_tinyallocs, instead making that the source-of-truth. Change-Id: I3e6cb5f1b3d086e432ce7d456895511a48e3617a Reviewed-on: https://go-review.googlesource.com/c/go/+/246967 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26	runtime: make distributed/local malloc stats the source-of-truth	Michael Anthony Knyszek
	This change makes it so that various local malloc stats (excluding heap_scan and local_tinyallocs) are no longer written first to mheap fields but are instead accessed directly from each mcache. This change is part of a move toward having stats be distributed, and cleaning up some old code related to the stats. Note that because there's no central source-of-truth, when an mcache dies, it must donate its stats to another mcache. It's always safe to donate to the mcache for the 0th P, so do that. Change-Id: I2556093dbc27357cb9621c9b97671f3c00aa1173 Reviewed-on: https://go-review.googlesource.com/c/go/+/246964 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-23	runtime: add tests for addrRanges.add	Michael Anthony Knyszek
	Change-Id: I249deb482df74068b0538e9d773b9a87bc5a6df3 Reviewed-on: https://go-review.googlesource.com/c/go/+/242681 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Trust: Michael Knyszek <mknyszek@google.com> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-23	runtime: add tests for addrRanges.findSucc	Michael Anthony Knyszek
	This change adds a test suite for addrRanges.findSucc so we can change the implementation more safely. For #40191. Change-Id: I14a834b6d54836cbc676eb0edb292ba6176705cc Reviewed-on: https://go-review.googlesource.com/c/go/+/242678 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-22	runtime: define the AddrRange used for testing in terms of addrRange	Michael Anthony Knyszek
	Currently the AddrRange used for testing is defined separately from addrRange in the runtime, making it difficult to test it as well as addrRanges. Redefine AddrRange in terms of addrRange instead. For #40191. Change-Id: I3aa5b8df3e4c9a3c494b46ab802dd574b2488141 Reviewed-on: https://go-review.googlesource.com/c/go/+/242677 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Austin Clements <austin@google.com>
2020-09-16	runtime: actually fix locking in BenchmarkMSpanCountAlloc	Michael Anthony Knyszek
	I just submitted CL 255297 which mostly fixed this problem, but totally forgot to actually acquire/release the heap lock. Oops. Updates #41391. Change-Id: I45b42f20a9fc765c4de52476db3654d4bfe9feb3 Reviewed-on: https://go-review.googlesource.com/c/go/+/255298 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Go Bot <gobot@golang.org>
2020-09-16	runtime: fix leak and locking in BenchmarkMSpanCountAlloc	Michael Anthony Knyszek
	CL 249917 made the mspan in MSpanCountAlloc no longer stack-allocated (for good reason), but then allocated an mspan on each call and did not free it, resulting in a leak. That allocation was also not protected by the heap lock, which could lead to data corruption of mheap fields and the spanalloc. To fix this, export some functions to allocate/free dummy mspans from spanalloc (with proper locking) and allocate just one up-front for the benchmark, freeing it at the end. Then, update MSpanCountAlloc to accept a dummy mspan. Note that we need to allocate the dummy mspan up-front otherwise we measure things like heap locking and fixalloc performance instead of what we actually want to measure: how fast we can do a popcount on the mark bits. Fixes #41391. Change-Id: If6629a6ec1ece639c7fb78532045837a8c872c04 Reviewed-on: https://go-review.googlesource.com/c/go/+/255297 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Trust: Michael Knyszek <mknyszek@google.com>
2020-09-09	runtime: fix ReadMemStatsSlow's and CheckScavengedBits' chunk iteration	Michael Anthony Knyszek
	Both ReadMemStatsSlow and CheckScavengedBits iterate over the page allocator's chunks but don't actually check if they exist. During the development process the chunks index became sparse, so now this was a possibility. If the runtime tests' heap is sparse we might end up segfaulting in either one of these functions, though this will generally be very rare. The pattern here to return nil for a nonexistent chunk is also useful elsewhere, so this change introduces tryChunkOf which won't throw, but might return nil. It also updates the documentation of chunkOf. Fixes #41296. Change-Id: Id5ae0ca3234480de1724fdf2e3677eeedcf76fa0 Reviewed-on: https://go-review.googlesource.com/c/go/+/253777 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-08-25	cmd/compile: don't allow go:notinheap on the heap or stack	Keith Randall
	Right now we just prevent such types from being on the heap. This CL makes it so they cannot appear on the stack either. The distinction between heap and stack is pretty vague at the language level (e.g. it is affected by -N), and we don't need the flexibility anyway. Once go:notinheap types cannot be in either place, we don't need to consider pointers to such types to be pointers, at least according to the garbage collector and stack copying. (This is the big win of this CL, in my opinion.) The distinction between HasPointers and HasHeapPointer no longer exists. There is only HasPointers. This CL is cleanup before possible use of go:notinheap to fix #40954. Update #13386 Change-Id: Ibd895aadf001c0385078a6d4809c3f374991231a Reviewed-on: https://go-review.googlesource.com/c/go/+/249917 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2020-08-13	runtime: revert signal stack mlocking	Austin Clements
	Go 1.14 included a (rather awful) workaround for a Linux kernel bug that corrupted vector registers on x86 CPUs during signal delivery (https://bugzilla.kernel.org/show_bug.cgi?id=205663). This bug was introduced in Linux 5.2 and fixed in 5.3.15, 5.4.2 and all 5.5 and later kernels. The fix was also back-ported by major distros. This workaround was necessary, but had unfortunate downsides, including causing Go programs to exceed the mlock ulimit in many configurations (#37436). We're reasonably confident that by the Go 1.16 release, the number of systems running affected kernels will be vanishingly small. Hence, this CL removes this workaround. This effectively reverts CLs 209597 (version parser), 209899 (mlock top of signal stack), 210299 (better failure message), 223121 (soft mlock failure handling), and 244059 (special-case patched Ubuntu kernels). The one thing we keep is the osArchInit function. It's empty everywhere now, but is a reasonable hook to have. Updates #35326, #35777 (the original register corruption bugs). Updates #40184 (request to revert in 1.15). Fixes #35979. Change-Id: Ie213270837095576f1f3ef46bf3de187dc486c50 Reviewed-on: https://go-review.googlesource.com/c/go/+/246200 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
2020-05-14	runtime: make maxOffAddr reflect the actual address space upper bound	Michael Anthony Knyszek
	Currently maxOffAddr is defined in terms of the whole 64-bit address space, assuming that it's all supported, by using ^uintptr(0) as the maximal address in the offset space. In reality, the maximal address in the offset space is (1<<heapAddrBits)-1 because we don't have more than that actually available to us on a given platform. On most platforms this is fine, because arenaBaseOffset is just connecting two segments of address space, but on AIX we use it as an actual offset for the starting address of the available address space, which is limited. This means using ^uintptr(0) as the maximal address in the offset address space causes wrap-around, especially when we just want to represent a range approximately like [addr, infinity), which today we do by using maxOffAddr. To fix this, we define maxOffAddr more appropriately, in terms of (1<<heapAddrBits)-1. This change also redefines arenaBaseOffset to not be the negation of the virtual address corresponding to address zero in the virtual address space, but instead directly as the virtual address corresponding to zero. This matches the existing documentation more closely and makes the logic around arenaBaseOffset decidedly simpler, especially when trying to reason about its use on AIX. Fixes #38966. Change-Id: I1336e5036a39de846f64cc2d253e8536dee57611 Reviewed-on: https://go-review.googlesource.com/c/go/+/233497 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>
2020-05-08	runtime: make addrRange[s] operate on offset addresses	Michael Anthony Knyszek
	Currently addrRange and addrRanges operate on real addresses. That is, the addresses they manipulate don't include arenaBaseOffset. When added to an address, arenaBaseOffset makes the address space appear contiguous on platforms where the address space is segmented. While this is generally OK because even those platforms which have a segmented address space usually don't give addresses in a different segment, today it causes a mismatch between the scavenger and the rest of the page allocator. The scavenger scavenges from the highest addresses first, but only via real address, whereas the page allocator allocates memory in offset address order. So this change makes addrRange and addrRanges, i.e. what the scavenger operates on, use offset addresses. However, lots of the page allocator relies on an addrRange containing real addresses. To make this transition less error-prone, this change introduces a new type, offAddr, whose purpose is to make offset addresses a distinct type, so any attempt to trivially mix real and offset addresses will trigger a compilation error. This change doesn't attempt to use offAddr in all of the runtime; a follow-up change will look for and catch remaining uses of an offset address which doesn't use the type. Updates #35788. Change-Id: I991d891ac8ace8339ca180daafdf6b261a4d43d1 Reviewed-on: https://go-review.googlesource.com/c/go/+/230717 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>
2020-05-08	runtime: remove scavAddr in favor of address ranges	Michael Anthony Knyszek
	This change removes the concept of s.scavAddr in favor of explicitly reserving and unreserving address ranges. s.scavAddr has several problems with raciness that can cause the scavenger to miss updates, or move it back unnecessarily, forcing future scavenge calls to iterate over searched address space unnecessarily. This change achieves this by replacing scavAddr with a second addrRanges which is cloned from s.inUse at the end of each sweep phase. Ranges from this second addrRanges are then reserved by scavengers (with the reservation size proportional to the heap size) who are then able to safely iterate over those ranges without worry of another scavenger coming in. Fixes #35788. Change-Id: Ief01ae170384174875118742f6c26b2a41cbb66d Reviewed-on: https://go-review.googlesource.com/c/go/+/208378 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Austin Clements <austin@google.com>
2020-05-07	runtime/runtime2: pack the sudog struct	Aaron Patterson
	This commit moves the isSelect bool below the ticket uint32. The boolean was consuming 8 bytes of the struct. The uint32 was also consuming 8 bytes, so we can pack isSelect below the uint32 and save 8 bytes. This reduces the sudog struct from 96 bytes to 88 bytes. Change-Id: If555cdaf2f5eaa125e2590fc4d113dbc99750738 GitHub-Last-Rev: d63b4e086b17da74e185046dfecb12d58e4f19ac GitHub-Pull-Request: golang/go#36552 Reviewed-on: https://go-review.googlesource.com/c/go/+/214677 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2020-04-07	runtime: static lock ranking for the runtime (enabled by GOEXPERIMENT)	Dan Scales
	I took some of the infrastructure from Austin's lock logging CR https://go-review.googlesource.com/c/go/+/192704 (with deadlock detection from the logs), and developed a setup to give static lock ranking for runtime locks. Static lock ranking establishes a documented total ordering among locks, and then reports an error if the total order is violated. This can happen if a deadlock happens (by acquiring a sequence of locks in different orders), or if just one side of a possible deadlock happens. Lock ordering deadlocks cannot happen as long as the lock ordering is followed. Along the way, I found a deadlock involving the new timer code, which Ian fixed via https://go-review.googlesource.com/c/go/+/207348, as well as two other potential deadlocks. See the constants at the top of runtime/lockrank.go to show the static lock ranking that I ended up with, along with some comments. This is great documentation of the current intended lock ordering when acquiring multiple locks in the runtime. I also added an array lockPartialOrder[] which shows and enforces the current partial ordering among locks (which is embedded within the total ordering). This is more specific about the dependencies among locks. I don't try to check the ranking within a lock class with multiple locks that can be acquired at the same time (i.e. check the ranking when multiple hchan locks are acquired). Currently, I am doing a lockInit() call to set the lock rank of most locks. Any lock that is not otherwise initialized is assumed to be a leaf lock (a very high rank lock), so that eliminates the need to do anything for a bunch of locks (including all architecture-dependent locks). For two locks, root.lock and notifyList.lock (only in the runtime/sema.go file), it is not as easy to do lock initialization, so instead, I am passing the lock rank with the lock calls. For Windows compilation, I needed to increase the StackGuard size from 896 to 928 because of the new lock-rank checking functions. Checking of the static lock ranking is enabled by setting GOEXPERIMENT=staticlockranking before doing a run. To make sure that the static lock ranking code has no overhead in memory or CPU when not enabled by GOEXPERIMENT, I changed 'go build/install' so that it defines a build tag (with the same name) whenever any experiment has been baked into the toolchain (by checking Expstring()). This allows me to avoid increasing the size of the 'mutex' type when static lock ranking is not enabled. Fixes #38029 Change-Id: I154217ff307c47051f8dae9c2a03b53081acd83a Reviewed-on: https://go-review.googlesource.com/c/go/+/207619 Reviewed-by: Dan Scales <danscales@google.com> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Dan Scales <danscales@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-03-23	runtime: add countAlloc benchmark	Michael Anthony Knyszek
	This change adds a small microbenchmark for (*mspan).countAlloc, which we're about to replace. Admittedly this isn't a critical piece of code, but the benchmark was useful in understanding the performance change. Change-Id: Iea93c00f571ee95534a42f2ef2ab026b382242b3 Reviewed-on: https://go-review.googlesource.com/c/go/+/224438 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-03-10	runtime: make typehash match compiler generated hashes exactly	Keith Randall
	If typehash (used by reflect) does not match the built-in map's hash, then problems occur. If a map is built using reflect, and then assigned to a variable of map type, the hash function can change. That causes very bad things. This issue is rare. MapOf consults a cache of all types that occur in the binary before making a new one. To make a true new map type (with a hash function derived from typehash) that map type must not occur in the binary anywhere. But to cause the bug, we need a variable of that type in order to assign to it. The only way to make that work is to use a named map type for the variable, so it is distinct from the unnamed version that MapOf looks for. Fixes #37716 Change-Id: I3537bfceca8cbfa1af84202f432f3c06953fe0ed Reviewed-on: https://go-review.googlesource.com/c/go/+/222357 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-01-28	runtime: ensure that searchAddr always refers to inUse memory	Michael Anthony Knyszek
	This change formalizes an assumption made by the page allocator, which is that (pageAlloc).searchAddr should never refer to memory that is not represented by (pageAlloc).inUse. The portion of address space covered by (pageAlloc).inUse reflects the parts of the summary arrays which are guaranteed to mapped, and so looking at any summary which is not reflected there may cause a segfault. In fact, this can happen today. This change thus also removes a micro-optimization which is the only case which may cause (pageAlloc).searchAddr to point outside of any region covered by (*pageAlloc).inUse, and adds a test verifying that the current segfault can no longer occur. Change-Id: I98b534f0ffba8656d3bd6d782f6fc22549ddf1c2 Reviewed-on: https://go-review.googlesource.com/c/go/+/216697 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-01-06	runtime: test memmove writes pointers atomically	Cherry Zhang
	In the previous CL we ensures that memmove writes pointers atomically, so the concurrent GC won't observe a partially updated pointer. This CL adds a test. Change-Id: Icd1124bf3a15ef25bac20c7fb8933f1a642d897c Reviewed-on: https://go-review.googlesource.com/c/go/+/212627 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-12-11	runtime: use inUse ranges to map in summary memory only as needed	Michael Anthony Knyszek
	Prior to this change, if the heap was very discontiguous (such as in TestArenaCollision) it's possible we could map a large amount of memory as R/W and commit it. We would use only the start and end to track what should be mapped, and we would extend that mapping as needed to accomodate a potentially fragmented address space. After this change, we only map exactly the part of the summary arrays that we need by using the inUse ranges from the previous change. This reduces the GCSys footprint of TestArenaCollision from 300 MiB to 18 MiB. Because summaries are no longer mapped contiguously, this means the scavenger can no longer iterate directly. This change also updates the scavenger to borrow ranges out of inUse and iterate over only the parts of the heap which are actually currently in use. This is both an optimization and necessary for correctness. Fixes #35514. Change-Id: I96bf0c73ed0d2d89a00202ece7b9d089a53bac90 Reviewed-on: https://go-review.googlesource.com/c/go/+/207758 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>
2019-12-11	runtime: track ranges of address space which are owned by the heap	Michael Anthony Knyszek
	This change adds a new inUse field to the allocator which tracks ranges of addresses that are owned by the heap. It is updated on each heap growth. These ranges are tracked in an array which is kept sorted. In practice this array shouldn't exceed its initial allocation except in rare cases and thus should be small (ideally exactly 1 element in size). In a hypothetical worst-case scenario wherein we have a 1 TiB heap and 4 MiB arenas (note that the address ranges will never be at a smaller granularity than an arena, since arenas are always allocated contiguously), inUse would use at most 4 MiB of memory if the heap mappings were completely discontiguous (highly unlikely) with an additional 2 MiB leaked from previous allocations. Furthermore, the copies that are done to keep the inUse array sorted will copy at most 4 MiB of memory in such a scenario, which, assuming a conservative copying rate of 5 GiB/s, amounts to about 800µs. However, note that in practice: 1) Most 64-bit platforms have 64 MiB arenas. 2) The copies should incur little-to-no page faults, meaning a copy rate closer to 25-50 GiB/s is expected. 3) Go heaps are almost always mostly contiguous. Updates #35514. Change-Id: I3ad07f1c2b5b9340acf59ecc3b9ae09e884814fe Reviewed-on: https://go-review.googlesource.com/c/go/+/207757 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Austin Clements <austin@google.com>
2019-12-05	runtime: add a simple version number parser	Austin Clements
	This will be used to parse the Linux kernel versions, but this code is generic and can be tested on its own. For #35777. Change-Id: If1df48d07250e5855dde45bc9d57c66f777b9fb4 Reviewed-on: https://go-review.googlesource.com/c/go/+/209597 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-12-03	runtime: convert page allocator bitmap to sparse array	Michael Anthony Knyszek
	Currently the page allocator bitmap is implemented as a single giant memory mapping which is reserved at init time and committed as needed. This causes problems on systems that don't handle large uncommitted mappings well, or institute low virtual address space defaults as a memory limiting mechanism. This change modifies the implementation of the page allocator bitmap away from a directly-mapped set of bytes to a sparse array in same vein as mheap.arenas. This will hurt performance a little but the biggest gains are from the lockless allocation possible with the page allocator, so the impact of this extra layer of indirection should be minimal. In fact, this is exactly what we see: https://perf.golang.org/search?q=upload:20191125.5 This reduces the amount of mapped (PROT_NONE) memory needed on systems with 48-bit address spaces to ~600 MiB down from almost 9 GiB. The bulk of this remaining memory is used by the summaries. Go processes with 32-bit address spaces now always commit to 128 KiB of memory for the bitmap. Previously it would only commit the pages in the bitmap which represented the range of addresses (lowest address to highest address, even if there are unused regions in that range) used by the heap. Updates #35568. Updates #35451. Change-Id: I0ff10380156568642b80c366001eefd0a4e6c762 Reviewed-on: https://go-review.googlesource.com/c/go/+/207497 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-11-16	runtime: add arenaBaseOffset on aix/ppc64	Clément Chigot
	On AIX, addresses returned by mmap are between 0x0a00000000000000 and 0x0afffffffffffff. The previous solution to handle these large addresses was to increase the arena size up to 60 bits addresses, cf CL 138736. However, with the new page allocator, the 60bit heap addresses are causing huge memory allocations, especially by (s *pageAlloc).init. mmap and munmap syscalls dealing with these allocations are reducing performances of every Go programs. In order to avoid these allocations, arenaBaseOffset is set to 0x0a00000000000000 and heap addresses are on 48bit, as others operating systems. Updates: #35451 Change-Id: Ice916b8578f76703428ec12a82024147a7592bc0 Reviewed-on: https://go-review.googlesource.com/c/go/+/206841 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
2019-11-10	runtime: make the test addresses for pageAlloc smaller on 32-bit	Michael Anthony Knyszek
	This change makes the test addresses start at 1 GiB instead of 2 GiB to support mips and mipsle, which only have 31-bit address spaces. It also changes some tests to use smaller offsets for the chunk index to avoid jumping too far ahead in the address space to support 31-bit address spaces. The tests don't require such large jumps for what they're testing anyway. Updates #35112. Fixes #35440. Change-Id: Ic68ff2b0a1f10ef37ac00d4bb5b910ddcdc76f2e Reviewed-on: https://go-review.googlesource.com/c/go/+/205938 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-11-09	sync: yield to the waiter when unlocking a starving mutex	Rhys Hiltner
	When we have already assigned the semaphore ticket to a specific waiter, we want to get the waiter running as fast as possible since no other G waiting on the semaphore can acquire it optimistically. The net effect is that, when a sync.Mutex is contended, the code in the critical section guarded by the Mutex gets a priority boost. Fixes #33747 The original work was done in CL 200577 by Carlo Alberto Ferraris. The change was reverted in CL 205817 because it broke the linux-arm64-packet and solaris-amd64-oraclerel builders. Change-Id: I76d79b1d63fd206ed1c57fe6900cb7ae9e4d46cb Reviewed-on: https://go-review.googlesource.com/c/go/+/206180 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-11-08	runtime: copy some functions from math/bits to runtime/internal/sys	David Chase
	CL 201765 activated calls from the runtime to functions in math/bits. When coverage and race detection were simultaneously enabled, this caused a crash when the covered+race-checked code in math/bits was called from the runtime before there was even a P. PS Win for gdlv in helping sort this out. TODO - next CL intrinsifies the new functions in runtime/internal/sys TODO/Would-be-nice - Ctz64 and TrailingZeros64 are the same function; 386.s is intrinsified; clean all that up. Fixes #35461. Updates #35112. Change-Id: I750a54dba493130ad3e68a06530ede7687d41e1d Reviewed-on: https://go-review.googlesource.com/c/go/+/206199 Reviewed-by: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-11-08	runtime: add per-p page allocation cache	Michael Anthony Knyszek
	This change adds a per-p free page cache which the page allocator may allocate out of without a lock. The change also introduces a completely lockless page allocator fast path. Although the cache contains at most 64 pages (and usually less), the vast majority (85%+) of page allocations are exactly 1 page in size. Updates #35112. Change-Id: I170bf0a9375873e7e3230845eb1df7e5cf741b78 Reviewed-on: https://go-review.googlesource.com/c/go/+/195701 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Austin Clements <austin@google.com>
2019-11-08	runtime: add page cache and tests	Michael Anthony Knyszek
	This change adds a page cache structure which owns a chunk of free pages at a given base address. It also adds code to allocate to this cache from the page allocator. Finally, it adds tests for both. Notably this change does not yet integrate the code into the runtime, just into runtime tests. Updates #35112. Change-Id: Ibe121498d5c3be40390fab58a3816295601670df Reviewed-on: https://go-review.googlesource.com/c/go/+/196643 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>
2019-11-08	runtime: remove old page allocator	Michael Anthony Knyszek
	This change removes the old page allocator from the runtime. Updates #35112. Change-Id: Ib20e1c030f869b6318cd6f4288a9befdbae1b771 Reviewed-on: https://go-review.googlesource.com/c/go/+/195700 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>
2019-11-07	runtime: integrate new page allocator into runtime	Michael Anthony Knyszek
	This change integrates all the bits and pieces of the new page allocator into the runtime, behind a global constant. Updates #35112. Change-Id: I6696bde7bab098a498ab37ed2a2caad2a05d30ec Reviewed-on: https://go-review.googlesource.com/c/go/+/201764 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>
2019-11-07	runtime: add option to scavenge with lock held throughout	Michael Anthony Knyszek
	This change adds a "locked" parameter to scavenge() and scavengeone() which allows these methods to be run with the heap lock acquired, and synchronously with respect to others which acquire the heap lock. This mode is necessary for both heap-growth scavenging (multiple asynchronous scavengers here could be problematic) and debug.FreeOSMemory. Updates #35112. Change-Id: I24eea8e40f971760999c980981893676b4c9b666 Reviewed-on: https://go-review.googlesource.com/c/go/+/195699 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2019-11-07	runtime: count scavenged bits for new allocation for new page allocator	Michael Anthony Knyszek
	This change makes it so that the new page allocator returns the number of pages that are scavenged in a new allocation so that mheap can update memstats appropriately. The accounting could be embedded into pageAlloc, but that would make the new allocator more difficult to test. Updates #35112. Change-Id: I0f94f563d7af2458e6d534f589d2e7dd6af26d12 Reviewed-on: https://go-review.googlesource.com/c/go/+/195698 Reviewed-by: Austin Clements <austin@google.com>
2019-11-07	runtime: add scavenging code for new page allocator	Michael Anthony Knyszek
	This change adds a scavenger for the new page allocator along with tests. The scavenger walks over the heap backwards once per GC, looking for memory to scavenge. It walks across the heap without any lock held, searching optimistically. If it finds what appears to be a scavenging candidate it acquires the heap lock and attempts to verify it. Upon verification it then scavenges. Notably, unlike the old scavenger, it doesn't show any preference for huge pages and instead follows a more strict last-page-first policy. Updates #35112. Change-Id: I0621ef73c999a471843eab2d1307ae5679dd18d6 Reviewed-on: https://go-review.googlesource.com/c/go/+/195697 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Austin Clements <austin@google.com>
2019-11-07	runtime: add new page allocator core	Michael Anthony Knyszek
	This change adds a new bitmap-based allocator to the runtime with tests. It does not yet integrate the page allocator into the runtime and thus this change is almost purely additive. Updates #35112. Change-Id: Ic3d024c28abee8be8797d3918116a80f901cc2bf Reviewed-on: https://go-review.googlesource.com/c/go/+/190622 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>
2019-11-07	runtime: add packed bitmap summaries	Michael Anthony Knyszek
	This change adds the concept of summaries and of summarizing a set of pallocBits, a core concept in the new page allocator. These summaries are really just three integers packed into a uint64. This change also adds tests and a benchmark for generating these summaries. Updates #35112. Change-Id: I69686316086c820c792b7a54235859c2105e5fee Reviewed-on: https://go-review.googlesource.com/c/go/+/190621 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-11-07	runtime: add pallocbits and tests	Michael Anthony Knyszek
	This change adds a per-chunk bitmap for page allocation called pallocBits with algorithms for allocating and freeing pages out of the bitmap. This change also adds tests for pallocBits, but does not yet integrate it into the runtime. Updates #35112. Change-Id: I479006ed9f1609c80eedfff0580d5426b064b0ff Reviewed-on: https://go-review.googlesource.com/c/go/+/190620 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Austin Clements <austin@google.com>
2019-11-07	Revert "sync: yield to the waiter when unlocking a starving mutex"	Bryan C. Mills
	This reverts CL 200577. Reason for revert: broke linux-arm64-packet and solaris-amd64-oraclerel builders Fixes #35424 Updates #33747 Change-Id: I2575fd84d37995d458183caae54704f15d8b8426 Reviewed-on: https://go-review.googlesource.com/c/go/+/205817 Run-TryBot: Bryan C. Mills <bcmills@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-11-07	sync: yield to the waiter when unlocking a starving mutex	Carlo Alberto Ferraris
	When we have already assigned the semaphore ticket to a specific waiter, we want to get the waiter running as fast as possible since no other G waiting on the semaphore can acquire it optimistically. The net effect is that, when a sync.Mutex is contented, the code in the critical section guarded by the Mutex gets a priority boost. Fixes #33747 Change-Id: I9967f0f763c25504010651bdd7f944ee0189cd45 Reviewed-on: https://go-review.googlesource.com/c/go/+/200577 Reviewed-by: Rhys Hiltner <rhys@justin.tv> Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-11-02	runtime: add a test for asynchronous safe points	Austin Clements
	This adds a test of preempting a loop containing no synchronous safe points for STW and stack scanning. We couldn't add this test earlier because it requires scheduler, STW, and stack scanning preemption to all be working. For #10958, #24543. Change-Id: I73292db78ca3d14aab11bdafd26d03986920ef0a Reviewed-on: https://go-review.googlesource.com/c/go/+/201777 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-10-31	runtime: atomically set span state and use as publication barrier	Austin Clements
	When everything is working correctly, any pointer the garbage collector encounters can only point into a fully initialized heap span, since the span must have been initialized before that pointer could escape the heap allocator and become visible to the GC. However, in various cases, we try to be defensive against bad pointers. In findObject, this is just a sanity check: we never expect to find a bad pointer, but programming errors can lead to them. In spanOfHeap, we don't necessarily trust the pointer and we're trying to check if it really does point to the heap, though it should always point to something. Conservative scanning takes this to a new level, since it can only guess that a word may be a pointer and verify this. In all of these cases, we have a problem that the span lookup and check can race with span initialization, since the span becomes visible to lookups before it's fully initialized. Furthermore, we're about to start initializing the span without the heap lock held, which is going to introduce races where accesses were previously protected by the heap lock. To address this, this CL makes accesses to mspan.state atomic, and ensures that the span is fully initialized before setting the state to mSpanInUse. All loads are now atomic, and in any case where we don't trust the pointer, it first atomically loads the span state and checks that it's mSpanInUse, after which it will have synchronized with span initialization and can safely check the other span fields. For #10958, #24543, but a good fix in general. Change-Id: I518b7c63555b02064b98aa5f802c92b758fef853 Reviewed-on: https://go-review.googlesource.com/c/go/+/203286 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>