Age | Commit message (Collapse) | Author |
|
This CL ports a few performance-critical runtime assembly functions to
use register arguments directly. While using the faster ABI is nice,
the real win here is that we avoid ABI wrappers: since these are
"builtin" functions in the compiler, it can generate calls to them
without knowing that their native implementation is ABI0. Hence, it
generates ABIInternal calls that go through ABI wrappers. By porting
them to use ABIInternal natively, we avoid the overhead of the ABI
wrapper.
This significantly improves performance on several benchmarks,
comparing regabiwrappers before and after this change:
name old time/op new time/op delta
BiogoIgor 15.7s ± 2% 15.7s ± 2% ~ (p=0.617 n=25+25)
BiogoKrishna 18.5s ± 5% 17.7s ± 2% -4.61% (p=0.000 n=25+25)
BleveIndexBatch100 5.91s ± 3% 5.82s ± 3% -1.60% (p=0.000 n=25+25)
BleveQuery 6.76s ± 0% 6.60s ± 1% -2.31% (p=0.000 n=22+25)
CompileTemplate 248ms ± 5% 245ms ± 1% ~ (p=0.643 n=25+20)
CompileUnicode 94.4ms ± 3% 93.9ms ± 2% ~ (p=0.152 n=24+23)
CompileGoTypes 1.60s ± 2% 1.59s ± 2% ~ (p=0.059 n=24+24)
CompileCompiler 104ms ± 3% 103ms ± 1% ~ (p=0.056 n=25+22)
CompileSSA 10.9s ± 1% 10.9s ± 1% ~ (p=0.052 n=25+25)
CompileFlate 156ms ± 8% 152ms ± 1% -2.49% (p=0.008 n=25+21)
CompileGoParser 248ms ± 1% 249ms ± 2% ~ (p=0.058 n=21+20)
CompileReflect 595ms ± 3% 601ms ± 4% ~ (p=0.182 n=25+25)
CompileTar 211ms ± 2% 211ms ± 1% ~ (p=0.663 n=23+23)
CompileXML 282ms ± 2% 284ms ± 5% ~ (p=0.456 n=21+23)
CompileStdCmd 13.6s ± 2% 13.5s ± 2% ~ (p=0.112 n=25+24)
FoglemanFauxGLRenderRotateBoat 8.69s ± 2% 8.67s ± 0% ~ (p=0.094 n=22+25)
FoglemanPathTraceRenderGopherIter1 20.2s ± 2% 20.7s ± 3% +2.53% (p=0.000 n=24+24)
GopherLuaKNucleotide 31.4s ± 1% 31.0s ± 1% -1.28% (p=0.000 n=25+24)
MarkdownRenderXHTML 246ms ± 1% 244ms ± 1% -0.79% (p=0.000 n=20+21)
Tile38WithinCircle100kmRequest 843µs ± 4% 818µs ± 4% -2.93% (p=0.000 n=25+25)
Tile38IntersectsCircle100kmRequest 1.06ms ± 5% 1.05ms ± 3% -1.19% (p=0.021 n=24+25)
Tile38KNearestLimit100Request 1.01ms ± 1% 1.01ms ± 2% ~ (p=0.335 n=22+25)
[Geo mean] 596ms 592ms -0.71%
(https://perf.golang.org/search?q=upload:20210411.5)
It also significantly reduces the performance penalty of enabling
regabiwrappers, though it doesn't yet fully close the gap on all
benchmarks:
name old time/op new time/op delta
BiogoIgor 15.7s ± 1% 15.7s ± 2% ~ (p=0.366 n=24+25)
BiogoKrishna 17.7s ± 2% 17.7s ± 2% ~ (p=0.315 n=23+25)
BleveIndexBatch100 5.86s ± 4% 5.82s ± 3% ~ (p=0.137 n=24+25)
BleveQuery 6.55s ± 0% 6.60s ± 1% +0.83% (p=0.000 n=24+25)
CompileTemplate 244ms ± 1% 245ms ± 1% ~ (p=0.208 n=21+20)
CompileUnicode 94.0ms ± 4% 93.9ms ± 2% ~ (p=0.666 n=24+23)
CompileGoTypes 1.60s ± 2% 1.59s ± 2% ~ (p=0.154 n=25+24)
CompileCompiler 103ms ± 1% 103ms ± 1% ~ (p=0.905 n=24+22)
CompileSSA 10.9s ± 2% 10.9s ± 1% ~ (p=0.803 n=25+25)
CompileFlate 153ms ± 1% 152ms ± 1% ~ (p=0.182 n=23+21)
CompileGoParser 250ms ± 2% 249ms ± 2% ~ (p=0.843 n=24+20)
CompileReflect 595ms ± 4% 601ms ± 4% ~ (p=0.141 n=25+25)
CompileTar 212ms ± 3% 211ms ± 1% ~ (p=0.499 n=23+23)
CompileXML 282ms ± 1% 284ms ± 5% ~ (p=0.129 n=20+23)
CompileStdCmd 13.5s ± 2% 13.5s ± 2% ~ (p=0.480 n=24+24)
FoglemanFauxGLRenderRotateBoat 8.66s ± 1% 8.67s ± 0% ~ (p=0.325 n=25+25)
FoglemanPathTraceRenderGopherIter1 20.6s ± 3% 20.7s ± 3% ~ (p=0.137 n=25+24)
GopherLuaKNucleotide 30.5s ± 2% 31.0s ± 1% +1.68% (p=0.000 n=23+24)
MarkdownRenderXHTML 243ms ± 1% 244ms ± 1% +0.51% (p=0.000 n=23+21)
Tile38WithinCircle100kmRequest 801µs ± 2% 818µs ± 4% +2.11% (p=0.000 n=25+25)
Tile38IntersectsCircle100kmRequest 1.01ms ± 2% 1.05ms ± 3% +4.34% (p=0.000 n=24+25)
Tile38KNearestLimit100Request 1.00ms ± 1% 1.01ms ± 2% +0.81% (p=0.008 n=21+25)
[Geo mean] 589ms 592ms +0.50%
(https://perf.golang.org/search?q=upload:20210411.6)
Change-Id: I8f77f010b0abc658064df569a27a9c7a7b1c7bf9
Reviewed-on: https://go-review.googlesource.com/c/go/+/308931
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
|
|
With the register ABI, it's important to inject sigpanic0 instead of
sigpanic so we can set up function entry registers. This was already
happening on most OSes. This CL gets the remaining ones.
Change-Id: I6bc4d912b6497e03ed54d0a9c1eae8fd099d2cea
Reviewed-on: https://go-review.googlesource.com/c/go/+/308930
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
|
|
Currently, we have boolean and integral constants for GOEXPERIMENTs in
various places. Consolidate these into automatically generated
constants in the internal/goexperiment package.
Change-Id: I42a49aba2a3b4c722fedea23a613162cd8a67bee
Reviewed-on: https://go-review.googlesource.com/c/go/+/307818
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
|
|
This change causes finalizers, reflect calls, and Windows syscall
callbacks to assume the register ABI when GOEXPERIMENT=regabiargs is
set. That is, when all Go functions are using the new ABI by default,
these features should assume the new ABI too.
For #40724.
Change-Id: Ie4ee66b8085b692e1ff675f01134c9a4703ae1b9
Reviewed-on: https://go-review.googlesource.com/c/go/+/306571
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
|
|
This change modifies runfinq to properly pass arguments to finalizers in
registers via reflectcall.
For #40724.
Change-Id: I414c0eff466ef315a0eb10507994e598dd29ccb2
Reviewed-on: https://go-review.googlesource.com/c/go/+/300112
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
reflectcall tail calls runtime.call{16,32,...} functions, so they
have the same signature as reflectcall. It is important for them
to have the correct arg map, because those functions, as well as
the function being reflectcall'd, could move the stack. When that
happens, its pointer arguments, in particular regArgs, need to be
adjusted. Otherwise it will still point to the old stack, causing
memory corruption.
This only caused failures on the regabi builder because it is the
only place where internal/abi.RegArgs is not a zero-sized type.
May fix #44821.
Change-Id: Iab400ea6b60c52360d0b43a793f6bfe50ca9989b
Reviewed-on: https://go-review.googlesource.com/c/go/+/300154
Trust: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
Change-Id: Ie811526534df8622d89c5b1b81dbe19ece1c962b
Reviewed-on: https://go-review.googlesource.com/c/go/+/292110
Trust: Russ Cox <rsc@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
This change adds support for the new register ABI on amd64 to
reflect.(Value).Call. If internal/abi's register counts are non-zero,
reflect will try to set up arguments in registers on the Call path.
Note that because the register ABI becomes ABI0 with zero registers
available, this should keep working as it did before.
This change does not add any tests for the register ABI case because
there's no way to do so at the moment.
For #40724.
Change-Id: I8aa089a5aa5a31b72e56b3d9388dd3f82203985b
Reviewed-on: https://go-review.googlesource.com/c/go/+/272568
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
|
|
In case that we are panicking in ABI0 context or external code,
special registers are not initialized. Initialized them in
injected code before calling sigpanic.
TODO: Windows, Plan 9.
Change-Id: I0919b80e7cc55463f3dd94f1f63cba305717270a
Reviewed-on: https://go-review.googlesource.com/c/go/+/289710
Trust: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Jeremy Faller <jeremy@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
The runtime.gosave function is not used anywhere. Delete.
Note: there is also a gosave<> function, which is actually used
and not deleted.
Change-Id: I64149a7afdd217de26d1e6396233f2becfad7153
Reviewed-on: https://go-review.googlesource.com/c/go/+/289719
Trust: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
|
|
memclrNoHeapPointers is the underlying implementation of
typedmemclr and memclrHasPointers, so it still needs to write
pointer-aligned words atomically. Document this requirement.
Updates #41428.
Change-Id: Ice00dee5de7a96a50e51ff019fcef069e8a8406a
Reviewed-on: https://go-review.googlesource.com/c/go/+/287692
Trust: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
TempDir) from io/ioutil
io/ioutil was a poorly defined collection of helpers.
Proposal #40025 moved out the generic I/O helpers to io.
This CL for proposal #42026 moves the OS-specific helpers to os,
making the entire io/ioutil package deprecated.
For #42026.
Change-Id: I018bcb2115ef2ff1bc7ca36a9247eda429af21ad
Reviewed-on: https://go-review.googlesource.com/c/go/+/266364
Trust: Russ Cox <rsc@golang.org>
Trust: Emmanuel Odeke <emmanuel@orijtech.com>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Emmanuel Odeke <emmanuel@orijtech.com>
|
|
This redesigns the way calls work from C to exported Go functions. It
removes several steps from the call path, makes cmd/cgo no longer
sensitive to the Go calling convention, and eliminates the use of
reflectcall from cgo.
In order to avoid generating a large amount of FFI glue between the C
and Go ABIs, the cgo tool has long depended on generating a C function
that marshals the arguments into a struct, and then the actual ABI
switch happens in functions with fixed signatures that simply take a
pointer to this struct. In a way, this CL simply pushes this idea
further.
Currently, the cgo tool generates this argument struct in the exact
layout of the Go stack frame and depends on reflectcall to unpack it
into the appropriate Go call (even though it's actually
reflectcall'ing a function generated by cgo).
In this CL, we decouple this struct from the Go stack layout. Instead,
cgo generates a Go function that takes the struct, unpacks it, and
calls the exported function. Since this generated function has a
generic signature (like the rest of the call path), we don't need
reflectcall and can instead depend on the Go compiler itself to
implement the call to the exported Go function.
One complication is that syscall.NewCallback on Windows, which
converts a Go function into a C function pointer, depends on
cgocallback's current dynamic calling approach since the signatures of
the callbacks aren't known statically. For this specific case, we
continue to depend on reflectcall. Really, the current approach makes
some overly simplistic assumptions about translating the C ABI to the
Go ABI. Now we're at least in a much better position to do a proper
ABI translation.
For comparison, the current cgo call path looks like:
GoF (generated C function) ->
crosscall2 (in cgo/asm_*.s) ->
_cgoexp_GoF (generated Go function) ->
cgocallback (in asm_*.s) ->
cgocallback_gofunc (in asm_*.s) ->
cgocallbackg (in cgocall.go) ->
cgocallbackg1 (in cgocall.go) ->
reflectcall (in asm_*.s) ->
_cgoexpwrap_GoF (generated Go function) ->
p.GoF
Now the call path looks like:
GoF (generated C function) ->
crosscall2 (in cgo/asm_*.s) ->
cgocallback (in asm_*.s) ->
cgocallbackg (in cgocall.go) ->
cgocallbackg1 (in cgocall.go) ->
_cgoexp_GoF (generated Go function) ->
p.GoF
Notably:
1. We combine _cgoexp_GoF and _cgoexpwrap_GoF and move the combined
operation to the end of the sequence. This combined function also
handles reflectcall's previous role.
2. We combined cgocallback and cgocallback_gofunc since the only
purpose of having both was to convert a raw PC into a Go function
value. We instead construct the Go function value in cgocallbackg1.
3. cgocallbackg1 no longer reaches backwards through the stack to get
the arguments to cgocallback_gofunc. Instead, we just pass the
arguments down.
4. Currently, we need an explicit msanwrite to mark the results struct
as written because reflectcall doesn't do this. Now, the results are
written by regular Go assignments, so the Go compiler generates the
necessary MSAN annotations. This also means we no longer need to track
the size of the arguments frame.
Updates #40724, since now we don't need to teach cgo about the
register ABI or change how it uses reflectcall.
Change-Id: I7840489a2597962aeb670e0c1798a16a7359c94f
Reviewed-on: https://go-review.googlesource.com/c/go/+/258938
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Like we did for sync, let the runtime give net random numbers,
to avoid forcing an import of math/rand for DNS.
Change-Id: Iab3e64121d687d288a3961a8ccbcebe589047253
Reviewed-on: https://go-review.googlesource.com/c/go/+/241258
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Currently, runtime.call16 is defined and used only on 32-bit
architectures, while 64-bit architectures all start at call32 and go
up from there. This led to unnecessary complexity because call16's
prototype needed to be in a different file, separate from all of the
other call* prototypes, which in turn led to it getting out of sync
with the other call* prototypes. This CL adds call16 on 64-bit
architectures, bringing them all into sync, and moves the call16
prototype to live with the others.
Prior to CL 31655 (in 2016), call16 couldn't be implemented on 64-bit
architectures because it needed at least four words of argument space
to invoke "callwritebarrier" after copying back the results. CL 31655
changed the way call* invoked the write barrier in preparation for the
hybrid barrier; since the hybrid barrier had to be invoked prior to
copying back results, it needed a different solution that didn't reuse
call*'s stack space. At this point, call16 was no longer a problem on
64-bit, but we never added it. Until now.
Change-Id: Id10ade0e4f75c6ea76afa6229ddaee2b994c27dd
Reviewed-on: https://go-review.googlesource.com/c/go/+/259339
Trust: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Change-Id: I9ae3f462a6a6b3809de14b0d08f369524b636d57
Reviewed-on: https://go-review.googlesource.com/c/go/+/237097
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
materializeGCProg allocates a temporary buffer for unrolling a GC
program. Unfortunately, when computing the size of the buffer, it
rounds *down* the number of bytes needed to store bitmap before
rounding up the number of pages needed to store those bytes. The fact
that it rounds up to pages usually mitigates the rounding down, but
the type from #37470 exists right on the boundary where this doesn't
work:
type Sequencer struct {
htable [1 << 17]uint32
buf []byte
}
On 64-bit, this GC bitmap is exactly 8 KiB of zeros, followed by three
one bits. Hence, this needs 8193 bytes of storage, but the current
math in materializeGCProg rounds *down* the three one bits to 8192
bytes. Since this is exactly pageSize, the next step of rounding up to
the page size doesn't mitigate this error, and materializeGCProg
allocates a buffer that is one byte too small. runGCProg then writes
one byte past the end of this buffer, causing either a segfault (if
you're lucky!) or memory corruption.
Fixes #37470.
Change-Id: Iad24c463c501cd9b1dc1924bc2ad007991a094a0
Reviewed-on: https://go-review.googlesource.com/c/go/+/221197
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Unlike C's memmove, Go's memmove must be careful to do indivisible
writes of pointer values because it may be racing with the garbage
collector reading the heap.
We've had various bugs related to this over the years (#36101, #13160,
#12552). Indeed, memmove is a great target for optimization and it's
easy to forget the special requirements of Go's memmove.
The CL documents these (currently unwritten!) requirements. We're also
adding a test that should hopefully keep everyone honest going
forward, though it's hard to be sure we're hitting all cases of
memmove.
Change-Id: I2f59f8d8d6fb42d2f10006b55d605b5efd8ddc24
Reviewed-on: https://go-review.googlesource.com/c/go/+/213418
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
This change renames the "round" function to the more appropriately named
"alignUp" which rounds an integer up to the next multiple of a power of
two.
This change also adds the alignDown function, which is almost like
alignUp but rounds down to the previous multiple of a power of two.
With these two functions, we also go and replace manual rounding code
with it where we can.
Change-Id: Ie1487366280484dcb2662972b01b4f7135f72fec
Reviewed-on: https://go-review.googlesource.com/c/go/+/190618
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
This reverts CL 180761
Reason for revert: Reinstate the stack-allocated defer CL.
There was nothing wrong with the CL proper, but stack allocation of defers exposed two other issues.
Issue #32477: Fix has been submitted as CL 181258.
Issue #32498: Possible fix is CL 181377 (not submitted yet).
Change-Id: I32b3365d5026600069291b068bbba6cb15295eb3
Reviewed-on: https://go-review.googlesource.com/c/go/+/181378
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
This reverts commit fff4f599fe1c21e411a99de5c9b3777d06ce0ce6.
Reason for revert: Seems to still have issues around GC.
Fixes #32452
Change-Id: Ibe7af629f9ad6a3d5312acd7b066123f484da7f0
Reviewed-on: https://go-review.googlesource.com/c/go/+/180761
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
|
|
When a defer is executed at most once in a function body,
we can allocate the defer record for it on the stack instead
of on the heap.
This should make defers like this (which are very common) faster.
This optimization applies to 363 out of the 370 static defer sites
in the cmd/go binary.
name old time/op new time/op delta
Defer-4 52.2ns ± 5% 36.2ns ± 3% -30.70% (p=0.000 n=10+10)
Fixes #6980
Update #14939
Change-Id: I697109dd7aeef9e97a9eeba2ef65ff53d3ee1004
Reviewed-on: https://go-review.googlesource.com/c/go/+/171758
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
Working toward making the tree vet-safe instead of having
so many exceptions in cmd/vet/all/whitelist.
This CL makes "GOOS=linux GOARCH=386 go vet -unsafeptr=false runtime" happy,
while keeping "GO_BUILDER_NAME=misc-vetall go tool dist test" happy too.
For #31916.
Change-Id: I3e5586a7ff6e359357350d0602c2259493280ded
Reviewed-on: https://go-review.googlesource.com/c/go/+/176099
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
The vetall builder runs vet straight out of golang.org/x/tools,
so submiting CL 176097 in that repo will break the builder
by making all these whitelist entries stale.
Submiting this CL will fix it, by removing them.
The addition of the gcWriteBarrier declaration in runtime/stubs.go
is necessary because the diagnostic is no longer emitted on arm,
so it must be removed from all.txt. Adding it to runtime is better
than adding it to every-other-goarch.txt.
For #31916.
Change-Id: I432f6049cd3ee5a467add5066c440be8616d9d54
Reviewed-on: https://go-review.googlesource.com/c/go/+/176177
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
Follow-up for CL 147037 and after Brad noticed the "returns whether"
pattern during the review of CL 150621.
Go documentation style for boolean funcs is to say:
// Foo reports whether ...
func Foo() bool
(rather than "returns whether")
Created with:
$ perl -i -npe 's/returns whether/reports whether/' $(git grep -l "returns whether" | grep -v vendor)
Change-Id: I15fe9ff99180ad97750cd05a10eceafdb12dc0b4
Reviewed-on: https://go-review.googlesource.com/c/150918
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Fixes #28955
Change-Id: I738ad0c76f7bf8fc504a48cf55d3becd5ed7a9d6
Reviewed-on: https://go-review.googlesource.com/c/151337
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Currently, package runtime contains the definition of reflect.call,
even though it's just a jump to runtime.reflectcall. This "push"
symbol is confusing, since it's not clear where the definition of
reflect.call comes from when you're in the reflect package.
Replace this with a "pull" symbol: the runtime now defines only
runtime.reflectcall and package reflect uses a go:linkname to access
this symbol directly. This makes it clear where reflect.call is coming
from without any spooky action at a distance and eliminates all of the
definitions of reflect.call in the runtime.
Change-Id: I3ec73cd394efe9df8d3061a57c73aece2e7048dd
Reviewed-on: https://go-review.googlesource.com/c/148657
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Also adds some missing asmdecl comments for funcs with Go proto.
Change-Id: Iabc68e8c0ad936e06ed719e0f030bfc5f6f6e168
Reviewed-on: https://go-review.googlesource.com/127760
Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Each URL was manually verified to ensure it did not serve up incorrect
content.
Change-Id: I4dc846227af95a73ee9a3074d0c379ff0fa955df
Reviewed-on: https://go-review.googlesource.com/115798
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
|
|
Getcallerpc/sp no longer takes argument.
Change-Id: I80b30020e798990c59c8ffd0a4e078af6a75aea0
Reviewed-on: https://go-review.googlesource.com/109696
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
getcallersp is intrinsified, and so the dummy arg is no longer
needed. Remove it, as well as a few dummy args that are solely
to feed getcallersp.
Change-Id: Ibb6c948ff9c56537042b380ac3be3a91b247aaa6
Reviewed-on: https://go-review.googlesource.com/109596
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
On all non-x86 arches, runtime.abort simply reads from nil.
Unfortunately, if this happens on a user stack, the signal handler
will dutifully turn this into a panicmem, which lets user defers run
and which user code can even recover from.
To fix this, add an explicit check to the signal handler that turns
faults in abort into hard crashes directly in the signal handler. This
has the added benefit of giving a register dump at the abort point.
Change-Id: If26a7f13790745ee3867db7f53b72d8281176d70
Reviewed-on: https://go-review.googlesource.com/93661
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Currently, throw may grow the stack, which means whenever we call it
from a context where it's not safe to grow the stack, we first have to
switch to the system stack. This is pretty easy to get wrong.
Fix this by making throw switch to the system stack so it doesn't grow
the stack and is hence safe to call without a system stack switch at
the call site.
The only thing this complicates is badsystemstack itself, which would
now go into an infinite loop before printing anything (previously it
would also go into an infinite loop, but would at least print the
error first). Fix this by making badsystemstack do a direct write and
then crash hard.
Change-Id: Ic5b4a610df265e47962dcfa341cabac03c31c049
Reviewed-on: https://go-review.googlesource.com/93659
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Currently, threads created by the runtime exist until the whole
program exits. For #14592 and #20395, we want to be able to exit and
clean up threads created by the runtime. This commit implements that
mechanism.
The main difficulty is how to clean up the g0 stack. In cgo mode and
on Solaris and Windows where the OS manages thread stacks, we simply
arrange to return from mstart and let the system clean up the thread.
If the runtime allocated the g0 stack, then we use a new exitThread
syscall wrapper that arranges to clear a flag in the M once the stack
can safely be reaped and call the thread termination syscall.
exitThread is based on the existing exit1 wrapper, which was always
meant to terminate the calling thread. However, exit1 has never been
used since it was introduced 9 years ago, so it was broken on several
platforms. exitThread also has the additional complication of having
to flag that the stack is unused, which requires some tricks on
platforms that use the stack for syscalls.
This still leaves the problem of how to reap the unused g0 stacks. For
this, we move the M from allm to a new freem list as part of the M
exiting. Later, allocm scans the freem list, finds Ms that are marked
as done with their stack, removes these from the list and frees their
g0 stacks. This also allows these Ms to be garbage collected.
This CL does not yet use any of this functionality. Follow-up CLs
will. Likewise, there are no new tests in this CL because we'll need
follow-up functionality to test it.
Change-Id: Ic851ee74227b6d39c6fc1219fc71b45d3004bc63
Reviewed-on: https://go-review.googlesource.com/46037
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Add a compiler intrinsic for getcallersp. So we are able to get
rid of the argument (not done in this CL).
Change-Id: Ic38fda1c694f918328659ab44654198fb116668d
Reviewed-on: https://go-review.googlesource.com/69350
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
Now that getcallerpc is a compiler intrinsic on x86 and non-x86
platforms don't need the argument, we can drop it.
Sadly, this doesn't let us remove any dummy arguments since all of
those cases also use getcallersp, which still takes the argument
pointer, but this is at least an improvement.
Change-Id: I9c34a41cf2c18cba57f59938390bf9491efb22d2
Reviewed-on: https://go-review.googlesource.com/65474
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
First step towards removing the mandatory argument for
getcallerpc, which solves certain problems for the runtime.
This might also slightly improve performance.
Intrinsic enabled on 386, amd64, amd64p32,
runtime asm implementation removed on those architectures.
Now-superfluous argument remains in getcallerpc signature
(for a future CL; non-386/amd64 asm funcs ignore it).
Added getcallerpc to the "not a real function" test
in dcl.go, that story is a little odd with respect to
unexported functions but that is not this CL.
Fixes #17327.
Change-Id: I5df1ad91f27ee9ac1f0dd88fa48f1329d6306c3e
Reviewed-on: https://go-review.googlesource.com/31851
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
The current generator is a simple LSFR, which showed strong
correlation in higher bits, as manifested by fastrandn().
Change it with xorshift64+, which is slightly more complex,
has a larger state, but has a period of 2^64-1 and is much better
at statistical tests. The version used here is capable of
passing Diehard and even SmallCrush.
Speed is slightly worse but is probably insignificant:
name old time/op new time/op delta
Fastrand-4 0.77ns ±12% 0.91ns ±21% +17.31% (p=0.048 n=5+5)
FastrandHashiter-4 13.6ns ±21% 15.2ns ±17% ~ (p=0.160 n=6+5)
Fastrandn/2-4 2.30ns ± 5% 2.45ns ±15% ~ (p=0.222 n=5+5)
Fastrandn/3-4 2.36ns ± 7% 2.45ns ± 6% ~ (p=0.222 n=5+5)
Fastrandn/4-4 2.33ns ± 8% 2.61ns ±30% ~ (p=0.126 n=6+5)
Fastrandn/5-4 2.33ns ± 5% 2.48ns ± 9% ~ (p=0.052 n=6+5)
Fixes #21806
Change-Id: I013bb37b463fdfc229a7f324df8fe2da8d286f33
Reviewed-on: https://go-review.googlesource.com/62530
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
eqstring is only called for strings with equal lengths.
Instead of pushing a pointer and length for each argument string
on the stack we can omit pushing one of the lengths on the stack.
Changing eqstrings signature to eqstring(*uint8, *uint8, int) bool
to implement the above optimization would make it very similar to the
existing memequal(*any, *any, uintptr) bool function.
Since string lengths are positive we can avoid code redundancy and
use memequal instead of using eqstring with an optimized signature.
go command binary size reduced by 4128 bytes on amd64.
name old time/op new time/op delta
CompareStringEqual 6.03ns ± 1% 5.71ns ± 1% -5.23% (p=0.000 n=19+18)
CompareStringIdentical 2.88ns ± 1% 3.22ns ± 7% +11.86% (p=0.000 n=20+20)
CompareStringSameLength 4.31ns ± 1% 4.01ns ± 1% -7.17% (p=0.000 n=19+19)
CompareStringDifferentLength 0.29ns ± 2% 0.29ns ± 2% ~ (p=1.000 n=20+20)
CompareStringBigUnaligned 64.3µs ± 2% 64.1µs ± 3% ~ (p=0.164 n=20+19)
CompareStringBig 61.9µs ± 1% 61.6µs ± 2% -0.46% (p=0.033 n=20+19)
Change-Id: Ice15f3b937c981f0d3bc8479a9ea0d10658ac8df
Reviewed-on: https://go-review.googlesource.com/53650
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Change-Id: I1b42fca2107b06e6fc95728f7bf3d08d005c4cb4
Reviewed-on: https://go-review.googlesource.com/55810
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Change-Id: I92cf39a05e738a03d956779d7a1ab1ef8074b2ab
Reviewed-on: https://go-review.googlesource.com/54655
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Intrinsic enabled on all architectures,
runtime asm implementation removed on all architectures.
Fixes #21258
Change-Id: I2cb86d460b497c2f287a5b3df5c37fdb231c23a7
Reviewed-on: https://go-review.googlesource.com/53411
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
Although mincore is declared in stubs.go, mincore isn't used by any
OSes except linux. Move it to os_linux.go and clean up unused code.
Change-Id: I6cfb0fed85c0317a4d091a2722ac55fa79fc7c9a
Reviewed-on: https://go-review.googlesource.com/54910
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
The only non test user of the assembler prefetch functions is the
heapBits.prefetch function which is itself unused.
The runtime prefetch functions have no functionality on most platforms
and are not inlineable since they are written in assembler. The function
call overhead eliminates the performance gains that could be achieved with
prefetching and would degrade performance for platforms where the functions
are no-ops.
If prefetch functions are needed back again later they can be improved
by avoiding the function call overhead and implementing them as intrinsics.
Change-Id: I52c553cf3607ffe09f0441c6e7a0a818cb21117d
Reviewed-on: https://go-review.googlesource.com/44370
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Currently, we mix objects with pointers and objects without pointers
("noscan" objects) together in memory. As a result, for every object
we grey, we have to check that object's heap bits to find out if it's
noscan, which adds to the per-object cost of GC. This also hurts the
TLB footprint of the garbage collector because it decreases the
density of scannable objects at the page level.
This commit improves the situation by using separate spans for noscan
objects. This will allow a much simpler noscan check (in a follow up
CL), eliminate the need to clear the bitmap of noscan objects (in a
follow up CL), and improves TLB footprint by increasing the density of
scannable objects.
This is also a step toward eliminating dead bits, since the current
noscan check depends on checking the dead bit of the first word.
This has no effect on the heap size of the garbage benchmark.
We'll measure the performance change of this after the follow-up
optimizations.
This is a cherry-pick from dev.garbage commit d491e550c3. The only
non-trivial merge conflict was in updatememstats in mstats.go, where
we now have to separate the per-spanclass stats from the per-sizeclass
stats.
Change-Id: I13bdc4869538ece5649a8d2a41c6605371618e40
Reviewed-on: https://go-review.googlesource.com/41251
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
|
|
This commit reworks multiway select statements to use normal control
flow primitives instead of the previous setjmp/longjmp-like behavior.
This simplifies liveness analysis and should prevent issues around
"returns twice" function calls within SSA passes.
test/live.go is updated because liveness analysis's CFG is more
representative of actual control flow. The case bodies are the only
real successors of the selectgo call, but previously the selectsend,
selectrecv, etc. calls were included in the successors list too.
Updates #19331.
Change-Id: I7f879b103a4b85e62fc36a270d812f54c0aa3e83
Reviewed-on: https://go-review.googlesource.com/37661
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Now that we don't rescan stacks, stack barriers are unnecessary. This
removes all of the code and structures supporting them as well as
tests that were specifically for stack barriers.
Updates #17503.
Change-Id: Ia29221730e0f2bbe7beab4fa757f31a032d9690c
Reviewed-on: https://go-review.googlesource.com/36620
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
This occurs a fair amount in the runtime for non-power-of-two n.
Use an alternative, faster formulation.
name old time/op new time/op delta
Fastrandn/2-8 4.45ns ± 2% 2.09ns ± 3% -53.12% (p=0.000 n=14+14)
Fastrandn/3-8 4.78ns ±11% 2.06ns ± 2% -56.94% (p=0.000 n=15+15)
Fastrandn/4-8 4.76ns ± 9% 1.99ns ± 3% -58.28% (p=0.000 n=15+13)
Fastrandn/5-8 4.96ns ±13% 2.03ns ± 6% -59.14% (p=0.000 n=15+15)
name old time/op new time/op delta
SelectUncontended-8 33.7ns ± 2% 33.9ns ± 2% +0.70% (p=0.000 n=49+50)
SelectSyncContended-8 1.68µs ± 4% 1.65µs ± 4% -1.54% (p=0.000 n=50+45)
SelectAsyncContended-8 282ns ± 1% 277ns ± 1% -1.50% (p=0.000 n=48+43)
SelectNonblock-8 5.31ns ± 1% 5.32ns ± 1% ~ (p=0.275 n=45+44)
SelectProdCons-8 585ns ± 3% 577ns ± 2% -1.35% (p=0.000 n=50+50)
GoroutineSelect-8 1.59ms ± 2% 1.59ms ± 1% ~ (p=0.084 n=49+48)
Updates #16213
Change-Id: Ib555a4d7da2042a25c3976f76a436b536487d5b7
Reviewed-on: https://go-review.googlesource.com/36932
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Extend period of fastrand from (1<<31)-1 to (1<<32)-1 by
choosing other polynom and reacting on high bit before shift.
Polynomial is taken at https://users.ece.cmu.edu/~koopman/lfsr/index.html
from 32.dat.gz . It is referred as F7711115 cause this list of
polynomials is for LFSR with shift to right (and fastrand uses shift to
left). (old polynomial is referred in 31.dat.gz as 7BB88888).
There were couple of places with conversation of fastrand to int, which
leads to negative values on 32bit platforms. They are fixed.
Change-Id: Ibee518a3f9103e0aea220ada494b3aec77babb72
Reviewed-on: https://go-review.googlesource.com/36875
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
So it could be inlined.
Using bit-tricks it could be implemented without condition
(improved trick version by Minux Ma).
Simple benchmark shows it is faster on i386 and x86_64, though
I don't know will it be faster on other architectures?
benchmark old ns/op new ns/op delta
BenchmarkFastrand-3 2.79 1.48 -46.95%
BenchmarkFastrandHashiter-3 25.9 24.9 -3.86%
Change-Id: Ie2eb6d0f598c0bb5fac7f6ad0f8b5e3eddaa361b
Reviewed-on: https://go-review.googlesource.com/34782
Reviewed-by: Minux Ma <minux@golang.org>
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|