Age | Commit message (Collapse) | Author |
|
expandFinalInlineFrame
This is a follow-up to golang.org/cl/301369, which made the same change
in Frames.Next. The same logic applies here: a profile stack may have
been truncated at an invalid PC provided by cgoTraceback.
expandFinalInlineFrame will then try to lookup the inline tree and
crash.
The same fix applies as well: upon encountering a bad PC, simply leave
it as-is and move on.
For #44971
For #45480
Fixes #45481
Change-Id: I2823c67a1f3425466b05384cc6d30f5fc8ee6ddc
Reviewed-on: https://go-review.googlesource.com/c/go/+/309109
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Trust: Michael Pratt <mpratt@google.com>
(cherry picked from commit aad13cbb749d1e6c085ff0556d306de1a2d5d063)
Reviewed-on: https://go-review.googlesource.com/c/go/+/309550
Run-TryBot: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
|
|
When using cgo, some of the frames can be provided by cgoTraceback, a
cgo-provided function to generate C tracebacks. Unlike Go tracebacks,
cgoTraceback has no particular guarantees that it produces valid
tracebacks.
If one of the (invalid) frames happens to put the PC in the alignment
region at the end of a function (filled with int 3's on amd64), then
Frames.Next will find a valid funcInfo for the PC, but pcdatavalue will
panic because PCDATA doesn't cover this PC.
Tolerate this case by doing a non-strict PCDATA lookup. We'll still show
a bogus frame, but at least avoid throwing.
For #44971
Fixes #45302
Change-Id: I9eed728470d6f264179a7615bd19845c941db78c
Reviewed-on: https://go-review.googlesource.com/c/go/+/301369
Trust: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
(cherry picked from commit e4a4161f1f3157550846e1b6bd4fe83aae15778e)
Reviewed-on: https://go-review.googlesource.com/c/go/+/305890
|
|
Change-Id: I04037e13b131e79ebc5af84896bfeda49ddc0eaa
GitHub-Last-Rev: b0d0de930862e4f163e158876cba70d81ed2d52e
GitHub-Pull-Request: golang/go#39500
Reviewed-on: https://go-review.googlesource.com/c/go/+/237220
Reviewed-by: Keith Randall <khr@golang.org>
|
|
On some architectures, for async preemption the injected call
needs to clobber a register (usually REGTMP) in order to return
to the preempted function. As a consequence, the PC ranges where
REGTMP is live are not preemptible.
The uses of REGTMP are usually generated by the assembler, where
it needs to load or materialize a large constant or offset that
doesn't fit into the instruction. In those cases, REGTMP is not
live at the start of the instruction sequence. Instead of giving
up preemption in those cases, we could preempt it and restart the
sequence when resuming the execution. Basically, this is like
reissuing an interrupted instruction, except that here the
"instruction" is a Prog that consists of multiple machine
instructions. For this to work, we need to generate PC data to
mark the start of the Prog.
Currently this is only done for ARM64.
TODO: the split-stack function prologue is currently not async
preemptible. We could use this mechanism, preempt it and restart
at the function entry.
Change-Id: I37cb282f8e606e7ab6f67b3edfdc6063097b4bd1
Reviewed-on: https://go-review.googlesource.com/c/go/+/208126
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
Currently, we emit stack maps and register maps at almost every
instruction. This was originally intended to support non-cooperative
preemption, but was only ever used for debug call injection. Now debug
call injection also uses conservative frame scanning. As a result,
stack maps are only needed at call sites and register maps aren't
needed at all except that we happen to also encode unsafe-point
information in the register map PCDATA stream.
This CL reduces stack maps to only appear at calls, and replace full
register maps with just safe/unsafe-point information.
This is all protected by the go115ReduceLiveness feature flag, which
is defined in both runtime and cmd/compile.
This CL significantly reduces binary sizes and also speeds up compiles
and links:
name old exe-bytes new exe-bytes delta
BinGoSize 15.0MB ± 0% 14.1MB ± 0% -5.72%
name old pcln-bytes new pcln-bytes delta
BinGoSize 3.14MB ± 0% 2.48MB ± 0% -21.08%
name old time/op new time/op delta
Template 178ms ± 7% 172ms ±14% -3.59% (p=0.005 n=19+19)
Unicode 71.0ms ±12% 69.8ms ±10% ~ (p=0.126 n=18+18)
GoTypes 655ms ± 8% 615ms ± 8% -6.11% (p=0.000 n=19+19)
Compiler 3.27s ± 6% 3.15s ± 7% -3.69% (p=0.001 n=20+20)
SSA 7.10s ± 5% 6.85s ± 8% -3.53% (p=0.001 n=19+20)
Flate 124ms ±15% 116ms ±22% -6.57% (p=0.024 n=18+19)
GoParser 156ms ±26% 147ms ±34% ~ (p=0.070 n=19+19)
Reflect 406ms ± 9% 387ms ±21% -4.69% (p=0.028 n=19+20)
Tar 163ms ±15% 162ms ±27% ~ (p=0.370 n=19+19)
XML 223ms ±13% 218ms ±14% ~ (p=0.157 n=20+20)
LinkCompiler 503ms ±21% 484ms ±23% ~ (p=0.072 n=20+20)
ExternalLinkCompiler 1.27s ± 7% 1.22s ± 8% -3.85% (p=0.005 n=20+19)
LinkWithoutDebugCompiler 294ms ±17% 273ms ±11% -7.16% (p=0.001 n=19+18)
(https://perf.golang.org/search?q=upload:20200428.8)
The binary size improvement is even slightly better when you include
the CLs leading up to this. Relative to the parent of "cmd/compile:
mark PanicBounds/Extend as calls":
name old exe-bytes new exe-bytes delta
BinGoSize 15.0MB ± 0% 14.1MB ± 0% -6.18%
name old pcln-bytes new pcln-bytes delta
BinGoSize 3.22MB ± 0% 2.48MB ± 0% -22.92%
(https://perf.golang.org/search?q=upload:20200428.9)
For #36365.
Change-Id: I69448e714f2a44430067ca97f6b78e08c0abed27
Reviewed-on: https://go-review.googlesource.com/c/go/+/230544
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Fixes #37967
Change-Id: I6fc22bdd65f0263d5672731b73d09249201ab0aa
Reviewed-on: https://go-review.googlesource.com/c/go/+/224458
Reviewed-by: Michael Pratt <mpratt@google.com>
|
|
When generating stacks, the runtime automatically expands inline
functions to inline all inline frames in the stack. However, due to the
stack size limit, the final frame may be truncated in the middle of
several inline frames at the same location.
As-is, we assume that the final frame is a normal function, and emit and
cache a Location for it. If we later receive a complete stack frame, we
will first use the cached Location for the inlined function and then
generate a new Location for the "caller" frame, in violation of the
pprof requirement to merge inlined functions into the same Location.
As a result, we:
1. Nondeterministically may generate a profile with the different stacks
combined or split, depending on which is encountered first. This is
particularly problematic when performing a diff of profiles.
2. When split stacks are generated, we lose the inlining information.
We avoid both of these problems by performing a second expansion of the
last stack frame to recover additional inline frames that may have been
lost. This expansion is a bit simpler than the one done by the runtime
because we don't have to handle skipping, and we know that the last
emitted frame is not an elided wrapper, since it by definition is
already included in the stack.
Fixes #37446
Change-Id: If3ca2af25b21d252cf457cc867dd932f107d4c61
Reviewed-on: https://go-review.googlesource.com/c/go/+/221577
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
|
|
On PPC64 when external linking, for large binaries we split the
text section to multiple sections, so the external linking may
insert trampolines between sections. These trampolines are within
the address range covered by the func table, but not known by Go.
This causes runtime.findfunc to return a wrong function if the
given PC is from such trampolines.
In this CL, we generate a marker between text sections where
there could potentially be a hole in the func table. At run time,
we skip the hole if we see such a marker.
Fixes #37216.
Change-Id: I95ab3875a84b357dbaa65a4ed339a19282257ce0
Reviewed-on: https://go-review.googlesource.com/c/go/+/219717
Reviewed-by: David Chase <drchase@google.com>
|
|
This adds support for pausing a running G by sending a signal to its
M.
The main complication is that we want to target a G, but can only send
a signal to an M. Hence, the protocol we use is to simply mark the G
for preemption (which we already do) and send the M a "wake up and
look around" signal. The signal checks if it's running a G with a
preemption request and stops it if so in the same way that stack check
preemptions stop Gs. Since the preemption may fail (the G could be
moved or the signal could arrive at an unsafe point), we keep a count
of the number of received preemption signals. This lets stopG detect
if its request failed and should be retried without an explicit
channel back to suspendG.
For #10958, #24543.
Change-Id: I3e1538d5ea5200aeb434374abb5d5fdc56107e53
Reviewed-on: https://go-review.googlesource.com/c/go/+/201760
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
This adds support for scanning the stack when a goroutine is stopped
at an async safe point. This is not yet lit up because asyncPreempt is
not yet injected, but prepares us for that.
This works by conservatively scanning the registers dumped in the
frame of asyncPreempt and its parent frame, which was stopped at an
asynchronous safe point.
Conservative scanning works by only marking words that are pointers to
valid, allocated heap objects. One complication is pointers to stack
objects. In this case, we can't determine if the stack object is still
"allocated" or if it was freed by an earlier GC. Hence, we need to
propagate the conservative-ness of scanning stack objects: if all
pointers found to a stack object were found via conservative scanning,
then the stack object itself needs to be scanned conservatively, since
its pointers may point to dead objects.
For #10958, #24543.
Change-Id: I7ff84b058c37cde3de8a982da07002eaba126fd6
Reviewed-on: https://go-review.googlesource.com/c/go/+/201761
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
extra funcdata
Generate inline code at defer time to save the args of defer calls to unique
(autotmp) stack slots, and generate inline code at exit time to check which defer
calls were made and make the associated function/method/interface calls. We
remember that a particular defer statement was reached by storing in the deferBits
variable (always stored on the stack). At exit time, we check the bits of the
deferBits variable to determine which defer function calls to make (in reverse
order). These low-cost defers are only used for functions where no defers
appear in loops. In addition, we don't do these low-cost defers if there are too
many defer statements or too many exits in a function (to limit code increase).
When a function uses open-coded defers, we produce extra
FUNCDATA_OpenCodedDeferInfo information that specifies the number of defers, and
for each defer, the stack slots where the closure and associated args have been
stored. The funcdata also includes the location of the deferBits variable.
Therefore, for panics, we can use this funcdata to determine exactly which defers
are active, and call the appropriate functions/methods/closures with the correct
arguments for each active defer.
In order to unwind the stack correctly after a recover(), we need to add an extra
code segment to functions with open-coded defers that simply calls deferreturn()
and returns. This segment is not reachable by the normal function, but is returned
to by the runtime during recovery. We set the liveness information of this
deferreturn() to be the same as the liveness at the first function call during the
last defer exit code (so all return values and all stack slots needed by the defer
calls will be live).
I needed to increase the stackguard constant from 880 to 896, because of a small
amount of new code in deferreturn().
The -N flag disables open-coded defers. '-d defer' prints out the kind of defer
being used at each defer statement (heap-allocated, stack-allocated, or
open-coded).
Cost of defer statement [ go test -run NONE -bench BenchmarkDefer$ runtime ]
With normal (stack-allocated) defers only: 35.4 ns/op
With open-coded defers: 5.6 ns/op
Cost of function call alone (remove defer keyword): 4.4 ns/op
Text size increase (including funcdata) for go binary without/with open-coded defers: 0.09%
The average size increase (including funcdata) for only the functions that use
open-coded defers is 1.1%.
The cost of a panic followed by a recover got noticeably slower, since panic
processing now requires a scan of the stack for open-coded defer frames. This scan
is required, even if no frames are using open-coded defers:
Cost of panic and recover [ go test -run NONE -bench BenchmarkPanicRecover runtime ]
Without open-coded defers: 62.0 ns/op
With open-coded defers: 255 ns/op
A CGO Go-to-C-to-Go benchmark got noticeably faster because of open-coded defers:
CGO Go-to-C-to-Go benchmark [cd misc/cgo/test; go test -run NONE -bench BenchmarkCGoCallback ]
Without open-coded defers: 443 ns/op
With open-coded defers: 347 ns/op
Updates #14939 (defer performance)
Updates #34481 (design doc)
Change-Id: I63b1a60d1ebf28126f55ee9fd7ecffe9cb23d1ff
Reviewed-on: https://go-review.googlesource.com/c/go/+/202340
Reviewed-by: Austin Clements <austin@google.com>
|
|
code and extra funcdata"
This reverts CL 190098.
Reason for revert: broke several builders.
Change-Id: I69161352f9ded02537d8815f259c4d391edd9220
Reviewed-on: https://go-review.googlesource.com/c/go/+/201519
Run-TryBot: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Dan Scales <danscales@google.com>
|
|
extra funcdata
Generate inline code at defer time to save the args of defer calls to unique
(autotmp) stack slots, and generate inline code at exit time to check which defer
calls were made and make the associated function/method/interface calls. We
remember that a particular defer statement was reached by storing in the deferBits
variable (always stored on the stack). At exit time, we check the bits of the
deferBits variable to determine which defer function calls to make (in reverse
order). These low-cost defers are only used for functions where no defers
appear in loops. In addition, we don't do these low-cost defers if there are too
many defer statements or too many exits in a function (to limit code increase).
When a function uses open-coded defers, we produce extra
FUNCDATA_OpenCodedDeferInfo information that specifies the number of defers, and
for each defer, the stack slots where the closure and associated args have been
stored. The funcdata also includes the location of the deferBits variable.
Therefore, for panics, we can use this funcdata to determine exactly which defers
are active, and call the appropriate functions/methods/closures with the correct
arguments for each active defer.
In order to unwind the stack correctly after a recover(), we need to add an extra
code segment to functions with open-coded defers that simply calls deferreturn()
and returns. This segment is not reachable by the normal function, but is returned
to by the runtime during recovery. We set the liveness information of this
deferreturn() to be the same as the liveness at the first function call during the
last defer exit code (so all return values and all stack slots needed by the defer
calls will be live).
I needed to increase the stackguard constant from 880 to 896, because of a small
amount of new code in deferreturn().
The -N flag disables open-coded defers. '-d defer' prints out the kind of defer
being used at each defer statement (heap-allocated, stack-allocated, or
open-coded).
Cost of defer statement [ go test -run NONE -bench BenchmarkDefer$ runtime ]
With normal (stack-allocated) defers only: 35.4 ns/op
With open-coded defers: 5.6 ns/op
Cost of function call alone (remove defer keyword): 4.4 ns/op
Text size increase (including funcdata) for go cmd without/with open-coded defers: 0.09%
The average size increase (including funcdata) for only the functions that use
open-coded defers is 1.1%.
The cost of a panic followed by a recover got noticeably slower, since panic
processing now requires a scan of the stack for open-coded defer frames. This scan
is required, even if no frames are using open-coded defers:
Cost of panic and recover [ go test -run NONE -bench BenchmarkPanicRecover runtime ]
Without open-coded defers: 62.0 ns/op
With open-coded defers: 255 ns/op
A CGO Go-to-C-to-Go benchmark got noticeably faster because of open-coded defers:
CGO Go-to-C-to-Go benchmark [cd misc/cgo/test; go test -run NONE -bench BenchmarkCGoCallback ]
Without open-coded defers: 443 ns/op
With open-coded defers: 347 ns/op
Updates #14939 (defer performance)
Updates #34481 (design doc)
Change-Id: I51a389860b9676cfa1b84722f5fb84d3c4ee9e28
Reviewed-on: https://go-review.googlesource.com/c/go/+/190098
Reviewed-by: Austin Clements <austin@google.com>
|
|
An extra goroutine is necessary to handle asynchronous events on wasm.
However, we do not want this goroutine to exist all the time.
This change makes it short-lived, so it ends after the asynchronous
event was handled.
Fixes #34768
Change-Id: I24626ff0af9d803a01ebe33fbb584d04d2059a44
Reviewed-on: https://go-review.googlesource.com/c/go/+/200497
Run-TryBot: Richard Musiol <neelance@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
On wasm there is a special goroutine that handles asynchronous events.
Blocking this goroutine often causes a deadlock. However, the stack
trace of this goroutine was omitted when printing the deadlock error.
This change adds an exception so the goroutine is not considered as
an internal system goroutine and the stack trace gets printed, which
helps with debugging the deadlock.
Updates #32764
Change-Id: Icc8f5ba3ca5a485d557b7bdd76bf2f1ffb92eb3e
Reviewed-on: https://go-review.googlesource.com/c/go/+/199537
Run-TryBot: Richard Musiol <neelance@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Upgrade the thread sanitizer to handle mid-stack inlining correctly.
We can now return multiple stack frames for each pc that the thread sanitizer
gives us to symbolize.
To fix #33309, we still need to modify the tsan library with its portion
of this fix, rebuild the .syso files on all supported archs, and check
them into runtime/race.
Update #33309
Change-Id: I340013631ffc8428043ab7efe3a41b6bf5638eaf
Reviewed-on: https://go-review.googlesource.com/c/go/+/195781
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
|
|
That way we will never have to look up the file/line for the frame
that's next to be returned when the user stops calling Next.
For the benchmark from #32093:
name old time/op new time/op delta
Helper-4 948ns ± 1% 836ns ± 3% -11.89% (p=0.000 n=9+9)
(#32093 was fixed with a more specific, and better, fix, but this
fix is much more general.)
Change-Id: I89e796f80c9706706d8d8b30eb14be3a8a442846
Reviewed-on: https://go-review.googlesource.com/c/go/+/178077
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
The pclntab encoding supports writing only some PCDATA and FUNCDATA values.
However, the encoding is dense: The max index in use determines the space used.
We should thus choose a numbering in which frequently used indices are smaller.
This change re-orders the PCDATA and FUNCDATA indices using that principle,
using a quick and dirty instrumentation to measure index frequency.
It shrinks binaries by about 0.5%.
Updates #6853
file before after Δ %
go 14745044 14671316 -73728 -0.500%
addr2line 4305128 4280552 -24576 -0.571%
api 6095800 6058936 -36864 -0.605%
asm 4930928 4906352 -24576 -0.498%
buildid 2881520 2861040 -20480 -0.711%
cgo 4896584 4867912 -28672 -0.586%
compile 25868408 25770104 -98304 -0.380%
cover 5319656 5286888 -32768 -0.616%
dist 3654528 3634048 -20480 -0.560%
doc 4719672 4691000 -28672 -0.607%
fix 3418312 3393736 -24576 -0.719%
link 6137952 6109280 -28672 -0.467%
nm 4250536 4225960 -24576 -0.578%
objdump 4665192 4636520 -28672 -0.615%
pack 2297488 2285200 -12288 -0.535%
pprof 14735332 14657508 -77824 -0.528%
test2json 2834952 2818568 -16384 -0.578%
trace 11679964 11618524 -61440 -0.526%
vet 8452696 8403544 -49152 -0.581%
Change-Id: I30665dce57ec7a52e7d3c6718560b3aa5b83dd0b
Reviewed-on: https://go-review.googlesource.com/c/go/+/171760
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
As .init_array section aren't available on AIX, the Go runtime
initialization is made with gcc constructor attribute.
However, as cgo tool is building a binary in order to get imported
C symbols, Go symbols imported for this initilization must be ignored.
-Wl,-berok is mandatory otherwize ld will fail to create this binary,
_rt0_aix_ppc64_lib and runtime_rt0_go aren't defined in runtime/cgo.
These two symbols must also be ignored when creating _cgo_import.go.
Change-Id: Icf2e0282f5b50de5fa82007439a428e6147efef1
Reviewed-on: https://go-review.googlesource.com/c/go/+/169118
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Change-Id: I904b8655f21743189814bccf24073b6fbb9fc56d
GitHub-Last-Rev: b032c14394c949f9ad7b18d019a3979d38d4e1fb
GitHub-Pull-Request: golang/go#29997
Reviewed-on: https://go-review.googlesource.com/c/160421
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Reuse the strict mechanism from FileLine for FuncForPC, so we don't
crash when asking the pcln table about bad pcs.
Fixes #29735
Change-Id: Iaffb32498b8586ecf4eae03823e8aecef841aa68
Reviewed-on: https://go-review.googlesource.com/c/157799
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Returning the innermost frame instead of the outermost
makes code that walks the results of runtime.Caller{,s}
still work correctly in the presence of mid-stack inlining.
Fixes #29582
Change-Id: I2392e3dd5636eb8c6f58620a61cef2194fe660a7
Reviewed-on: https://go-review.googlesource.com/c/156364
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
In 1.11 we stored "return addresses" in the result of runtime.Callers.
I changed that behavior in CL 152537 to store an address in the call
instruction itself. This CL reverts that part of 152537.
The change in 152537 was made because we now store pcs of inline marks
in the result of runtime.Callers as well. This CL will now store the
address of the inline mark + 1 in the results of runtime.Callers, so
that the subsequent -1 done in CallersFrames will pick out the correct
inline mark instruction.
This CL means that the results of runtime.Callers can be passed to
runtime.FuncForPC as they were before. There are a bunch of packages
in the wild that take the results of runtime.Callers, subtract 1, and
then call FuncForPC. This CL keeps that pattern working as it did in
1.11.
The changes to runtime/pprof in this CL are exactly a revert of the
changes to that package in 152537 (except the locForPC comment).
Update #29582
Change-Id: I04d232000fb482f0f0ff6277f8d7b9c72e97eb48
Reviewed-on: https://go-review.googlesource.com/c/156657
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Work involved in getting a stack trace is divided between
runtime.Callers and runtime.CallersFrames.
Before this CL, runtime.Callers returns a pc per runtime frame.
runtime.CallersFrames is responsible for expanding a runtime frame
into potentially multiple user frames.
After this CL, runtime.Callers returns a pc per user frame.
runtime.CallersFrames just maps those to user frame info.
Entries in the result of runtime.Callers are now pcs
of the calls (or of the inline marks), not of the instruction
just after the call.
Fixes #29007
Fixes #28640
Update #26320
Change-Id: I1c9567596ff73dc73271311005097a9188c3406f
Reviewed-on: https://go-review.googlesource.com/c/152537
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
This change introduces two optimizations together,
one for recursive and one for non-recursive stacks.
For recursive stacks, we introduce the new entry
at the beginning of the cache, so it can be found first.
This adds an extra read and write.
While we're here, switch from fastrandn, which does a multiply,
to fastrand % n, which does a shift.
For non-recursive stacks, split the cache from [16]pcvalueCacheEnt
into [2][8]pcvalueCacheEnt, and add a very cheap associative lookup.
name old time/op new time/op delta
StackCopyPtr-8 118ms ± 1% 106ms ± 2% -9.56% (p=0.000 n=17+18)
StackCopy-8 95.8ms ± 1% 87.0ms ± 3% -9.11% (p=0.000 n=19+20)
StackCopyNoCache-8 135ms ± 2% 139ms ± 1% +3.06% (p=0.000 n=19+18)
During make.bash, the association function used has this return distribution:
percent count return value
53.23% 678797 1
46.74% 596094 0
It is definitely not perfect, but it is pretty good,
and that's all we need.
Change-Id: I2cabb1d26b99c5111bc28f427016a2a5e6c620fd
Reviewed-on: https://go-review.googlesource.com/c/110564
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
When a function triggers a signal (like a segfault which translates to
a nil pointer exception) during execution, a sigpanic handler is just
below it on the stack. The function itself did not stop at a
safepoint, so we have to figure out what safepoint we should use to
scan its stack frame.
Previously we used the site of the most recent defer to get the live
variables at the signal site. That answer is not quite correct, as
explained in #27518. Instead, use the site of a deferreturn call.
It has all the right variables marked as live (no args, all the return
values, except those that escape to the heap, in which case the
corresponding PAUTOHEAP variables will be live instead).
This CL requires stack objects, so that all the local variables
and args referenced by the deferred closures keep the right variables alive.
Fixes #27518
Change-Id: Id45d8a8666759986c203181090b962e2981e48ca
Reviewed-on: https://go-review.googlesource.com/c/134637
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Rework how the compiler+runtime handles stack-allocated variables
whose address is taken.
Direct references to such variables work as before. References through
pointers, however, use a new mechanism. The new mechanism is more
precise than the old "ambiguously live" mechanism. It computes liveness
at runtime based on the actual references among objects on the stack.
Each function records all of its address-taken objects in a FUNCDATA.
These are called "stack objects". The runtime then uses that
information while scanning a stack to find all of the stack objects on
a stack. It then does a mark phase on the stack objects, using all the
pointers found on the stack (and ancillary structures, like defer
records) as the root set. Only stack objects which are found to be
live during this mark phase will be scanned and thus retain any heap
objects they point to.
A subsequent CL will remove all the "ambiguously live" logic from
the compiler, so that the stack object tracing will be required.
For this CL, the stack tracing is all redundant with the current
ambiguously live logic.
Update #22350
Change-Id: Ide19f1f71a5b6ec8c4d54f8f66f0e9a98344772f
Reviewed-on: https://go-review.googlesource.com/c/134155
Reviewed-by: Austin Clements <austin@google.com>
|
|
This adds a mechanism for debuggers to safely inject calls to Go
functions on amd64. Debuggers must participate in a protocol with the
runtime, and need to know how to lay out a call frame, but the runtime
support takes care of the details of handling live pointers in
registers, stack growth, and detecting the trickier conditions when it
is unsafe to inject a user function call.
Fixes #21678.
Updates derekparker/delve#119.
Change-Id: I56d8ca67700f1f77e19d89e7fc92ab337b228834
Reviewed-on: https://go-review.googlesource.com/109699
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
This adds FUNCDATA and PCDATA that records the register maps much like
the existing live arguments maps and live locals maps. The register
map is indexed independently from the argument and locals maps since
changes in register liveness tend not to correlate with changes to
argument and local liveness.
This is the final CL toward adding safe-points everywhere. The
following CLs will optimize liveness analysis to bring down the cost.
The effect of this CL is:
name old time/op new time/op delta
Template 195ms ± 2% 197ms ± 1% ~ (p=0.136 n=9+9)
Unicode 98.4ms ± 2% 99.7ms ± 1% +1.39% (p=0.004 n=10+10)
GoTypes 685ms ± 1% 700ms ± 1% +2.06% (p=0.000 n=9+9)
Compiler 3.28s ± 1% 3.34s ± 0% +1.71% (p=0.000 n=9+8)
SSA 7.79s ± 1% 7.91s ± 1% +1.55% (p=0.000 n=10+9)
Flate 133ms ± 2% 133ms ± 2% ~ (p=0.190 n=10+10)
GoParser 161ms ± 2% 164ms ± 3% +1.83% (p=0.015 n=10+10)
Reflect 450ms ± 1% 457ms ± 1% +1.62% (p=0.000 n=10+10)
Tar 183ms ± 2% 185ms ± 1% +0.91% (p=0.008 n=9+10)
XML 234ms ± 1% 238ms ± 1% +1.60% (p=0.000 n=9+9)
[Geo mean] 411ms 417ms +1.40%
name old exe-bytes new exe-bytes delta
HelloSize 1.47M ± 0% 1.51M ± 0% +2.79% (p=0.000 n=10+10)
Compared to just before "cmd/internal/obj: consolidate emitting entry
stack map", the cumulative effect of adding stack maps everywhere and
register maps is:
name old time/op new time/op delta
Template 185ms ± 2% 197ms ± 1% +6.42% (p=0.000 n=10+9)
Unicode 96.3ms ± 3% 99.7ms ± 1% +3.60% (p=0.000 n=10+10)
GoTypes 658ms ± 0% 700ms ± 1% +6.37% (p=0.000 n=10+9)
Compiler 3.14s ± 1% 3.34s ± 0% +6.53% (p=0.000 n=9+8)
SSA 7.41s ± 2% 7.91s ± 1% +6.71% (p=0.000 n=9+9)
Flate 126ms ± 1% 133ms ± 2% +6.15% (p=0.000 n=10+10)
GoParser 153ms ± 1% 164ms ± 3% +6.89% (p=0.000 n=10+10)
Reflect 437ms ± 1% 457ms ± 1% +4.59% (p=0.000 n=10+10)
Tar 178ms ± 1% 185ms ± 1% +4.18% (p=0.000 n=10+10)
XML 223ms ± 1% 238ms ± 1% +6.39% (p=0.000 n=10+9)
[Geo mean] 394ms 417ms +5.78%
name old alloc/op new alloc/op delta
Template 34.5MB ± 0% 38.0MB ± 0% +10.19% (p=0.000 n=10+10)
Unicode 29.3MB ± 0% 30.3MB ± 0% +3.56% (p=0.000 n=8+9)
GoTypes 113MB ± 0% 125MB ± 0% +10.89% (p=0.000 n=10+10)
Compiler 510MB ± 0% 575MB ± 0% +12.79% (p=0.000 n=10+10)
SSA 1.46GB ± 0% 1.64GB ± 0% +12.40% (p=0.000 n=10+10)
Flate 23.9MB ± 0% 25.9MB ± 0% +8.56% (p=0.000 n=10+10)
GoParser 28.0MB ± 0% 30.8MB ± 0% +10.08% (p=0.000 n=10+10)
Reflect 77.6MB ± 0% 84.3MB ± 0% +8.63% (p=0.000 n=10+10)
Tar 34.1MB ± 0% 37.0MB ± 0% +8.44% (p=0.000 n=10+10)
XML 42.7MB ± 0% 47.2MB ± 0% +10.75% (p=0.000 n=10+10)
[Geo mean] 76.0MB 83.3MB +9.60%
name old allocs/op new allocs/op delta
Template 321k ± 0% 337k ± 0% +4.98% (p=0.000 n=10+10)
Unicode 337k ± 0% 340k ± 0% +1.04% (p=0.000 n=10+9)
GoTypes 1.13M ± 0% 1.18M ± 0% +4.85% (p=0.000 n=10+10)
Compiler 4.67M ± 0% 4.96M ± 0% +6.25% (p=0.000 n=10+10)
SSA 11.7M ± 0% 12.3M ± 0% +5.69% (p=0.000 n=10+10)
Flate 216k ± 0% 226k ± 0% +4.52% (p=0.000 n=10+9)
GoParser 271k ± 0% 283k ± 0% +4.52% (p=0.000 n=10+10)
Reflect 927k ± 0% 972k ± 0% +4.78% (p=0.000 n=10+10)
Tar 318k ± 0% 333k ± 0% +4.56% (p=0.000 n=10+10)
XML 376k ± 0% 395k ± 0% +5.04% (p=0.000 n=10+10)
[Geo mean] 730k 764k +4.61%
name old exe-bytes new exe-bytes delta
HelloSize 1.46M ± 0% 1.51M ± 0% +3.66% (p=0.000 n=10+10)
For #24543.
Change-Id: I91e003dc64151916b384274884bf02a2d6862547
Reviewed-on: https://go-review.googlesource.com/109353
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Currently isSystemGoroutine has a hard-coded list of known entry
points into system goroutines. This list is annoying to maintain. For
example, it's missing the ensureSigM goroutine.
Replace it with a check that simply looks for any goroutine with
runtime function as its entry point, with a few exceptions. This also
matches the definition recently added to the trace viewer (CL 81315).
Change-Id: Iaed723d4a6e8c2ffb7c0c48fbac1688b00b30f01
Reviewed-on: https://go-review.googlesource.com/81655
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Also do very minor code cleanup.
name old time/op new time/op delta
StackCopyPtr-8 84.8ms ± 6% 82.9ms ± 5% -2.19% (p=0.000 n=95+94)
StackCopy-8 68.4ms ± 5% 65.3ms ± 4% -4.54% (p=0.000 n=99+99)
StackCopyNoCache-8 107ms ± 2% 105ms ± 2% -2.13% (p=0.000 n=91+95)
Change-Id: I2d85ede48bffada9584d437a08a82212c0da6d00
Reviewed-on: https://go-review.googlesource.com/109001
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
When there are plugins, there may not be a unique copy of runtime
functions like goexit, mcall, etc. So identifying them by entry
address is problematic. Instead, keep track of each special function
using a field in the symbol table. That way, multiple copies of
the same runtime function will be treated identically.
Fixes #24351
Fixes #23133
Change-Id: Iea3232df8a6af68509769d9ca618f530cc0f84fd
Reviewed-on: https://go-review.googlesource.com/100739
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
Change-Id: I07a1eb02ffc621c5696b49491181300bf411f822
Reviewed-on: https://go-review.googlesource.com/96475
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Remove a branch and a stack spill.
name old time/op new time/op delta
StackCopy-8 79.2ms ± 1% 79.1ms ± 2% ~ (p=0.063 n=96+95)
StackCopyNoCache-8 121ms ± 1% 120ms ± 2% -0.46% (p=0.000 n=97+88)
Change-Id: Ifcbbb05d773178fad84cb11a9a6768ace69fcf24
Reviewed-on: https://go-review.googlesource.com/94029
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
This reverts commit 08f19bbde1b01227fdc2fa2d326e4029bb74dd96.
Reason for revert:
The changed transformation takes effect on a larger set
of code snippets than expected.
For example, this:
func foo() {
// Comment
bar()
}
becomes:
func foo() {
// Comment
bar()
}
This is an unintended consequence.
Change-Id: Ifca88d6267dab8a8170791f7205124712bf8ace8
Reviewed-on: https://go-review.googlesource.com/81335
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Joe Tsai <joetsai@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
CL 45412 started hiding autogenerated wrapper functions from call
stacks so that call stack semantics better matched language semantics.
This is based on the theory that the wrapper function will call the
"real" function and all the programmer knows about is the real
function.
However, this theory breaks down in two cases:
1. If the wrapper is at the top of the stack, then it didn't call
anything. This can happen, for example, if the "stack" was actually
synthesized by the user.
2. If the wrapper panics, for example by calling panicwrap or by
dereferencing a nil pointer, then it didn't call the wrapped
function and the user needs to see what panicked, even if we can't
attribute it nicely.
This commit modifies the traceback logic to include the wrapper
function in both of these cases.
Fixes #22231.
Change-Id: I6e4339a652f73038bd8331884320f0b8edd86eb1
Reviewed-on: https://go-review.googlesource.com/76770
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
To improve readability when exported fields are removed,
forbid the printer from emitting an empty line before the first comment
in a const, var, or type block.
Also, when printing the "Has filtered or unexported fields." message,
add an empty line before it to separate the message from the struct
or interfact contents.
Before the change:
<<<
type NamedArg struct {
// Name is the name of the parameter placeholder.
//
// If empty, the ordinal position in the argument list will be
// used.
//
// Name must omit any symbol prefix.
Name string
// Value is the value of the parameter.
// It may be assigned the same value types as the query
// arguments.
Value interface{}
// contains filtered or unexported fields
}
>>>
After the change:
<<<
type NamedArg struct {
// Name is the name of the parameter placeholder.
//
// If empty, the ordinal position in the argument list will be
// used.
//
// Name must omit any symbol prefix.
Name string
// Value is the value of the parameter.
// It may be assigned the same value types as the query
// arguments.
Value interface{}
// contains filtered or unexported fields
}
>>>
Fixes #18264
Change-Id: I9fe17ca39cf92fcdfea55064bd2eaa784ce48c88
Reviewed-on: https://go-review.googlesource.com/71990
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
|
|
Currently we look to see if the main.main symbol address is in the
module data text range. This requires access to the main.main
symbol, which usually the runtime has, but does not when building
a plugin.
To avoid a dynamic relocation to main.main (which I haven't worked
out how to have the linker generate on darwin), stop using the
symbol. Instead record a boolean in the moduledata if the module
has the main function.
Fixes #22175
Change-Id: If313a118f17ab499d0a760bbc2519771ed654530
Reviewed-on: https://go-review.googlesource.com/69370
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
No need to type this global as an unsafe.Pointer, we know
what type the referent is.
Change-Id: I7b1374065b53ccf1373754a21d54adbedf1fd587
Reviewed-on: https://go-review.googlesource.com/67990
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
The compiler generates wrapper methods to forward interface method
calls (which are always pointer-based) to value methods. These
wrappers appear in the call stack even though they are an
implementation detail. This leaves ugly "<autogenerated>" functions in
stack traces and can throw off skip counts for stack traces.
Fix this by considering these runtime frames in printed stack traces
so they will only be printed if runtime frames are being printed, and
by eliding them from the call stack expansion used by CallersFrames
and Caller.
This removes the test for issue 4388 since that was checking that
"<autogenerated>" appeared in the stack trace instead of something
even weirder. We replace it with various runtime package tests.
Fixes #16723.
Change-Id: Ice3f118c66f254bb71478a664d62ab3fc7125819
Reviewed-on: https://go-review.googlesource.com/45412
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
1. remove broken verification
The runtime check assumes that no-pcln symbol entry have zero value,
but the linker emit no entries if the symbol is no-pcln.
As a result, if there are no-pcln symbols at the very end of pcln
table, it will panic.
2. correct condition of export
Handle special chracters in pluginpath correcty.
Export "go.itab.*", so different plugins can share the same itab.
Fixes #18190
Change-Id: Ia4f9c51d83ce8488a9470520f1ee9432802cfc1d
Reviewed-on: https://go-review.googlesource.com/61091
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Along the way, track bad modules. Make sure they don't end up on
the active modules list, and aren't accidentally reprocessed as
new plugins.
Fixes #19004
Change-Id: I8a5e7bb11f572f7b657a97d521a7f84822a35c07
Reviewed-on: https://go-review.googlesource.com/61171
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
The activeModules function is called by the cgo pointer checking code,
which is called by the write barrier (when GODEBUG=cgocheck=2), and as
such must be nosplit/nowritebarrier.
Fixes #21306
Change-Id: I57f2124f14de7f3872b2de9532abab15df95d45a
Reviewed-on: https://go-review.googlesource.com/53352
Reviewed-by: Austin Clements <austin@google.com>
|
|
kicking off contributing again with a classic
Change-Id: Ifb0aed8f1dc854f85751ce0495967a3c4315128d
Reviewed-on: https://go-review.googlesource.com/49016
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
The current description refers to the outermost "frame" which can be
misleading. A user reading it can think it means a stack frame.
Change-Id: Ie2c7cb4b4db8f41572df206478ce3b46a0245a5d
Reviewed-on: https://go-review.googlesource.com/47850
Reviewed-by: Austin Clements <austin@google.com>
|
|
Change-Id: I1c02aa4f7131ae984fda66b32e8a993c0a40b8f4
Reviewed-on: https://go-review.googlesource.com/47690
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
The Func type has allowed calling the Func.Name method on a nil pointer
since Go1.2, where it returned an empty string. A regression caused by
CL/37331 caused this behavior to change. This breaks code that lazily
does runtime.FuncForPC(myPtr).Name() without first checking that myPtr
is actually non-nil.
Fixes #20872
Change-Id: Iae9a2ebabca5e9d1f5a2cdaf2f30e9c6198fec4f
Reviewed-on: https://go-review.googlesource.com/47354
Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
I was surprised to see readvarint show up in a cpu profile.
Use a few simple optimizations to speed up stack copying:
* Avoid making a copy of the cache.entries array or any of its elements.
* Use a shift instead of a signed division in stackmapdata.
* Change readvarint to return the number of bytes consumed
rather than an updated slice.
* Make some minor optimizations to readvarint to help the compiler.
* Avoid called readvarint when the value fits in a single byte.
The first and last optimizations are the most significant,
although they all contribute a little.
Add a benchmark for stack copying that includes lots of different
functions in a recursive loop, to bust the cache.
This might speed up other runtime operations as well;
I only benchmarked stack copying.
name old time/op new time/op delta
StackCopy-8 96.4ms ± 2% 82.7ms ± 1% -14.24% (p=0.000 n=20+19)
StackCopyNoCache-8 167ms ± 1% 131ms ± 1% -21.58% (p=0.000 n=20+20)
Change-Id: I13d5c455c65073c73b656acad86cf8e8e3c9807b
Reviewed-on: https://go-review.googlesource.com/43150
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
|
|
In particular, this says that Frames.Function uniquely identifies a
function within a program. We depend on this in various places that
use runtime.Frames in std, but it wasn't actually written down.
Change-Id: Ie7ede348c17673e11ae513a094862b60c506abc5
Reviewed-on: https://go-review.googlesource.com/41610
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
The Frames API forces the PC slice to escape to the heap because it
stores it in the Frames object. However, we'd like to use this API for
call stack expansion internally in the runtime in places where it
would be very good to avoid heap allocation.
This commit makes this possible by pulling the bulk of the Frames
implementation into an internal frameExpander API. The key difference
between these APIs is that the frameExpander does not hold the PC
slice; instead, the caller is responsible for threading the PC slice
through the frameExpander API calls. This makes it possible to keep
the PC slice on the stack. The Frames API then becomes a thin shim
around the frameExpander that keeps the PC slice in the Frames object.
Change-Id: If6b2d0b9132a2a905a0cf5deced9feddce76fc0e
Reviewed-on: https://go-review.googlesource.com/40610
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Lazar <lazard@golang.org>
|