Age | Commit message (Collapse) | Author |
|
This patch fixes a bug in the SSA back end's DWARF generation code
that determines variable locations / lifetimes.
The code in question was written to handle sequences of initial
pseudo-ops (zero width instructions such as OpPhi, OpArg, etc) in a
basic block, detecting these ops at the start of a block and then
treating the values specially when emitting ranges for the variables
in those values. The logic in this code wasn't quite correct, meaning
that a flag variable wasn't being set properly to record the presence
of a block of zero-width value-bearing ops, leading to incorrect or
missing DWARF locations for register params.
Also in this patch is a tweak to some sanity-checking code intended to
catch scheduling problems with OpArg/OpPhi etc. The checks need to
allow for the possibility of an Arg op scheduled after a spill of an
incoming register param inserted by the register allocator. Example:
b1:
v13 = ArgIntReg <int> {p1+16} [2] : CX
v14 = ArgIntReg <int> {p2+16} [5] : R8
v38 = ArgIntReg <int> {p3+16} [8] : R11
v35 = ArgIntReg <int> {p1+0} [0] : AX
v15 = StoreReg <int> v35 : .autotmp_4[int]
v40 = Arg <int> {p4} [16] : p4+16[int]
v1 = InitMem <mem>
v3 = SB <uintptr> : SB
v18 = CMPQ <flags> v14 v13
NE v18 → b3 b2 (unlikely) (18)
Here the register allocator has decided to spill v35, meaning that the
OpArg v40 is no longer going to be positioned prior to all other
non-zero-width ops; this is a valid scenario and needs to be handled
properly by the debug code.
Fixes #46425.
Change-Id: I239b3ad56a9c1b8ebf68af42e1f57308293ed7e6
Reviewed-on: https://go-review.googlesource.com/c/go/+/332269
Trust: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
|
|
During DWARF debug location generation, as a preamble to the main data
flow analysis, examine the function entry block to look for in-params
arriving in registers that are partially or completely dead, and
insert new OpArg{Int,Float}Reg values for the dead or partially-dead
pieces. In addition, add entries to the f.NamedValues table for
incoming live register-resident params that don't already have
entries. This helps create better/saner DWARF location expressions for
params. Example:
func foo(s string, used int, notused int) int {
return len(s) + used
}
When optimization is complete for this function, the parameter
"notused" is completely dead, meaning that there is no entry for it in
the f.NamedValues table (which then means we don't emit a DWARF
variable location expression for it in the function enty block). In
addition, since only the length field of "s" is used, there is no
DWARF location expression for the other component of "s", leading to
degraded DWARF.
There are still problems/issues with DWARF location generation, but
this does improve things with respect to being able to print the
values of incoming parameters when stopped in the debugger at the
entry point of a function (when optimization is enabled).
Updates #40724.
Change-Id: I5bb5253648942f9fd33b081fe1a5a36208e75785
Reviewed-on: https://go-review.googlesource.com/c/go/+/322631
Trust: Than McIntosh <thanm@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
steals idea from CL 312093
further investigation revealed additional duplicate
slots (equivalent, but not equal), so delete those too.
Rearranged Func.Names to be addresses of slots,
create canonical addresses so that split slots
(which use those addresses to refer to their parent,
and split slots can be further split)
will preserve "equivalent slots are equal".
Removes duplicates, improves metrics for "args at entry".
Change-Id: I5bbdcb50bd33655abcab3d27ad8cdce25499faaf
Reviewed-on: https://go-review.googlesource.com/c/go/+/312292
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
|
|
Add some rudimentary debug trace output for -N location list
generation if "-d=ssa/locationlists" is set.
Updates #45948.
Change-Id: If1a95730538a6e7def7ebe1ece1a71da8e5f0975
Reviewed-on: https://go-review.googlesource.com/c/go/+/317089
Trust: Than McIntosh <thanm@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
The code that created DWARF debug var locations for input parameters
in the non-optimized case for regabi was not doing the right thing for
degenerate functions with infinite loops. Detect these cases and don't
try to emit the normal location data.
Fixes #45948.
Change-Id: I2717fc4bac2e03d5d850a6ec8a09ed05fed0c896
Reviewed-on: https://go-review.googlesource.com/c/go/+/316752
Trust: Than McIntosh <thanm@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
|
|
When constructing multi-piece DWARF location expressions for
struct-typed parameters using the register ABI, make sure that the
location expressions generated properly reflect padding between
elements (this is required by debuggers). Example:
type small struct { x uint16 ; y uint8 ; z int32 }
func ABC(p1 int, p2 small, f1 float32) {
...
In the DWARF location expression for "p2" on entry to the routine, we
need pieces for each field, but for debuggers (such as GDB) to work
properly, we also need to describe the padding between elements. Thus
instead of
<rbx> DW_OP_piece 2 <rcx> DW_OP_piece 1 <rdi> DW_OP_piece 4
we need to emit
<rbx> DW_OP_piece 2 <rcx> DW_OP_piece 1 DW_OP_piece 1 <rdi> DW_OP_piece 4
This patch adds a new helper routine in abiutils to compute the
correct padding amounts for a struct type, a unit test for the helper,
and updates the debug generation code to call the helper and insert
apadding "piece" ops in the right spots.
Updates #40724.
Updates #45720.
Change-Id: Ie208bee25776b9eb70642041869e65e4fa65a005
Reviewed-on: https://go-review.googlesource.com/c/go/+/315071
Trust: Than McIntosh <thanm@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
Revise the code that generates DWARF location expressions for input
parameters to get it to work properly with the new register ABI when
optimization is turned off.
The previously implementation assumed stack locations for all
input+output parameters when -N (disable optimization) was in effect.
In the new implementation, a register-resident input parameter is
given a 2-element location list, the first list element pointing to
the ABI register(s) containing the param, and the second element
pointing to the stack home once it has been spilled.
NB, this change fixes a bunch of the Delve pkg/proc unit tests (maybe
about half of the outstanding failures). Still a good number that need
to be investigated, however.
Updates #40724.
Updates #45720.
Change-Id: I743bbb9af187bcdebeb8e690fdd6db58094ca415
Reviewed-on: https://go-review.googlesource.com/c/go/+/314431
Trust: Than McIntosh <thanm@google.com>
Trust: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
The SSA code for debug variable location analysis (for DWARF) has two
special 'sentinel' values that it uses to handshake with the
debugInfo.GetPC callback when capturing the PC values of debug
variable ranges after prog generatoin: "BlockStart" and "BlockEnd".
"BlockStart" has the expected semantics: it means "the PC value of the
first instruction of block B", but "BlockEnd" does not mean "PC value
of the last instruction of block B", but rather it is implemented as
"the PC value of the last instruction of the function". This causes
confusion when reading the code, and seems to to result in implementation
flaws in the past, leading to incorrect ranges in some cases.
To help with this, add a new sentinel "FuncEnd" (which has the "last
inst in the function" semantics) and change the implementation of
"BlockEnd" to actually mean what its name implies (last inst in
block).
Updates #45720.
Change-Id: Ic3497fb60413e898d2bfe27805c3db56483d12a2
Reviewed-on: https://go-review.googlesource.com/c/go/+/314930
Trust: Than McIntosh <thanm@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
Morestack works for non-pointer register parameters
Within a function body, pointer-typed parameters are correctly
tracked.
Results still not hooked up.
For #40724.
Change-Id: Icaee0b51d0da54af983662d945d939b756088746
Reviewed-on: https://go-review.googlesource.com/c/go/+/294410
Trust: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
This commit adds exactly two "n := n.(*ir.Name)" statements, that are
each immediately preceded by a "case ir.ONAME:" clause in an n.Op()
switch. The rest of the changes are simply replacing "ir.Node" to
"*ir.Name" and removing now unnecessary "n.(*ir.Name)" type
assertions, exposing the latent typing details.
Passes buildall w/ toolstash -cmp.
Updates #42982.
Change-Id: I8ea3bbb7ddf0c7192245cafa49a19c0e7a556a39
Reviewed-on: https://go-review.googlesource.com/c/go/+/275791
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Russ Cox <rsc@golang.org>
|
|
Now that the only remaining ir.Node implementation that is stored
(directly) into ssa.Aux, we can rewrite all of the conversions between
ir.Node and ssa.Aux to use *ir.Name instead.
rf doesn't have a way to rewrite the type switch case clauses, so we
just use sed instead. There's only a handful, and they're the only
times that "case ir.Node" appears anyway.
The next CL will move the tag method declarations so that ir.Node no
longer implements ssa.Aux.
Passes buildall w/ toolstash -cmp.
Updates #42982.
[git-generate]
cd src/cmd/compile/internal
sed -i -e 's/case ir.Node/case *ir.Name/' gc/plive.go */ssa.go
cd ssa
rf '
ex . ../gc {
import "cmd/compile/internal/ir"
var v *Value
v.Aux.(ir.Node) -> v.Aux.(*ir.Name)
var n ir.Node
var asAux func(Aux)
strict n # only match ir.Node-typed expressions; not *ir.Name
implicit asAux # match implicit assignments to ssa.Aux
asAux(n) -> n.(*ir.Name)
}
'
Change-Id: I3206ef5f12a7cfa37c5fecc67a1ca02ea4d52b32
Reviewed-on: https://go-review.googlesource.com/c/go/+/275789
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
|
|
This was already documented as always being an ONAME, so it just
needed a few type assertion changes.
Passes buildall w/ toolstash -cmp.
Updates #42982.
Change-Id: I61f4b6ebd57c43b41977f4b37b81fe94fb11a723
Reviewed-on: https://go-review.googlesource.com/c/go/+/275757
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Trust: Matthew Dempsky <mdempsky@google.com>
|
|
It's currently hard to automate refactorings around the Value.Aux
field, because we don't have any static typing information for it.
Adding a tag interface will make subsequent CLs easier and safer.
Passes buildall w/ toolstash -cmp.
Updates #42982.
Change-Id: I41ae8e411a66bda3195a0957b60c2fe8a8002893
Reviewed-on: https://go-review.googlesource.com/c/go/+/275756
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Trust: Matthew Dempsky <mdempsky@google.com>
|
|
The plan is to introduce a Node interface that replaces the old *Node pointer-to-struct.
The previous CL defined an interface INode modeling a *Node.
This CL:
- Changes all references outside internal/ir to use INode,
along with many references inside internal/ir as well.
- Renames Node to node.
- Renames INode to Node
So now ir.Node is an interface implemented by *ir.node, which is otherwise inaccessible,
and the code outside package ir is now (clearly) using only the interface.
The usual rule is never to redefine an existing name with a new meaning,
so that old code that hasn't been updated gets a "unknown name" error
instead of more mysterious errors or silent misbehavior. That rule would
caution against replacing Node-the-struct with Node-the-interface,
as in this CL, because code that says *Node would now be using a pointer
to an interface. But this CL is being landed at the same time as another that
moves Node from gc to ir. So the net effect is to replace *gc.Node with ir.Node,
which does follow the rule: any lingering references to gc.Node will be told
it's gone, not silently start using pointers to interfaces. So the rule is followed
by the CL sequence, just not this specific CL.
Overall, the loss of inlining caused by using interfaces cuts the compiler speed
by about 6%, a not insignificant amount. However, as we convert the representation
to concrete structs that are not the giant Node over the next weeks, that speed
should come back as more of the compiler starts operating directly on concrete types
and the memory taken up by the graph of Nodes drops due to the more precise
structs. Honestly, I was expecting worse.
% benchstat bench.old bench.new
name old time/op new time/op delta
Template 168ms ± 4% 182ms ± 2% +8.34% (p=0.000 n=9+9)
Unicode 72.2ms ±10% 82.5ms ± 6% +14.38% (p=0.000 n=9+9)
GoTypes 563ms ± 8% 598ms ± 2% +6.14% (p=0.006 n=9+9)
Compiler 2.89s ± 4% 3.04s ± 2% +5.37% (p=0.000 n=10+9)
SSA 6.45s ± 4% 7.25s ± 5% +12.41% (p=0.000 n=9+10)
Flate 105ms ± 2% 115ms ± 1% +9.66% (p=0.000 n=10+8)
GoParser 144ms ±10% 152ms ± 2% +5.79% (p=0.011 n=9+8)
Reflect 345ms ± 9% 370ms ± 4% +7.28% (p=0.001 n=10+9)
Tar 149ms ± 9% 161ms ± 5% +8.05% (p=0.001 n=10+9)
XML 190ms ± 3% 209ms ± 2% +9.54% (p=0.000 n=9+8)
LinkCompiler 327ms ± 2% 325ms ± 2% ~ (p=0.382 n=8+8)
ExternalLinkCompiler 1.77s ± 4% 1.73s ± 6% ~ (p=0.113 n=9+10)
LinkWithoutDebugCompiler 214ms ± 4% 211ms ± 2% ~ (p=0.360 n=10+8)
StdCmd 14.8s ± 3% 15.9s ± 1% +6.98% (p=0.000 n=10+9)
[Geo mean] 480ms 510ms +6.31%
name old user-time/op new user-time/op delta
Template 223ms ± 3% 237ms ± 3% +6.16% (p=0.000 n=9+10)
Unicode 103ms ± 6% 113ms ± 3% +9.53% (p=0.000 n=9+9)
GoTypes 758ms ± 8% 800ms ± 2% +5.55% (p=0.003 n=10+9)
Compiler 3.95s ± 2% 4.12s ± 2% +4.34% (p=0.000 n=10+9)
SSA 9.43s ± 1% 9.74s ± 4% +3.25% (p=0.000 n=8+10)
Flate 132ms ± 2% 141ms ± 2% +6.89% (p=0.000 n=9+9)
GoParser 177ms ± 9% 183ms ± 4% ~ (p=0.050 n=9+9)
Reflect 467ms ±10% 495ms ± 7% +6.17% (p=0.029 n=10+10)
Tar 183ms ± 9% 197ms ± 5% +7.92% (p=0.001 n=10+10)
XML 249ms ± 5% 268ms ± 4% +7.82% (p=0.000 n=10+9)
LinkCompiler 544ms ± 5% 544ms ± 6% ~ (p=0.863 n=9+9)
ExternalLinkCompiler 1.79s ± 4% 1.75s ± 6% ~ (p=0.075 n=10+10)
LinkWithoutDebugCompiler 248ms ± 6% 246ms ± 2% ~ (p=0.965 n=10+8)
[Geo mean] 483ms 504ms +4.41%
[git-generate]
cd src/cmd/compile/internal/ir
: # We need to do the conversion in multiple steps, so we introduce
: # a temporary type alias that will start out meaning the pointer-to-struct
: # and then change to mean the interface.
rf '
mv Node OldNode
add node.go \
type Node = *OldNode
'
: # It should work to do this ex in ir, but it misses test files, due to a bug in rf.
: # Run the command in gc to handle gc's tests, and then again in ssa for ssa's tests.
cd ../gc
rf '
ex . ../arm ../riscv64 ../arm64 ../mips64 ../ppc64 ../mips ../wasm {
import "cmd/compile/internal/ir"
*ir.OldNode -> ir.Node
}
'
cd ../ssa
rf '
ex {
import "cmd/compile/internal/ir"
*ir.OldNode -> ir.Node
}
'
: # Back in ir, finish conversion clumsily with sed,
: # because type checking and circular aliases do not mix.
cd ../ir
sed -i '' '
/type Node = \*OldNode/d
s/\*OldNode/Node/g
s/^func (n Node)/func (n *OldNode)/
s/OldNode/node/g
s/type INode interface/type Node interface/
s/var _ INode = (Node)(nil)/var _ Node = (*node)(nil)/
' *.go
gofmt -w *.go
sed -i '' '
s/{Func{}, 136, 248}/{Func{}, 152, 280}/
s/{Name{}, 32, 56}/{Name{}, 44, 80}/
s/{Param{}, 24, 48}/{Param{}, 44, 88}/
s/{node{}, 76, 128}/{node{}, 88, 152}/
' sizeof_test.go
cd ../ssa
sed -i '' '
s/{LocalSlot{}, 28, 40}/{LocalSlot{}, 32, 48}/
' sizeof_test.go
cd ../gc
sed -i '' 's/\*ir.Node/ir.Node/' mkbuiltin.go
cd ../../../..
go install std cmd
cd cmd/compile
go test -u || go test -u
Change-Id: I196bbe3b648e4701662e4a2bada40bf155e2a553
Reviewed-on: https://go-review.googlesource.com/c/go/+/272935
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
|
|
The cycle hacks existed because gc needed to import ssa
which need to know about gc.Node. But now that's ir.Node,
and there's no cycle anymore.
Don't know how much it matters but LocalSlot is now
one word shorter than before, because it holds a pointer
instead of an interface for the *Node. That won't last long.
Now that they're not necessary for interface satisfaction,
IsSynthetic and IsAutoTmp can move to top-level ir functions.
Change-Id: Ie511e93466cfa2b17d9a91afc4bd8d53fdb80453
Reviewed-on: https://go-review.googlesource.com/c/go/+/272931
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
|
|
Makes sure the copyright notice is not interpreted as the package level
godoc.
Change-Id: I2afce7c9d620f19d51ec1438b1d0db1774b57146
Reviewed-on: https://go-review.googlesource.com/c/go/+/248760
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
|
|
This was originally
Revert "cmd/link: fix up debug_range for dsymutil (revert CL 72371)"
which has the effect of no longer using Base Address Selection
Entries in DWARF. However, the build-time costs of that are
about 2%, so instead the hacky fixup that generated technically
incorrect DWARF was removed from the linker, and the choice
is instead made in the compiler, dependent on platform, but
also under control of a flag so that we can report this bug
against LLDB/dsymutil/dwarfdump (really, the LLVM dwarf
libraries).
This however does not solve #31188; debugging still fails,
but dwarfdump no longer complains. There are at least two
LLDB bugs involved, and this change will at allow us
to report them without them being rejected because our
now-obsolete workaround for the first bug creates
not-quite-DWARF.
Updates #31188.
Change-Id: I5300c51ad202147bab7333329ebe961623d2b47d
Reviewed-on: https://go-review.googlesource.com/c/go/+/170638
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
|
|
This change avoids creating zero length location lists by
repairing an overly aggressive change in CL146718
and by explicitly checking for and filtering out any
zero-length lists that are detected (building
compiler+runtime creates a single one).
Updates #28486.
Change-Id: I01c571fee2376474c7f3038e801bd58fd9e0b820
Reviewed-on: https://go-review.googlesource.com/c/150097
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
|
|
Before this change, location list construction would extend
from the previous (in linear order) block, even if was not a
flow predecessor. This can cause a debugger to tell lies.
Fix accounts for this in block merging code by (crudely)
"changing" all variables live from a previous block if it
is not also a predecessor.
Fixes #28486.
Change-Id: I11336b0b969f0cd09f40f4e5f2bdfdeb02f377a4
Reviewed-on: https://go-review.googlesource.com/c/146718
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
|
|
For the entry block, make the "first instruction" be truly
the first instruction. This allows printing of incoming
parameters with Delve.
Also be sure Phis are marked as being at the start of their
block. This is observed to move location list pointers,
and where moved, they become correct.
Leading zero-width instructions include LoweredGetClosurePtr.
Because this instruction is actually architecture-specific,
and it is now tested for in 3 different places, also created
Op.isLoweredGetClosurePtr() to reduce future surprises.
Change-Id: Ic043b7265835cf1790382a74334b5714ae4060af
Reviewed-on: https://go-review.googlesource.com/c/145179
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
|
|
Go documentation style for boolean funcs is to say:
// Foo reports whether ...
func Foo() bool
(rather than "returns true if")
This CL also replaces 4 uses of "iff" with the same "reports whether"
wording, which doesn't lose any meaning, and will prevent people from
sending typo fixes when they don't realize it's "if and only if". In
the past I think we've had the typo CLs updated to just say "reports
whether". So do them all at once.
(Inspired by the addition of another "returns true if" in CL 146938
in fd_plan9.go)
Created with:
$ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true iff" | grep -v vendor)
$ perl -i -npe 's/returns true if/reports whether/' $(git grep -l "returns true if" | grep -v vendor)
Change-Id: Ided502237f5ab0d25cb625dbab12529c361a8b9f
Reviewed-on: https://go-review.googlesource.com/c/147037
Reviewed-by: Ian Lance Taylor <iant@golang.org>
|
|
This restores the printing of vXX and bYY in the left-hand
edge of the last column of ssa.html, where the generated
progs appear.
Change-Id: I81ab9b2fa5ae28e6e5de1b77665cfbed8d14e000
Reviewed-on: https://go-review.googlesource.com/c/141277
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Yury Smolsky <yury@smolsky.by>
|
|
Change-Id: Id87d9f55d1714fc553f5b1a9cba0f2fe348dad3e
Reviewed-on: https://go-review.googlesource.com/126396
Run-TryBot: Yury Smolsky <yury@smolsky.by>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Fix duplicated index in LHS and RHS of the < operator.
Found using https://go-critic.github.io/overview#dupSubExpr-ref
Change-Id: I9a5a40bbd436b32e8117579a01bc50afe3608c97
Reviewed-on: https://go-review.googlesource.com/122776
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Apparently a LoadReg can take a Phi as its argument. The Phi has names
in the NamedValue table, so just read the Load's names from the Phi.
The example given, XORKeyStream in chacha20, is pretty complicated so I
didn't try to actually debug it and verify that the results are right.
But the debug logging looks reasonable, with the right names in the right
registers at the right times.
Fixes #25404
Change-Id: I2c3183dcfb033948556d6805bd66c22c0b45625c
Reviewed-on: https://go-review.googlesource.com/114008
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
Remove the unexpected function, which is a lot less relevant now that
the generation basically can't detect invalid states, and make sure no
logging appears without -d locationlists=2.
Updates #25404
Change-Id: If3522df5a7397f2e7b43cb808936e319132132b6
Reviewed-on: https://go-review.googlesource.com/114007
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
Use the math/bits functions to calculate the number of leading/
trailing zeros, bit length and the population count.
The math/bits package is built as part of the bootstrap process
so we do not need to provide an alternative implementation for
Go versions prior to 1.9.
Passes toolstash-check -all.
Change-Id: I393b4cc1c8accd0ca7cb3599d3926fa6319b574f
Reviewed-on: https://go-review.googlesource.com/113336
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Before:
unexpected at 2721:load with unexpected source op v3278unexpected at 2775:load with
unexpected source op v3281unexpected at 2249:load with unexpected source op
v3289unexpected at 2875:load with unexpected source op v3278unexpected at 2232:load
with unexpected source op v286unexpected at 2231:load with unexpected source op
v3291unexpected at 2784:load with unexpected source op v3289unexpected at 2785:load
with unexpected source op v3291
After:
debug info generation: v2721: load with unexpected source op: Phi (v3278)
debug info generation: v2775: load with unexpected source op: Phi (v3281)
debug info generation: v2249: load with unexpected source op: Phi (v3289)
debug info generation: v2875: load with unexpected source op: Phi (v3278)
debug info generation: v2232: load with unexpected source op: Phi (v286)
debug info generation: v2231: load with unexpected source op: Phi (v3291)
debug info generation: v2784: load with unexpected source op: Phi (v3289)
debug info generation: v2785: load with unexpected source op: Phi (v3291)
Updates #25404.
Change-Id: Ib97722848d27ca18bdcd482a610626bc3c6def7d
Reviewed-on: https://go-review.googlesource.com/113275
Run-TryBot: Michael Munday <mike.munday@ibm.com>
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
|
|
User variables that cannot be SSA'd, either because their addresses are
taken or because they are too large for the decomposition heuristic, do
not explicitly appear as operands of SSA values. Instead they are written
to directly via the stack pointer.
This hid them from the location list generation, which is only
interested in the named value table. Fortunately, the lifetime of
stack-only variables is delineated by VarDef/VarKill ops, and it's easy
enough to turn those into location list bounds.
One wrinkle: stack frame information is not explicitly available in the
SSA phases, because it's owned by the frontend in AllocFrame. It would
be easier if the set of live LocalSlots were returned by that, but this
is the minimal change to fix missing variables. Or VarDef/VarKills
could appear in NamedValues, which would make this change even easier.
Change-Id: Ice6654dad6f9babb0286e95c7ec28594561dc91f
Reviewed-on: https://go-review.googlesource.com/100458
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
This is the "3rd bug" that caused compilations to sometimes
produce different results when dwarf location lists were
enabled.
A loop had not been properly rewritten in an earlier
optimization CL, and it accessed uninitialized data,
which was deterministically perhaps wrong when single
threaded, but variably wrong when multithreaded.
Change-Id: Ib3da538762fdf7d5e4407106f2434f3b14a1d7ea
Reviewed-on: https://go-review.googlesource.com/99935
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
|
|
Before DWARF location lists can be turned on, 3 bugs need
fixing.
This CL addresses two -- lack of register definitions for
various architectures, and bugs on 32-bit platforms.
The third bug comes later.
Passes
GO_GCFLAGS=-dwarflocationlists ./run.bash -no-rebuild
(-no-rebuild because the map dependence causes trouble)
Change-Id: I4223b48ade84763e4b048e4aeb81149f082c7bc7
Reviewed-on: https://go-review.googlesource.com/99255
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
When generating location lists, batch up changes for all zero-width
instructions, not just phis. This prevents the creation of location list
entries that don't actually cover any instructions.
This isn't perfect because of the caveats in the prior CL (Copy is
zero-width sometimes) but in practice this seems to fix all of the empty
lists in std.
Change-Id: Ice4a9ade36b6b24ca111d1494c414eec96e5af25
Reviewed-on: https://go-review.googlesource.com/97958
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
Some SSA values don't translate into any instructions. If a function
began with two of them, and both modified the storage of the same
variable, we'd end up with a location list entry that started and ended
at 0. That looks like an end-of-list entry, which would then confuse
downstream tools, particularly the fixup in the linker.
"Fix" this by changing the end of such entries to 1. Should be harmless,
since AFAIK we don't generate any 1-byte instructions. Later CLs will
reduce the frequency of these entries anyway.
Change-Id: I9b7e5e69f914244cc826fb9f4a6acfe2dc695f81
Reviewed-on: https://go-review.googlesource.com/97955
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
CL generated mechanically with github.com/mdempsky/unconvert.
Also updated cmd/compile/internal/ssa/gen/*.rules manually.
Change-Id: If721ef73cf0771ae83ce7e2d11623fc8d9155768
Reviewed-on: https://go-review.googlesource.com/97075
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Reuse even more memory, and keep track of it in a long-lived debugState
object rather than piecemeal in the Cache.
Change-Id: Ib6936b4e8594dc6dda1f59ece753c00fd1c136ba
Reviewed-on: https://go-review.googlesource.com/92404
Reviewed-by: David Chase <drchase@google.com>
|
|
Change the closures to methods on debugState, mostly just for aesthetic
reasons.
Change-Id: I5242807f7300efafc7efb4eb3bd305ac3ec8e826
Reviewed-on: https://go-review.googlesource.com/92403
Reviewed-by: David Chase <drchase@google.com>
|
|
changedVars was functionally a set, but couldn't be iterated over
efficiently. In functions with many variables, the wasted iteration was
costly. Use a sparseSet instead.
(*gc.Node).String() is very expensive: it calls Sprintf, which does
reflection, etc, etc. Instead, just look at .Sym.Name, which is all we
care about.
Change-Id: Ib61cd7b5c796e1813b8859135e85da5bfe2ac686
Reviewed-on: https://go-review.googlesource.com/92402
Reviewed-by: David Chase <drchase@google.com>
|
|
Replace the OnStack boolean in VarLoc with a flag bit in StackOffset.
This doesn't get much memory savings since it's still 64-bit aligned,
but does seem to help a bit anyway.
Change liveSlot to fit into 16 bytes. Because nested structs still get
padding, this required inlining it. Fortunately there's not much logic
to copy.
Change-Id: Ie19a409daa41aa310275c4517a021eecf8886441
Reviewed-on: https://go-review.googlesource.com/92401
Reviewed-by: David Chase <drchase@google.com>
|
|
For functions with many local variables, keeping track of every
LocalSlot for every variable is very expensive. Only track the slots
that are actually used by a given variable.
Change-Id: Iaafbce030a782b8b8c4a0eb7cf025e59af899ea4
Reviewed-on: https://go-review.googlesource.com/92400
Reviewed-by: David Chase <drchase@google.com>
|
|
Keeping the start state of each block around costs more than just
recomputing them as necessary, especially because many blocks only have
one predecessor and don't need any merging at all. Stop storing the
start state, and reuse predecessors' end states as much as conveniently
possible.
Change-Id: I549bad9e1a35af76a974e46fe69f74cd4dce873b
Reviewed-on: https://go-review.googlesource.com/92399
Reviewed-by: David Chase <drchase@google.com>
|
|
Because getStackOffset is a function pointer, the compiler assumes that
its arguments escape. Pass a value instead to avoid heap allocations.
Change-Id: Ib94e5941847f134cd00e873040a4d7fcf15ced26
Reviewed-on: https://go-review.googlesource.com/92397
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
Using bits.TrailingZeroes instead of iterating over each bit is a small
but easy win for the common case of only one or two registers being set.
I copied in the implementation for use with pre-1.9 bootstraps.
Change-Id: Ieaa768554d7d5239a5617fbf34f1ee0b32ce1de5
Reviewed-on: https://go-review.googlesource.com/92395
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
Put everything that showed up in the allocation profile into the cache,
and reuse it across functions.
After this CL, the overhead of enabling location lists is getting
pretty close to the desired 5%:
compilecmp -all -beforeflags -dwarflocationlists=0 -afterflags -dwarflocationlists=1 -n 30 4ebad42292b6a4090faf37753dd768d2965e38c4 4ebad42292b6a4090faf37753dd768d2965e38c4
compilecmp -dwarflocationlists=0 4ebad42292b6a4090faf37753dd768d2965e38c4 -dwarflocationlists=1 4ebad42292b6a4090faf37753dd768d2965e38c4
benchstat -geomean /tmp/869550129 /tmp/143495132
completed 30 of 30, estimated time remaining 0s (eta 3:24PM)
name old time/op new time/op delta
Template 199ms ± 4% 209ms ± 6% +5.17% (p=0.000 n=29+30)
Unicode 99.2ms ± 8% 100.5ms ± 6% ~ (p=0.112 n=30+30)
GoTypes 642ms ± 3% 684ms ± 3% +6.54% (p=0.000 n=29+30)
SSA 8.00s ± 1% 8.71s ± 1% +8.78% (p=0.000 n=29+29)
Flate 129ms ± 7% 134ms ± 5% +3.77% (p=0.000 n=30+30)
GoParser 157ms ± 4% 164ms ± 5% +4.35% (p=0.000 n=29+30)
Reflect 428ms ± 3% 450ms ± 4% +5.09% (p=0.000 n=30+30)
Tar 195ms ± 5% 204ms ± 8% +4.78% (p=0.000 n=30+30)
XML 228ms ± 4% 241ms ± 4% +5.62% (p=0.000 n=30+29)
StdCmd 15.4s ± 1% 16.7s ± 1% +8.29% (p=0.000 n=29+29)
[Geo mean] 476ms 502ms +5.35%
name old user-time/op new user-time/op delta
Template 294ms ±18% 304ms ±15% ~ (p=0.242 n=29+29)
Unicode 182ms ±27% 172ms ±28% ~ (p=0.104 n=30+30)
GoTypes 957ms ±15% 1016ms ±12% +6.16% (p=0.000 n=30+30)
SSA 13.3s ± 5% 14.3s ± 3% +7.32% (p=0.000 n=30+28)
Flate 188ms ±17% 193ms ±17% ~ (p=0.288 n=28+29)
GoParser 232ms ±16% 238ms ±13% ~ (p=0.065 n=30+29)
Reflect 585ms ±13% 620ms ±10% +5.88% (p=0.000 n=30+30)
Tar 298ms ±21% 332ms ±23% +11.32% (p=0.000 n=30+30)
XML 329ms ±17% 343ms ±12% +4.18% (p=0.032 n=30+30)
[Geo mean] 492ms 513ms +4.13%
name old alloc/op new alloc/op delta
Template 38.3MB ± 0% 40.3MB ± 0% +5.29% (p=0.000 n=30+30)
Unicode 29.3MB ± 0% 29.6MB ± 0% +1.28% (p=0.000 n=30+29)
GoTypes 110MB ± 0% 118MB ± 0% +6.97% (p=0.000 n=29+30)
SSA 1.48GB ± 0% 1.61GB ± 0% +9.06% (p=0.000 n=30+30)
Flate 24.8MB ± 0% 26.0MB ± 0% +4.99% (p=0.000 n=29+30)
GoParser 30.9MB ± 0% 32.2MB ± 0% +4.20% (p=0.000 n=30+30)
Reflect 76.8MB ± 0% 80.6MB ± 0% +4.97% (p=0.000 n=30+30)
Tar 39.6MB ± 0% 41.7MB ± 0% +5.22% (p=0.000 n=29+30)
XML 42.0MB ± 0% 45.4MB ± 0% +8.22% (p=0.000 n=29+30)
[Geo mean] 63.9MB 67.5MB +5.56%
name old allocs/op new allocs/op delta
Template 383k ± 0% 405k ± 0% +5.69% (p=0.000 n=30+30)
Unicode 343k ± 0% 346k ± 0% +0.98% (p=0.000 n=30+27)
GoTypes 1.15M ± 0% 1.22M ± 0% +6.17% (p=0.000 n=29+29)
SSA 12.2M ± 0% 13.2M ± 0% +8.15% (p=0.000 n=30+30)
Flate 234k ± 0% 249k ± 0% +6.44% (p=0.000 n=30+30)
GoParser 315k ± 0% 332k ± 0% +5.31% (p=0.000 n=30+28)
Reflect 972k ± 0% 1010k ± 0% +3.89% (p=0.000 n=30+30)
Tar 394k ± 0% 415k ± 0% +5.35% (p=0.000 n=28+30)
XML 404k ± 0% 429k ± 0% +6.31% (p=0.000 n=29+29)
[Geo mean] 651k 686k +5.35%
Change-Id: Ia005a8d6b33ce9f8091322f004376a3d6e5c1a94
Reviewed-on: https://go-review.googlesource.com/89357
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
Completely redesign and reimplement location list generation to be more
efficient, and hopefully not too hard to understand.
RegKills are gone. Instead of using the regalloc's liveness
calculations, redo them using the Ops' clobber information. Besides
saving a lot of Values, this avoids adding RegKills to blocks that would
be empty otherwise, which was messing up optimizations. This does mean
that it's much harder to tell whether the generation process is buggy
(there's nothing to cross-check it with), and there may be disagreements
with GC liveness. But the performance gain is significant, and it's nice
not to be messing with earlier compiler phases.
The intermediate representations are gone. Instead of producing
ssa.BlockDebugs, then dwarf.LocationLists, and then finally real
location lists, go directly from the SSA to a (mostly) real location
list. Because the SSA analysis happens before assembly, it stores
encoded block/value IDs where PCs would normally go. It would be easier
to do the SSA analysis after assembly, but I didn't want to retain the
SSA just for that.
Generation proceeds in two phases: first, it traverses the function in
CFG order, storing the state of the block at the beginning and end. End
states are used to produce the start states of the successor blocks. In
the second phase, it traverses in program text order and produces the
location lists. The processing in the second phase is redundant, but
much cheaper than storing the intermediate representation. It might be
possible to combine the two phases somewhat to take advantage of cases
where the CFG matches the block layout, but I haven't tried.
Location lists are finalized by adding a base address selection entry,
translating each encoded block/value ID to a real PC, and adding the
terminating zero entry. This probably won't work on OSX, where dsymutil
will choke on the base address selection. I tried emitting CU-relative
relocations for each address, and it was *very* bad for performance --
it uses more memory storing all the relocations than it does for the
actual location list bytes. I think I'm going to end up synthesizing the
relocations in the linker only on OSX, but TBD.
TestNexting needs updating: with more optimizations working, the
debugger doesn't stop on the continue (line 88) any more, and the test's
duplicate suppression kicks in. Also, dx and dy live a little longer
now, but they have the correct values.
Change-Id: Ie772dfe23a4e389ca573624fac4d05401ae32307
Reviewed-on: https://go-review.googlesource.com/89356
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
A statement like
foo = bar + qux
might compile to
AX := AX + BX
resulting in a regkill for AX before this instruction.
The buggy behavior is to kill AX "at" this instruction,
before it has executed. (Code generation of no-instruction
values like RegKills applies their effects at the
next actual instruction emitted).
However, bar is still associated with AX until after the
instruction executes, so the effect of the regkill must
occur at the boundary between this instruction and the
next. Similarly, the new value bound to AX is not visible
until this instruction executes (and in the case of values
that require multiple instructions in code generation, until
all of them have executed).
The ranges are adjusted so that a value's start occurs
at the next following instruction after its evaluation,
and the end occurs after (execution of) the first
instruction following the end of the lifetime as a value.
(Notice the asymmetry; the entire value must be finished
before it is visible, but execution of a single instruction
invalidates. However, the value *is* visible before that
next instruction executes).
The test was adjusted to make it insensitive to the result
numbering for variables printed by gdb, since that is not
relevant to the test and makes the differences introduced
by small changes larger than necessary/useful.
The test was also improved to present variable probes
more intuitively, and also to allow explicit indication
of "this variable was optimized out"
Change-Id: I39453eead8399e6bb05ebd957289b112d1100c0e
Reviewed-on: https://go-review.googlesource.com/74090
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
Instead of the hand-written control flow analysis in debug info
generation, use a reverse postorder traversal, which is basically the
same thing. It should be slightly faster.
More importantly, the previous version simply gave up in the case of
non-reducible functions, and produced output that caused a later stage
to crash. It turns out that there's a non-reducible function in
compress/flate, so that wasn't a theoretical issue.
With this change, all blocks will be visited, even for non-reducible
functions.
Change-Id: Id47536764ee93203c6b4105a1a3013fe3265aa12
Reviewed-on: https://go-review.googlesource.com/73110
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
When variables need to be spilled to the stack, they usually get their
own stack slot. Local variables have a slot allocated if they need one,
and arguments start out on the stack. Before this CL, the debug
information made the assumption that this was always the case, and so
didn't bother storing an actual stack offset during SSA analysis.
There's at least one case where this isn't true: variables that alias
arguments. Since the argument is the source of the variable, the
variable will begin its life on the stack in the argument's stack slot,
not its own. Therefore the debug info needs to track the actual stack
slot for each location entry.
No detectable performance change, despite the O(N) loop in getHomeSlot.
Change-Id: I2701adb7eddee17d4524336cb7aa6786e8f32b46
Reviewed-on: https://go-review.googlesource.com/67231
Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
|
|
The information that's used to generate DWARF location lists is very
ssa.Value centric; it uses Values as start and end coordinates to define
ranges. That mostly works fine, but control flow instructions don't come
from Values, so the ranges couldn't cover them.
Control flow instructions are generated when the SSA representation is
converted to assembly, so that's the best place to extend the ranges
to cover them. (Before that, there's nothing to refer to, and afterward
the boundaries between blocks have been lost.) That requires block
information in the debugInfo type, which then flows down to make
everything else awkward. On the plus side, there's a little less copying
slices around than there used to be, so it should be a little faster.
Previously, the ranges for empty blocks were not very meaningful. That
was fine, because they had no Values to cover, so no debug information
was generated for them. But they do have control flow instructions
(that's why they exist) and so now it's important that the information
be correct. Introduce two sentinel values, BlockStart and BlockEnd, that
denote the boundary of a block, even if the block is empty. BlockEnd
replaces the previous SurvivedBlock flag.
There's one more problem: the last instruction in the function will be a
control flow instruction, so any live ranges need to be extended past
it. But there's no instruction after it to use as the end of the range.
Instead, leave the EndProg field of those ranges as nil and fix it up to
point to past the end of the assembled text at the very last moment.
Change-Id: I81f884020ff36fd6fe8d7888fc57c99412c4245b
Reviewed-on: https://go-review.googlesource.com/63010
Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
Just to get rid of lots of .Name() stutter in printf calls.
Change-Id: I86cf00b3f7b2172387a1c6a7f189c1897fab6300
Reviewed-on: https://go-review.googlesource.com/56630
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
Debuggers use DWARF information to find local variables on the
stack and in registers. Prior to this CL, the DWARF information for
functions claimed that all variables were on the stack at all times.
That's incorrect when optimizations are enabled, and results in
debuggers showing data that is out of date or complete gibberish.
After this CL, the compiler is capable of representing variable
locations more accurately, and attempts to do so. Due to limitations of
the SSA backend, it's not possible to be completely correct.
There are a number of problems in the current design. One of the easier
to understand is that variable names currently must be attached to an
SSA value, but not all assignments in the source code actually result
in machine code. For example:
type myint int
var a int
b := myint(int)
and
b := (*uint64)(unsafe.Pointer(a))
don't generate machine code because the underlying representation is the
same, so the correct value of b will not be set when the user would
expect.
Generating the more precise debug information is behind a flag,
dwarflocationlists. Because of the issues described above, setting the
flag may not make the debugging experience much better, and may actually
make it worse in cases where the variable actually is on the stack and
the more complicated analysis doesn't realize it.
A number of changes are included:
- Add a new pseudo-instruction, RegKill, which indicates that the value
in the register has been clobbered.
- Adjust regalloc to emit RegKills in the right places. Significantly,
this means that phis are mixed with StoreReg and RegKills after
regalloc.
- Track variable decomposition in ssa.LocalSlots.
- After the SSA backend is done, analyze the result and build location
lists for each LocalSlot.
- After assembly is done, update the location lists with the assembled
PC offsets, recompose variables, and build DWARF location lists. Emit the
list as a new linker symbol, one per function.
- In the linker, aggregate the location lists into a .debug_loc section.
TODO:
- currently disabled for non-X86/AMD64 because there are no data tables.
go build -toolexec 'toolstash -cmp' -a std succeeds.
With -dwarflocationlists false:
before: f02812195637909ff675782c0b46836a8ff01976
after: 06f61e8112a42ac34fb80e0c818b3cdb84a5e7ec
benchstat -geomean /tmp/220352263 /tmp/621364410
completed 15 of 15, estimated time remaining 0s (eta 3:52PM)
name old time/op new time/op delta
Template 199ms ± 3% 198ms ± 2% ~ (p=0.400 n=15+14)
Unicode 96.6ms ± 5% 96.4ms ± 5% ~ (p=0.838 n=15+15)
GoTypes 653ms ± 2% 647ms ± 2% ~ (p=0.102 n=15+14)
Flate 133ms ± 6% 129ms ± 3% -2.62% (p=0.041 n=15+15)
GoParser 164ms ± 5% 159ms ± 3% -3.05% (p=0.000 n=15+15)
Reflect 428ms ± 4% 422ms ± 3% ~ (p=0.156 n=15+13)
Tar 123ms ±10% 124ms ± 8% ~ (p=0.461 n=15+15)
XML 228ms ± 3% 224ms ± 3% -1.57% (p=0.045 n=15+15)
[Geo mean] 206ms 377ms +82.86%
name old user-time/op new user-time/op delta
Template 292ms ±10% 301ms ±12% ~ (p=0.189 n=15+15)
Unicode 166ms ±37% 158ms ±14% ~ (p=0.418 n=15+14)
GoTypes 962ms ± 6% 963ms ± 7% ~ (p=0.976 n=15+15)
Flate 207ms ±19% 200ms ±14% ~ (p=0.345 n=14+15)
GoParser 246ms ±22% 240ms ±15% ~ (p=0.587 n=15+15)
Reflect 611ms ±13% 587ms ±14% ~ (p=0.085 n=15+13)
Tar 211ms ±12% 217ms ±14% ~ (p=0.355 n=14+15)
XML 335ms ±15% 320ms ±18% ~ (p=0.169 n=15+15)
[Geo mean] 317ms 583ms +83.72%
name old alloc/op new alloc/op delta
Template 40.2MB ± 0% 40.2MB ± 0% -0.15% (p=0.000 n=14+15)
Unicode 29.2MB ± 0% 29.3MB ± 0% ~ (p=0.624 n=15+15)
GoTypes 114MB ± 0% 114MB ± 0% -0.15% (p=0.000 n=15+14)
Flate 25.7MB ± 0% 25.6MB ± 0% -0.18% (p=0.000 n=13+15)
GoParser 32.2MB ± 0% 32.2MB ± 0% -0.14% (p=0.003 n=15+15)
Reflect 77.8MB ± 0% 77.9MB ± 0% ~ (p=0.061 n=15+15)
Tar 27.1MB ± 0% 27.0MB ± 0% -0.11% (p=0.029 n=15+15)
XML 42.7MB ± 0% 42.5MB ± 0% -0.29% (p=0.000 n=15+15)
[Geo mean] 42.1MB 75.0MB +78.05%
name old allocs/op new allocs/op delta
Template 402k ± 1% 398k ± 0% -0.91% (p=0.000 n=15+15)
Unicode 344k ± 1% 344k ± 0% ~ (p=0.715 n=15+14)
GoTypes 1.18M ± 0% 1.17M ± 0% -0.91% (p=0.000 n=15+14)
Flate 243k ± 0% 240k ± 1% -1.05% (p=0.000 n=13+15)
GoParser 327k ± 1% 324k ± 1% -0.96% (p=0.000 n=15+15)
Reflect 984k ± 1% 982k ± 0% ~ (p=0.050 n=15+15)
Tar 261k ± 1% 259k ± 1% -0.77% (p=0.000 n=15+15)
XML 411k ± 0% 404k ± 1% -1.55% (p=0.000 n=15+15)
[Geo mean] 439k 755k +72.01%
name old text-bytes new text-bytes delta
HelloSize 694kB ± 0% 694kB ± 0% -0.00% (p=0.000 n=15+15)
name old data-bytes new data-bytes delta
HelloSize 5.55kB ± 0% 5.55kB ± 0% ~ (all equal)
name old bss-bytes new bss-bytes delta
HelloSize 133kB ± 0% 133kB ± 0% ~ (all equal)
name old exe-bytes new exe-bytes delta
HelloSize 1.04MB ± 0% 1.04MB ± 0% ~ (all equal)
Change-Id: I991fc553ef175db46bb23b2128317bbd48de70d8
Reviewed-on: https://go-review.googlesource.com/41770
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
|