Age | Commit message (Collapse) | Author |
|
Panic if the slice is too short.
Updates #395
Change-Id: I90f4bff2da5d8f3148ba06d2482084f32b25c29a
Reviewed-on: https://go-review.googlesource.com/c/go/+/301650
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
|
|
This doesn't change any behavior, but should help the compiler realise
that these funcs really do nothing at all.
Change-Id: Ib26c02ef264691acac983538ec300f91d6ff98db
Reviewed-on: https://go-review.googlesource.com/c/go/+/280314
Trust: Daniel Martí <mvdan@mvdan.cc>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Right now we just prevent such types from being on the heap. This CL
makes it so they cannot appear on the stack either. The distinction
between heap and stack is pretty vague at the language level (e.g. it
is affected by -N), and we don't need the flexibility anyway.
Once go:notinheap types cannot be in either place, we don't need to
consider pointers to such types to be pointers, at least according to
the garbage collector and stack copying. (This is the big win of this
CL, in my opinion.)
The distinction between HasPointers and HasHeapPointer no longer
exists. There is only HasPointers.
This CL is cleanup before possible use of go:notinheap to fix #40954.
Update #13386
Change-Id: Ibd895aadf001c0385078a6d4809c3f374991231a
Reviewed-on: https://go-review.googlesource.com/c/go/+/249917
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
|
|
Passes toolstash-check -all.
Change-Id: Ia73233c2269017a5802df821ea2ca138c16a94ee
Reviewed-on: https://go-review.googlesource.com/c/go/+/229519
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <mike.munday@ibm.com>
|
|
In cases in which we had a named value whose args were all _,
like this rule from ARM.rules:
(MOVBUreg x:(MOVBUload _ _)) -> (MOVWreg x)
We previously inserted
_ = x.Args[1]
even though it is unnecessary.
This change eliminates this pointless bounds check.
And in other cases, we now check bounds just as far as strictly necessary.
No significant movement on any compiler metrics.
Just nicer (and less) code.
Passes toolstash-check -all.
Change-Id: I075dfe9f926cc561cdc705e9ddaab563164bed3a
Reviewed-on: https://go-review.googlesource.com/c/go/+/221781
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
This:
* Simplifies and shortens the generated code for rewrite rules.
* Shrinks cmd/compile by 86k (0.4%) and makes it easier to compile.
* Removes the stmt boundary code wrangling from Value.reset,
in favor of doing it in the one place where it actually does some work,
namely the writebarrier pass. (This was ascertained by inspecting the
code for cases in which notStmtBoundary values were generated.)
Passes toolstash-check -all.
Change-Id: I25671d4c4bbd772f235195d11da090878ea2cc07
Reviewed-on: https://go-review.googlesource.com/c/go/+/221421
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
This shrinks the compiler without impacting performance.
(The performance-sensitive part of rewrite rules is the non-match case.)
Passes toolstash-check -all.
Executable size:
file before after Δ %
compile 20356168 20163960 -192208 -0.944%
total 115599376 115407168 -192208 -0.166%
Text size:
file before after Δ %
cmd/compile/internal/ssa.s 3928309 3778774 -149535 -3.807%
total 18862943 18713408 -149535 -0.793%
Memory allocated compiling package SSA:
SSA 12.7M ± 0% 12.5M ± 0% -1.74% (p=0.008 n=5+5)
Compiler speed impact:
name old time/op new time/op delta
Template 211ms ± 1% 211ms ± 2% ~ (p=0.832 n=49+49)
Unicode 82.8ms ± 2% 83.2ms ± 2% +0.44% (p=0.022 n=46+49)
GoTypes 726ms ± 1% 728ms ± 2% ~ (p=0.076 n=46+48)
Compiler 3.39s ± 2% 3.40s ± 2% ~ (p=0.633 n=48+49)
SSA 7.71s ± 1% 7.65s ± 1% -0.78% (p=0.000 n=45+44)
Flate 134ms ± 1% 134ms ± 1% ~ (p=0.195 n=50+49)
GoParser 167ms ± 1% 167ms ± 1% ~ (p=0.390 n=47+47)
Reflect 453ms ± 3% 452ms ± 2% ~ (p=0.492 n=48+49)
Tar 184ms ± 3% 184ms ± 2% ~ (p=0.862 n=50+48)
XML 248ms ± 2% 248ms ± 2% ~ (p=0.096 n=49+47)
[Geo mean] 415ms 415ms -0.03%
name old user-time/op new user-time/op delta
Template 273ms ± 1% 273ms ± 2% ~ (p=0.711 n=48+48)
Unicode 117ms ± 6% 117ms ± 5% ~ (p=0.633 n=50+50)
GoTypes 972ms ± 2% 974ms ± 1% +0.29% (p=0.016 n=47+49)
Compiler 4.46s ± 6% 4.51s ± 6% ~ (p=0.093 n=50+50)
SSA 10.4s ± 1% 10.3s ± 2% -0.94% (p=0.000 n=45+50)
Flate 166ms ± 2% 167ms ± 2% ~ (p=0.148 n=49+48)
GoParser 202ms ± 1% 202ms ± 2% -0.28% (p=0.014 n=47+49)
Reflect 594ms ± 2% 594ms ± 2% ~ (p=0.717 n=48+49)
Tar 224ms ± 2% 224ms ± 2% ~ (p=0.805 n=50+49)
XML 311ms ± 1% 310ms ± 1% ~ (p=0.177 n=49+48)
[Geo mean] 537ms 537ms +0.01%
Change-Id: I562b9f349b34ddcff01771769e6dbbc80604da7a
Reviewed-on: https://go-review.googlesource.com/c/go/+/221237
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
We added chunking of rewrite rules to speed up compiling package SSA.
This series of changes has significantly shrunk the number of
rewrite rules, and they are no longer being added nearly as fast.
Now that we are sharing v.Args across multiple rewrite rules,
there is additional benefit to having more rules in a single function.
Removing chunking now has an incidental impact on compiling package SSA,
marginally speeds up other compilation, shrinks the cmd/compile binary,
and simplifies the code.
name old time/op new time/op delta
Template 211ms ± 2% 210ms ± 2% -0.50% (p=0.000 n=91+97)
Unicode 81.9ms ± 3% 81.8ms ± 3% ~ (p=0.179 n=96+91)
GoTypes 731ms ± 2% 731ms ± 1% ~ (p=0.442 n=94+96)
Compiler 3.43s ± 2% 3.41s ± 2% -0.36% (p=0.001 n=98+94)
SSA 8.30s ± 2% 8.32s ± 2% +0.19% (p=0.034 n=94+95)
Flate 135ms ± 2% 134ms ± 1% -0.30% (p=0.006 n=98+94)
GoParser 167ms ± 1% 167ms ± 1% -0.22% (p=0.001 n=92+94)
Reflect 453ms ± 2% 453ms ± 3% ~ (p=0.306 n=98+97)
Tar 184ms ± 2% 183ms ± 2% -0.31% (p=0.012 n=94+94)
XML 249ms ± 2% 248ms ± 1% -0.26% (p=0.002 n=96+92)
[Geo mean] 419ms 418ms -0.21%
name old user-time/op new user-time/op delta
Template 273ms ± 2% 272ms ± 2% -0.46% (p=0.000 n=93+96)
Unicode 116ms ± 4% 117ms ± 4% ~ (p=0.433 n=98+98)
GoTypes 977ms ± 2% 977ms ± 1% ~ (p=0.971 n=92+99)
Compiler 4.56s ± 6% 4.53s ± 6% ~ (p=0.081 n=100+100)
SSA 11.1s ± 2% 11.1s ± 2% ~ (p=0.064 n=99+96)
Flate 167ms ± 2% 167ms ± 1% -0.24% (p=0.004 n=95+96)
GoParser 203ms ± 1% 203ms ± 2% -0.14% (p=0.049 n=96+97)
Reflect 595ms ± 2% 595ms ± 2% ~ (p=0.544 n=95+92)
Tar 225ms ± 2% 224ms ± 2% ~ (p=0.562 n=99+99)
XML 312ms ± 2% 311ms ± 1% ~ (p=0.050 n=97+93)
[Geo mean] 543ms 542ms -0.13%
Change-Id: I8d34ab59f154b28f20c6f9e416b976bfce339baa
Reviewed-on: https://go-review.googlesource.com/c/go/+/216220
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
CL 213703 converted generated rewrite rules for commutative ops
to use loops instead of duplicated code.
However, it loaded args using expressions like
v.Args[i] and v.Args[i^1], which the compiler could
not eliminate bounds for (including with all outstanding
prove CLs).
Also, given a series of separate rewrite rules for the same op,
we generated bounds checks for every rewrite rule, even though
we were repeatedly loading the same set of args.
This change reduces both sets of bounds checks.
Instead of loading v.Args[i] and v.Args[i^1] for commutative loops,
we now preload v.Args[0] and v.Args[1] into local variables,
and then swap them (as needed) in the commutative loop post statement.
And we now load all top level v.Args into local variables
at the beginning of every rewrite rule function.
The second optimization is the more significant,
but the first helps a little, and they play together
nicely from the perspective of generating the code.
This does increase register pressure, but the reduced bounds
checks more than compensate.
Note that the vast majority of rewrite rules evaluated
are not applied, so the prologue is the most important
part of the rewrite rules.
There is one subtle aspect to the new generated code.
Because the top level v.Args are shared across rewrite rules,
and rule evaluation can swap v_0 and v_1, v_0 and v_1
can end up being swapped from one rule to the next.
That is OK, because any time a rule does not get applied,
they will have been swapped exactly twice.
Passes toolstash-check -all.
name old time/op new time/op delta
Template 213ms ± 2% 211ms ± 2% -0.85% (p=0.000 n=92+96)
Unicode 83.5ms ± 2% 83.2ms ± 2% -0.41% (p=0.004 n=95+90)
GoTypes 737ms ± 2% 733ms ± 2% -0.51% (p=0.000 n=91+94)
Compiler 3.45s ± 2% 3.43s ± 2% -0.44% (p=0.000 n=99+100)
SSA 8.54s ± 1% 8.32s ± 2% -2.56% (p=0.000 n=96+99)
Flate 136ms ± 2% 135ms ± 1% -0.47% (p=0.000 n=96+96)
GoParser 169ms ± 1% 168ms ± 1% -0.33% (p=0.000 n=96+93)
Reflect 456ms ± 3% 455ms ± 3% ~ (p=0.261 n=95+94)
Tar 186ms ± 2% 185ms ± 2% -0.48% (p=0.000 n=94+95)
XML 251ms ± 1% 250ms ± 1% -0.51% (p=0.000 n=91+94)
[Geo mean] 424ms 421ms -0.68%
name old user-time/op new user-time/op delta
Template 275ms ± 1% 274ms ± 2% -0.55% (p=0.000 n=95+98)
Unicode 118ms ± 4% 118ms ± 4% ~ (p=0.642 n=98+90)
GoTypes 983ms ± 1% 980ms ± 1% -0.30% (p=0.000 n=93+93)
Compiler 4.56s ± 6% 4.52s ± 6% -0.72% (p=0.003 n=100+100)
SSA 11.4s ± 1% 11.1s ± 1% -2.50% (p=0.000 n=96+97)
Flate 168ms ± 1% 167ms ± 1% -0.49% (p=0.000 n=92+92)
GoParser 204ms ± 1% 204ms ± 2% -0.27% (p=0.003 n=99+96)
Reflect 599ms ± 2% 598ms ± 2% ~ (p=0.116 n=95+92)
Tar 227ms ± 2% 225ms ± 2% -0.57% (p=0.000 n=95+98)
XML 313ms ± 2% 312ms ± 1% -0.37% (p=0.000 n=89+95)
[Geo mean] 547ms 544ms -0.61%
file before after Δ %
compile 21113112 21109016 -4096 -0.019%
total 131704940 131700844 -4096 -0.003%
Change-Id: Id6c39e0367e597c0c75b8a4b1eb14cc3cbd11956
Reviewed-on: https://go-review.googlesource.com/c/go/+/216218
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Prior to this change, we generated additional rules at rulegen time
for all possible combinations of args to commutative ops.
This is simple and works well, but leads to lots of generated rules.
This in turn has increased the size of the compiler,
made it hard to compile package ssa on small machines,
and provided a disincentive to mark some ops as commutative.
This change reworks how we handle commutative ops.
Instead of generating a rule per argument permutation,
we generate a series of nested loops, one for each commutative op.
Each loop tries both possible argument orderings.
I also considered attempting to canonicalize the inputs to the
rewrite rules. However, because either or both arguments might be
nothing more than an identifier, and because there can be arbitrary
conditions to evaluate during matching, I did not see how to proceed.
The duplicate rule detection now sorts arguments to commutative ops,
so that it can detect commutative-only duplicates.
There may be further optimizations to the new generated code.
In particular, we may not be removing as many bounds checks as before;
I have not investigated deeply. If more work here is needed,
we could do it with more hints or with improvements to the prove pass.
This change has almost no impact on the generated code.
It does not pass toolstash-check, however. In a handful of functions,
for reasons I do not understand, there are minor position changes.
For the entire series ending at this change,
there is negligible compiler performance impact.
The compiler binary shrinks by about 15%,
and package ssa shrinks by about 25%.
Package ssa also compiles ~25% faster with ~25% less memory.
Change-Id: Ia2ee9ceae7be08a17342319d4e31b0bb238a2ee4
Reviewed-on: https://go-review.googlesource.com/c/go/+/213703
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
First, renove unnecessary "// cond:" lines from the generated files.
This shaves off about ~7k lines.
Second, join "if cond { break }" statements via "||", which allows us to
deduplicate a large number of them. This shaves off another ~25k lines.
This change is not for readability or simplicity; but rather, to avoid
unnecessary verbosity that makes the generated files larger. All in all,
git reports that the generated files overall weigh ~200KiB less, or
about 2.7% less.
While at it, add a -trace flag to rulegen.
Updates #33644.
Change-Id: I3fac0290a6066070cc62400bf970a4ae0929470a
Reviewed-on: https://go-review.googlesource.com/c/go/+/196498
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
First, add cpu and memory profiling flags, as these are useful to see
where rulegen is spending its time. It now takes many seconds to run on
a recent laptop, so we have to keep an eye on what it's doing.
Second, stop writing '_ = var' lines to keep imports and variables used
at all times. Now that rulegen removes all such unused names, they're
unnecessary.
To perform the removal, lean on go/types to first detect what names are
unused. We can configure it to give us all the type-checking errors in a
file, so we can collect all "declared but not used" errors in a single
pass.
We then use astutil.Apply to remove the relevant nodes based on the line
information from each unused error. This allows us to apply the changes
without having to do extra parser+printer roundtrips to plaintext, which
are far too expensive.
We need to do multiple such passes, as removing an unused variable
declaration might then make another declaration unused. Two passes are
enough to clean every file at the moment, so add a limit of three passes
for now to avoid eating cpu uncontrollably by accident.
The resulting performance of the changes above is a ~30% loss across the
table, since go/types is fairly expensive. The numbers were obtained
with 'benchcmd Rulegen go run *.go', which involves compiling rulegen
itself, but that seems reflective of how the program is used.
name old time/op new time/op delta
Rulegen 5.61s ± 0% 7.36s ± 0% +31.17% (p=0.016 n=5+4)
name old user-time/op new user-time/op delta
Rulegen 7.20s ± 1% 9.92s ± 1% +37.76% (p=0.016 n=5+4)
name old sys-time/op new sys-time/op delta
Rulegen 135ms ±19% 169ms ±17% +25.66% (p=0.032 n=5+5)
name old peak-RSS-bytes new peak-RSS-bytes delta
Rulegen 71.0MB ± 2% 85.6MB ± 2% +20.56% (p=0.008 n=5+5)
We can live with a bit more resource usage, but the time/op getting
close to 10s isn't good. To win that back, introduce concurrency in
main.go. This further increases resource usage a bit, but the real time
on this quad-core laptop is greatly reduced. The final benchstat is as
follows:
name old time/op new time/op delta
Rulegen 5.61s ± 0% 3.97s ± 1% -29.26% (p=0.008 n=5+5)
name old user-time/op new user-time/op delta
Rulegen 7.20s ± 1% 13.91s ± 1% +93.09% (p=0.008 n=5+5)
name old sys-time/op new sys-time/op delta
Rulegen 135ms ±19% 269ms ± 9% +99.17% (p=0.008 n=5+5)
name old peak-RSS-bytes new peak-RSS-bytes delta
Rulegen 71.0MB ± 2% 226.3MB ± 1% +218.72% (p=0.008 n=5+5)
It might be possible to reduce the cpu or memory usage in the future,
such as configuring go/types to do less work, or taking shortcuts to
avoid having to run it many times. For now, ~2x cpu and ~4x memory usage
seems like a fair trade for a faster and better rulegen.
Finally, we can remove the old code that tried to remove some unused
variables in a hacky and unmaintainable way.
Change-Id: Iff9e83e3f253babf5a1bd48cc993033b8550cee6
Reviewed-on: https://go-review.googlesource.com/c/go/+/189798
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
A lot of the naked for loops begin like:
for {
v := b.Control
if v.Op != OpConstBool {
break
}
...
return true
}
Instead, write them out in a more compact and readable way:
for v.Op == OpConstBool {
...
return true
}
This requires the addition of two bytes.Buffer writers, as this helps us
make a decision based on future pieces of generated code. This probably
makes rulegen slightly slower, but that's not noticeable; the code
generation still takes ~3.5s on my laptop, excluding build time.
The "v := b.Control" declaration can be moved to the top of each
function. Even though the rules can modify b.Control when firing, they
also make the function return, so v can't be used again.
While at it, remove three unnecessary lines from the top of each
rewriteBlock func.
In total, this results in ~4k lines removed from the generated code, and
a slight improvement in readability.
Change-Id: I317e4c6a4842c64df506f4513375475fad2aeec5
Reviewed-on: https://go-review.googlesource.com/c/go/+/167399
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
This reduces the number of extra bounds check hints we need to
insert. For example, rather than producing:
_ = v.Args[2]
x := v.Args[0]
y := v.Args[1]
z := v.Args[2]
We now produce:
z := v.Args[2]
x := v.Args[0]
y := v.Args[1]
This gets rid of about 7000 lines of code from the rewrite rules.
Change-Id: I1291cf0f82e8d035a6d65bce7dee6cedee04cbcd
Reviewed-on: https://go-review.googlesource.com/c/go/+/167397
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
The sheer length of the generated rules files makes my
editor and git client unhappy.
This change is a small step towards shortening them.
We recognize a few magic variables during rulegen: b, config, fe, typ.
Of these, only b appears prone to false positives.
By tightening the heuristic and fixing one case in MIPS.rules,
we can make the heuristic enough that it has no failures.
That allows us to remove the hedge assignments to _,
removing 3000 pointless lines of code.
Change-Id: I080cde5db28c8277cb3fd9ddcd829306c9a27785
Reviewed-on: https://go-review.googlesource.com/c/go/+/166979
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
During development and debugging, I often want to
write noteRule(fmt.Sprintf(...)), and end up
manually adding the import to the generated code.
Let's just make it always available instead.
Change-Id: I1e2d47c98ba056e1b5da42e35fb6ad26f1d9cc3d
Reviewed-on: https://go-review.googlesource.com/c/145207
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Martin Möhrmann <moehrmann@google.com>
|
|
This was an artifact from when we had a separate ssa.Type interface to
break circular dependency between packages ssa and gc. It's no longer
needed now that package ssa directly uses package types.
Change-Id: I6a93e5d79082815f7f0eb89507381969cc6cb403
Reviewed-on: https://go-review.googlesource.com/109137
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
|
|
The first word of an interface is a pointer, but for the purposes
of GC we don't need to treat it as such.
1. If it is a non-empty interface, the pointer points to an itab
which is always in persistentalloc space.
2. If it is an empty interface, the pointer points to a _type.
a. If it is a compile-time-allocated type, it points into
the read-only data section.
b. If it is a reflect-allocated type, it points into the Go heap.
Reflect is responsible for keeping a reference to
the underlying type so it won't be GCd.
If we ever have a moving GC, we need to change this for 2b (as
well as scan itabs to update their itab._type fields).
Write barriers on the first word of interfaces have already been removed.
Change-Id: I643e91d7ac4de980ac2717436eff94097c65d959
Reviewed-on: https://go-review.googlesource.com/97518
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
In addition to look nicer to the eye, this allows to reformat
and indent rules without causing spurious changes to the generated
file, making it easier to spot functional changes.
After this CL, all CLs that will aggregate rules through
the new "|" functionality should cause no changes to the
generated files.
Change-Id: Icec283585ba8d7b91c79d76513c1d83dca4b30aa
Reviewed-on: https://go-review.googlesource.com/95216
Run-TryBot: Giovanni Bajo <rasky@develer.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
|
|
Noticed while looking at #20356.
Cuts 160k (1%) off of the cmd/compile binary.
Change-Id: If2397bc6971d6be9be6975048adecb0b5efa6d66
Reviewed-on: https://go-review.googlesource.com/43501
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
When package ssa was created, Type was in package gc.
To avoid circular dependencies, we used an interface (ssa.Type)
to represent type information in SSA.
In the Go 1.9 cycle, gri extricated the Type type from package gc.
As a result, we can now use it in package ssa.
Now, instead of package types depending on package ssa,
it is the other way.
This is a more sensible dependency tree,
and helps compiler performance a bit.
Though this is a big CL, most of the changes are
mechanical and uninteresting.
Interesting bits:
* Add new singleton globals to package types for the special
SSA types Memory, Void, Invalid, Flags, and Int128.
* Add two new Types, TSSA for the special types,
and TTUPLE, for SSA tuple types.
ssa.MakeTuple is now types.NewTuple.
* Move type comparison result constants CMPlt, CMPeq, and CMPgt
to package types.
* We had picked the name "types" in our rules for the handy
list of types provided by ssa.Config. That conflicted with
the types package name, so change it to "typ".
* Update the type comparison routine to handle tuples and special
types inline.
* Teach gc/fmt.go how to print special types.
* We can now eliminate ElemTypes in favor of just Elem,
and probably also some other duplicated Type methods
designed to return ssa.Type instead of *types.Type.
* The ssa tests were using their own dummy types,
and they were not particularly careful about types in general.
Of necessity, this CL switches them to use *types.Type;
it does not make them more type-accurate.
Unfortunately, using types.Type means initializing a bit
of the types universe.
This is prime for refactoring and improvement.
This shrinks ssa.Value; it now fits in a smaller size class
on 64 bit systems. This doesn't have a giant impact,
though, since most Values are preallocated in a chunk.
name old alloc/op new alloc/op delta
Template 37.9MB ± 0% 37.7MB ± 0% -0.57% (p=0.000 n=10+8)
Unicode 28.9MB ± 0% 28.7MB ± 0% -0.52% (p=0.000 n=10+10)
GoTypes 110MB ± 0% 109MB ± 0% -0.88% (p=0.000 n=10+10)
Flate 24.7MB ± 0% 24.6MB ± 0% -0.66% (p=0.000 n=10+10)
GoParser 31.1MB ± 0% 30.9MB ± 0% -0.61% (p=0.000 n=10+9)
Reflect 73.9MB ± 0% 73.4MB ± 0% -0.62% (p=0.000 n=10+8)
Tar 25.8MB ± 0% 25.6MB ± 0% -0.77% (p=0.000 n=9+10)
XML 41.2MB ± 0% 40.9MB ± 0% -0.80% (p=0.000 n=10+10)
[Geo mean] 40.5MB 40.3MB -0.68%
name old allocs/op new allocs/op delta
Template 385k ± 0% 386k ± 0% ~ (p=0.356 n=10+9)
Unicode 343k ± 1% 344k ± 0% ~ (p=0.481 n=10+10)
GoTypes 1.16M ± 0% 1.16M ± 0% -0.16% (p=0.004 n=10+10)
Flate 238k ± 1% 238k ± 1% ~ (p=0.853 n=10+10)
GoParser 320k ± 0% 320k ± 0% ~ (p=0.720 n=10+9)
Reflect 957k ± 0% 957k ± 0% ~ (p=0.460 n=10+8)
Tar 252k ± 0% 252k ± 0% ~ (p=0.133 n=9+10)
XML 400k ± 0% 400k ± 0% ~ (p=0.796 n=10+10)
[Geo mean] 428k 428k -0.01%
Removing all the interface calls helps non-trivially with CPU, though.
name old time/op new time/op delta
Template 178ms ± 4% 173ms ± 3% -2.90% (p=0.000 n=94+96)
Unicode 85.0ms ± 4% 83.9ms ± 4% -1.23% (p=0.000 n=96+96)
GoTypes 543ms ± 3% 528ms ± 3% -2.73% (p=0.000 n=98+96)
Flate 116ms ± 3% 113ms ± 4% -2.34% (p=0.000 n=96+99)
GoParser 144ms ± 3% 140ms ± 4% -2.80% (p=0.000 n=99+97)
Reflect 344ms ± 3% 334ms ± 4% -3.02% (p=0.000 n=100+99)
Tar 106ms ± 5% 103ms ± 4% -3.30% (p=0.000 n=98+94)
XML 198ms ± 5% 192ms ± 4% -2.88% (p=0.000 n=92+95)
[Geo mean] 178ms 173ms -2.65%
name old user-time/op new user-time/op delta
Template 229ms ± 5% 224ms ± 5% -2.36% (p=0.000 n=95+99)
Unicode 107ms ± 6% 106ms ± 5% -1.13% (p=0.001 n=93+95)
GoTypes 696ms ± 4% 679ms ± 4% -2.45% (p=0.000 n=97+99)
Flate 137ms ± 4% 134ms ± 5% -2.66% (p=0.000 n=99+96)
GoParser 176ms ± 5% 172ms ± 8% -2.27% (p=0.000 n=98+100)
Reflect 430ms ± 6% 411ms ± 5% -4.46% (p=0.000 n=100+92)
Tar 128ms ±13% 123ms ±13% -4.21% (p=0.000 n=100+100)
XML 239ms ± 6% 233ms ± 6% -2.50% (p=0.000 n=95+97)
[Geo mean] 220ms 213ms -2.76%
Change-Id: I15c7d6268347f8358e75066dfdbd77db24e8d0c1
Reviewed-on: https://go-review.googlesource.com/42145
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
This reverts commit 94d540a4b6bf68ec472bf4469037955e3133fcf7.
Reason for revert: prefer something along the lines of CL 42018.
Change-Id: I876fe32e98f37d8d725fe55e0fd0ea429c0198e0
Reviewed-on: https://go-review.googlesource.com/42022
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
|
|
Type.Size and Type.Alignment are for the front end:
They calculate size and alignment if needed.
Type.MustSize and Type.MustAlignment are for the back end:
They call Fatal if size and alignment are not already calculated.
Most uses are of MustSize and MustAlignment,
but that's because the back end is newer,
and this API was added to support it.
This CL was mostly generated with sed and selective reversion.
The only mildly interesting bit is the change of the ssa.Type interface
and the supporting ssa dummy types.
Follow-up to review feedback on CL 41970.
Passes toolstash-check.
Change-Id: I0d9b9505e57453dae8fb6a236a07a7a02abd459e
Reviewed-on: https://go-review.googlesource.com/42016
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
This makes the cmd/compile/internal/ssa package
compile much faster, and has no impact
on the speed of the compiler.
The chunk size was selected empirically,
in that at chunk size 10, the object
file was smaller than at chunk size 5 or 20.
name old time/op new time/op delta
SSA 7.33s ± 5% 5.64s ± 1% -23.10% (p=0.000 n=10+10)
name old user-time/op new user-time/op delta
SSA 9.70s ± 1% 8.04s ± 2% -17.17% (p=0.000 n=9+10)
name old obj-bytes new obj-bytes delta
SSA 9.82M ± 0% 8.28M ± 0% -15.67% (p=0.000 n=10+10)
Change-Id: Iab472905da3f0e82f3db2c93d06e2759abc9dd44
Reviewed-on: https://go-review.googlesource.com/41296
Reviewed-by: Keith Randall <khr@golang.org>
|
|
Now only cmd/asm and cmd/compile depend on cmd/internal/obj. Changing
the assembler backends no longer requires reinstalling cmd/link or
cmd/addr2line.
There's also now one canonical definition of the object file format in
cmd/internal/objabi/doc.go, with a warning to update all three
implementations.
objabi is still something of a grab bag of unrelated code (e.g., flag
and environment variable handling probably belong in a separate "tool"
package), but this is still progress.
Fixes #15165.
Fixes #20026.
Change-Id: Ic4b92fac7d0d35438e0d20c9579aad4085c5534c
Reviewed-on: https://go-review.googlesource.com/40972
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
|
|
Note that this is a redo of an undo of the original buggy CL 38666.
We have lots of rewrite rules that vary only in the fact that
we have 2 versions for the 2 different orderings of various
commuting ops. For example:
(ADDL x (MOVLconst [c])) -> (ADDLconst [c] x)
(ADDL (MOVLconst [c]) x) -> (ADDLconst [c] x)
It can get unwieldly quickly, especially when there is more than
one commuting op in a rule.
Our existing "fix" for this problem is to have rules that
canonicalize the operations first. For example:
(Eq64 x (Const64 <t> [c])) && x.Op != OpConst64 -> (Eq64 (Const64 <t> [c]) x)
Subsequent rules can then assume if there is a constant arg to Eq64,
it will be the first one. This fix kinda works, but it is fragile and
only works when we remember to include the required extra rules.
The fundamental problem is that the rule matcher doesn't
know anything about commuting ops. This CL fixes that fact.
We already have information about which ops commute. (The register
allocator takes advantage of commutivity.) The rule generator now
automatically generates multiple rules for a single source rule when
there are commutative ops in the rule. We can now drop all of our
almost-duplicate source-level rules and the canonicalization rules.
I have some CLs in progress that will be a lot less verbose when
the rule generator handles commutivity for me.
I had to reorganize the load-combining rules a bit. The 8-way OR rules
generated 128 different reorderings, which was causing the generator
to put too much code in the rewrite*.go files (the big ones were going
from 25K lines to 132K lines). Instead I reorganized the rules to
combine pairs of loads at a time. The generated rule files are now
actually a bit (5%) smaller.
Make.bash times are ~unchanged.
Compiler benchmarks are not observably different. Probably because
we don't spend much compiler time in rule matching anyway.
I've also done a pass over all of our ops adding commutative markings
for ops which hadn't had them previously.
Fixes #18292
Change-Id: Ic1c0e43fbf579539f459971625f69690c9ab8805
Reviewed-on: https://go-review.googlesource.com/38801
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
Updates #13560
Change-Id: I9bc08ca5cf0627e653d55f748ebb83be8b69ea3b
Reviewed-on: https://go-review.googlesource.com/39296
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
|
|
By checking GOARM in ssa/gen/ARM.rules, each intermediate operator
can be implemented via different instruction serials.
It is up to the user to choose between compitability and efficiency.
The Bswap32(x) is optimized to REV(x) when GOARM >= 6.
The CTZ(x) is optimized to CLZ(RBIT x) when GOARM == 7.
Change-Id: Ie9ee645fa39333fa79ad84ed4d1cefac30422814
Reviewed-on: https://go-review.googlesource.com/35610
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
|
|
This reverts commit 041ecb697f0e867a2bb0bf219cc2fd5f77057c2e.
Reason for revert: Not working on S390x and some 386 archs.
I have a guess why the S390x is failing. No clue on the 386 yet.
Revert until I can figure it out.
Change-Id: I64f1ce78fa6d1037ebe7ee2a8a8107cb4c1db70c
Reviewed-on: https://go-review.googlesource.com/38790
Reviewed-by: Keith Randall <khr@golang.org>
|
|
We have lots of rewrite rules that vary only in the fact that
we have 2 versions for the 2 different orderings of various
commuting ops. For example:
(ADDL x (MOVLconst [c])) -> (ADDLconst [c] x)
(ADDL (MOVLconst [c]) x) -> (ADDLconst [c] x)
It can get unwieldly quickly, especially when there is more than
one commuting op in a rule.
Our existing "fix" for this problem is to have rules that
canonicalize the operations first. For example:
(Eq64 x (Const64 <t> [c])) && x.Op != OpConst64 -> (Eq64 (Const64 <t> [c]) x)
Subsequent rules can then assume if there is a constant arg to Eq64,
it will be the first one. This fix kinda works, but it is fragile and
only works when we remember to include the required extra rules.
The fundamental problem is that the rule matcher doesn't
know anything about commuting ops. This CL fixes that fact.
We already have information about which ops commute. (The register
allocator takes advantage of commutivity.) The rule generator now
automatically generates multiple rules for a single source rule when
there are commutative ops in the rule. We can now drop all of our
almost-duplicate source-level rules and the canonicalization rules.
I have some CLs in progress that will be a lot less verbose when
the rule generator handles commutivity for me.
I had to reorganize the load-combining rules a bit. The 8-way OR rules
generated 128 different reorderings, which was causing the generator
to put too much code in the rewrite*.go files (the big ones were going
from 25K lines to 132K lines). Instead I reorganized the rules to
combine pairs of loads at a time. The generated rule files are now
actually a bit (5%) smaller.
[Note to reviewers: check these carefully. Most of the other rule
changes are trivial.]
Make.bash times are ~unchanged.
Compiler benchmarks are not observably different. Probably because
we don't spend much compiler time in rule matching anyway.
I've also done a pass over all of our ops adding commutative markings
for ops which hadn't had them previously.
Fixes #18292
Change-Id: I999b1307272e91965b66754576019dedcbe7527a
Reviewed-on: https://go-review.googlesource.com/38666
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
This reduces the number of calls back into the
gc Type routines, which will help performance
in a concurrent backend.
It also reduces the number of callsites
that must be considered in making the transition.
Passes toolstash-check -all. No compiler performance changes.
Updates #15756
Change-Id: Ic7a8f1daac7e01a21658ae61ac118b2a70804117
Reviewed-on: https://go-review.googlesource.com/38340
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
|
|
Prior to this CL, the ssa.Frontend field was responsible
for providing types to the backend during compilation.
However, the types needed by the backend are few and static.
It makes more sense to use a struct for them
and to hang that struct off the ssa.Config,
which is the correct home for readonly data.
Now that Types is a struct, we can clean up the names a bit as well.
This has the added benefit of allowing early construction
of all types needed by the backend.
This will be useful for concurrent backend compilation.
Passes toolstash-check -all. No compiler performance change.
Updates #15756
Change-Id: I021658c8cf2836d6a22bbc20cc828ac38c7da08a
Reviewed-on: https://go-review.googlesource.com/38336
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
|
|
Suggested by mdempsky in CL 38232.
This allows us to use the Frontend field
to associate frontend state and information
with a function.
See the following CL in the series for examples.
This is a giant CL, but it is almost entirely routine refactoring.
The ssa test API is starting to feel a bit unwieldy.
I will clean it up separately, once the dust has settled.
Passes toolstash -cmp.
Updates #15756
Change-Id: I71c573bd96ff7251935fce1391b06b1f133c3caf
Reviewed-on: https://go-review.googlesource.com/38327
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
|
|
Prior to this CL, config was an explicit argument
to the SSA rewrite rules, and rules that needed
a Frontend got at it via config.
An upcoming CL moves Frontend from Config to Func,
so rules can no longer reach Frontend via Config.
Passing a Frontend as an argument to the rewrite rules
causes a 2-3% regression in compile times.
This CL takes a different approach:
It treats the variable names "config" and "fe"
as special and calculates them as needed.
The "as needed part" is also important to performance:
If they are calculated eagerly, the nilchecks themselves
cause a regression.
This introduces a little bit of magic into the rewrite
generator. However, from the perspective of the rules,
the config variable was already more or less magic.
And it makes the upcoming changes much clearer.
Passes toolstash -cmp.
Change-Id: I173f2bcc124cba43d53138bfa3775e21316a9107
Reviewed-on: https://go-review.googlesource.com/38326
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
|
|
Remove size AuxInt in Store, and alignment in Move/Zero. We still
pass size AuxInt to Move/Zero, as it is used for partial Move/Zero
lowering (e.g. cmd/compile/internal/ssa/gen/386.rules:288).
SizeAndAlign is gone.
Passes "toolstash -cmp" on std.
Change-Id: I1ca34652b65dd30de886940e789fcf41d521475d
Reviewed-on: https://go-review.googlesource.com/38150
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
For SSA Store/Move/Zero ops, attach the type of the value being
stored to the op as the Aux field. This type will be used for
write barrier insertion (in a followup CL). Since SSA passes
do not accurately propagate types of values (because of type
casting), we can't simply use type of the store's arguments
for write barrier insertion.
Passes "toolstash -cmp" on std.
Updates #17583.
Change-Id: I051d5e5c482931640d1d7d879b2a6bb91f2e0056
Reviewed-on: https://go-review.googlesource.com/36838
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
|
|
This is a mostly mechanical rename followed by manual fixes where necessary.
Change-Id: Ie5c670b133db978f15dc03e50dc2da0c80fc8842
Reviewed-on: https://go-review.googlesource.com/34137
Reviewed-by: David Lazar <lazard@golang.org>
|
|
Recognize runtime.newobject and don't Zero or NilCheck it.
Fixes #15914 (?)
Updates #15390.
TBD: add test
Change-Id: Ia3bfa5c2ddbe2c27c92d9f68534a713b5ce95934
Reviewed-on: https://go-review.googlesource.com/27930
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
Compound AUTO types weren't named previously. That was because live
variable analysis (plive.go) doesn't handle spilling to compound types.
It can't handle them because there is no valid place to put VARDEFs when
regalloc is spilling compound types.
compound types = multiword builtin types: complex, string, slice, and
interface.
Instead, we split named AUTOs into individual one-word variables. For
example, a string s gets split into a byte ptr s.ptr and an integer
s.len. Those two variables can be spilled to / restored from
independently. As a result, live variable analysis can handle them
because they are one-word objects.
This CL will change how AUTOs are described in DWARF information.
Consider the code:
func f(s string, i int) int {
x := s[i:i+5]
g()
return lookup(x)
}
The old compiler would spill x to two consecutive slots on the stack,
both named x (at offsets 0 and 8). The new compiler spills the pointer
of x to a slot named x.ptr. It doesn't spill x.len at all, as it is a
constant (5) and can be rematerialized for the call to lookup.
So compound objects may not be spilled in their entirety, and even if
they are they won't necessarily be contiguous. Such is the price of
optimization.
Re-enable live variable analysis tests. One test remains disabled, it
fails because of #14904.
Change-Id: I8ef2b5ab91e43a0d2136bfc231c05d100ec0b801
Reviewed-on: https://go-review.googlesource.com/21233
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|
|
Allow names to be used for subexpressions of match rules.
For example:
(OpA x:(OpB y)) -> ..use x here to refer to the OpB value..
This gets rid of the .Args[0].Args[0]... way of naming we
used to use.
While we're here, give all subexpression matches names instead
of recomputing them with .Args[i] sequences each time they
are referenced. Makes the generated rule code a bit smaller.
Change-Id: Ie42139f6f208933b75bd2ae8bd34e95419bc0e4e
Reviewed-on: https://go-review.googlesource.com/20997
Run-TryBot: Todd Neal <todd@tneal.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
|
|
* Move lowering into a separate pass.
* SliceLen/SliceCap is now available to various intermediate passes
which use useful for bounds checking.
* Add a second opt pass to handle the new opportunities
Decreases the code size of binaries in pkg/tool/linux_amd64
by ~45K.
Updates #14564 #14606
Change-Id: I5b2bd6202181c50623a3585fbf15c0d6db6d4685
Reviewed-on: https://go-review.googlesource.com/20172
Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
|