aboutsummaryrefslogtreecommitdiff
path: root/src/cmd/compile/internal/ssa/check.go
AgeCommit message (Collapse)Author
2021-03-03cmd/compile: make modified Aux type for OpArgXXXX pass ssa/checkDavid Chase
For #40724. Change-Id: I7d1e76139d187cd15a6e0df9d19542b7200589f6 Reviewed-on: https://go-review.googlesource.com/c/go/+/297911 Trust: David Chase <drchase@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-12-21[dev.regabi] all: merge master into dev.regabiMatthew Dempsky
The list of conflicted files for this merge is: src/cmd/compile/internal/gc/inl.go src/cmd/compile/internal/gc/order.go src/cmd/compile/internal/gc/ssa.go test/fixedbugs/issue20415.go test/fixedbugs/issue22822.go test/fixedbugs/issue28079b.go inl.go was updated for changes on dev.regabi: namely that OSELRECV has been removed, and that OSELRECV2 now only uses List, rather than both Left and List. order.go was updated IsAutoTmp is now a standalone function, rather than a method on Node. ssa.go was similarly updated for new APIs involving package ir. The tests are all merging upstream additions for gccgo error messages with changes to cmd/compile's error messages on the dev.regabi branch. Change-Id: Icaaf186d69da791b5994dbb6688ec989caabec42
2020-12-14cmd/compile: fix incorrect shift count type with s390x rulesRuixin Bao
The type of the shift count must be an unsigned integer. Some s390x rules for shift have their auxint type being int8. This results in a compilation failure on s390x with an invalid operation when running make.bash using older versions of go (e.g: go1.10.4). This CL adds an auxint type of uint8 and changes the ops for shift and rotate to use auxint with type uint8. The related rules are also modified to address this change. Fixes #43090 Change-Id: I594274b6e3d9b23092fc9e9f4b354870164f2f19 Reviewed-on: https://go-review.googlesource.com/c/go/+/277078 Reviewed-by: Keith Randall <khr@golang.org> Trust: Dmitri Shuralyov <dmitshur@golang.org>
2020-12-08[dev.regabi] cmd/compile: add ssa.Aux tag interface for Value.AuxMatthew Dempsky
It's currently hard to automate refactorings around the Value.Aux field, because we don't have any static typing information for it. Adding a tag interface will make subsequent CLs easier and safer. Passes buildall w/ toolstash -cmp. Updates #42982. Change-Id: I41ae8e411a66bda3195a0957b60c2fe8a8002893 Reviewed-on: https://go-review.googlesource.com/c/go/+/275756 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Trust: Matthew Dempsky <mdempsky@google.com>
2020-09-18cmd/compile: add type check for ssa genericOpssurechen
Change-Id: I2233a6a157ec8feffaefd6a8ee65b1c38778c1cd Reviewed-on: https://go-review.googlesource.com/c/go/+/255238 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Giovanni Bajo <rasky@develer.com> Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Trust: Giovanni Bajo <rasky@develer.com>
2020-09-16cmd/compile: introduce special ssa Aux type for callsDavid Chase
This is prerequisite to moving call expansion later into SSA, and probably a good idea anyway. Passes tests. This is the first minimal CL that does a 1-for-1 substitution of *ssa.AuxCall for *obj.LSym. Next step (next CL) is to make this change for all calls so that additional information can be stored in AuxCall. Change-Id: Ia3a7715648fd9fb1a176850767a726e6f5b959eb Reviewed-on: https://go-review.googlesource.com/c/go/+/237680 Trust: David Chase <drchase@google.com> Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-09-03cmd/compile: store the comparison pseudo-ops of arm64 conditional ↵fanzha02
instructions in AuxInt The current implementation stores the comparison pseudo-ops of arm64 conditional instructions (CSEL/CSEL0) in Aux, this patch modifies it and stores it in AuxInt, which can avoid the allocation. Change-Id: I0b69e51f63acd84c6878c6a59ccf6417501a8cfc Reviewed-on: https://go-review.googlesource.com/c/go/+/252517 Run-TryBot: fannie zhang <Fannie.Zhang@arm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2020-06-18cmd/compile: redo flag constant ops for armKeith Randall
Encode the flag results in an auxint field instead of having one opcode per flag state. This helps us handle the new *noov branches in a unified manner. This is only for arm, arm64 is in a subsequent CL. We could extend to other architectures as well, athough it would only be cleanup, no behavioral change. Update #39505 Change-Id: Ia46cea596faad540d1496c5915ab1274571543f0 Reviewed-on: https://go-review.googlesource.com/c/go/+/238077 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-04-17cmd/compile: make some s390x rules use strongly typed aux valuesMichael Munday
This first pass makes the rules using the condition code mask (CCMask) and rotate parameters (RotateParams) aux values strongly typed. This required adding strongly typed aux handling to the block rulegen. More CLs like this to follow, but this is probably the most complex. Passes toolstash-check -all. Change-Id: Ie513b07d527f0c1b398d7748331442dcb5f7b17d Reviewed-on: https://go-review.googlesource.com/c/go/+/228518 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2020-04-09cmd/compile: start implementing strongly typed aux and auxint fieldsKeith Randall
Right now the Aux and AuxInt fields of ssa.Values are typed as interface{} and int64, respectively. Each rule that uses these values must cast them to the type they actually are (*obj.LSym, or int32, or ValAndOff, etc.), use them, and then cast them back to interface{} or int64. We know for each opcode what the types of the Aux and AuxInt fields should be. So let's modify the rule generator to declare the types to be what we know they should be, autoconverting to and from the generic types for us. That way we can make the rules more type safe. It's difficult to make a single CL for this, so I've coopted the "=>" token to indicate a rule that is strongly typed. "->" rules are processed as before. That will let us migrate a few rules at a time in separate CLs. Hopefully we can reach a state where all rules are strongly typed and we can drop the distinction. This CL changes just a few rules to get a feel for what this transition would look like. I've decided not to put explicit types in the rules. I think it makes the rules somewhat clearer, but definitely more verbose. In particular, the passthrough rules that don't modify the fields in question are verbose for no real reason. Change-Id: I63a1b789ac5702e7caf7934cd49f784235d1d73d Reviewed-on: https://go-review.googlesource.com/c/go/+/190197 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2020-03-04cmd/compile: don't allow NaNs in floating-point constant opsKeith Randall
Trying this CL again, with a fixed test that allows platforms to disagree on the exact behavior of converting NaNs. We store 32-bit floating point constants in a 64-bit field, by converting that 32-bit float to 64-bit float to store it, and convert it back to use it. That works for *almost* all floating-point constants. The exception is signaling NaNs. The round trip described above means we can't represent a 32-bit signaling NaN, because conversions strip the signaling bit. To fix this issue, just forbid NaNs as floating-point constants in SSA form. This shouldn't affect any real-world code, as people seldom constant-propagate NaNs (except in test code). Additionally, NaNs are somewhat underspecified (which of the many NaNs do you get when dividing 0/0?), so when cross-compiling there's a danger of using the compiler machine's NaN regime for some math, and the target machine's NaN regime for other math. Better to use the target machine's NaN regime always. Update #36400 Change-Id: Idf203b688a15abceabbd66ba290d4e9f63619ecb Reviewed-on: https://go-review.googlesource.com/c/go/+/221790 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2020-02-28cmd/compile: add dedicated ARM64BitField aux typeJosh Bleecher Snyder
The goal here is improved AuxInt printing in ssa.html. Instead of displaying an inscrutable encoded integer, it displays something like v25 (28) = UBFX <int> [lsb=4,width=8] v52 which is much nicer for debugging. Change-Id: I40713ff7f4a857c4557486cdf73c2dff137511ca Reviewed-on: https://go-review.googlesource.com/c/go/+/221420 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-02-25Revert "cmd/compile: don't allow NaNs in floating-point constant ops"Bryan C. Mills
This reverts CL 213477. Reason for revert: tests are failing on linux-mips*-rtrk builders. Change-Id: I8168f7450890233f1bd7e53930b73693c26d4dc0 Reviewed-on: https://go-review.googlesource.com/c/go/+/220897 Run-TryBot: Bryan C. Mills <bcmills@google.com> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-02-25cmd/compile: don't allow NaNs in floating-point constant opsKeith Randall
We store 32-bit floating point constants in a 64-bit field, by converting that 32-bit float to 64-bit float to store it, and convert it back to use it. That works for *almost* all floating-point constants. The exception is signaling NaNs. The round trip described above means we can't represent a 32-bit signaling NaN, because conversions strip the signaling bit. To fix this issue, just forbid NaNs as floating-point constants in SSA form. This shouldn't affect any real-world code, as people seldom constant-propagate NaNs (except in test code). Additionally, NaNs are somewhat underspecified (which of the many NaNs do you get when dividing 0/0?), so when cross-compiling there's a danger of using the compiler machine's NaN regime for some math, and the target machine's NaN regime for other math. Better to use the target machine's NaN regime always. This has been a bug since 1.10, and there's an easy workaround (declare a global varaible containing the signaling NaN pattern, and use that as the argument to math.Float32frombits) so we'll fix it in 1.15. Fixes #36400 Update #36399 Change-Id: Icf155e743281560eda2eed953d19a829552ccfda Reviewed-on: https://go-review.googlesource.com/c/go/+/213477 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2019-11-05cmd/compile: fix liveness for open-coded defer args for infinite loopsDan Scales
Once defined, a stack slot holding an open-coded defer arg should always be marked live, since it may be used at any time if there is a panic. These stack slots are typically kept live naturally by the open-defer code inlined at each return/exit point. However, we need to do extra work to make sure that they are kept live if a function has an infinite loop or a panic exit. For this fix, only in the case of a function that is using open-coded defers, we compute the set of blocks (most often empty) that cannot reach a return or a BlockExit (panic) because of an infinite loop. Then, for each block b which cannot reach a return or BlockExit or is a BlockExit block, we mark each defer arg slot as live, as long as the definition of the defer arg slot dominates block b. For this change, had to export (*Func).sdom (-> Sdom) and SparseTree.isAncestorEq (-> IsAncestorEq) Updates #35277 Change-Id: I7b53c9bd38ba384a3794386dd0eb94e4cbde4eb1 Reviewed-on: https://go-review.googlesource.com/c/go/+/204802 Run-TryBot: Dan Scales <danscales@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-10-02cmd/compile: allow multiple SSA block control valuesMichael Munday
Control values are used to choose which successor of a block is jumped to. Typically a control value takes the form of a 'flags' value that represents the result of a comparison. Some architectures however use a variable in a register as a control value. Up until now we have managed with a single control value per block. However some architectures (e.g. s390x and riscv64) have combined compare-and-branch instructions that take two variables in registers as parameters. To generate these instructions we need to support 2 control values per block. This CL allows up to 2 control values to be used in a block in order to support the addition of compare-and-branch instructions. I have implemented s390x compare-and-branch instructions in a different CL. Passes toolstash-check -all. Results of compilebench: name old time/op new time/op delta Template 208ms ± 1% 209ms ± 1% ~ (p=0.289 n=20+20) Unicode 83.7ms ± 1% 83.3ms ± 3% -0.49% (p=0.017 n=18+18) GoTypes 748ms ± 1% 748ms ± 0% ~ (p=0.460 n=20+18) Compiler 3.47s ± 1% 3.48s ± 1% ~ (p=0.070 n=19+18) SSA 11.5s ± 1% 11.7s ± 1% +1.64% (p=0.000 n=19+18) Flate 130ms ± 1% 130ms ± 1% ~ (p=0.588 n=19+20) GoParser 160ms ± 1% 161ms ± 1% ~ (p=0.211 n=20+20) Reflect 465ms ± 1% 467ms ± 1% +0.42% (p=0.007 n=20+20) Tar 184ms ± 1% 185ms ± 2% ~ (p=0.087 n=18+20) XML 253ms ± 1% 253ms ± 1% ~ (p=0.377 n=20+18) LinkCompiler 769ms ± 2% 774ms ± 2% ~ (p=0.070 n=19+19) ExternalLinkCompiler 3.59s ±11% 3.68s ± 6% ~ (p=0.072 n=20+20) LinkWithoutDebugCompiler 446ms ± 5% 454ms ± 3% +1.79% (p=0.002 n=19+20) StdCmd 26.0s ± 2% 26.0s ± 2% ~ (p=0.799 n=20+20) name old user-time/op new user-time/op delta Template 238ms ± 5% 240ms ± 5% ~ (p=0.142 n=20+20) Unicode 105ms ±11% 106ms ±10% ~ (p=0.512 n=20+20) GoTypes 876ms ± 2% 873ms ± 4% ~ (p=0.647 n=20+19) Compiler 4.17s ± 2% 4.19s ± 1% ~ (p=0.093 n=20+18) SSA 13.9s ± 1% 14.1s ± 1% +1.45% (p=0.000 n=18+18) Flate 145ms ±13% 146ms ± 5% ~ (p=0.851 n=20+18) GoParser 185ms ± 5% 188ms ± 7% ~ (p=0.174 n=20+20) Reflect 534ms ± 3% 538ms ± 2% ~ (p=0.105 n=20+18) Tar 215ms ± 4% 211ms ± 9% ~ (p=0.079 n=19+20) XML 295ms ± 6% 295ms ± 5% ~ (p=0.968 n=20+20) LinkCompiler 832ms ± 4% 837ms ± 7% ~ (p=0.707 n=17+20) ExternalLinkCompiler 1.58s ± 8% 1.60s ± 4% ~ (p=0.296 n=20+19) LinkWithoutDebugCompiler 478ms ±12% 489ms ±10% ~ (p=0.429 n=20+20) name old object-bytes new object-bytes delta Template 559kB ± 0% 559kB ± 0% ~ (all equal) Unicode 216kB ± 0% 216kB ± 0% ~ (all equal) GoTypes 2.03MB ± 0% 2.03MB ± 0% ~ (all equal) Compiler 8.07MB ± 0% 8.07MB ± 0% -0.06% (p=0.000 n=20+20) SSA 27.1MB ± 0% 27.3MB ± 0% +0.89% (p=0.000 n=20+20) Flate 343kB ± 0% 343kB ± 0% ~ (all equal) GoParser 441kB ± 0% 441kB ± 0% ~ (all equal) Reflect 1.36MB ± 0% 1.36MB ± 0% ~ (all equal) Tar 487kB ± 0% 487kB ± 0% ~ (all equal) XML 632kB ± 0% 632kB ± 0% ~ (all equal) name old export-bytes new export-bytes delta Template 18.5kB ± 0% 18.5kB ± 0% ~ (all equal) Unicode 7.92kB ± 0% 7.92kB ± 0% ~ (all equal) GoTypes 35.0kB ± 0% 35.0kB ± 0% ~ (all equal) Compiler 109kB ± 0% 110kB ± 0% +0.72% (p=0.000 n=20+20) SSA 137kB ± 0% 138kB ± 0% +0.58% (p=0.000 n=20+20) Flate 4.89kB ± 0% 4.89kB ± 0% ~ (all equal) GoParser 8.49kB ± 0% 8.49kB ± 0% ~ (all equal) Reflect 11.4kB ± 0% 11.4kB ± 0% ~ (all equal) Tar 10.5kB ± 0% 10.5kB ± 0% ~ (all equal) XML 16.7kB ± 0% 16.7kB ± 0% ~ (all equal) name old text-bytes new text-bytes delta HelloSize 761kB ± 0% 761kB ± 0% ~ (all equal) CmdGoSize 10.8MB ± 0% 10.8MB ± 0% ~ (all equal) name old data-bytes new data-bytes delta HelloSize 10.7kB ± 0% 10.7kB ± 0% ~ (all equal) CmdGoSize 312kB ± 0% 312kB ± 0% ~ (all equal) name old bss-bytes new bss-bytes delta HelloSize 122kB ± 0% 122kB ± 0% ~ (all equal) CmdGoSize 146kB ± 0% 146kB ± 0% ~ (all equal) name old exe-bytes new exe-bytes delta HelloSize 1.13MB ± 0% 1.13MB ± 0% ~ (all equal) CmdGoSize 15.1MB ± 0% 15.1MB ± 0% ~ (all equal) Change-Id: I3cc2f9829a109543d9a68be4a21775d2d3e9801f Reviewed-on: https://go-review.googlesource.com/c/go/+/196557 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Keith Randall <khr@golang.org>
2019-09-26cmd/compile: use numeric condition code masks on s390xMichael Munday
Prior to this CL conditional branches on s390x always used an extended mnemonic such as BNE, BLT and so on to represent branch instructions with different condition code masks. This CL adds support for numeric condition code masks to the s390x SSA backend so that we can encode the condition under which a Block's successor is chosen as a field in that Block rather than in its type. This change will be useful as we come to add support for combined compare-and-branch instructions. Rather than trying to add extended mnemonics for every possible combination of mask and compare-and- branch instruction we can instead use a single mnemonic for each instruction. Change-Id: Idb7458f187b50906877d683695c291dff5279553 Reviewed-on: https://go-review.googlesource.com/c/go/+/197178 Reviewed-by: Keith Randall <khr@golang.org>
2019-08-28cmd/compile: remove auxSymInt32Keith Randall
We never used it, might as well get rid of it. Change-Id: I5c23c93e90173bff9ac1fc1b8ae1e2025215d6eb Reviewed-on: https://go-review.googlesource.com/c/go/+/191938 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-09-05cmd/compile: fix store-to-load forwarding of 32-bit sNaNsMichael Munday
Signalling NaNs were being converted to quiet NaNs during constant propagation through integer <-> float store-to-load forwarding. This occurs because we store float32 constants as float64 values and CPU hardware 'quietens' NaNs during conversion between the two. Eventually we want to move to using float32 values to store float32 constants, however this will be a big change since both the compiler and the assembler expect float64 values. So for now this is a small change that will fix the immediate issue. Fixes #27193. Change-Id: Iac54bd8c13abe26f9396712bc71f9b396f842724 Reviewed-on: https://go-review.googlesource.com/132956 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2018-07-12cmd/compile: add LocalAddr that takes SP,mem operandsDavid Chase
Lack of a well-defined order between VarDef and related address operations sometimes causes problems with store order and write barrier transformations; glitches in the order are made irreparable (by later optimizations) if the two parts of the glitch straddle a split in the original block caused by insertion of a write barrier diamond. Fix this by creating a LocalAddr for addresses of locals (what VarDef matters for) that takes a memory input to help make the order explicit. Addr is modified to only be legal for SB operand, so there is no overlap between Addr and LocalAddr uses (there may be some downstream cleanup from this). Changes to generic.rules and rewrite.go ensure that codegen tests continue to pass; CSE of LocalAddr is impaired, not quite sure of the cost. Fixes #26105. Change-Id: Id4192b4440aa4e9d7ba54a465c456df9b530b515 Reviewed-on: https://go-review.googlesource.com/122483 Run-TryBot: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2018-04-24cmd/compile/internal/ssa: add Op{SP,SB} type checks to check.goisharipo
gc/ssa.go initilizes SP and SB values with TUINTPTR type. Assign same type in SSA tests and modify check.go to catch mismatching types for those ops. This makes SSA tests more consistent. Change-Id: I798440d57d00fb949d1a0cd796759c9b82a934bd Reviewed-on: https://go-review.googlesource.com/106658 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2018-02-20cmd/compile/internal/ssa: emit csel on arm64philhofer
Introduce a new SSA pass to generate CondSelect intstrutions, and add CondSelect lowering rules for arm64. In order to make the CSEL instruction easier to optimize, and to simplify the introduction of CSNEG, CSINC, and CSINV in the future, modify the CSEL instruction to accept a condition code in the aux field. Notably, this change makes the go1 Gzip benchmark more than 10% faster. Benchmarks on a Cavium ThunderX: name old time/op new time/op delta BinaryTree17-96 15.9s ± 6% 16.0s ± 4% ~ (p=0.968 n=10+9) Fannkuch11-96 7.17s ± 0% 7.00s ± 0% -2.43% (p=0.000 n=8+9) FmtFprintfEmpty-96 208ns ± 1% 207ns ± 0% ~ (p=0.152 n=10+8) FmtFprintfString-96 379ns ± 0% 375ns ± 0% -0.95% (p=0.000 n=10+9) FmtFprintfInt-96 385ns ± 0% 383ns ± 0% -0.52% (p=0.000 n=9+10) FmtFprintfIntInt-96 591ns ± 0% 586ns ± 0% -0.85% (p=0.006 n=7+9) FmtFprintfPrefixedInt-96 656ns ± 0% 667ns ± 0% +1.71% (p=0.000 n=10+10) FmtFprintfFloat-96 967ns ± 0% 984ns ± 0% +1.78% (p=0.000 n=10+10) FmtManyArgs-96 2.35µs ± 0% 2.25µs ± 0% -4.63% (p=0.000 n=9+8) GobDecode-96 31.0ms ± 0% 30.8ms ± 0% -0.36% (p=0.006 n=9+9) GobEncode-96 24.4ms ± 0% 24.5ms ± 0% +0.30% (p=0.000 n=9+9) Gzip-96 1.60s ± 0% 1.43s ± 0% -10.58% (p=0.000 n=9+10) Gunzip-96 167ms ± 0% 169ms ± 0% +0.83% (p=0.000 n=8+9) HTTPClientServer-96 311µs ± 1% 308µs ± 0% -0.75% (p=0.000 n=10+10) JSONEncode-96 65.0ms ± 0% 64.8ms ± 0% -0.25% (p=0.000 n=9+8) JSONDecode-96 262ms ± 1% 261ms ± 1% ~ (p=0.579 n=10+10) Mandelbrot200-96 18.0ms ± 0% 18.1ms ± 0% +0.17% (p=0.000 n=8+10) GoParse-96 14.0ms ± 0% 14.1ms ± 1% +0.42% (p=0.003 n=9+10) RegexpMatchEasy0_32-96 644ns ± 2% 645ns ± 2% ~ (p=0.836 n=10+10) RegexpMatchEasy0_1K-96 3.70µs ± 0% 3.49µs ± 0% -5.58% (p=0.000 n=10+10) RegexpMatchEasy1_32-96 662ns ± 2% 657ns ± 2% ~ (p=0.137 n=10+10) RegexpMatchEasy1_1K-96 4.47µs ± 0% 4.31µs ± 0% -3.48% (p=0.000 n=10+10) RegexpMatchMedium_32-96 844ns ± 2% 849ns ± 1% ~ (p=0.208 n=10+10) RegexpMatchMedium_1K-96 179µs ± 0% 182µs ± 0% +1.20% (p=0.000 n=10+10) RegexpMatchHard_32-96 10.0µs ± 0% 10.1µs ± 0% +0.48% (p=0.000 n=10+9) RegexpMatchHard_1K-96 297µs ± 0% 297µs ± 0% -0.14% (p=0.000 n=10+10) Revcomp-96 3.08s ± 0% 3.13s ± 0% +1.56% (p=0.000 n=9+9) Template-96 276ms ± 2% 275ms ± 1% ~ (p=0.393 n=10+10) TimeParse-96 1.37µs ± 0% 1.36µs ± 0% -0.53% (p=0.000 n=10+7) TimeFormat-96 1.40µs ± 0% 1.42µs ± 0% +0.97% (p=0.000 n=10+10) [Geo mean] 264µs 262µs -0.77% Change-Id: Ie54eee4b3092af53e6da3baa6d1755098f57f3a2 Reviewed-on: https://go-review.googlesource.com/55670 Run-TryBot: Philip Hofer <phofer@umich.edu> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2018-02-20cmd/compile: reset branch prediction when deleting a branchKeith Randall
When we go from a branch block to a plain block, reset the branch prediction bit. Downstream passes asssume that if the branch prediction is set, then the block has 2 successors. Fixes #23504 Change-Id: I2898ec002228b2e34fe80ce420c6939201c0a5aa Reviewed-on: https://go-review.googlesource.com/88955 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-02-14cmd/compile: reimplement location list generationHeschi Kreinick
Completely redesign and reimplement location list generation to be more efficient, and hopefully not too hard to understand. RegKills are gone. Instead of using the regalloc's liveness calculations, redo them using the Ops' clobber information. Besides saving a lot of Values, this avoids adding RegKills to blocks that would be empty otherwise, which was messing up optimizations. This does mean that it's much harder to tell whether the generation process is buggy (there's nothing to cross-check it with), and there may be disagreements with GC liveness. But the performance gain is significant, and it's nice not to be messing with earlier compiler phases. The intermediate representations are gone. Instead of producing ssa.BlockDebugs, then dwarf.LocationLists, and then finally real location lists, go directly from the SSA to a (mostly) real location list. Because the SSA analysis happens before assembly, it stores encoded block/value IDs where PCs would normally go. It would be easier to do the SSA analysis after assembly, but I didn't want to retain the SSA just for that. Generation proceeds in two phases: first, it traverses the function in CFG order, storing the state of the block at the beginning and end. End states are used to produce the start states of the successor blocks. In the second phase, it traverses in program text order and produces the location lists. The processing in the second phase is redundant, but much cheaper than storing the intermediate representation. It might be possible to combine the two phases somewhat to take advantage of cases where the CFG matches the block layout, but I haven't tried. Location lists are finalized by adding a base address selection entry, translating each encoded block/value ID to a real PC, and adding the terminating zero entry. This probably won't work on OSX, where dsymutil will choke on the base address selection. I tried emitting CU-relative relocations for each address, and it was *very* bad for performance -- it uses more memory storing all the relocations than it does for the actual location list bytes. I think I'm going to end up synthesizing the relocations in the linker only on OSX, but TBD. TestNexting needs updating: with more optimizations working, the debugger doesn't stop on the continue (line 88) any more, and the test's duplicate suppression kicks in. Also, dx and dy live a little longer now, but they have the correct values. Change-Id: Ie772dfe23a4e389ca573624fac4d05401ae32307 Reviewed-on: https://go-review.googlesource.com/89356 Run-TryBot: Heschi Kreinick <heschi@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2017-11-30cmd/compile: use soft-float routines for soft-float targetsVladimir Stefanovic
Updates #18162 (mostly fixes) Change-Id: I35bcb8a688bdaa432adb0ddbb73a2f7adda47b9e Reviewed-on: https://go-review.googlesource.com/37958 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-11-21cmd/compile: ignore RegKill ops for non-phi after phi checkThan McIntosh
Relax the 'phi after non-phi' SSA sanity check to allow RegKill ops interspersed with phi ops in a block. This fixes a sanity check failure when -dwarflocationlists is enabled. Updates #22694. Change-Id: Iaae604ab6f1a8b150664dd120003727a6fb2f698 Reviewed-on: https://go-review.googlesource.com/77610 Run-TryBot: Than McIntosh <thanm@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2017-10-05cmd/compile: make loop finder more aware of irreducible loopsDavid Chase
The loop finder doesn't return good information if it encounters an irreducible loop. Make a start on improving this, and set a function-level flag to indicate when there is such a loop (and the returned information might be flaky). Use that flag to prevent the loop rotater from getting confused; the existing code seems to depend on artifacts of the previous loop-finding algorithm. (There is one irreducible loop in the go library, in "inflate.go"). Change-Id: If6e26feab38d9b009d2252d556e1470c803bde40 Reviewed-on: https://go-review.googlesource.com/42150 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-09-08cmd/compile: propagate constants through math.Float{32,64}{,from}bitsMichael Munday
This CL adds generic SSA rules to propagate constants through raw bits conversions between floats and integers. This allows constants to propagate through some math functions. For example, math.Copysign(0, -1) is now constant folded to a load of -0.0. Requires a fix to the ARM assembler which loaded -0.0 as +0.0. Change-Id: I52649a4691077c7414f19d17bb599a6743c23ac2 Reviewed-on: https://go-review.googlesource.com/62250 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-06-07cmd/compile: check that phis are always first after schedulingKeith Randall
Update #20178 Change-Id: I603f77268ed38afdd84228c775efe006f08f14a7 Reviewed-on: https://go-review.googlesource.com/45018 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-05-15cmd/compile: better check for single live memoryKeith Randall
Enhance the one-live-memory-at-a-time check to run during many more phases of the SSA backend. Also make it work in an interblock fashion. Change types.IsMemory to return true for tuples containing a memory type. Fix trim pass to build the merged phi correctly. Doesn't affect code but allows the check to pass after trim runs. Switch the AddTuple* ops to take the memory-containing tuple argument second. Update #20335 Change-Id: I5b03ef3606b75a9e4f765276bb8b183cdc172b43 Reviewed-on: https://go-review.googlesource.com/43495 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-05-11cmd/compile: fix store chain in schedule passKeith Randall
Tuple ops are weird. They are essentially a pair of ops, one which consumes a mem and one which generates a mem (the Select1). The schedule pass didn't handle these quite right. Fix the scheduler to include both parts of the paired op in the store chain. That makes sure that loads are correctly ordered with respect to the first of the pair. Add a check for the ssacheck builder, that there is only one live store at a time. I thought we already had such a check, but apparently not... Fixes #20335 Change-Id: I59eb3446a329100af38d22820b1ca2190ca46a78 Reviewed-on: https://go-review.googlesource.com/43294 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2017-04-19cmd/compile: enhance postorder computation and repair loop finderDavid Chase
Replace derecursed postorder computation with one that mimics DFS traversal. Corrected outerinner function in loopfinder Leave enhanced checks in place. Change-Id: I657ba5e89c88941028d6d4c72e9f9056e30f1ce8 Reviewed-on: https://go-review.googlesource.com/40872 Run-TryBot: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2017-03-16cmd/compile: use type information in Aux for Store sizeCherry Zhang
Remove size AuxInt in Store, and alignment in Move/Zero. We still pass size AuxInt to Move/Zero, as it is used for partial Move/Zero lowering (e.g. cmd/compile/internal/ssa/gen/386.rules:288). SizeAndAlign is gone. Passes "toolstash -cmp" on std. Change-Id: I1ca34652b65dd30de886940e789fcf41d521475d Reviewed-on: https://go-review.googlesource.com/38150 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2016-10-25cmd/compile: add a writebarrier phase in SSACherry Zhang
When the compiler insert write barriers, the frontend makes conservative decisions at an early stage. This may have false positives which result in write barriers for stack writes. A new phase, writebarrier, is added to the SSA backend, to delay the decision and eliminate false positives. The frontend still makes conservative decisions. When building SSA, instead of emitting runtime calls directly, it emits WB ops (StoreWB, MoveWB, etc.), which will be expanded to branches and runtime calls in writebarrier phase. Writes to static locations on stack are detected and write barriers are removed. All write barriers of stack writes found by the script from issue #17330 are eliminated (except two false positives). Fixes #17330. Change-Id: I9bd66333da9d0ceb64dcaa3c6f33502798d1a0f8 Reviewed-on: https://go-review.googlesource.com/31131 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2016-09-19cmd/compile: cache CFG-dependent computationsKeith Randall
We compute a lot of stuff based off the CFG: postorder traversal, dominators, dominator tree, loop nest. Multiple phases use this information and we end up recomputing some of it. Add a cache for this information so if the CFG hasn't changed, we can reuse the previous computation. Change-Id: I9b5b58af06830bd120afbee9cfab395a0a2f74b2 Reviewed-on: https://go-review.googlesource.com/29356 Reviewed-by: David Chase <drchase@google.com>
2016-09-15cmd/compile: redo nil checksKeith Randall
Get rid of BlockCheck. Josh goaded me into it, and I went down a rabbithole making it happen. NilCheck now panics if the pointer is nil and returns void, as before. BlockCheck is gone, and NilCheck is no longer a Control value for any block. It just exists (and deadcode knows not to throw it away). I rewrote the nilcheckelim pass to handle this case. In particular, there can now be multiple NilCheck ops per block. I moved all of the arch-dependent nil check elimination done as part of ssaGenValue into its own proper pass, so we don't have to duplicate that code for every architecture. Making the arch-dependent nil check its own pass means I needed to add a bunch of flags to the opcode table so I could write the code without arch-dependent ops everywhere. Change-Id: I419f891ac9b0de313033ff09115c374163416a9f Reviewed-on: https://go-review.googlesource.com/29120 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2016-09-12cmd/compile: get rid of BlockCallKeith Randall
No need for it, we can treat calls as (mostly) normal values that take a memory and return a memory. Lowers the number of basic blocks needed to represent a function. "go test -c net/http" uses 27% fewer basic blocks. Probably doesn't affect generated code much, but should help various passes whose running time and/or space depends on the number of basic blocks. Fixes #15631 Change-Id: I0bf21e123f835e2cfa382753955a4f8bce03dfa6 Reviewed-on: https://go-review.googlesource.com/28950 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2016-08-31cmd/compile: print SizeAndAlign AuxInt values correctlyKeith Randall
Makes the AuxInt arg to Move/Zero print in a readable format. Change-Id: I12295959b00ff7c1638d35836cc6d64d112c11ca Reviewed-on: https://go-review.googlesource.com/28271 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-16cmd/compile: use sparse algorithm for phis in large programDavid Chase
This adds a sparse method for locating nearest ancestors in a dominator tree, and checks blocks with more than one predecessor for differences and inserts phi functions where there are. Uses reversed post order to cut number of passes, running it from first def to last use ("last use" for paramout and mem is end-of-program; last use for a phi input from a backedge is the source of the back edge) Includes a cutover from old algorithm to new to avoid paying large constant factor for small programs. This keeps normal builds running at about the same time, while not running over-long on large machine-generated inputs. Add "phase" flags for ssa/build -- ssa/build/stats prints number of blocks, values (before and after linking references and inserting phis, so expansion can be measured), and their product; the product governs the cutover, where a good value seems to be somewhere between 1 and 5 million. Among the files compiled by make.bash, this is the shape of the tail of the distribution for #blocks, #vars, and their product: #blocks #vars product max 6171 28180 173,898,780 99.9% 1641 6548 10,401,878 99% 463 1909 873,721 95% 152 639 95,235 90% 84 359 30,021 The old algorithm is indeed usually fastest, for 99%ile values of usually. The fix to LookupVarOutgoing ( https://go-review.googlesource.com/#/c/22790/ ) deals with some of the same problems addressed by this CL, but on at least one bug ( #15537 ) this change is still a significant help. With this CL: /tmp/gopath$ rm -rf pkg bin /tmp/gopath$ time go get -v -gcflags -memprofile=y.mprof \ github.com/gogo/protobuf/test/theproto3/combos/... ... real 4m35.200s user 13m16.644s sys 0m36.712s and pprof reports 3.4GB allocated in one of the larger profiles With tip: /tmp/gopath$ rm -rf pkg bin /tmp/gopath$ time go get -v -gcflags -memprofile=y.mprof \ github.com/gogo/protobuf/test/theproto3/combos/... ... real 10m36.569s user 25m52.286s sys 4m3.696s and pprof reports 8.3GB allocated in the same larger profile With this CL, most of the compilation time on the benchmarked input is spent in register/stack allocation (cumulative 53%) and in the sparse lookup algorithm itself (cumulative 20%). Fixes #15537. Change-Id: Ia0299dda6a291534d8b08e5f9883216ded677a00 Reviewed-on: https://go-review.googlesource.com/22342 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-05cmd/compile: enable constant-time CFG editingKeith Randall
Provide indexes along with block pointers for Preds and Succs arrays. This allows us to splice edges in and out of those arrays in constant time. Fixes worst-case O(n^2) behavior in deadcode and fuse. benchmark old ns/op new ns/op delta BenchmarkFuse1-8 2065 2057 -0.39% BenchmarkFuse10-8 9408 9073 -3.56% BenchmarkFuse100-8 105238 76277 -27.52% BenchmarkFuse1000-8 3982562 1026750 -74.22% BenchmarkFuse10000-8 301220329 12824005 -95.74% BenchmarkDeadCode1-8 1588 1566 -1.39% BenchmarkDeadCode10-8 4333 4250 -1.92% BenchmarkDeadCode100-8 32031 32574 +1.70% BenchmarkDeadCode1000-8 590407 468275 -20.69% BenchmarkDeadCode10000-8 17822890 5000818 -71.94% BenchmarkDeadCode100000-8 1388706640 78021127 -94.38% BenchmarkDeadCode200000-8 5372518479 168598762 -96.86% Change-Id: Iccabdbb9343fd1c921ba07bbf673330a1c36ee17 Reviewed-on: https://go-review.googlesource.com/22589 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-04cmd/compile: check that SSA memory args are in the right placeJosh Bleecher Snyder
Fixes #15510 Change-Id: I2e0568778ef90cf29712753b8c42109ef84a0256 Reviewed-on: https://go-review.googlesource.com/22784 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2016-04-29cmd/compile: make vet happy with ssa codeKeith Randall
Fixes #15488 Change-Id: I054eb1e1c859de315e3cdbdef5428682bce693fd Reviewed-on: https://go-review.googlesource.com/22609 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2016-04-28cmd/compile: remove BlockDead stateKeith Randall
It is unused, remove the clutter. Change-Id: I51a44326b125ef79241459c463441f76a289cc08 Reviewed-on: https://go-review.googlesource.com/22586 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-04-26cmd/compile: more sanity checks on rewrite rulesKeith Randall
Make sure ops have the right number of args, set aux and auxint only if allowed, etc. Normalize error reporting format. Change-Id: Ie545fcc5990c8c7d62d40d9a0a55885f941eb645 Reviewed-on: https://go-review.googlesource.com/22320 Reviewed-by: David Chase <drchase@google.com>
2016-04-21cmd/compile: fix dominator check in check()Keith Randall
Ancestor comparison was the wrong way around, effectively disabling the def-must-dominate-use check. Update #15084 Change-Id: Ic56d674c5000569d2cc855bbb000a60eae517c7c Reviewed-on: https://go-review.googlesource.com/22330 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2016-04-14cmd/compile: fix use of original spill name after sinkingDavid Chase
This is a fix for the ssacheck builder http://build.golang.org/log/baa00f70c34e41186051cfe90568de3d91f115d7 after CL 21307 for sinking spills down loop exits https://go-review.googlesource.com/#/c/21037/ The fix is to reuse (move) the original spill, thus preserving the definition of the variable and its use count. Original and copy both use the same stack slot, but ssacheck needs to see a definition for the variable itself. Fixes #15279. Change-Id: I286285490193dc211b312d64dbc5a54867730bd6 Reviewed-on: https://go-review.googlesource.com/21995 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-30cmd/compile: define high bits of AuxIntKeith Randall
Previously if we were only using the low bits of AuxInt, the high bits were ignored and could be junk. This CL changes that behavior to define the high bits to be the sign-extended version of the low bits for all cases. There are 2 main benefits: - Deterministic representation. This helps with CSE. (Const8 [0x1]) and (Const8 [0x101]) used to be the same "value" but CSE couldn't see them as such. - Testability. We can check that all ops leave AuxInt in a state consistent with the new rule. In the old scheme, it was hard to check whether a rule correctly used only the low-order bits. Side benefits: - ==0 and !=0 tests are easier. Drawbacks: - This differs from the runtime representation in registers, where it is important that we allow upper bits to be undefined (so we're not sign/zero-extending all the time). - Ops that treat AuxInt as unsigned (shifts, mostly) need to be a bit more careful. Change-Id: I9a685ff27e36dc03287c9ab1cecd6c0b4045c819 Reviewed-on: https://go-review.googlesource.com/21256 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2016-03-17cmd/compile: keep value use counts in SSAKeith Randall
Keep track of how many uses each Value has. Each appearance in Value.Args and in Block.Control counts once. The number of uses of a value is generically useful to constrain rewrite rules. For instance, we might want to prevent merging index operations into loads if the same index expression is used lots of times. But I have one use in particular for which the use count is required. We must make sure we don't combine ops with loads if the load has more than one use. Otherwise, we may split a single load into multiple loads and that breaks perceived behavior in the presence of races. In particular, the load of m.state in sync/mutex.go:Lock can't be done twice. (I have a separate CL which triggers the mutex failure. This CL has a test which demonstrates a similar failure.) Change-Id: Icaafa479239f48632a069d0c3f624e6ebc6b1f0e Reviewed-on: https://go-review.googlesource.com/20790 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Todd Neal <todd@tneal.org>
2016-03-13cmd/compile: const folding for float32/64Todd Neal
Split the auxFloat type into 32/64 bit versions and perform checking for exactly representable float32 values. Perform const folding on float32/64. Comment out some const negation rules that the frontend already performs. Change-Id: Ib3f8d59fa8b30e50fe0267786cfb3c50a06169d2 Reviewed-on: https://go-review.googlesource.com/20568 Run-TryBot: Todd Neal <todd@tneal.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2016-03-10cmd/compile: fix defer/deferreturnKeith Randall
Make sure we do any just-before-return cleanup on all paths out of a function, including when recovering. Each exit path should include deferreturn (if there are any defers) and then the exit code (e.g. copying heap-escaping return values back to the stack). Introduce a Defer SSA block type which has two outgoing edges - one the fallthrough edge (the defer was queued successfully) and one which immediately returns (the defer had a successful recover() call and normal execution should resume at the return point). Fixes #14725 Change-Id: Iad035c9fd25ef8b7a74dafbd7461cf04833d981f Reviewed-on: https://go-review.googlesource.com/20486 Reviewed-by: David Chase <drchase@google.com>