aboutsummaryrefslogtreecommitdiff
path: root/src/cmd/compile/internal/ssa/shortcircuit.go
AgeCommit message (Collapse)Author
2021-03-24cmd/compile: disable shortcircuit optimization for intertwined phi valuesKeith Randall
We need to be careful that when doing value graph surgery, we not re-substitute a value that has already been substituted. That can lead to confusing a previous iteration's value with the current iteration's value. The simple fix in this CL just aborts the optimization if it detects intertwined phis (a phi which is the argument to another phi). It might be possible to keep the optimization with a more complicated CL, but: 1) This CL is clearly safe to backport. 2) There were no instances of this abort triggering in all.bash, prior to the test introduced in this CL. Fixes #45175 Change-Id: I2411dca03948653c053291f6829a76bec0c32330 Reviewed-on: https://go-review.googlesource.com/c/go/+/304251 Trust: Keith Randall <khr@golang.org> Trust: Josh Bleecher Snyder <josharian@gmail.com> Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-02-22cmd/compile: reject some rare looping CFGs in shortcircuitJosh Bleecher Snyder
One CFGs that shortcircuit looks for is: p q \ / b / \ t u The test case creates a CFG like that in which p == t. That caused the compiler to generate a (short-lived) invalid phi value. Fix this with a relatively big hammer: Disallow single-length loops entirely. This is probably overkill, but it such loops are very rare. This doesn't change the generated code for anything in std. It generates worse code for the test case: It no longer compiles the entire function away. Fixes #44465 Change-Id: Ib8cdcd6cc9d7f48b4dab253652038ace24eae152 Reviewed-on: https://go-review.googlesource.com/c/go/+/295130 Trust: Josh Bleecher Snyder <josharian@gmail.com> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Keith Randall <khr@golang.org>
2020-10-22cmd/compile: remove go115shortcircuitPhisCherry Zhang
Change-Id: Ib2697ebfcc14a01ab1f793cddcbf69180ffc49a2 Reviewed-on: https://go-review.googlesource.com/c/go/+/264341 Trust: Cherry Zhang <cherryyz@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2020-04-20cmd/compile: use fuse to implement shortcircuit loopJosh Bleecher Snyder
The rewrite loop in shortcircuit is identical to the one in fuse. That's not surprising; shortcircuit is fuse-like. Take advantage of that by merging the two loops. Passes toolstash-check. Change-Id: I642cb39a23d2ac8964ed577678f062fce721439c Reviewed-on: https://go-review.googlesource.com/c/go/+/229003 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2020-04-11cmd/compile: guard against invalid phis in shortcircuitJosh Bleecher Snyder
In the review of CL 222923, Keith expressed concern that we could end up with invalid phis. We have some code to handle this, but on further reflection, I think it might not handle some cases in which phis get moved. I can't create a failing case, but guard against it nevertheless. Change-Id: Ib3a07ac1d36a674c72dcb9cc9261ccfcb716b5a3 Reviewed-on: https://go-review.googlesource.com/c/go/+/227697 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2020-04-08cmd/compile: handle some additional phis in shortcircuitJosh Bleecher Snyder
Prior to this change, the shortcircuit pass could only handle blocks containing only a single phi control value, possibly wrapped in some OpNot and OpCopy values. This change partially lifts this limitation. It handles some cases in which the block contains other phi values. This appears to happen most commonly in cases in which the conditionals being checked involve the memory state, in which case there is a phi memory value in the block. The general idea here is to use the information we have about the CFG to (1) move the other phi values into other blocks and/or (2) rewrite uses of the other phi values in other blocks. For example, consider this CFG: p q \ / b / \ t u And consider a phi value v in block b. We'll write v = Phi(p: x, q: y) to say that v has value x corresponding to inbound block p, and value y for block q. We will rewrite this CFG to: p q | / | b |/ \ t u What should we do with v? Any uses of v in u can be replaced with y. Why? If we are in block u, we came from b, and before that from q. If prior to b we came from p, then we would have gone to t, not u. Since we came from q, we know that v took the value y. Uses of v in t are a bit more complicated. It is going to end up being a phi value: Phi(p: ?, b: ?). Suppose, after the rewrite, we came from block p. Then, before the rewrite, we would have gone to b, where v would have the value x. So we have Phi(p: x, b: ?). Suppose, after the rewrite, we came from block b. Then we must have come from block q. If we come from block q, v has value y. So we have Phi(p: x, b: y). Uses of v in t can thus be replaced with a new phi value, with the same values as v, but with altered predecessors. Similar reasoning can be employed to rewrite or replace other uses of v elsewhere in the CFG, so that v itself can be eliminated, and the CFG rewrite can proceed. This change sets up the infrastructure for such optimizations and adds a few cheap ones. All optimizations in this change depend only on the shape of the CFG; future changes may also depend on where v's uses are. That analysis is more powerful but more expensive, and should be done incrementally. The use of closures here is perhaps a bit unusual, but during development it proved critical to having readable code. We must decide early on whether we can safely do the CFG modifications, and then later fix up the phis if so. Safely storing state and decisions across these two phases is hard to do readably. Closures solve the problem neatly. I manually instrumented the code paths in shortcircuitPhiPlan. During make.bash there are nearly 6000 invocations. The least-visited code path gets run 85 times, so all the code in this CL is reasonably well-exercised. Here is a concrete example of code improved by this change: func f(e interface{}) int { if x, ok := e.(int); ok { return x } return 0 } Omitting PCDATA, FUNCDATA, and the like, it used to compile to: "".f STEXT nosplit size=50 args=0x18 locals=0x0 0x0000 00000 (x.go:4) LEAQ type.int(SB), AX 0x0007 00007 (x.go:4) MOVQ "".e+8(SP), CX 0x000c 00012 (x.go:4) CMPQ AX, CX 0x000f 00015 (x.go:4) JNE 43 0x0011 00017 (x.go:4) MOVQ "".e+16(SP), AX 0x0016 00022 (x.go:4) MOVQ (AX), AX 0x0019 00025 (x.go:4) JNE 33 0x001b 00027 (x.go:5) MOVQ AX, "".~r1+24(SP) 0x0020 00032 (x.go:5) RET 0x0021 00033 (x.go:7) MOVQ $0, "".~r1+24(SP) 0x002a 00042 (x.go:7) RET 0x002b 00043 (x.go:7) MOVL $0, AX 0x0030 00048 (x.go:4) JMP 25 Afterwards, it compiles to: "".f STEXT nosplit size=41 args=0x18 locals=0x0 0x0000 00000 (x.go:4) LEAQ type.int(SB), AX 0x0007 00007 (x.go:4) MOVQ "".e+8(SP), CX 0x000c 00012 (x.go:4) CMPQ AX, CX 0x000f 00015 (x.go:4) JNE 31 0x0011 00017 (x.go:4) MOVQ "".e+16(SP), AX 0x0016 00022 (x.go:4) MOVQ (AX), AX 0x0019 00025 (x.go:5) MOVQ AX, "".~r1+24(SP) 0x001e 00030 (x.go:5) RET 0x001f 00031 (x.go:7) MOVQ $0, "".~r1+24(SP) 0x0028 00040 (x.go:7) RET Note that there is now only a single JNE and a single RET $0 path. Updates #37608 Has a minor good effect on compilation speed and memory use. Provides widespread improvements to generated code. The rare, minor regressions I have investigated are due to register allocation fluctuations. file before after Δ % addr2line 4376080 4371984 -4096 -0.094% api 5945400 5933112 -12288 -0.207% asm 5034312 5030216 -4096 -0.081% buildid 2844952 2840856 -4096 -0.144% cgo 4812872 4804680 -8192 -0.170% compile 19622064 19610368 -11696 -0.060% cover 5236648 5232552 -4096 -0.078% dist 3658312 3654216 -4096 -0.112% doc 4653512 4649416 -4096 -0.088% fix 3370072 3365976 -4096 -0.122% link 6671864 6667768 -4096 -0.061% pprof 14781652 14761172 -20480 -0.139% trace 11639684 11627396 -12288 -0.106% vet 8252280 8231800 -20480 -0.248% total 115052984 114934792 -118192 -0.103% file before after Δ % internal/cpu.s 3298 3296 -2 -0.061% internal/bytealg.s 1730 1737 +7 +0.405% cmd/vendor/golang.org/x/mod/semver.s 7332 7283 -49 -0.668% image/color.s 8248 8156 -92 -1.115% math.s 35966 35956 -10 -0.028% math/cmplx.s 6596 6575 -21 -0.318% runtime.s 480566 480053 -513 -0.107% sync.s 16408 16385 -23 -0.140% math/rand.s 10447 10406 -41 -0.392% internal/reflectlite.s 28408 28366 -42 -0.148% errors.s 2736 2701 -35 -1.279% sort.s 17031 17036 +5 +0.029% io.s 16993 16964 -29 -0.171% container/heap.s 2006 1997 -9 -0.449% text/tabwriter.s 9570 9552 -18 -0.188% bytes.s 31823 31594 -229 -0.720% strconv.s 52760 52717 -43 -0.082% vendor/golang.org/x/text/transform.s 16713 16706 -7 -0.042% strings.s 42590 42563 -27 -0.063% bufio.s 22883 22785 -98 -0.428% encoding/base32.s 9586 9531 -55 -0.574% syscall.s 82237 82243 +6 +0.007% image.s 37465 37452 -13 -0.035% regexp/syntax.s 82827 82769 -58 -0.070% image/draw.s 18698 18584 -114 -0.610% image/jpeg.s 36560 36549 -11 -0.030% time.s 82557 82526 -31 -0.038% context.s 10863 10820 -43 -0.396% regexp.s 64114 64049 -65 -0.101% os.s 51751 51524 -227 -0.439% reflect.s 168240 168049 -191 -0.114% cmd/go/internal/lockedfile/internal/filelock.s 2317 2290 -27 -1.165% path/filepath.s 17831 17766 -65 -0.365% io/ioutil.s 6994 6990 -4 -0.057% encoding/binary.s 30791 30726 -65 -0.211% cmd/vendor/golang.org/x/sys/unix.s 78055 78033 -22 -0.028% encoding/pem.s 9280 9247 -33 -0.356% crypto/cipher.s 20376 20374 -2 -0.010% os/exec.s 29229 29140 -89 -0.304% internal/goroot.s 4588 4579 -9 -0.196% cmd/internal/browser.s 2246 2240 -6 -0.267% cmd/vendor/golang.org/x/crypto/ssh/terminal.s 27183 27149 -34 -0.125% fmt.s 76625 76484 -141 -0.184% encoding/hex.s 6154 6152 -2 -0.032% compress/lzw.s 7063 7059 -4 -0.057% database/sql/driver.s 18875 18862 -13 -0.069% debug/plan9obj.s 8268 8266 -2 -0.024% net/url.s 29724 29719 -5 -0.017% encoding/csv.s 12872 12856 -16 -0.124% debug/gosym.s 25303 25268 -35 -0.138% compress/flate.s 50952 51019 +67 +0.131% compress/zlib.s 7277 7266 -11 -0.151% archive/zip.s 42155 42111 -44 -0.104% debug/dwarf.s 107632 107541 -91 -0.085% database/sql.s 98373 98028 -345 -0.351% os/user.s 14722 14708 -14 -0.095% encoding/json.s 105836 105711 -125 -0.118% debug/macho.s 32598 32560 -38 -0.117% encoding/gob.s 136478 135755 -723 -0.530% debug/pe.s 31160 30869 -291 -0.934% debug/elf.s 63495 63302 -193 -0.304% vendor/golang.org/x/text/unicode/bidi.s 27220 27217 -3 -0.011% vendor/golang.org/x/text/secure/bidirule.s 3363 3352 -11 -0.327% go/token.s 12036 12035 -1 -0.008% flag.s 22277 22256 -21 -0.094% mime.s 39696 39509 -187 -0.471% go/scanner.s 19033 19020 -13 -0.068% archive/tar.s 70936 70581 -355 -0.500% internal/xcoff.s 22823 22820 -3 -0.013% text/scanner.s 11631 11629 -2 -0.017% encoding/xml.s 110534 110408 -126 -0.114% math/big.s 183636 183545 -91 -0.050% image/gif.s 27376 27343 -33 -0.121% crypto/dsa.s 6029 5969 -60 -0.995% image/png.s 42947 42939 -8 -0.019% crypto/rand.s 6866 6854 -12 -0.175% vendor/golang.org/x/text/unicode/norm.s 66394 66354 -40 -0.060% runtime/trace.s 2603 2521 -82 -3.150% crypto/ed25519.s 6321 6300 -21 -0.332% text/template/parse.s 93910 93844 -66 -0.070% crypto/rsa.s 31460 31369 -91 -0.289% encoding/asn1.s 57021 57023 +2 +0.004% crypto/elliptic.s 51382 51363 -19 -0.037% crypto/x509/pkix.s 10386 10342 -44 -0.424% vendor/golang.org/x/net/idna.s 24482 24466 -16 -0.065% vendor/golang.org/x/crypto/cryptobyte.s 33479 33280 -199 -0.594% crypto/ecdsa.s 11936 11883 -53 -0.444% go/constant.s 43670 42663 -1007 -2.306% go/ast.s 80383 80191 -192 -0.239% testing.s 68069 68057 -12 -0.018% runtime/pprof.s 59613 59603 -10 -0.017% testing/iotest.s 4895 4891 -4 -0.082% internal/trace.s 78136 78089 -47 -0.060% cmd/internal/goobj2.s 13158 13154 -4 -0.030% cmd/internal/src.s 17661 17657 -4 -0.023% go/parser.s 79046 78880 -166 -0.210% cmd/internal/objabi.s 16367 16343 -24 -0.147% text/template.s 94899 94486 -413 -0.435% go/printer.s 77267 76992 -275 -0.356% cmd/internal/goobj.s 25988 25947 -41 -0.158% runtime/pprof/internal/profile.s 102066 101933 -133 -0.130% go/format.s 5419 5371 -48 -0.886% cmd/vendor/golang.org/x/arch/ppc64/ppc64asm.s 37181 37149 -32 -0.086% go/doc.s 74533 74132 -401 -0.538% html/template.s 88743 88389 -354 -0.399% cmd/asm/internal/lex.s 24881 24872 -9 -0.036% cmd/internal/buildid.s 18263 18256 -7 -0.038% cmd/vendor/golang.org/x/arch/x86/x86asm.s 80036 79980 -56 -0.070% go/build.s 68905 68737 -168 -0.244% cmd/cover.s 46070 45950 -120 -0.260% cmd/internal/obj.s 117001 116991 -10 -0.009% cmd/doc.s 62700 62419 -281 -0.448% cmd/internal/obj/arm.s 66745 66687 -58 -0.087% cmd/compile/internal/syntax.s 145406 145062 -344 -0.237% cmd/internal/obj/wasm.s 44049 44027 -22 -0.050% net.s 291835 291020 -815 -0.279% cmd/dist.s 209020 208807 -213 -0.102% cmd/cgo.s 241564 241102 -462 -0.191% vendor/golang.org/x/net/http/httpproxy.s 9407 9399 -8 -0.085% log/syslog.s 7921 7909 -12 -0.151% go/types.s 319325 317513 -1812 -0.567% vendor/golang.org/x/net/http/httpguts.s 3834 3825 -9 -0.235% mime/multipart.s 21414 21343 -71 -0.332% cmd/internal/obj/ppc64.s 119949 119938 -11 -0.009% cmd/compile/internal/logopt.s 10158 10118 -40 -0.394% vendor/golang.org/x/net/nettest.s 28012 27991 -21 -0.075% go/internal/srcimporter.s 6405 6380 -25 -0.390% go/internal/gcimporter.s 34525 34493 -32 -0.093% net/mail.s 23937 23720 -217 -0.907% go/internal/gccgoimporter.s 56095 56038 -57 -0.102% cmd/compile/internal/types.s 47247 47207 -40 -0.085% cmd/api.s 39582 39558 -24 -0.061% cmd/go/internal/base.s 12572 12551 -21 -0.167% cmd/vendor/golang.org/x/xerrors.s 17846 17814 -32 -0.179% cmd/vendor/golang.org/x/mod/sumdb/note.s 18142 18070 -72 -0.397% cmd/go/internal/search.s 19994 19876 -118 -0.590% cmd/go/internal/imports.s 16457 16428 -29 -0.176% cmd/vendor/golang.org/x/mod/module.s 17838 17759 -79 -0.443% cmd/go/internal/cache.s 30551 30514 -37 -0.121% cmd/vendor/golang.org/x/mod/sumdb/tlog.s 36356 36321 -35 -0.096% cmd/internal/test2json.s 9452 9408 -44 -0.466% cmd/go/internal/mvs.s 25136 25092 -44 -0.175% cmd/go/internal/txtar.s 3488 3461 -27 -0.774% cmd/vendor/golang.org/x/mod/zip.s 18811 18800 -11 -0.058% cmd/go/internal/version.s 11213 11171 -42 -0.375% cmd/link/internal/benchmark.s 4941 4949 +8 +0.162% cmd/internal/obj/s390x.s 126865 126849 -16 -0.013% cmd/gofmt.s 30684 30596 -88 -0.287% cmd/fix.s 87450 86906 -544 -0.622% cmd/internal/obj/x86.s 88578 88556 -22 -0.025% cmd/vendor/golang.org/x/mod/modfile.s 72450 72363 -87 -0.120% cmd/oldlink/internal/loader.s 16743 16741 -2 -0.012% cmd/pack.s 14863 14861 -2 -0.013% cmd/go/internal/load.s 106742 106568 -174 -0.163% cmd/oldlink/internal/objfile.s 21787 21780 -7 -0.032% cmd/oldlink/internal/loadmacho.s 29309 29317 +8 +0.027% cmd/oldlink/internal/loadelf.s 35013 35021 +8 +0.023% cmd/asm/internal/asm.s 68550 68538 -12 -0.018% cmd/link/internal/loader.s 94765 94564 -201 -0.212% cmd/link/internal/loadelf.s 35663 35667 +4 +0.011% cmd/link/internal/loadmacho.s 29501 29509 +8 +0.027% cmd/vendor/golang.org/x/tools/go/analysis.s 4983 4976 -7 -0.140% cmd/vendor/golang.org/x/tools/go/analysis/internal/analysisflags.s 16771 16709 -62 -0.370% cmd/vendor/golang.org/x/tools/go/types/objectpath.s 18481 18456 -25 -0.135% cmd/vendor/golang.org/x/tools/go/analysis/passes/internal/analysisutil.s 2100 2085 -15 -0.714% cmd/vendor/github.com/google/pprof/profile.s 150141 149620 -521 -0.347% cmd/vendor/github.com/google/pprof/internal/measurement.s 10420 10404 -16 -0.154% cmd/vendor/golang.org/x/tools/go/analysis/passes/asmdecl.s 36814 36755 -59 -0.160% cmd/vendor/golang.org/x/tools/go/analysis/passes/bools.s 6688 6673 -15 -0.224% cmd/vendor/golang.org/x/tools/go/analysis/passes/cgocall.s 9856 9784 -72 -0.731% cmd/vendor/golang.org/x/tools/go/analysis/passes/composite.s 3011 2979 -32 -1.063% cmd/vendor/golang.org/x/tools/go/analysis/passes/copylock.s 9737 9682 -55 -0.565% cmd/vendor/golang.org/x/tools/go/cfg.s 30738 30725 -13 -0.042% cmd/vendor/github.com/ianlancetaylor/demangle.s 175195 174513 -682 -0.389% cmd/vendor/golang.org/x/tools/go/analysis/passes/httpresponse.s 3625 3520 -105 -2.897% cmd/vendor/golang.org/x/tools/go/analysis/passes/loopclosure.s 2987 2971 -16 -0.536% cmd/vendor/golang.org/x/tools/go/analysis/passes/shift.s 4372 4340 -32 -0.732% cmd/vendor/golang.org/x/tools/go/analysis/passes/stdmethods.s 8634 8611 -23 -0.266% cmd/vendor/golang.org/x/tools/go/analysis/passes/tests.s 6189 6164 -25 -0.404% cmd/vendor/golang.org/x/tools/go/analysis/passes/structtag.s 8089 8073 -16 -0.198% cmd/vendor/golang.org/x/tools/go/analysis/passes/unsafeptr.s 2208 2177 -31 -1.404% cmd/vendor/golang.org/x/tools/go/analysis/passes/unreachable.s 8050 8047 -3 -0.037% cmd/vendor/golang.org/x/tools/go/analysis/passes/unusedresult.s 3665 3629 -36 -0.982% cmd/vendor/golang.org/x/tools/go/ast/astutil.s 65773 65680 -93 -0.141% cmd/vendor/golang.org/x/tools/go/analysis/unitchecker.s 13328 13286 -42 -0.315% cmd/vendor/golang.org/x/tools/go/types/typeutil.s 12263 12162 -101 -0.824% cmd/vendor/golang.org/x/tools/go/analysis/passes/errorsas.s 1459 1421 -38 -2.605% cmd/vendor/golang.org/x/tools/go/analysis/passes/ctrlflow.s 5208 5191 -17 -0.326% cmd/vendor/golang.org/x/tools/go/analysis/passes/unmarshal.s 1801 1782 -19 -1.055% cmd/vendor/golang.org/x/tools/go/analysis/passes/lostcancel.s 9569 9528 -41 -0.428% cmd/go/internal/work.s 304928 304756 -172 -0.056% crypto/x509.s 147340 147139 -201 -0.136% cmd/vendor/golang.org/x/tools/go/analysis/passes/printf.s 34287 34019 -268 -0.782% crypto/tls.s 311603 310644 -959 -0.308% cmd/oldlink/internal/ld.s 533115 532651 -464 -0.087% cmd/oldlink/internal/wasm.s 16484 16458 -26 -0.158% cmd/oldlink/internal/x86.s 18832 18830 -2 -0.011% cmd/link/internal/ld.s 548200 547626 -574 -0.105% cmd/link/internal/wasm.s 16760 16734 -26 -0.155% cmd/link/internal/arm64.s 20850 20840 -10 -0.048% cmd/link/internal/x86.s 17437 17435 -2 -0.011% net/http.s 556647 555519 -1128 -0.203% net/http/cookiejar.s 15849 15833 -16 -0.101% expvar.s 9521 9508 -13 -0.137% net/http/httptest.s 16471 16452 -19 -0.115% cmd/vendor/github.com/google/pprof/internal/plugin.s 4266 4264 -2 -0.047% net/http/cgi.s 23448 23428 -20 -0.085% cmd/go/internal/web.s 16472 16428 -44 -0.267% net/http/httputil.s 39672 39670 -2 -0.005% net/rpc.s 33989 33965 -24 -0.071% net/http/fcgi.s 19167 19162 -5 -0.026% cmd/vendor/github.com/google/pprof/internal/symbolz.s 5861 5857 -4 -0.068% cmd/vendor/github.com/google/pprof/internal/binutils.s 35842 35823 -19 -0.053% cmd/vendor/github.com/google/pprof/internal/symbolizer.s 11449 11404 -45 -0.393% cmd/go/internal/get.s 62726 62582 -144 -0.230% cmd/vendor/github.com/google/pprof/internal/report.s 80032 80022 -10 -0.012% cmd/go/internal/modfetch/codehost.s 89005 88871 -134 -0.151% cmd/trace.s 116607 116496 -111 -0.095% cmd/vendor/github.com/google/pprof/internal/driver.s 143234 143207 -27 -0.019% cmd/vendor/github.com/google/pprof/driver.s 9000 8998 -2 -0.022% cmd/go/internal/modfetch.s 126300 125726 -574 -0.454% cmd/pprof.s 12317 12312 -5 -0.041% cmd/go/internal/modconv.s 17878 17861 -17 -0.095% cmd/go/internal/modload.s 150261 149763 -498 -0.331% cmd/go/internal/clean.s 11122 11091 -31 -0.279% cmd/go/internal/help.s 6523 6521 -2 -0.031% cmd/go/internal/generate.s 11627 11614 -13 -0.112% cmd/go/internal/envcmd.s 22034 21986 -48 -0.218% cmd/go/internal/modget.s 38478 38398 -80 -0.208% cmd/go/internal/modcmd.s 46430 46229 -201 -0.433% cmd/go/internal/test.s 64399 64374 -25 -0.039% cmd/compile/internal/ssa.s 3615264 3608276 -6988 -0.193% cmd/compile/internal/gc.s 1538865 1537625 -1240 -0.081% cmd/compile/internal/amd64.s 33593 33574 -19 -0.057% cmd/compile/internal/x86.s 30871 30852 -19 -0.062% total 19343565 19311284 -32281 -0.167% Change-Id: Ib030eb79458827a5a5b6d0d2f98765f8325a4d7e Reviewed-on: https://go-review.googlesource.com/c/go/+/222923 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2020-03-13cmd/compile: more minor cleanup in shortcircuitBlockJosh Bleecher Snyder
Continue to simplify, rename for clarity, improve docs, and reduce variable scope. This is in preparation for this function becoming more complicated. Passes toolstash-check. Updates #37608 Change-Id: I630a4e07c92297c46d18aea69ec29852d6371ff0 Reviewed-on: https://go-review.googlesource.com/c/go/+/222919 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2020-03-13cmd/compile: remove loop in shortcircuitJosh Bleecher Snyder
shortcircuitBlock contained a loop to handle blocks like b: <- p q v = Phi true false If v -> t u in a single execution. This change makes shortcircuitBlock do it in two instead, one for each constant phi arg. Motivation: Upcoming changes will expand the range of blocks that the shortcircuit pass can handle. Those changes need to understand what the CFG will look like after the rewrite in shortcircuitBlock. Making shortcircuitBlock do only a single CFG modification at a time significantly simplifies that code. In theory, this is less efficient, but not measurably so. There is minor, unimportant churn in the generated code. Updates #37608 Change-Id: Ia6dce7011e3e19b546ed1e176bd407575a0ab837 Reviewed-on: https://go-review.googlesource.com/c/go/+/222918 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2020-03-13cmd/compile: rename a local variable in shortcircuitBlockJosh Bleecher Snyder
v is pretty generic. Subsequent changes will make this function more complicated, so rename it now, independently, for easier review. v is the control value for the block (or its underlying phi); call it ctl. Passes toolstash-check. Updates #37608 Change-Id: I3fbae3344f1c95aff0a69c1e4f61ef637a54774e Reviewed-on: https://go-review.googlesource.com/c/go/+/222917 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-10-02cmd/compile: allow multiple SSA block control valuesMichael Munday
Control values are used to choose which successor of a block is jumped to. Typically a control value takes the form of a 'flags' value that represents the result of a comparison. Some architectures however use a variable in a register as a control value. Up until now we have managed with a single control value per block. However some architectures (e.g. s390x and riscv64) have combined compare-and-branch instructions that take two variables in registers as parameters. To generate these instructions we need to support 2 control values per block. This CL allows up to 2 control values to be used in a block in order to support the addition of compare-and-branch instructions. I have implemented s390x compare-and-branch instructions in a different CL. Passes toolstash-check -all. Results of compilebench: name old time/op new time/op delta Template 208ms ± 1% 209ms ± 1% ~ (p=0.289 n=20+20) Unicode 83.7ms ± 1% 83.3ms ± 3% -0.49% (p=0.017 n=18+18) GoTypes 748ms ± 1% 748ms ± 0% ~ (p=0.460 n=20+18) Compiler 3.47s ± 1% 3.48s ± 1% ~ (p=0.070 n=19+18) SSA 11.5s ± 1% 11.7s ± 1% +1.64% (p=0.000 n=19+18) Flate 130ms ± 1% 130ms ± 1% ~ (p=0.588 n=19+20) GoParser 160ms ± 1% 161ms ± 1% ~ (p=0.211 n=20+20) Reflect 465ms ± 1% 467ms ± 1% +0.42% (p=0.007 n=20+20) Tar 184ms ± 1% 185ms ± 2% ~ (p=0.087 n=18+20) XML 253ms ± 1% 253ms ± 1% ~ (p=0.377 n=20+18) LinkCompiler 769ms ± 2% 774ms ± 2% ~ (p=0.070 n=19+19) ExternalLinkCompiler 3.59s ±11% 3.68s ± 6% ~ (p=0.072 n=20+20) LinkWithoutDebugCompiler 446ms ± 5% 454ms ± 3% +1.79% (p=0.002 n=19+20) StdCmd 26.0s ± 2% 26.0s ± 2% ~ (p=0.799 n=20+20) name old user-time/op new user-time/op delta Template 238ms ± 5% 240ms ± 5% ~ (p=0.142 n=20+20) Unicode 105ms ±11% 106ms ±10% ~ (p=0.512 n=20+20) GoTypes 876ms ± 2% 873ms ± 4% ~ (p=0.647 n=20+19) Compiler 4.17s ± 2% 4.19s ± 1% ~ (p=0.093 n=20+18) SSA 13.9s ± 1% 14.1s ± 1% +1.45% (p=0.000 n=18+18) Flate 145ms ±13% 146ms ± 5% ~ (p=0.851 n=20+18) GoParser 185ms ± 5% 188ms ± 7% ~ (p=0.174 n=20+20) Reflect 534ms ± 3% 538ms ± 2% ~ (p=0.105 n=20+18) Tar 215ms ± 4% 211ms ± 9% ~ (p=0.079 n=19+20) XML 295ms ± 6% 295ms ± 5% ~ (p=0.968 n=20+20) LinkCompiler 832ms ± 4% 837ms ± 7% ~ (p=0.707 n=17+20) ExternalLinkCompiler 1.58s ± 8% 1.60s ± 4% ~ (p=0.296 n=20+19) LinkWithoutDebugCompiler 478ms ±12% 489ms ±10% ~ (p=0.429 n=20+20) name old object-bytes new object-bytes delta Template 559kB ± 0% 559kB ± 0% ~ (all equal) Unicode 216kB ± 0% 216kB ± 0% ~ (all equal) GoTypes 2.03MB ± 0% 2.03MB ± 0% ~ (all equal) Compiler 8.07MB ± 0% 8.07MB ± 0% -0.06% (p=0.000 n=20+20) SSA 27.1MB ± 0% 27.3MB ± 0% +0.89% (p=0.000 n=20+20) Flate 343kB ± 0% 343kB ± 0% ~ (all equal) GoParser 441kB ± 0% 441kB ± 0% ~ (all equal) Reflect 1.36MB ± 0% 1.36MB ± 0% ~ (all equal) Tar 487kB ± 0% 487kB ± 0% ~ (all equal) XML 632kB ± 0% 632kB ± 0% ~ (all equal) name old export-bytes new export-bytes delta Template 18.5kB ± 0% 18.5kB ± 0% ~ (all equal) Unicode 7.92kB ± 0% 7.92kB ± 0% ~ (all equal) GoTypes 35.0kB ± 0% 35.0kB ± 0% ~ (all equal) Compiler 109kB ± 0% 110kB ± 0% +0.72% (p=0.000 n=20+20) SSA 137kB ± 0% 138kB ± 0% +0.58% (p=0.000 n=20+20) Flate 4.89kB ± 0% 4.89kB ± 0% ~ (all equal) GoParser 8.49kB ± 0% 8.49kB ± 0% ~ (all equal) Reflect 11.4kB ± 0% 11.4kB ± 0% ~ (all equal) Tar 10.5kB ± 0% 10.5kB ± 0% ~ (all equal) XML 16.7kB ± 0% 16.7kB ± 0% ~ (all equal) name old text-bytes new text-bytes delta HelloSize 761kB ± 0% 761kB ± 0% ~ (all equal) CmdGoSize 10.8MB ± 0% 10.8MB ± 0% ~ (all equal) name old data-bytes new data-bytes delta HelloSize 10.7kB ± 0% 10.7kB ± 0% ~ (all equal) CmdGoSize 312kB ± 0% 312kB ± 0% ~ (all equal) name old bss-bytes new bss-bytes delta HelloSize 122kB ± 0% 122kB ± 0% ~ (all equal) CmdGoSize 146kB ± 0% 146kB ± 0% ~ (all equal) name old exe-bytes new exe-bytes delta HelloSize 1.13MB ± 0% 1.13MB ± 0% ~ (all equal) CmdGoSize 15.1MB ± 0% 15.1MB ± 0% ~ (all equal) Change-Id: I3cc2f9829a109543d9a68be4a21775d2d3e9801f Reviewed-on: https://go-review.googlesource.com/c/go/+/196557 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Keith Randall <khr@golang.org>
2019-08-29cmd/compile: handle infinite loops in shortcircuit passJosh Bleecher Snyder
The newly upgraded shortcircuit pass attempted to remove infinite loops. Stop doing that. Fixes #33903 Change-Id: I0fc9c1b5f2427e54ce650806602ef5e3ad65aca5 Reviewed-on: https://go-review.googlesource.com/c/go/+/192144 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-08-27cmd/compile: improve shortcircuit passJosh Bleecher Snyder
While working on #30645, I noticed that many instances in which the walkinrange optimization could apply were not even being considered. This was because of extraneous blocks in the CFG, of the type that shortcircuit normally removes. The change improves the shortcircuit pass to handle most of those cases. (There are a few that can only be reasonably detected later in compilation, after other optimizations have been run, but not enough to be worth chasing.) Notable changes: * Instead of calculating live-across-blocks values, use v.Uses == 1. This is cheaper and more straightforward. v.Uses did not exist when this pass was initially written. * Incorporate a fusePlain and loop until stable. This is necessary to find many of the instances. * Allow Copy and Not wrappers around Phi values. This significantly increases effectiveness. * Allow removal of all preds, creating a dead block. The previous pass stopped unnecessarily at one pred. * Use phielimValue during cleanup instead of manually setting the op to OpCopy. The result is marginally faster compilation and smaller code. name old time/op new time/op delta Template 213ms ± 2% 212ms ± 2% -0.63% (p=0.002 n=49+48) Unicode 90.0ms ± 2% 89.8ms ± 2% ~ (p=0.122 n=48+48) GoTypes 710ms ± 3% 711ms ± 2% ~ (p=0.433 n=45+49) Compiler 3.23s ± 2% 3.22s ± 2% ~ (p=0.124 n=47+49) SSA 10.0s ± 1% 10.0s ± 1% -0.43% (p=0.000 n=48+50) Flate 135ms ± 3% 135ms ± 2% ~ (p=0.311 n=49+49) GoParser 158ms ± 2% 158ms ± 2% ~ (p=0.757 n=48+48) Reflect 447ms ± 2% 447ms ± 2% ~ (p=0.815 n=49+48) Tar 189ms ± 2% 189ms ± 3% ~ (p=0.530 n=47+49) XML 251ms ± 3% 250ms ± 1% -0.75% (p=0.002 n=49+48) [Geo mean] 427ms 426ms -0.25% name old user-time/op new user-time/op delta Template 265ms ± 2% 265ms ± 2% ~ (p=0.969 n=48+50) Unicode 119ms ± 6% 119ms ± 6% ~ (p=0.738 n=50+50) GoTypes 923ms ± 2% 925ms ± 2% ~ (p=0.057 n=43+47) Compiler 4.37s ± 2% 4.37s ± 2% ~ (p=0.691 n=50+46) SSA 13.4s ± 1% 13.4s ± 1% ~ (p=0.282 n=42+49) Flate 162ms ± 2% 162ms ± 2% ~ (p=0.774 n=48+50) GoParser 186ms ± 2% 186ms ± 3% ~ (p=0.213 n=47+47) Reflect 572ms ± 2% 573ms ± 3% ~ (p=0.303 n=50+49) Tar 240ms ± 3% 240ms ± 2% ~ (p=0.939 n=46+44) XML 302ms ± 2% 302ms ± 2% ~ (p=0.399 n=47+47) [Geo mean] 540ms 541ms +0.07% name old alloc/op new alloc/op delta Template 36.8MB ± 0% 36.7MB ± 0% -0.42% (p=0.008 n=5+5) Unicode 28.1MB ± 0% 28.1MB ± 0% ~ (p=0.151 n=5+5) GoTypes 124MB ± 0% 124MB ± 0% -0.26% (p=0.008 n=5+5) Compiler 571MB ± 0% 566MB ± 0% -0.84% (p=0.008 n=5+5) SSA 1.86GB ± 0% 1.85GB ± 0% -0.58% (p=0.008 n=5+5) Flate 22.8MB ± 0% 22.8MB ± 0% -0.17% (p=0.008 n=5+5) GoParser 27.3MB ± 0% 27.3MB ± 0% -0.20% (p=0.008 n=5+5) Reflect 79.5MB ± 0% 79.3MB ± 0% -0.20% (p=0.008 n=5+5) Tar 34.7MB ± 0% 34.6MB ± 0% -0.42% (p=0.008 n=5+5) XML 45.4MB ± 0% 45.3MB ± 0% -0.29% (p=0.008 n=5+5) [Geo mean] 80.0MB 79.7MB -0.34% name old allocs/op new allocs/op delta Template 378k ± 0% 377k ± 0% -0.22% (p=0.008 n=5+5) Unicode 339k ± 0% 339k ± 0% ~ (p=0.643 n=5+5) GoTypes 1.36M ± 0% 1.36M ± 0% -0.10% (p=0.008 n=5+5) Compiler 5.51M ± 0% 5.50M ± 0% -0.13% (p=0.008 n=5+5) SSA 17.5M ± 0% 17.5M ± 0% -0.14% (p=0.008 n=5+5) Flate 234k ± 0% 234k ± 0% -0.04% (p=0.008 n=5+5) GoParser 299k ± 0% 299k ± 0% -0.05% (p=0.008 n=5+5) Reflect 978k ± 0% 979k ± 0% +0.02% (p=0.016 n=5+5) Tar 351k ± 0% 351k ± 0% -0.04% (p=0.008 n=5+5) XML 435k ± 0% 435k ± 0% -0.11% (p=0.008 n=5+5) [Geo mean] 840k 840k -0.08% file before after Δ % go 14794788 14770212 -24576 -0.166% addr2line 4203688 4199592 -4096 -0.097% api 5954056 5941768 -12288 -0.206% asm 4862704 4846320 -16384 -0.337% cgo 4778920 4770728 -8192 -0.171% compile 24001568 23923792 -77776 -0.324% cover 5198440 5190248 -8192 -0.158% dist 3595248 3587056 -8192 -0.228% doc 4618504 4610312 -8192 -0.177% fix 3337416 3333320 -4096 -0.123% link 6120408 6116312 -4096 -0.067% nm 4149064 4140872 -8192 -0.197% objdump 4555608 4547416 -8192 -0.180% pprof 14616324 14595844 -20480 -0.140% test2json 2766328 2762232 -4096 -0.148% trace 11638844 11622460 -16384 -0.141% vet 8274936 8258552 -16384 -0.198% total 132520780 132270972 -249808 -0.189% Change-Id: Ifcd235a2a6e5f13ed5c93e62523e2ef61321fccf Reviewed-on: https://go-review.googlesource.com/c/go/+/178197 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2018-04-09cmd: remove a few more unused parametersDaniel Martí
ssa's pos parameter on the Const* funcs is unused, so remove it. ld's alloc parameter on elfnote is always true, so remove the arguments and simplify the code. Finally, arm's addpltreloc never has its return parameter used, so remove it. Change-Id: I63387ecf6ab7b5f7c20df36be823322bb98427b8 Reviewed-on: https://go-review.googlesource.com/104456 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2017-04-24cmd/compile: lazily create true and false Values in shortcircuitJosh Bleecher Snyder
It is mildly wasteful to always create values that must sometimes then be dead code eliminated. Given that it is very easy to avoid, do so. Noticed when examining a package with thousands of generated wrappers, each of which uses only a handful of Values to compile. Change-Id: If02eb4aa786dfa20f7aa43e8d729dad8b3db2786 Reviewed-on: https://go-review.googlesource.com/41502 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-19cmd/compile: separate ssa.Frontend and ssa.TypeSourceJosh Bleecher Snyder
Prior to this CL, the ssa.Frontend field was responsible for providing types to the backend during compilation. However, the types needed by the backend are few and static. It makes more sense to use a struct for them and to hang that struct off the ssa.Config, which is the correct home for readonly data. Now that Types is a struct, we can clean up the names a bit as well. This has the added benefit of allowing early construction of all types needed by the backend. This will be useful for concurrent backend compilation. Passes toolstash-check -all. No compiler performance change. Updates #15756 Change-Id: I021658c8cf2836d6a22bbc20cc828ac38c7da08a Reviewed-on: https://go-review.googlesource.com/38336 Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-03-17cmd/compile: move Frontend field from ssa.Config to ssa.FuncJosh Bleecher Snyder
Suggested by mdempsky in CL 38232. This allows us to use the Frontend field to associate frontend state and information with a function. See the following CL in the series for examples. This is a giant CL, but it is almost entirely routine refactoring. The ssa test API is starting to feel a bit unwieldy. I will clean it up separately, once the dust has settled. Passes toolstash -cmp. Updates #15756 Change-Id: I71c573bd96ff7251935fce1391b06b1f133c3caf Reviewed-on: https://go-review.googlesource.com/38327 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-12-08[dev.inline] cmd/compile/internal/ssa: rename various fields from Line to PosRobert Griesemer
This is a mostly mechanical rename followed by manual fixes where necessary. Change-Id: Ie5c670b133db978f15dc03e50dc2da0c80fc8842 Reviewed-on: https://go-review.googlesource.com/34137 Reviewed-by: David Lazar <lazard@golang.org>
2016-05-05cmd/compile: enable constant-time CFG editingKeith Randall
Provide indexes along with block pointers for Preds and Succs arrays. This allows us to splice edges in and out of those arrays in constant time. Fixes worst-case O(n^2) behavior in deadcode and fuse. benchmark old ns/op new ns/op delta BenchmarkFuse1-8 2065 2057 -0.39% BenchmarkFuse10-8 9408 9073 -3.56% BenchmarkFuse100-8 105238 76277 -27.52% BenchmarkFuse1000-8 3982562 1026750 -74.22% BenchmarkFuse10000-8 301220329 12824005 -95.74% BenchmarkDeadCode1-8 1588 1566 -1.39% BenchmarkDeadCode10-8 4333 4250 -1.92% BenchmarkDeadCode100-8 32031 32574 +1.70% BenchmarkDeadCode1000-8 590407 468275 -20.69% BenchmarkDeadCode10000-8 17822890 5000818 -71.94% BenchmarkDeadCode100000-8 1388706640 78021127 -94.38% BenchmarkDeadCode200000-8 5372518479 168598762 -96.86% Change-Id: Iccabdbb9343fd1c921ba07bbf673330a1c36ee17 Reviewed-on: https://go-review.googlesource.com/22589 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-17cmd/compile: keep value use counts in SSAKeith Randall
Keep track of how many uses each Value has. Each appearance in Value.Args and in Block.Control counts once. The number of uses of a value is generically useful to constrain rewrite rules. For instance, we might want to prevent merging index operations into loads if the same index expression is used lots of times. But I have one use in particular for which the use count is required. We must make sure we don't combine ops with loads if the load has more than one use. Otherwise, we may split a single load into multiple loads and that breaks perceived behavior in the presence of races. In particular, the load of m.state in sync/mutex.go:Lock can't be done twice. (I have a separate CL which triggers the mutex failure. This CL has a test which demonstrates a similar failure.) Change-Id: Icaafa479239f48632a069d0c3f624e6ebc6b1f0e Reviewed-on: https://go-review.googlesource.com/20790 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Todd Neal <todd@tneal.org>
2016-01-29[dev.ssa] cmd/compile: optimization for && and || expressionsKeith Randall
Compiling && and || expressions often leads to control flow of the following form: p: If a goto b else c b: <- p ... x = phi(a, ...) If x goto t else u Note that if we take the edge p->b, then we are guaranteed to take the edge b->t also. So in this situation, we might as well go directly from p to t. Change-Id: I6974f1e6367119a2ddf2014f9741fdb490edcc12 Reviewed-on: https://go-review.googlesource.com/18910 Reviewed-by: David Chase <drchase@google.com>