aboutsummaryrefslogtreecommitdiff
path: root/test/live.go
AgeCommit message (Collapse)Author
2020-01-29cmd/compile,cmd/link: fix and re-enable open-coded defers on riscv64Joel Sing
The R_CALLRISCV relocation marker is on the JALR instruction, however the actual relocation is currently two instructions previous for the AUIPC+ADDI sequence. Adjust the platform dependent offset accordingly and re-enable open-coded defers. Fixes #36786. Change-Id: I71597c193c447930fbe94ce44b7355e89ae877bb Reviewed-on: https://go-review.googlesource.com/c/go/+/216797 Run-TryBot: Joel Sing <joel@sing.id.au> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-01-29test: disable the live test on riscv64Joel Sing
This test expects that open-coded defers are enabled, which is not currently the case on riscv64. Updates issue #27532 and #36786. Change-Id: I94bb558c5b0734b4cfe5ae12873be81026009bcf Reviewed-on: https://go-review.googlesource.com/c/go/+/216777 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-10-24cmd/compile, cmd/link, runtime: make defers low-cost through inline code and ↵Dan Scales
extra funcdata Generate inline code at defer time to save the args of defer calls to unique (autotmp) stack slots, and generate inline code at exit time to check which defer calls were made and make the associated function/method/interface calls. We remember that a particular defer statement was reached by storing in the deferBits variable (always stored on the stack). At exit time, we check the bits of the deferBits variable to determine which defer function calls to make (in reverse order). These low-cost defers are only used for functions where no defers appear in loops. In addition, we don't do these low-cost defers if there are too many defer statements or too many exits in a function (to limit code increase). When a function uses open-coded defers, we produce extra FUNCDATA_OpenCodedDeferInfo information that specifies the number of defers, and for each defer, the stack slots where the closure and associated args have been stored. The funcdata also includes the location of the deferBits variable. Therefore, for panics, we can use this funcdata to determine exactly which defers are active, and call the appropriate functions/methods/closures with the correct arguments for each active defer. In order to unwind the stack correctly after a recover(), we need to add an extra code segment to functions with open-coded defers that simply calls deferreturn() and returns. This segment is not reachable by the normal function, but is returned to by the runtime during recovery. We set the liveness information of this deferreturn() to be the same as the liveness at the first function call during the last defer exit code (so all return values and all stack slots needed by the defer calls will be live). I needed to increase the stackguard constant from 880 to 896, because of a small amount of new code in deferreturn(). The -N flag disables open-coded defers. '-d defer' prints out the kind of defer being used at each defer statement (heap-allocated, stack-allocated, or open-coded). Cost of defer statement [ go test -run NONE -bench BenchmarkDefer$ runtime ] With normal (stack-allocated) defers only: 35.4 ns/op With open-coded defers: 5.6 ns/op Cost of function call alone (remove defer keyword): 4.4 ns/op Text size increase (including funcdata) for go binary without/with open-coded defers: 0.09% The average size increase (including funcdata) for only the functions that use open-coded defers is 1.1%. The cost of a panic followed by a recover got noticeably slower, since panic processing now requires a scan of the stack for open-coded defer frames. This scan is required, even if no frames are using open-coded defers: Cost of panic and recover [ go test -run NONE -bench BenchmarkPanicRecover runtime ] Without open-coded defers: 62.0 ns/op With open-coded defers: 255 ns/op A CGO Go-to-C-to-Go benchmark got noticeably faster because of open-coded defers: CGO Go-to-C-to-Go benchmark [cd misc/cgo/test; go test -run NONE -bench BenchmarkCGoCallback ] Without open-coded defers: 443 ns/op With open-coded defers: 347 ns/op Updates #14939 (defer performance) Updates #34481 (design doc) Change-Id: I63b1a60d1ebf28126f55ee9fd7ecffe9cb23d1ff Reviewed-on: https://go-review.googlesource.com/c/go/+/202340 Reviewed-by: Austin Clements <austin@google.com>
2019-10-16Revert "cmd/compile, cmd/link, runtime: make defers low-cost through inline ↵Bryan C. Mills
code and extra funcdata" This reverts CL 190098. Reason for revert: broke several builders. Change-Id: I69161352f9ded02537d8815f259c4d391edd9220 Reviewed-on: https://go-review.googlesource.com/c/go/+/201519 Run-TryBot: Bryan C. Mills <bcmills@google.com> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Dan Scales <danscales@google.com>
2019-10-16cmd/compile, cmd/link, runtime: make defers low-cost through inline code and ↵Dan Scales
extra funcdata Generate inline code at defer time to save the args of defer calls to unique (autotmp) stack slots, and generate inline code at exit time to check which defer calls were made and make the associated function/method/interface calls. We remember that a particular defer statement was reached by storing in the deferBits variable (always stored on the stack). At exit time, we check the bits of the deferBits variable to determine which defer function calls to make (in reverse order). These low-cost defers are only used for functions where no defers appear in loops. In addition, we don't do these low-cost defers if there are too many defer statements or too many exits in a function (to limit code increase). When a function uses open-coded defers, we produce extra FUNCDATA_OpenCodedDeferInfo information that specifies the number of defers, and for each defer, the stack slots where the closure and associated args have been stored. The funcdata also includes the location of the deferBits variable. Therefore, for panics, we can use this funcdata to determine exactly which defers are active, and call the appropriate functions/methods/closures with the correct arguments for each active defer. In order to unwind the stack correctly after a recover(), we need to add an extra code segment to functions with open-coded defers that simply calls deferreturn() and returns. This segment is not reachable by the normal function, but is returned to by the runtime during recovery. We set the liveness information of this deferreturn() to be the same as the liveness at the first function call during the last defer exit code (so all return values and all stack slots needed by the defer calls will be live). I needed to increase the stackguard constant from 880 to 896, because of a small amount of new code in deferreturn(). The -N flag disables open-coded defers. '-d defer' prints out the kind of defer being used at each defer statement (heap-allocated, stack-allocated, or open-coded). Cost of defer statement [ go test -run NONE -bench BenchmarkDefer$ runtime ] With normal (stack-allocated) defers only: 35.4 ns/op With open-coded defers: 5.6 ns/op Cost of function call alone (remove defer keyword): 4.4 ns/op Text size increase (including funcdata) for go cmd without/with open-coded defers: 0.09% The average size increase (including funcdata) for only the functions that use open-coded defers is 1.1%. The cost of a panic followed by a recover got noticeably slower, since panic processing now requires a scan of the stack for open-coded defer frames. This scan is required, even if no frames are using open-coded defers: Cost of panic and recover [ go test -run NONE -bench BenchmarkPanicRecover runtime ] Without open-coded defers: 62.0 ns/op With open-coded defers: 255 ns/op A CGO Go-to-C-to-Go benchmark got noticeably faster because of open-coded defers: CGO Go-to-C-to-Go benchmark [cd misc/cgo/test; go test -run NONE -bench BenchmarkCGoCallback ] Without open-coded defers: 443 ns/op With open-coded defers: 347 ns/op Updates #14939 (defer performance) Updates #34481 (design doc) Change-Id: I51a389860b9676cfa1b84722f5fb84d3c4ee9e28 Reviewed-on: https://go-review.googlesource.com/c/go/+/190098 Reviewed-by: Austin Clements <austin@google.com>
2019-09-03cmd/compile: extend ssa.go to handle 1-element array and 1-field structCuong Manh Le
Assinging to 1-element array/1-field struct variable is considered clobbering the whole variable. By emitting OpVarDef in this case, liveness analysis can now know the variable is redefined. Also, the isfat is not necessary anymore, and will be removed in follow up CL. Fixes #33916 Change-Id: Iece0d90b05273f333d59d6ee5b12ee7dc71908c2 Reviewed-on: https://go-review.googlesource.com/c/go/+/192979 Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2019-08-28Revert "cmd/compile: make isfat handle 1-element array, 1-field struct"Matthew Dempsky
This reverts commit 53227762153afb39c979810bd59ec139e3c8127d. Reason for revert: broke js-wasm builder. Change-Id: If22762317c4a9e00f5060eb84377a4a52d601fca Reviewed-on: https://go-review.googlesource.com/c/go/+/192157 Run-TryBot: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Bryan C. Mills <bcmills@google.com>
2019-08-28cmd/compile: make isfat handle 1-element array, 1-field structLE Manh Cuong
This will improve liveness analysis slightly, the same logic as isdirectiface curently does. In: type T struct { m map[int]int } v := T{} v.m = make(map[int]int) T is considered "fat", now it is not. So assigning to v.m is considered to clobber the entire v. This is follow up of CL 179057. Change-Id: Id6b4807b8e8521ef5d8bcb14fedb6dceb9dbf18c Reviewed-on: https://go-review.googlesource.com/c/go/+/179578 Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2019-06-10Revert "Revert "cmd/compile,runtime: allocate defer records on the stack""Keith Randall
This reverts CL 180761 Reason for revert: Reinstate the stack-allocated defer CL. There was nothing wrong with the CL proper, but stack allocation of defers exposed two other issues. Issue #32477: Fix has been submitted as CL 181258. Issue #32498: Possible fix is CL 181377 (not submitted yet). Change-Id: I32b3365d5026600069291b068bbba6cb15295eb3 Reviewed-on: https://go-review.googlesource.com/c/go/+/181378 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-06-05Revert "cmd/compile,runtime: allocate defer records on the stack"Keith Randall
This reverts commit fff4f599fe1c21e411a99de5c9b3777d06ce0ce6. Reason for revert: Seems to still have issues around GC. Fixes #32452 Change-Id: Ibe7af629f9ad6a3d5312acd7b066123f484da7f0 Reviewed-on: https://go-review.googlesource.com/c/go/+/180761 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2019-06-04cmd/compile,runtime: allocate defer records on the stackKeith Randall
When a defer is executed at most once in a function body, we can allocate the defer record for it on the stack instead of on the heap. This should make defers like this (which are very common) faster. This optimization applies to 363 out of the 370 static defer sites in the cmd/go binary. name old time/op new time/op delta Defer-4 52.2ns ± 5% 36.2ns ± 3% -30.70% (p=0.000 n=10+10) Fixes #6980 Update #14939 Change-Id: I697109dd7aeef9e97a9eeba2ef65ff53d3ee1004 Reviewed-on: https://go-review.googlesource.com/c/go/+/171758 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>
2019-03-06cmd/compile: fix ordering for short-circuiting opsKeith Randall
Make sure the side effects inside short-circuited operations (&& and ||) happen correctly. Before this CL, we attached the side effects to the node itself using exprInPlace. That caused other side effects in sibling expressions to get reordered with respect to the short circuit side effect. Instead, rewrite a && b like: r := a if r { r = b } That code we can keep correctly ordered with respect to other side-effects extracted from part of a big expression. exprInPlace seems generally unsafe. But this was the only case where exprInPlace is called not at the top level of an expression, so I don't think the other uses can actually trigger an issue (there can't be a sibling expression). TODO: maybe those cases don't need "in place", and we can retire that function generally. This CL needed a small tweak to the SSA generation of OIF so that the short circuit optimization still triggers. The short circuit optimization looks for triangle but not diamonds, so don't bother allocating a block if it will be empty. Go 1 benchmarks are in the noise. Fixes #30566 Change-Id: I19c04296bea63cbd6ad05f87a63b005029123610 Reviewed-on: https://go-review.googlesource.com/c/go/+/165617 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2018-10-15cmd/compile: provide types for all order-allocated temporariesKeith Randall
Ensure that we correctly type the stack temps for regular closures, method function closures, and slice literals. Then we don't need to override the dummy types later. Furthermore, this allows order to reuse temporaries of these types. OARRAYLIT doesn't need a temporary as far as I can tell, so I removed that case from order. Change-Id: Ic58520fa50c90639393ff78f33d3c831d5c4acb9 Reviewed-on: https://go-review.googlesource.com/c/140306 Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-14cmd/compile: reuse temporaries in order passKeith Randall
Instead of allocating a new temporary each time one is needed, keep a list of temporaries which are free (have already been VARKILLed on every path) and use one of them. Should save a lot of stack space. In a function like this: func main() { fmt.Printf("%d %d\n", 2, 3) fmt.Printf("%d %d\n", 4, 5) fmt.Printf("%d %d\n", 6, 7) } The three [2]interface{} arrays used to hold the ... args all use the same autotmp, instead of 3 different autotmps as happened previous to this CL. Change-Id: I2d728e226f81e05ae68ca8247af62014a1b032d3 Reviewed-on: https://go-review.googlesource.com/c/140301 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-10-14runtime,cmd/compile: pass strings and slices to convT2{E,I} by valueKeith Randall
When we pass these types by reference, we usually have to allocate temporaries on the stack, initialize them, then pass their address to the conversion functions. It's simpler to pass these types directly by value. This particularly applies to conversions needed for fmt.Printf (to interface{} for constructing a [...]interface{}). func f(a, b, c string) { fmt.Printf("%s %s\n", a, b) fmt.Printf("%s %s\n", b, c) } This function's stack frame shrinks from 200 to 136 bytes, and its code shrinks from 535 to 453 bytes. The go binary shrinks 0.3%. Update #24286 Aside: for this function f, we don't really need to allocate temporaries for the convT2E function. We could use the address of a, b, and c directly. That might get similar (or maybe better?) improvements. I investigated a bit, but it seemed complicated to do it safely. This change was much easier. Change-Id: I78cbe51b501fb41e1e324ce4203f0de56a1db82d Reviewed-on: https://go-review.googlesource.com/c/135377 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2018-10-03cmd/compile,runtime: remove ambiguously live logicKeith Randall
The previous CL introduced stack objects. This CL removes the old ambiguously live liveness analysis. After this CL we're relying on stack objects exclusively. Update a bunch of liveness tests to reflect the new world. Fixes #22350 Change-Id: I739b26e015882231011ce6bc1a7f426049e59f31 Reviewed-on: https://go-review.googlesource.com/c/134156 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2018-08-23all: fix typos detected by github.com/client9/misspellKazuhiro Sera
Change-Id: Iadb3c5de8ae9ea45855013997ed70f7929a88661 GitHub-Last-Rev: ae85bcf82be8fee533e2b9901c6133921382c70a GitHub-Pull-Request: golang/go#26920 Reviewed-on: https://go-review.googlesource.com/128955 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-05-22cmd/compile: enable stack maps everywhere except unsafe pointsAustin Clements
This modifies issafepoint in liveness analysis to report almost every operation as a safe point. There are four things we don't mark as safe-points: 1. Runtime code (other than at calls). 2. go:nosplit functions (other than at calls). 3. Instructions between the load of the write barrier-enabled flag and the write. 4. Instructions leading up to a uintptr -> unsafe.Pointer conversion. We'll optimize this in later CLs: name old time/op new time/op delta Template 185ms ± 2% 190ms ± 2% +2.95% (p=0.000 n=10+10) Unicode 96.3ms ± 3% 96.4ms ± 1% ~ (p=0.905 n=10+9) GoTypes 658ms ± 0% 669ms ± 1% +1.72% (p=0.000 n=10+9) Compiler 3.14s ± 1% 3.18s ± 1% +1.56% (p=0.000 n=9+10) SSA 7.41s ± 2% 7.59s ± 1% +2.48% (p=0.000 n=9+10) Flate 126ms ± 1% 128ms ± 1% +2.08% (p=0.000 n=10+10) GoParser 153ms ± 1% 157ms ± 2% +2.38% (p=0.000 n=10+10) Reflect 437ms ± 1% 442ms ± 1% +0.98% (p=0.001 n=10+10) Tar 178ms ± 1% 179ms ± 1% +0.67% (p=0.035 n=10+9) XML 223ms ± 1% 229ms ± 1% +2.58% (p=0.000 n=10+10) [Geo mean] 394ms 401ms +1.75% No effect on binary size because we're not yet emitting these extra safe points. For #24543. Change-Id: I16a1eebb9183cad7cef9d53c0fd21a973cad6859 Reviewed-on: https://go-review.googlesource.com/109348 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2018-05-01cmd/compile: open code select{send,recv,default}Matthew Dempsky
Registration now looks like: var cases [4]runtime.scases var order [8]uint16 cases[0].kind = caseSend cases[0].c = c1 cases[0].elem = &v1 if raceenabled || msanenabled { selectsetpc(&cases[0]) } cases[1].kind = caseRecv cases[1].c = c2 cases[1].elem = &v2 if raceenabled || msanenabled { selectsetpc(&cases[1]) } ... Change-Id: Ib9bcf426a4797fe4bfd8152ca9e6e08e39a70b48 Reviewed-on: https://go-review.googlesource.com/37934 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>
2018-05-01runtime: eliminate runtime.hselectMatthew Dempsky
Now the registration phase looks like: var cases [4]runtime.scases var order [8]uint16 selectsend(&cases[0], c1, &v1) selectrecv(&cases[1], c2, &v2, nil) selectrecv(&cases[2], c3, &v3, &ok) selectdefault(&cases[3]) chosen := selectgo(&cases[0], &order[0], 4) Primarily, this is just preparation for having the compiler open-code selectsend, selectrecv, and selectdefault. As a minor benefit, order can now be layed out separately on the stack in the pointer-free segment, so it won't take up space in the function's stack pointer maps. Change-Id: I5552ba594201efd31fcb40084da20b42ea569a45 Reviewed-on: https://go-review.googlesource.com/37933 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>
2018-02-27cmd/compile: mark the first word of an interface as a uintptrKeith Randall
The first word of an interface is a pointer, but for the purposes of GC we don't need to treat it as such. 1. If it is a non-empty interface, the pointer points to an itab which is always in persistentalloc space. 2. If it is an empty interface, the pointer points to a _type. a. If it is a compile-time-allocated type, it points into the read-only data section. b. If it is a reflect-allocated type, it points into the Go heap. Reflect is responsible for keeping a reference to the underlying type so it won't be GCd. If we ever have a moving GC, we need to change this for 2b (as well as scan itabs to update their itab._type fields). Write barriers on the first word of interfaces have already been removed. Change-Id: I643e91d7ac4de980ac2717436eff94097c65d959 Reviewed-on: https://go-review.googlesource.com/97518 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2017-11-02cmd/compile: specialize map creation for small hint sizesMartin Möhrmann
Handle make(map[any]any) and make(map[any]any, hint) where hint <= BUCKETSIZE special to allow for faster map initialization and to improve binary size by using runtime calls with fewer arguments. Given hint is smaller or equal to BUCKETSIZE in which case overLoadFactor(hint, 0) is false and no buckets would be allocated by makemap: * If hmap needs to be allocated on the stack then only hmap's hash0 field needs to be initialized and no call to makemap is needed. * If hmap needs to be allocated on the heap then a new special makehmap function will allocate hmap and intialize hmap's hash0 field. Reduces size of the godoc by ~36kb. AMD64 name old time/op new time/op delta NewEmptyMap 16.6ns ± 2% 5.5ns ± 2% -66.72% (p=0.000 n=10+10) NewSmallMap 64.8ns ± 1% 56.5ns ± 1% -12.75% (p=0.000 n=9+10) Updates #6853 Change-Id: I624e90da6775afaa061178e95db8aca674f44e9b Reviewed-on: https://go-review.googlesource.com/61190 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2017-10-13cmd/compile: simplify slice/array range loops for some element sizesMartin Möhrmann
In range loops over slices and arrays besides a variable to track the index an extra variable containing the address of the current element is used. To compute a pointer to the next element the elements size is added to the address. On 386 and amd64 an element of size 1, 2, 4 or 8 bytes can by copied from an array using a MOV instruction with suitable addressing mode that uses the start address of the array, the index of the element and element size as scaling factor. Thereby, for arrays and slices with suitable element size we can avoid keeping and incrementing an extra variable to compute the next elements address. Shrinks cmd/go by 4 kilobytes. AMD64: name old time/op new time/op delta BinaryTree17 2.66s ± 7% 2.54s ± 0% -4.53% (p=0.000 n=10+8) Fannkuch11 3.02s ± 1% 3.02s ± 1% ~ (p=0.579 n=10+10) FmtFprintfEmpty 45.6ns ± 1% 42.2ns ± 1% -7.46% (p=0.000 n=10+10) FmtFprintfString 69.8ns ± 1% 70.4ns ± 1% +0.84% (p=0.041 n=10+10) FmtFprintfInt 80.1ns ± 1% 79.0ns ± 1% -1.35% (p=0.000 n=10+10) FmtFprintfIntInt 127ns ± 1% 125ns ± 1% -1.00% (p=0.007 n=10+9) FmtFprintfPrefixedInt 158ns ± 2% 152ns ± 1% -4.11% (p=0.000 n=10+10) FmtFprintfFloat 218ns ± 1% 214ns ± 1% -1.61% (p=0.000 n=10+10) FmtManyArgs 508ns ± 1% 504ns ± 1% -0.93% (p=0.001 n=9+10) GobDecode 6.76ms ± 1% 6.78ms ± 1% ~ (p=0.353 n=10+10) GobEncode 5.84ms ± 1% 5.77ms ± 1% -1.31% (p=0.000 n=10+9) Gzip 223ms ± 1% 218ms ± 1% -2.39% (p=0.000 n=10+10) Gunzip 40.3ms ± 1% 40.4ms ± 3% ~ (p=0.796 n=10+10) HTTPClientServer 73.5µs ± 0% 73.3µs ± 0% -0.28% (p=0.000 n=10+9) JSONEncode 12.7ms ± 1% 12.6ms ± 8% ~ (p=0.173 n=8+10) JSONDecode 57.5ms ± 1% 56.1ms ± 2% -2.40% (p=0.000 n=10+10) Mandelbrot200 3.80ms ± 1% 3.86ms ± 6% ~ (p=0.579 n=10+10) GoParse 3.25ms ± 1% 3.23ms ± 1% ~ (p=0.052 n=10+10) RegexpMatchEasy0_32 74.4ns ± 1% 76.9ns ± 1% +3.39% (p=0.000 n=10+10) RegexpMatchEasy0_1K 243ns ± 2% 248ns ± 1% +1.86% (p=0.000 n=10+8) RegexpMatchEasy1_32 71.0ns ± 2% 72.8ns ± 1% +2.55% (p=0.000 n=10+10) RegexpMatchEasy1_1K 370ns ± 1% 383ns ± 0% +3.39% (p=0.000 n=10+9) RegexpMatchMedium_32 107ns ± 0% 113ns ± 1% +5.33% (p=0.000 n=6+10) RegexpMatchMedium_1K 35.0µs ± 1% 36.0µs ± 1% +3.13% (p=0.000 n=10+10) RegexpMatchHard_32 1.65µs ± 1% 1.69µs ± 1% +2.23% (p=0.000 n=10+9) RegexpMatchHard_1K 49.8µs ± 1% 50.6µs ± 1% +1.59% (p=0.000 n=10+10) Revcomp 398ms ± 1% 396ms ± 1% -0.51% (p=0.043 n=10+10) Template 63.4ms ± 1% 60.8ms ± 0% -4.11% (p=0.000 n=10+9) TimeParse 318ns ± 1% 322ns ± 1% +1.10% (p=0.005 n=10+10) TimeFormat 323ns ± 1% 336ns ± 1% +4.15% (p=0.000 n=10+10) Updates: #15809. Change-Id: I55915aaf6d26768e12247f8a8edf14e7630726d1 Reviewed-on: https://go-review.googlesource.com/38061 Run-TryBot: Martin Möhrmann <moehrmann@google.com> Reviewed-by: Keith Randall <khr@golang.org>
2017-08-24cmd/compile: eliminate stores to unread auto variablesMichael Munday
This is a crude compiler pass to eliminate stores to auto variables that are only ever written to. Eliminates an unnecessary store to x from the following code: func f() int { var x := 1 return *(&x) } Fixes #19765. Change-Id: If2c63a8ae67b8c590b6e0cc98a9610939a3eeffa Reviewed-on: https://go-review.googlesource.com/38746 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2017-03-21runtime: add mapdelete_fast*Hugues Bruant
Add benchmarks for map delete with int32/int64/string key Benchmark results on darwin/amd64 name old time/op new time/op delta MapDelete/Int32/1-8 151ns ± 8% 99ns ± 3% -34.39% (p=0.008 n=5+5) MapDelete/Int32/2-8 128ns ± 2% 111ns ±15% -13.40% (p=0.040 n=5+5) MapDelete/Int32/4-8 128ns ± 5% 114ns ± 2% -10.82% (p=0.008 n=5+5) MapDelete/Int64/1-8 144ns ± 0% 104ns ± 3% -27.53% (p=0.016 n=4+5) MapDelete/Int64/2-8 153ns ± 1% 126ns ± 3% -17.17% (p=0.008 n=5+5) MapDelete/Int64/4-8 178ns ± 3% 136ns ± 2% -23.60% (p=0.008 n=5+5) MapDelete/Str/1-8 187ns ± 3% 171ns ± 3% -8.54% (p=0.008 n=5+5) MapDelete/Str/2-8 221ns ± 3% 206ns ± 4% -7.18% (p=0.016 n=5+4) MapDelete/Str/4-8 256ns ± 5% 232ns ± 2% -9.36% (p=0.016 n=4+5) name old time/op new time/op delta BinaryTree17-8 2.78s ± 7% 2.70s ± 1% ~ (p=0.151 n=5+5) Fannkuch11-8 3.21s ± 2% 3.19s ± 1% ~ (p=0.310 n=5+5) FmtFprintfEmpty-8 49.1ns ± 3% 50.2ns ± 2% ~ (p=0.095 n=5+5) FmtFprintfString-8 78.6ns ± 4% 80.2ns ± 5% ~ (p=0.460 n=5+5) FmtFprintfInt-8 79.7ns ± 1% 81.0ns ± 3% ~ (p=0.103 n=5+5) FmtFprintfIntInt-8 117ns ± 2% 119ns ± 0% ~ (p=0.079 n=5+4) FmtFprintfPrefixedInt-8 153ns ± 1% 146ns ± 3% -4.19% (p=0.024 n=5+5) FmtFprintfFloat-8 239ns ± 1% 237ns ± 1% ~ (p=0.246 n=5+5) FmtManyArgs-8 506ns ± 2% 509ns ± 2% ~ (p=0.238 n=5+5) GobDecode-8 7.06ms ± 4% 6.86ms ± 1% ~ (p=0.222 n=5+5) GobEncode-8 6.01ms ± 5% 5.87ms ± 2% ~ (p=0.222 n=5+5) Gzip-8 246ms ± 4% 236ms ± 1% -4.12% (p=0.008 n=5+5) Gunzip-8 37.7ms ± 4% 37.3ms ± 1% ~ (p=0.841 n=5+5) HTTPClientServer-8 64.9µs ± 1% 64.4µs ± 0% -0.80% (p=0.032 n=5+4) JSONEncode-8 16.0ms ± 2% 16.2ms ±11% ~ (p=0.548 n=5+5) JSONDecode-8 53.2ms ± 2% 53.1ms ± 4% ~ (p=1.000 n=5+5) Mandelbrot200-8 4.33ms ± 2% 4.32ms ± 2% ~ (p=0.841 n=5+5) GoParse-8 3.24ms ± 2% 3.27ms ± 4% ~ (p=0.690 n=5+5) RegexpMatchEasy0_32-8 86.2ns ± 1% 85.2ns ± 3% ~ (p=0.286 n=5+5) RegexpMatchEasy0_1K-8 198ns ± 2% 199ns ± 1% ~ (p=0.310 n=5+5) RegexpMatchEasy1_32-8 82.6ns ± 2% 81.8ns ± 1% ~ (p=0.294 n=5+5) RegexpMatchEasy1_1K-8 359ns ± 2% 354ns ± 1% -1.39% (p=0.048 n=5+5) RegexpMatchMedium_32-8 123ns ± 2% 123ns ± 1% ~ (p=0.905 n=5+5) RegexpMatchMedium_1K-8 38.2µs ± 2% 38.6µs ± 8% ~ (p=0.690 n=5+5) RegexpMatchHard_32-8 1.92µs ± 2% 1.91µs ± 5% ~ (p=0.460 n=5+5) RegexpMatchHard_1K-8 57.6µs ± 1% 57.0µs ± 2% ~ (p=0.310 n=5+5) Revcomp-8 483ms ± 7% 441ms ± 1% -8.79% (p=0.016 n=5+4) Template-8 58.0ms ± 1% 58.2ms ± 7% ~ (p=0.310 n=5+5) TimeParse-8 324ns ± 6% 312ns ± 2% ~ (p=0.087 n=5+5) TimeFormat-8 330ns ± 1% 329ns ± 1% ~ (p=0.968 n=5+5) name old speed new speed delta GobDecode-8 109MB/s ± 4% 112MB/s ± 1% ~ (p=0.222 n=5+5) GobEncode-8 128MB/s ± 5% 131MB/s ± 2% ~ (p=0.222 n=5+5) Gzip-8 78.9MB/s ± 4% 82.3MB/s ± 1% +4.25% (p=0.008 n=5+5) Gunzip-8 514MB/s ± 4% 521MB/s ± 1% ~ (p=0.841 n=5+5) JSONEncode-8 121MB/s ± 2% 120MB/s ±10% ~ (p=0.548 n=5+5) JSONDecode-8 36.5MB/s ± 2% 36.6MB/s ± 4% ~ (p=1.000 n=5+5) GoParse-8 17.9MB/s ± 2% 17.7MB/s ± 4% ~ (p=0.730 n=5+5) RegexpMatchEasy0_32-8 371MB/s ± 1% 375MB/s ± 3% ~ (p=0.310 n=5+5) RegexpMatchEasy0_1K-8 5.15GB/s ± 1% 5.13GB/s ± 1% ~ (p=0.548 n=5+5) RegexpMatchEasy1_32-8 387MB/s ± 2% 391MB/s ± 1% ~ (p=0.310 n=5+5) RegexpMatchEasy1_1K-8 2.85GB/s ± 2% 2.89GB/s ± 1% ~ (p=0.056 n=5+5) RegexpMatchMedium_32-8 8.07MB/s ± 2% 8.06MB/s ± 1% ~ (p=0.730 n=5+5) RegexpMatchMedium_1K-8 26.8MB/s ± 2% 26.6MB/s ± 7% ~ (p=0.690 n=5+5) RegexpMatchHard_32-8 16.7MB/s ± 2% 16.7MB/s ± 5% ~ (p=0.421 n=5+5) RegexpMatchHard_1K-8 17.8MB/s ± 1% 18.0MB/s ± 2% ~ (p=0.310 n=5+5) Revcomp-8 527MB/s ± 6% 577MB/s ± 1% +9.44% (p=0.016 n=5+4) Template-8 33.5MB/s ± 1% 33.4MB/s ± 7% ~ (p=0.310 n=5+5) Updates #19495 Change-Id: Ib9ece1690813d9b4788455db43d30891e2138df5 Reviewed-on: https://go-review.googlesource.com/38172 Reviewed-by: Hugues Bruant <hugues.bruant@gmail.com> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-13runtime: add mapassign_fast*Hugues Bruant
Add benchmarks for map assignment with int32/int64/string key Benchmark results on darwin/amd64 name old time/op new time/op delta MapAssignInt32_255-8 24.7ns ± 3% 17.4ns ± 2% -29.75% (p=0.000 n=10+10) MapAssignInt32_64k-8 45.5ns ± 4% 37.6ns ± 4% -17.18% (p=0.000 n=10+10) MapAssignInt64_255-8 26.0ns ± 3% 17.9ns ± 4% -31.03% (p=0.000 n=10+10) MapAssignInt64_64k-8 46.9ns ± 5% 38.7ns ± 2% -17.53% (p=0.000 n=9+10) MapAssignStr_255-8 47.8ns ± 3% 24.8ns ± 4% -48.01% (p=0.000 n=10+10) MapAssignStr_64k-8 83.0ns ± 3% 51.9ns ± 3% -37.45% (p=0.000 n=10+9) name old time/op new time/op delta BinaryTree17-8 3.11s ±19% 2.78s ± 3% ~ (p=0.095 n=5+5) Fannkuch11-8 3.26s ± 1% 3.21s ± 2% ~ (p=0.056 n=5+5) FmtFprintfEmpty-8 50.3ns ± 1% 50.8ns ± 2% ~ (p=0.246 n=5+5) FmtFprintfString-8 82.7ns ± 4% 80.1ns ± 5% ~ (p=0.238 n=5+5) FmtFprintfInt-8 82.6ns ± 2% 81.9ns ± 3% ~ (p=0.508 n=5+5) FmtFprintfIntInt-8 124ns ± 4% 121ns ± 3% ~ (p=0.111 n=5+5) FmtFprintfPrefixedInt-8 158ns ± 6% 160ns ± 2% ~ (p=0.341 n=5+5) FmtFprintfFloat-8 249ns ± 2% 245ns ± 2% ~ (p=0.095 n=5+5) FmtManyArgs-8 513ns ± 2% 519ns ± 3% ~ (p=0.151 n=5+5) GobDecode-8 7.48ms ±12% 7.11ms ± 2% ~ (p=0.222 n=5+5) GobEncode-8 6.25ms ± 1% 6.03ms ± 2% -3.56% (p=0.008 n=5+5) Gzip-8 252ms ± 4% 252ms ± 4% ~ (p=1.000 n=5+5) Gunzip-8 38.4ms ± 3% 38.6ms ± 2% ~ (p=0.690 n=5+5) HTTPClientServer-8 76.9µs ±41% 66.4µs ± 6% ~ (p=0.310 n=5+5) JSONEncode-8 16.5ms ± 3% 16.7ms ± 3% ~ (p=0.421 n=5+5) JSONDecode-8 54.6ms ± 1% 54.3ms ± 2% ~ (p=0.548 n=5+5) Mandelbrot200-8 4.45ms ± 3% 4.47ms ± 1% ~ (p=0.841 n=5+5) GoParse-8 3.43ms ± 1% 3.32ms ± 2% -3.28% (p=0.008 n=5+5) RegexpMatchEasy0_32-8 88.2ns ± 3% 89.4ns ± 2% ~ (p=0.333 n=5+5) RegexpMatchEasy0_1K-8 205ns ± 1% 206ns ± 1% ~ (p=0.905 n=5+5) RegexpMatchEasy1_32-8 85.1ns ± 1% 85.5ns ± 5% ~ (p=0.690 n=5+5) RegexpMatchEasy1_1K-8 365ns ± 1% 371ns ± 9% ~ (p=1.000 n=5+5) RegexpMatchMedium_32-8 129ns ± 2% 128ns ± 3% ~ (p=0.730 n=5+5) RegexpMatchMedium_1K-8 39.8µs ± 0% 39.7µs ± 4% ~ (p=0.730 n=4+5) RegexpMatchHard_32-8 1.99µs ± 3% 2.05µs ±16% ~ (p=0.794 n=5+5) RegexpMatchHard_1K-8 59.3µs ± 1% 60.3µs ± 7% ~ (p=1.000 n=5+5) Revcomp-8 1.36s ±63% 0.52s ± 5% ~ (p=0.095 n=5+5) Template-8 62.6ms ±14% 60.5ms ± 5% ~ (p=0.690 n=5+5) TimeParse-8 330ns ± 2% 324ns ± 2% ~ (p=0.087 n=5+5) TimeFormat-8 350ns ± 3% 340ns ± 1% -2.86% (p=0.008 n=5+5) name old speed new speed delta GobDecode-8 103MB/s ±11% 108MB/s ± 2% ~ (p=0.222 n=5+5) GobEncode-8 123MB/s ± 1% 127MB/s ± 2% +3.71% (p=0.008 n=5+5) Gzip-8 77.1MB/s ± 4% 76.9MB/s ± 3% ~ (p=1.000 n=5+5) Gunzip-8 505MB/s ± 3% 503MB/s ± 2% ~ (p=0.690 n=5+5) JSONEncode-8 118MB/s ± 3% 116MB/s ± 3% ~ (p=0.421 n=5+5) JSONDecode-8 35.5MB/s ± 1% 35.8MB/s ± 2% ~ (p=0.397 n=5+5) GoParse-8 16.9MB/s ± 1% 17.4MB/s ± 2% +3.45% (p=0.008 n=5+5) RegexpMatchEasy0_32-8 363MB/s ± 3% 358MB/s ± 2% ~ (p=0.421 n=5+5) RegexpMatchEasy0_1K-8 4.98GB/s ± 1% 4.97GB/s ± 1% ~ (p=0.548 n=5+5) RegexpMatchEasy1_32-8 376MB/s ± 1% 375MB/s ± 5% ~ (p=0.690 n=5+5) RegexpMatchEasy1_1K-8 2.80GB/s ± 1% 2.76GB/s ± 9% ~ (p=0.841 n=5+5) RegexpMatchMedium_32-8 7.73MB/s ± 1% 7.76MB/s ± 3% ~ (p=0.730 n=5+5) RegexpMatchMedium_1K-8 25.8MB/s ± 0% 25.8MB/s ± 4% ~ (p=0.651 n=4+5) RegexpMatchHard_32-8 16.1MB/s ± 3% 15.7MB/s ±14% ~ (p=0.794 n=5+5) RegexpMatchHard_1K-8 17.3MB/s ± 1% 17.0MB/s ± 7% ~ (p=0.984 n=5+5) Revcomp-8 273MB/s ±83% 488MB/s ± 5% ~ (p=0.095 n=5+5) Template-8 31.1MB/s ±13% 32.1MB/s ± 5% ~ (p=0.690 n=5+5) Updates #19495 Change-Id: I116e9a2a4594769318b22d736464de8a98499909 Reviewed-on: https://go-review.googlesource.com/38091 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-03-07cmd/compile, runtime: simplify multiway select implementationMatthew Dempsky
This commit reworks multiway select statements to use normal control flow primitives instead of the previous setjmp/longjmp-like behavior. This simplifies liveness analysis and should prevent issues around "returns twice" function calls within SSA passes. test/live.go is updated because liveness analysis's CFG is more representative of actual control flow. The case bodies are the only real successors of the selectgo call, but previously the selectsend, selectrecv, etc. calls were included in the successors list too. Updates #19331. Change-Id: I7f879b103a4b85e62fc36a270d812f54c0aa3e83 Reviewed-on: https://go-review.googlesource.com/37661 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2017-02-28cmd/compile, runtime: specialize convT2x, don't alloc for zero valsJosh Bleecher Snyder
Prior to this CL, all runtime conversions from a concrete value to an interface went through one of two runtime calls: convT2E or convT2I. However, in practice, basic types are very common. Specializing convT2x for those basic types allows for a more efficient implementation for those types. For basic scalars and strings, allocation and copying can use the same methods as normal code. For pointer-free types, allocation can occur without zeroing, and copying can take place without GC calls. For slices, copying is cheaper and simpler. This CL adds twelve runtime routines: convT2E16, convT2I16 convT2E32, convT2I32 convT2E64, convT2I64 convT2Estring, convT2Istring convT2Eslice, convT2Islice convT2Enoptr, convT2Inoptr While compiling make.bash, 93% of all convT2x calls are now to one of these specialized convT2x call. Within specialized convT2x routines, it is cheap to check for a zero value, in a way that it is not in general. When we detect a zero value there, we return a pointer to zeroVal, rather than allocating. name old time/op new time/op delta ConvT2Ezero/zero/16-8 17.9ns ± 2% 3.0ns ± 3% -83.20% (p=0.000 n=56+56) ConvT2Ezero/zero/32-8 17.8ns ± 2% 3.0ns ± 3% -83.15% (p=0.000 n=59+60) ConvT2Ezero/zero/64-8 20.1ns ± 1% 3.0ns ± 2% -84.98% (p=0.000 n=57+57) ConvT2Ezero/zero/str-8 32.6ns ± 1% 3.0ns ± 4% -90.70% (p=0.000 n=59+60) ConvT2Ezero/zero/slice-8 36.7ns ± 2% 3.0ns ± 2% -91.78% (p=0.000 n=59+59) ConvT2Ezero/zero/big-8 91.9ns ± 2% 85.9ns ± 2% -6.52% (p=0.000 n=57+57) ConvT2Ezero/nonzero/16-8 17.7ns ± 2% 12.7ns ± 3% -28.38% (p=0.000 n=55+60) ConvT2Ezero/nonzero/32-8 17.8ns ± 1% 12.7ns ± 1% -28.44% (p=0.000 n=54+57) ConvT2Ezero/nonzero/64-8 20.0ns ± 1% 15.0ns ± 1% -24.90% (p=0.000 n=56+58) ConvT2Ezero/nonzero/str-8 32.6ns ± 1% 25.7ns ± 1% -21.17% (p=0.000 n=58+55) ConvT2Ezero/nonzero/slice-8 36.8ns ± 2% 30.4ns ± 1% -17.32% (p=0.000 n=60+52) ConvT2Ezero/nonzero/big-8 92.1ns ± 2% 85.9ns ± 2% -6.70% (p=0.000 n=57+59) Benchmarks on a real program (the compiler): name old time/op new time/op delta Template 227ms ± 5% 221ms ± 2% -2.48% (p=0.000 n=30+26) Unicode 102ms ± 5% 100ms ± 3% -1.30% (p=0.009 n=30+26) GoTypes 656ms ± 5% 659ms ± 4% ~ (p=0.208 n=30+30) Compiler 2.82s ± 2% 2.82s ± 1% ~ (p=0.614 n=29+27) Flate 128ms ± 2% 128ms ± 5% ~ (p=0.783 n=27+28) GoParser 158ms ± 3% 158ms ± 3% ~ (p=0.261 n=28+30) Reflect 408ms ± 7% 401ms ± 3% ~ (p=0.075 n=30+30) Tar 123ms ± 6% 121ms ± 8% ~ (p=0.287 n=29+30) XML 220ms ± 2% 220ms ± 4% ~ (p=0.805 n=29+29) name old user-ns/op new user-ns/op delta Template 281user-ms ± 4% 279user-ms ± 3% -0.87% (p=0.044 n=28+28) Unicode 142user-ms ± 4% 141user-ms ± 3% -1.04% (p=0.015 n=30+27) GoTypes 884user-ms ± 3% 886user-ms ± 2% ~ (p=0.532 n=30+30) Compiler 3.94user-s ± 3% 3.92user-s ± 1% ~ (p=0.185 n=30+28) Flate 165user-ms ± 2% 165user-ms ± 4% ~ (p=0.780 n=27+29) GoParser 209user-ms ± 2% 208user-ms ± 3% ~ (p=0.453 n=28+30) Reflect 533user-ms ± 6% 526user-ms ± 3% ~ (p=0.057 n=30+30) Tar 156user-ms ± 6% 154user-ms ± 6% ~ (p=0.133 n=29+30) XML 288user-ms ± 4% 288user-ms ± 4% ~ (p=0.633 n=30+30) name old alloc/op new alloc/op delta Template 41.0MB ± 0% 40.9MB ± 0% -0.11% (p=0.000 n=29+29) Unicode 32.6MB ± 0% 32.6MB ± 0% ~ (p=0.572 n=29+30) GoTypes 122MB ± 0% 122MB ± 0% -0.10% (p=0.000 n=30+30) Compiler 482MB ± 0% 481MB ± 0% -0.07% (p=0.000 n=30+29) Flate 26.6MB ± 0% 26.6MB ± 0% ~ (p=0.096 n=30+30) GoParser 32.7MB ± 0% 32.6MB ± 0% -0.06% (p=0.011 n=28+28) Reflect 84.2MB ± 0% 84.1MB ± 0% -0.17% (p=0.000 n=29+30) Tar 27.7MB ± 0% 27.7MB ± 0% -0.05% (p=0.032 n=27+28) XML 44.7MB ± 0% 44.7MB ± 0% ~ (p=0.131 n=28+30) name old allocs/op new allocs/op delta Template 373k ± 1% 370k ± 1% -0.76% (p=0.000 n=30+30) Unicode 325k ± 1% 325k ± 1% ~ (p=0.383 n=29+30) GoTypes 1.16M ± 0% 1.15M ± 0% -0.75% (p=0.000 n=29+30) Compiler 4.15M ± 0% 4.13M ± 0% -0.59% (p=0.000 n=30+29) Flate 238k ± 1% 237k ± 1% -0.62% (p=0.000 n=30+30) GoParser 304k ± 1% 302k ± 1% -0.64% (p=0.000 n=30+28) Reflect 1.00M ± 0% 0.99M ± 0% -1.10% (p=0.000 n=29+30) Tar 245k ± 1% 244k ± 1% -0.59% (p=0.000 n=27+29) XML 391k ± 1% 389k ± 1% -0.59% (p=0.000 n=29+30) Change-Id: Id7f456d690567c2b0a96b0d6d64de8784b6e305f Reviewed-on: https://go-review.googlesource.com/36476 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2017-02-10cmd/compile/internal/syntax: removed gcCompat code needed to pass orig. testsRobert Griesemer
The gcCompat mode was introduced to match the new parser's node position setup exactly with the positions used by the original parser. Some of the gcCompat adjustments were required to satisfy syntax error test cases, and the rest were required to make toolstash cmp pass. This change removes the former gcCompat adjustments and instead adjusts the respective test cases as necessary. In some cases this makes the error lines consistent with the ones reported by gccgo. Where it has changed, the position associated with a given syntactic construct is the position (line/col number) of the left-most token belonging to the construct. Change-Id: I5b60c00c5999a895c4d6d6e9b383c6405ccf725c Reviewed-on: https://go-review.googlesource.com/36695 Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2017-02-03cmd/compile: make sure output params are live if there is a deferKeith Randall
If there is a defer, and that defer recovers, then the caller can see all of the output parameters. That means that we must mark all the output parameters live at any point which might panic. If there is no defer then this is not necessary. This is implemented. We could also detect whether there is a recover in any of the defers. If not, we would need to mark only output params that the defer actually references (and the closure mechanism already does that). This is not implemented. Fixes #18860. Change-Id: If984fe6686eddce9408bf25e725dd17fc16b8578 Reviewed-on: https://go-review.googlesource.com/36030 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Russ Cox <rsc@golang.org>
2017-02-02cmd/compile: convert constants to interfaces without allocatingJosh Bleecher Snyder
The order pass is responsible for ensuring that values passed to runtime functions, including convT2E/convT2I, are addressable. Prior to this CL, this was always accomplished by creating a temp, which frequently escaped to the heap, causing allocations, perhaps most notably in code like: fmt.Println(1, 2, 3) // allocates three times None of the runtime routines modify the contents of the pointers they receive, so in the case of constants, instead of creating a temp value, we can create a static value. (Marking the static value as read-only provides protection against accidental attempts by the runtime to modify the constant data.) This improves code generation for code like: panic("abc") c <- 2 // c is a chan int which can now simply refer to "abc" and 2, rather than going by way of a temporary. It also allows us to optimize convT2E/convT2I, by recognizing static readonly values and directly constructing the interface. This CL adds ~0.5% to binary size, despite decreasing the size of many functions, because it also adds many static symbols. This binary size regression could be recovered in future (but currently unplanned) work. There is a lot of content-duplication in these symbols; this statement generates six new symbols, three containing an int 1 and three containing a pointer to the string "a": fmt.Println(1, 1, 1, "a", "a", "a") These symbols could be made content-addressable. Furthermore, these symbols are small, so the alignment and naming overhead is large. As with the go.strings section, these symbols could be hidden and have their alignment reduced. The changes to test/live.go make it impossible (at least with current optimization techniques) to place the values being passed to the runtime in static symbols, preserving autotmp creation. Fixes #18704 Benchmarks from fmt and go-kit's logging package: github.com/go-kit/kit/log name old time/op new time/op delta JSONLoggerSimple-8 1.91µs ± 2% 2.11µs ±22% ~ (p=1.000 n=9+10) JSONLoggerContextual-8 2.60µs ± 6% 2.43µs ± 2% -6.29% (p=0.000 n=9+10) Discard-8 101ns ± 2% 34ns ±14% -66.33% (p=0.000 n=10+9) OneWith-8 161ns ± 1% 102ns ±16% -36.78% (p=0.000 n=10+10) TwoWith-8 175ns ± 3% 106ns ± 7% -39.36% (p=0.000 n=10+9) TenWith-8 293ns ± 3% 227ns ±15% -22.44% (p=0.000 n=9+10) LogfmtLoggerSimple-8 704ns ± 2% 608ns ± 2% -13.65% (p=0.000 n=10+9) LogfmtLoggerContextual-8 962ns ± 1% 860ns ±17% -10.57% (p=0.003 n=9+10) NopLoggerSimple-8 188ns ± 1% 120ns ± 1% -36.39% (p=0.000 n=9+10) NopLoggerContextual-8 379ns ± 1% 243ns ± 0% -35.77% (p=0.000 n=9+10) ValueBindingTimestamp-8 577ns ± 1% 499ns ± 1% -13.51% (p=0.000 n=10+10) ValueBindingCaller-8 898ns ± 2% 844ns ± 2% -6.00% (p=0.000 n=10+10) name old alloc/op new alloc/op delta JSONLoggerSimple-8 904B ± 0% 872B ± 0% -3.54% (p=0.000 n=10+10) JSONLoggerContextual-8 1.20kB ± 0% 1.14kB ± 0% -5.33% (p=0.000 n=10+10) Discard-8 64.0B ± 0% 32.0B ± 0% -50.00% (p=0.000 n=10+10) OneWith-8 96.0B ± 0% 64.0B ± 0% -33.33% (p=0.000 n=10+10) TwoWith-8 160B ± 0% 128B ± 0% -20.00% (p=0.000 n=10+10) TenWith-8 672B ± 0% 640B ± 0% -4.76% (p=0.000 n=10+10) LogfmtLoggerSimple-8 128B ± 0% 96B ± 0% -25.00% (p=0.000 n=10+10) LogfmtLoggerContextual-8 304B ± 0% 240B ± 0% -21.05% (p=0.000 n=10+10) NopLoggerSimple-8 128B ± 0% 96B ± 0% -25.00% (p=0.000 n=10+10) NopLoggerContextual-8 304B ± 0% 240B ± 0% -21.05% (p=0.000 n=10+10) ValueBindingTimestamp-8 159B ± 0% 127B ± 0% -20.13% (p=0.000 n=10+10) ValueBindingCaller-8 112B ± 0% 80B ± 0% -28.57% (p=0.000 n=10+10) name old allocs/op new allocs/op delta JSONLoggerSimple-8 19.0 ± 0% 17.0 ± 0% -10.53% (p=0.000 n=10+10) JSONLoggerContextual-8 25.0 ± 0% 21.0 ± 0% -16.00% (p=0.000 n=10+10) Discard-8 3.00 ± 0% 1.00 ± 0% -66.67% (p=0.000 n=10+10) OneWith-8 3.00 ± 0% 1.00 ± 0% -66.67% (p=0.000 n=10+10) TwoWith-8 3.00 ± 0% 1.00 ± 0% -66.67% (p=0.000 n=10+10) TenWith-8 3.00 ± 0% 1.00 ± 0% -66.67% (p=0.000 n=10+10) LogfmtLoggerSimple-8 4.00 ± 0% 2.00 ± 0% -50.00% (p=0.000 n=10+10) LogfmtLoggerContextual-8 7.00 ± 0% 3.00 ± 0% -57.14% (p=0.000 n=10+10) NopLoggerSimple-8 4.00 ± 0% 2.00 ± 0% -50.00% (p=0.000 n=10+10) NopLoggerContextual-8 7.00 ± 0% 3.00 ± 0% -57.14% (p=0.000 n=10+10) ValueBindingTimestamp-8 5.00 ± 0% 3.00 ± 0% -40.00% (p=0.000 n=10+10) ValueBindingCaller-8 4.00 ± 0% 2.00 ± 0% -50.00% (p=0.000 n=10+10) fmt name old time/op new time/op delta SprintfPadding-8 88.9ns ± 3% 79.1ns ± 1% -11.09% (p=0.000 n=10+7) SprintfEmpty-8 12.6ns ± 3% 12.8ns ± 3% ~ (p=0.136 n=10+10) SprintfString-8 38.7ns ± 5% 26.9ns ± 6% -30.65% (p=0.000 n=10+10) SprintfTruncateString-8 56.7ns ± 2% 47.0ns ± 3% -17.05% (p=0.000 n=10+10) SprintfQuoteString-8 164ns ± 2% 153ns ± 2% -7.01% (p=0.000 n=10+10) SprintfInt-8 38.9ns ±15% 26.5ns ± 2% -31.93% (p=0.000 n=10+9) SprintfIntInt-8 60.3ns ± 9% 38.2ns ± 1% -36.67% (p=0.000 n=10+8) SprintfPrefixedInt-8 58.6ns ±13% 51.2ns ±11% -12.66% (p=0.001 n=10+10) SprintfFloat-8 71.4ns ± 3% 64.2ns ± 3% -10.08% (p=0.000 n=8+10) SprintfComplex-8 175ns ± 3% 159ns ± 2% -9.03% (p=0.000 n=10+10) SprintfBoolean-8 33.5ns ± 4% 25.7ns ± 5% -23.28% (p=0.000 n=10+10) SprintfHexString-8 65.3ns ± 3% 51.7ns ± 5% -20.86% (p=0.000 n=10+9) SprintfHexBytes-8 67.2ns ± 5% 67.9ns ± 4% ~ (p=0.383 n=10+10) SprintfBytes-8 129ns ± 7% 124ns ± 7% ~ (p=0.074 n=9+10) SprintfStringer-8 127ns ± 4% 126ns ± 8% ~ (p=0.506 n=9+10) SprintfStructure-8 357ns ± 3% 359ns ± 3% ~ (p=0.469 n=10+10) ManyArgs-8 203ns ± 6% 126ns ± 3% -37.94% (p=0.000 n=10+10) FprintInt-8 119ns ±10% 74ns ± 3% -37.54% (p=0.000 n=10+10) FprintfBytes-8 122ns ± 4% 120ns ± 3% ~ (p=0.124 n=10+10) FprintIntNoAlloc-8 78.2ns ± 5% 74.1ns ± 3% -5.28% (p=0.000 n=10+10) ScanInts-8 349µs ± 1% 349µs ± 0% ~ (p=0.606 n=9+8) ScanRecursiveInt-8 43.8ms ± 7% 40.1ms ± 2% -8.42% (p=0.000 n=10+10) ScanRecursiveIntReaderWrapper-8 43.5ms ± 4% 40.4ms ± 2% -7.16% (p=0.000 n=10+9) name old alloc/op new alloc/op delta SprintfPadding-8 24.0B ± 0% 16.0B ± 0% -33.33% (p=0.000 n=10+10) SprintfEmpty-8 0.00B 0.00B ~ (all equal) SprintfString-8 21.0B ± 0% 5.0B ± 0% -76.19% (p=0.000 n=10+10) SprintfTruncateString-8 32.0B ± 0% 16.0B ± 0% -50.00% (p=0.000 n=10+10) SprintfQuoteString-8 48.0B ± 0% 32.0B ± 0% -33.33% (p=0.000 n=10+10) SprintfInt-8 16.0B ± 0% 1.0B ± 0% -93.75% (p=0.000 n=10+10) SprintfIntInt-8 24.0B ± 0% 3.0B ± 0% -87.50% (p=0.000 n=10+10) SprintfPrefixedInt-8 72.0B ± 0% 64.0B ± 0% -11.11% (p=0.000 n=10+10) SprintfFloat-8 16.0B ± 0% 8.0B ± 0% -50.00% (p=0.000 n=10+10) SprintfComplex-8 48.0B ± 0% 32.0B ± 0% -33.33% (p=0.000 n=10+10) SprintfBoolean-8 8.00B ± 0% 4.00B ± 0% -50.00% (p=0.000 n=10+10) SprintfHexString-8 96.0B ± 0% 80.0B ± 0% -16.67% (p=0.000 n=10+10) SprintfHexBytes-8 112B ± 0% 112B ± 0% ~ (all equal) SprintfBytes-8 96.0B ± 0% 96.0B ± 0% ~ (all equal) SprintfStringer-8 32.0B ± 0% 32.0B ± 0% ~ (all equal) SprintfStructure-8 256B ± 0% 256B ± 0% ~ (all equal) ManyArgs-8 80.0B ± 0% 0.0B -100.00% (p=0.000 n=10+10) FprintInt-8 8.00B ± 0% 0.00B -100.00% (p=0.000 n=10+10) FprintfBytes-8 32.0B ± 0% 32.0B ± 0% ~ (all equal) FprintIntNoAlloc-8 0.00B 0.00B ~ (all equal) ScanInts-8 15.2kB ± 0% 15.2kB ± 0% ~ (p=0.248 n=9+10) ScanRecursiveInt-8 21.6kB ± 0% 21.6kB ± 0% ~ (all equal) ScanRecursiveIntReaderWrapper-8 21.7kB ± 0% 21.7kB ± 0% ~ (all equal) name old allocs/op new allocs/op delta SprintfPadding-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10) SprintfEmpty-8 0.00 0.00 ~ (all equal) SprintfString-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10) SprintfTruncateString-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10) SprintfQuoteString-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10) SprintfInt-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10) SprintfIntInt-8 3.00 ± 0% 1.00 ± 0% -66.67% (p=0.000 n=10+10) SprintfPrefixedInt-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10) SprintfFloat-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10) SprintfComplex-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10) SprintfBoolean-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10) SprintfHexString-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10) SprintfHexBytes-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) SprintfBytes-8 2.00 ± 0% 2.00 ± 0% ~ (all equal) SprintfStringer-8 4.00 ± 0% 4.00 ± 0% ~ (all equal) SprintfStructure-8 7.00 ± 0% 7.00 ± 0% ~ (all equal) ManyArgs-8 8.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10) FprintInt-8 1.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10) FprintfBytes-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) FprintIntNoAlloc-8 0.00 0.00 ~ (all equal) ScanInts-8 1.60k ± 0% 1.60k ± 0% ~ (all equal) ScanRecursiveInt-8 1.71k ± 0% 1.71k ± 0% ~ (all equal) ScanRecursiveIntReaderWrapper-8 1.71k ± 0% 1.71k ± 0% ~ (all equal) Change-Id: I7ba72a25fea4140a0ba40a9f443103ed87cc69b5 Reviewed-on: https://go-review.googlesource.com/35554 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2017-01-09cmd/compile: insert scheduling checks on loop backedgesDavid Chase
Loop breaking with a counter. Benchmarked (see comments), eyeball checked for sanity on popular loops. This code ought to handle loops in general, and properly inserts phi functions in cases where the earlier version might not have. Includes test, plus modifications to test/run.go to deal with timeout and killing looping test. Tests broken by the addition of extra code (branch frequency and live vars) for added checks turn the check insertion off. If GOEXPERIMENT=preemptibleloops, the compiler inserts reschedule checks on every backedge of every reducible loop. Alternately, specifying GO_GCFLAGS=-d=ssa/insert_resched_checks/on will enable it for a single compilation, but because the core Go libraries contain some loops that may run long, this is less likely to have the desired effect. This is intended as a tool to help in the study and diagnosis of GC and other latency problems, now that goal STW GC latency is on the order of 100 microseconds or less. Updates #17831. Updates #10958. Change-Id: I6206c163a5b0248e3f21eb4fc65f73a179e1f639 Reviewed-on: https://go-review.googlesource.com/33910 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2016-10-31cmd/compile: mark temps with new AutoTemp flag, and use it.David Chase
This is an extension of https://go-review.googlesource.com/c/31662/ to mark all the temporaries, not just the ssa-generated ones. Before-and-after ls -l `go tool -n compile` shows a 3% reduction in size (or rather, a prior 3% inflation for failing to filter temps out properly.) Replaced name-dependent "is it a temp?" tests with calls to *Node.IsAutoTmp(), which depends on AutoTemp. Also replace calls to istemp(n) with n.IsAutoTmp(), to reduce duplication and clean up function name space. Generated temporaries now come with a "." prefix to avoid (apparently harmless) clashes with legal Go variable names. Fixes #17644. Fixes #17240. Change-Id: If1417f29c79a7275d7303ddf859b51472890fd43 Reviewed-on: https://go-review.googlesource.com/32255 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-10-12cmd/compile,runtime: redo how map assignments workKeith Randall
To compile: m[k] = v instead of: mapassign(maptype, m, &k, &v), do do: *mapassign(maptype, m, &k) = v mapassign returns a pointer to the value slot in the map. It is just like mapaccess except that it will allocate a new slot if k is not already present in the map. This makes map accesses faster but potentially larger (codewise). It is faster because the write into the map is done when the compiler knows the concrete type, so it can be done with a few store instructions instead of calling typedmemmove. We also potentially avoid stack temporaries to hold v. The code can be larger when the map has pointers in its value type, since there is a write barrier call in addition to the mapassign call. That makes the code at the callsite a bit bigger (go binary is 0.3% bigger). This CL is in preparation for doing operations like m[k] += v with only a single runtime call. That will roughly double the speed of such operations. Update #17133 Update #5147 Change-Id: Ia435f032090a2ed905dac9234e693972fe8c2dc5 Reviewed-on: https://go-review.googlesource.com/30815 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-03cmd/compile: relax liveness restrictions on ambiguously liveThan McIntosh
Update gc liveness to remove special conservative treatment of ambiguously live vars, since there is no longer a need to protect against GCDEBUG=gcdead. Change-Id: Id6e2d03218f7d67911e8436d283005a124e6957f Reviewed-on: https://go-review.googlesource.com/24896 Reviewed-by: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org>
2016-09-22test: errorcheck auto-generated functionsCherry Zhang
Add an "errorcheckwithauto" action which performs error check including lines with auto-generated functions (excluded by default). Comment "// ERRORAUTO" matches these lines. Add testcase for CL 29570 (as an example). Updates #16016, #17186. Change-Id: Iaba3727336cd602f3dda6b9e5f97dafe0848e632 Reviewed-on: https://go-review.googlesource.com/29652 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2016-09-19cmd/compile: args no longer live until end-of-functionKeith Randall
We're dropping this behavior in favor of runtime.KeepAlive. Implement runtime.KeepAlive as an intrinsic. Update #15843 Change-Id: Ib60225bd30d6770ece1c3c7d1339a06aa25b1cbc Reviewed-on: https://go-review.googlesource.com/28310 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2016-09-19cmd/compile: inline convT2{I,E} when result doesn't escapeKeith Randall
No point in calling a function when we can build the interface using a known type (or itab) and the address of a local. Get rid of third arg (preallocated stack space) to convT2{I,E}. Makes go binary smaller by 0.2% benchmark old ns/op new ns/op delta BenchmarkEfaceInteger-8 16.7 10.1 -39.52% Update #17118 Update #15375 Change-Id: I9724a1f802bfa1e3957bf1856b55558278e198a2 Reviewed-on: https://go-review.googlesource.com/29373 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-09-16cmd/compile: turn live variable test off for ppcKeith Randall
ppc64 has an extraneous variable live in some situations. We need a better tighten pass to get rid of this extra variable. I'm working on it, but fix the test in the meantime. Fixes build for ppc64. Change-Id: I1efb9ccb234a64f2a1c228abd2b3195f67fbeb41 Reviewed-on: https://go-review.googlesource.com/29353 Reviewed-by: David Chase <drchase@google.com>
2016-09-15test,cmd/compile: remove _ssa file suffixKeith Randall
Everything is SSA now. Update #16357 Change-Id: I436dbe367b863ee81a3695a7d653ba4bfc5b0f6c Reviewed-on: https://go-review.googlesource.com/29232 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-09-15test: make SSA tests unconditionalKeith Randall
Delete legacy backend tests, make SSA tests unconditional. Next CL will remove _ssa from the file names. Update #16357 Change-Id: I2a7f5dcbc69455f63b5e6e6b2725df26ab86c8dd Reviewed-on: https://go-review.googlesource.com/29231 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-09-13cmd/compile: add SSA backend for s390x and enable by defaultMichael Munday
The new SSA backend modifies the ABI slightly: R0 is now a usable general purpose register. Fixes #16677. Change-Id: I367435ce921e0c7e79e021c80cf8ef5d1d1466cf Reviewed-on: https://go-review.googlesource.com/28978 Run-TryBot: Michael Munday <munday@ca.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2016-09-07cmd/compile: ignore contentEscapes for marking nodes as escapingKeith Randall
Redo of CL 28575 with fixed test. We're in a pre-KeepAlive world for a bit yet, the old tests were in a client which was in a post-KeepAlive world. Change-Id: I114fd630339d761ab3306d1d99718d3cb973678d Reviewed-on: https://go-review.googlesource.com/28582 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-07Revert of cmd/compile: ignore contentEscapes for marking nodes as escapingBrad Fitzpatrick
Reason for revert: broke the build due to cherrypick; relies on an unsubmitted parent CL. Original issue's description: > cmd/compile: ignore contentEscapes for marking nodes as escaping > > We can still stack allocate and VarKill nodes which don't > escape but their content does. > > Fixes #16996 > > Change-Id: If8aa0fcf2c327b4cb880a3d5af8d213289e6f6bf > Reviewed-on: https://go-review.googlesource.com/28575 > Run-TryBot: Keith Randall <khr@golang.org> > TryBot-Result: Gobot Gobot <gobot@golang.org> > Reviewed-by: David Chase <drchase@google.com> > Change-Id: Ie1a325209de14d70af6acb2d78269b7a0450da7a Reviewed-on: https://go-review.googlesource.com/28578 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-09-07cmd/compile: ignore contentEscapes for marking nodes as escapingKeith Randall
We can still stack allocate and VarKill nodes which don't escape but their content does. Fixes #16996 Change-Id: If8aa0fcf2c327b4cb880a3d5af8d213289e6f6bf Reviewed-on: https://go-review.googlesource.com/28575 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2016-08-26cmd/compile: add MIPS64 optimizations, SSA on by defaultCherry Zhang
Add the following optimizations: - fold constants - fold address into load/store - simplify extensions and conditional branches - remove nil checks Turn on SSA on MIPS64 by default, and toggle the tests. Fixes #16359. Change-Id: I7f1e38c2509e22e42cd024e712990ebbe47176bd Reviewed-on: https://go-review.googlesource.com/27870 Run-TryBot: Cherry Zhang <cherryyz@google.com> Reviewed-by: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-08-18cmd/compile: ppc64le working, not optimized enoughDavid Chase
This time with the cherry-pick from the proper patch of the old CL. Stack size increased. Corrected NaN-comparison glitches. Marked g register as clobbered by calls. Fixed shared libraries. live_ssa.go still disabled because of differences. Presumably turning on more optimization will fix both the stack size and the live_ssa.go glitches. Enhanced debugging output for shared libs test. Rebased onto master. Updates #16010. Change-Id: I40864faf1ef32c118fb141b7ef8e854498e6b2c4 Reviewed-on: https://go-review.googlesource.com/27159 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
2016-08-15[dev.ssa] cmd/compile, etc.: more ARM64 optimizations, and enable SSA by defaultCherry Zhang
Add more ARM64 optimizations: - use hardware zero register when it is possible. - use shifted ops. The assembler supports shifted ops but not documented, nor knows how to print it. This CL adds them. - enable fast division. This was disabled because it makes the old backend generate slower code. But with SSA it generates faster code. Turn on SSA by default, also adjust tests. Change-Id: I7794479954c83bb65008dcb457bc1e21d7496da6 Reviewed-on: https://go-review.googlesource.com/26950 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2016-08-10[dev.ssa] cmd/compile: implement GO386=387Keith Randall
Last part of the 386 SSA port. Modify the x86 backend to simulate SSE registers and instructions with 387 registers and instructions. The simulation isn't terribly performant, but it works, and the old implementation wasn't very performant either. Leaving to people who care about 387 to optimize if they want. Turn on SSA backend for 386 by default. Fixes #16358 Change-Id: I678fb59132620b2c47e993c1c10c4c21135f70c0 Reviewed-on: https://go-review.googlesource.com/25271 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2016-08-09[dev.ssa] cmd/compile: port SSA backend to amd64p32Keith Randall
It's not a new backend, just a PtrSize==4 modification of the existing AMD64 backend. Change-Id: Icc63521a5cf4ebb379f7430ef3f070894c09afda Reviewed-on: https://go-review.googlesource.com/25586 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>