diff options
author | Cherry Zhang <cherryyz@google.com> | 2020-02-11 18:54:30 -0500 |
---|---|---|
committer | Cherry Zhang <cherryyz@google.com> | 2020-02-13 19:41:53 +0000 |
commit | 123f7dd3e1eb90825ece57b8dde39438ca34f150 (patch) | |
tree | 87c73b63b74568419b164d64b4f5509ffaad9aae /src/runtime/preempt_amd64.s | |
parent | a0c9fb6bd359f111f19176ff176244b91a3e7eaa (diff) | |
download | go-123f7dd3e1eb90825ece57b8dde39438ca34f150.tar.gz go-123f7dd3e1eb90825ece57b8dde39438ca34f150.zip |
runtime: zero upper bit of Y registers in asyncPreempt on darwin/amd64
Apparently, the signal handling code path in darwin kernel leaves
the upper bits of Y registers in a dirty state, which causes many
SSE operations (128-bit and narrower) become much slower. Clear
the upper bits to get to a clean state.
We do it at the entry of asyncPreempt, which is immediately
following exiting from the kernel's signal handling code, if we
actually injected a call. It does not cover other exits where we
don't inject a call, e.g. failed preemption, profiling signal, or
other async signals. But it does cover an important use case of
async signals, preempting a tight numerical loop, which we
introduced in this cycle.
Running the benchmark in issue #37174:
name old time/op new time/op delta
Fast-8 90.0ns ± 1% 46.8ns ± 3% -47.97% (p=0.000 n=10+10)
Slow-8 188ns ± 5% 49ns ± 1% -73.82% (p=0.000 n=10+9)
There is no more slowdown due to preemption signals.
For #37174.
Change-Id: I8b83d083fade1cabbda09b4bc25ccbadafaf7605
Reviewed-on: https://go-review.googlesource.com/c/go/+/219131
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Diffstat (limited to 'src/runtime/preempt_amd64.s')
-rw-r--r-- | src/runtime/preempt_amd64.s | 3 |
1 files changed, 3 insertions, 0 deletions
diff --git a/src/runtime/preempt_amd64.s b/src/runtime/preempt_amd64.s index d50c2f3a51..0f2fd7d8dd 100644 --- a/src/runtime/preempt_amd64.s +++ b/src/runtime/preempt_amd64.s @@ -4,6 +4,9 @@ #include "textflag.h" TEXT ·asyncPreempt(SB),NOSPLIT|NOFRAME,$0-0 + #ifdef GOOS_darwin + VZEROUPPER + #endif PUSHQ BP MOVQ SP, BP // Save flags before clobbering them |