runtime: skip work recheck for non-spinning Ms

When an M transitions from spinning to non-spinning state, it must recheck most sources of work to avoid missing work submitted between its initial check and decrementing sched.nmspinning (see "delicate dance" comment). Ever since the scheduler rewrite in Go 1.1 (golang.org/cl/7314062), we have performed this recheck on all Ms before stopping, regardless of whether or not they were spinning. Unfortunately, there is a problem with this approach: non-spinning Ms are not eligible to steal work (note the skip over the stealWork block), but can detect work during the recheck. If there is work available, this non-spinning M will jump to top, skip stealing, land in recheck again, and repeat. i.e., it will spin uselessly. The spin is bounded. This can only occur if there is another spinning M, which will either take the work, allowing this M to stop, or take some other work, allowing this M to upgrade to spinning. But the spinning is ultimately just a fancy spin-wait. golang.org/issue/43997 discusses several ways to address this. This CL takes the simplest approach: skipping the recheck on non-spinning Ms and allowing them to go to stop. Results for scheduler-relevant runtime and time benchmarks can be found at https://perf.golang.org/search?q=upload:20210420.5. The new BenchmarkCreateGoroutinesSingle is a characteristic example workload that hits this issue hard. A single M readies lots of work without itself parking. Other Ms must spin to steal work, which is very short-lived, forcing those Ms to spin again. Some of the Ms will be non-spinning and hit the above bug. With this fixed, that benchmark drops in CPU usage by a massive 68%, and wall time 24%. BenchmarkNetpollBreak shows similar drops because it is unintentionally almost the same benchmark (create short-living Gs in a loop). Typical well-behaved programs show little change. We also measure scheduling latency (time from goready to execute). Note that many of these benchmarks are very noisy because they don't involve much scheduling. Those that do, like CreateGoroutinesSingle, are expected to increase as we are replacing unintentional spin waiting with a real park. Fixes #43997 Change-Id: Ie1d1e1800f393cee1792455412caaa5865d13562 Reviewed-on: https://go-review.googlesource.com/c/go/+/310850 Trust: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>
author: Michael Pratt <mpratt@google.com> 2021-04-07 12:01:44 -0400
committer: Michael Pratt <mpratt@google.com> 2021-04-22 20:45:37 +0000
commit: ecfce58965da6017e02f5fc5c03eda52fc41c8d6 (patch)
tree: 6a69a25d0204587919d739ad26816fcb9db5aa1b /src/runtime/proc.go
parent: b6ff3c69d5fdf933f5265f95ae4bb12eaecc792f (diff)
download: go-ecfce58965da6017e02f5fc5c03eda52fc41c8d6.tar.gz
go-ecfce58965da6017e02f5fc5c03eda52fc41c8d6.zip
1 files changed, 30 insertions, 28 deletions
diff --git a/src/runtime/proc.go b/src/runtime/proc.go
index 5c7328aacc..37c051634c 100644
--- a/src/runtime/proc.go
+++ b/src/runtime/proc.go
@@ -2841,44 +2841,46 @@ top:
 		if int32(atomic.Xadd(&sched.nmspinning, -1)) < 0 {
 			throw("findrunnable: negative nmspinning")
 		}
-	}
 
-	// Check all runqueues once again.
-	_p_ = checkRunqsNoP(allpSnapshot, idlepMaskSnapshot)
-	if _p_ != nil {
-		acquirep(_p_)
-		if wasSpinning {
+		// Note the for correctness, only the last M transitioning from
+		// spinning to non-spinning must perform these rechecks to
+		// ensure no missed work. We are performing it on every M that
+		// transitions as a conservative change to monitor effects on
+		// latency. See golang.org/issue/43997.
+
+		// Check all runqueues once again.
+		_p_ = checkRunqsNoP(allpSnapshot, idlepMaskSnapshot)
+		if _p_ != nil {
+			acquirep(_p_)
 			_g_.m.spinning = true
 			atomic.Xadd(&sched.nmspinning, 1)
+			goto top
 		}
-		goto top
-	}
 
-	// Check for idle-priority GC work again.
-	_p_, gp = checkIdleGCNoP()
-	if _p_ != nil {
-		acquirep(_p_)
-		if wasSpinning {
+		// Check for idle-priority GC work again.
+		_p_, gp = checkIdleGCNoP()
+		if _p_ != nil {
+			acquirep(_p_)
 			_g_.m.spinning = true
 			atomic.Xadd(&sched.nmspinning, 1)
-		}
 
-		// Run the idle worker.
-		_p_.gcMarkWorkerMode = gcMarkWorkerIdleMode
-		casgstatus(gp, _Gwaiting, _Grunnable)
-		if trace.enabled {
-			traceGoUnpark(gp, 0)
+			// Run the idle worker.
+			_p_.gcMarkWorkerMode = gcMarkWorkerIdleMode
+			casgstatus(gp, _Gwaiting, _Grunnable)
+			if trace.enabled {
+				traceGoUnpark(gp, 0)
+			}
+			return gp, false
 		}
-		return gp, false
-	}
 
-	// Finally, check for timer creation or expiry concurrently with
-	// transitioning from spinning to non-spinning.
-	//
-	// Note that we cannot use checkTimers here because it calls
-	// adjusttimers which may need to allocate memory, and that isn't
-	// allowed when we don't have an active P.
-	pollUntil = checkTimersNoP(allpSnapshot, timerpMaskSnapshot, pollUntil)
+		// Finally, check for timer creation or expiry concurrently with
+		// transitioning from spinning to non-spinning.
+		//
+		// Note that we cannot use checkTimers here because it calls
+		// adjusttimers which may need to allocate memory, and that isn't
+		// allowed when we don't have an active P.
+		pollUntil = checkTimersNoP(allpSnapshot, timerpMaskSnapshot, pollUntil)
+	}
 
 	// Poll network until next timer.
 	if netpollinited() && (atomic.Load(&netpollWaiters) > 0 || pollUntil != 0) && atomic.Xchg64(&sched.lastpoll, 0) != 0 {
author	Michael Pratt <mpratt@google.com>	2021-04-07 12:01:44 -0400
committer	Michael Pratt <mpratt@google.com>	2021-04-22 20:45:37 +0000
commit	ecfce58965da6017e02f5fc5c03eda52fc41c8d6 (patch)
tree	6a69a25d0204587919d739ad26816fcb9db5aa1b /src/runtime/proc.go
parent	b6ff3c69d5fdf933f5265f95ae4bb12eaecc792f (diff)
download	go-ecfce58965da6017e02f5fc5c03eda52fc41c8d6.tar.gz go-ecfce58965da6017e02f5fc5c03eda52fc41c8d6.zip