From 5dd978a283ca445f8b5f255773b3904497365b61 Mon Sep 17 00:00:00 2001 From: Austin Clements Date: Mon, 12 Dec 2016 13:56:11 -0500 Subject: runtime: expand HACKING.md This adds high-level descriptions of the scheduler structures, the user and system stacks, error handling, and synchronization. Change-Id: I1eed97c6dd4a6e3d351279e967b11c6e64898356 Reviewed-on: https://go-review.googlesource.com/34290 Reviewed-by: Rick Hudson --- src/runtime/HACKING.md | 139 +++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 136 insertions(+), 3 deletions(-) diff --git a/src/runtime/HACKING.md b/src/runtime/HACKING.md index 88fb708c7e..ea7c5c128d 100644 --- a/src/runtime/HACKING.md +++ b/src/runtime/HACKING.md @@ -1,6 +1,139 @@ -This is a very incomplete and probably out-of-date guide to -programming in the Go runtime and how it differs from writing normal -Go. +This is a living document and at times it will be out of date. It is +intended to articulate how programming in the Go runtime differs from +writing normal Go. It focuses on pervasive concepts rather than +details of particular interfaces. + +Scheduler structures +==================== + +The scheduler manages three types of resources that pervade the +runtime: Gs, Ms, and Ps. It's important to understand these even if +you're not working on the scheduler. + +Gs, Ms, Ps +---------- + +A "G" is simply a goroutine. It's represented by type `g`. When a +goroutine exits, its `g` object is returned to a pool of free `g`s and +can later be reused for some other goroutine. + +An "M" is an OS thread that can be executing user Go code, runtime +code, a system call, or be idle. It's represented by type `m`. There +can be any number of Ms at a time since any number of threads may be +blocked in system calls. + +Finally, a "P" represents the resources required to execute user Go +code, such as scheduler and memory allocator state. It's represented +by type `p`. There are exactly `GOMAXPROCS` Ps. A P can be thought of +like a CPU in the OS scheduler and the contents of the `p` type like +per-CPU state. This is a good place to put state that needs to be +sharded for efficiency, but doesn't need to be per-thread or +per-goroutine. + +The scheduler's job is to match up a G (the code to execute), an M +(where to execute it), and a P (the rights and resources to execute +it). When an M stops executing user Go code, for example by entering a +system call, it returns its P to the idle P pool. In order to resume +executing user Go code, for example on return from a system call, it +must acquire a P from the idle pool. + +All `g`, `m`, and `p` objects are heap allocated, but are never freed, +so their memory remains type stable. As a result, the runtime can +avoid write barriers in the depths of the scheduler. + +User stacks and system stacks +----------------------------- + +Every non-dead G has a *user stack* associated with it, which is what +user Go code executes on. User stacks start small (e.g., 2K) and grow +or shrink dynamically. + +Every M has a *system stack* associated with it (also known as the M's +"g0" stack because it's implemented as a stub G) and, on Unix +platforms, a *signal stack* (also known as the M's "gsignal" stack). +System and signal stacks cannot grow, but are large enough to execute +runtime and cgo code (8K in a pure Go binary; system-allocated in a +cgo binary). + +Runtime code often temporarily switches to the system stack using +`systemstack`, `mcall`, or `asmcgocall` to perform tasks that must not +be preempted, that must not grow the user stack, or that switch user +goroutines. Code running on the system stack is implicitly +non-preemptible and the garbage collector does not scan system stacks. +While running on the system stack, the current user stack is not used +for execution. + +`getg()` and `getg().m.curg` +---------------------------- + +To get the current user `g`, use `getg().m.curg`. + +`getg()` alone returns the current `g`, but when executing on the +system or signal stacks, this will return the current M's "g0" or +"gsignal", respectively. This is usually not what you want. + +To determine if you're running on the user stack or the system stack, +use `getg() == getg().m.curg`. + +Error handling and reporting +============================ + +Errors that can reasonably be recovered from in user code should use +`panic` like usual. However, there are some situations where `panic` +will cause an immediate fatal error, such as when called on the system +stack or when called during `mallocgc`. + +Most errors in the runtime are not recoverable. For these, use +`throw`, which dumps the traceback and immediately terminates the +process. In general, `throw` should be passed a string constant to +avoid allocating in perilous situations. By convention, additional +details are printed before `throw` using `print` or `println` and the +messages are prefixed with "runtime:". + +For runtime error debugging, it's useful to run with +`GOTRACEBACK=system` or `GOTRACEBACK=crash`. + +Synchronization +=============== + +The runtime has multiple synchronization mechanisms. They differ in +semantics and, in particular, in whether they interact with the +goroutine scheduler or the OS scheduler. + +The simplest is `mutex`, which is manipulated using `lock` and +`unlock`. This should be used to protect shared structures for short +periods. Blocking on a `mutex` directly blocks the M, without +interacting with the Go scheduler. This means it is safe to use from +the lowest levels of the runtime, but also prevents any associated G +and P from being rescheduled. + +For one-shot notifications, use `note`, which provides `notesleep` and +`notewakeup`. Unlike traditional UNIX `sleep`/`wakeup`, `note`s are +race-free, so `notesleep` returns immediately if the `notewakeup` has +already happened. A `note` can be reset after use with `noteclear`, +which must not race with a sleep or wakeup. Like `mutex`, blocking on +a `note` blocks the M. However, there are different ways to sleep on a +`note`:`notesleep` also prevents rescheduling of any associated G and +P, while `notetsleepg` acts like a blocking system call that allows +the P to be reused to run another G. This is still less efficient than +blocking the G directly since it consumes an M. + +To interact directly with the goroutine scheduler, use `gopark` and +`goready`. `gopark` parks the current goroutine—putting it in the +"waiting" state and removing it from the scheduler's run queue—and +schedules another goroutine on the current M/P. `goready` puts a +parked goroutine back in the "runnable" state and adds it to the run +queue. + +In summary, + + + + + + + +
Blocks
InterfaceGMP
mutexYYY
noteYYY/N
parkYNN
Unmanaged memory ================ -- cgit v1.2.3-54-g00ecf