This is a living document and at times it will be out of date. It is intended to articulate how programming in the Go runtime differs from writing normal Go. It focuses on pervasive concepts rather than details of particular interfaces.
The scheduler manages three types of resources that pervade the runtime: Gs, Ms, and Ps. It's important to understand these even if you're not working on the scheduler.
A "G" is simply a goroutine. It's represented by type g
. When a
goroutine exits, its g
object is returned to a pool of free g
s and
can later be reused for some other goroutine.
An "M" is an OS thread that can be executing user Go code, runtime
code, a system call, or be idle. It's represented by type m
. There
can be any number of Ms at a time since any number of threads may be
blocked in system calls.
Finally, a "P" represents the resources required to execute user Go
code, such as scheduler and memory allocator state. It's represented
by type p
. There are exactly GOMAXPROCS
Ps. A P can be thought of
like a CPU in the OS scheduler and the contents of the p
type like
per-CPU state. This is a good place to put state that needs to be
sharded for efficiency, but doesn't need to be per-thread or
per-goroutine.
The scheduler's job is to match up a G (the code to execute), an M (where to execute it), and a P (the rights and resources to execute it). When an M stops executing user Go code, for example by entering a system call, it returns its P to the idle P pool. In order to resume executing user Go code, for example on return from a system call, it must acquire a P from the idle pool.
All g
, m
, and p
objects are heap allocated, but are never freed,
so their memory remains type stable. As a result, the runtime can
avoid write barriers in the depths of the scheduler.
To get the current user g
, use getg().m.curg
.
getg()
alone returns the current g
, but when executing on the
system or signal stacks, this will return the current M's "g0" or
"gsignal", respectively. This is usually not what you want.
To determine if you're running on the user stack or the system stack,
use getg() == getg().m.curg
.
Every non-dead G has a user stack associated with it, which is what user Go code executes on. User stacks start small (e.g., 2K) and grow or shrink dynamically.
Every M has a system stack associated with it (also known as the M's "g0" stack because it's implemented as a stub G) and, on Unix platforms, a signal stack (also known as the M's "gsignal" stack). System and signal stacks cannot grow, but are large enough to execute runtime and cgo code (8K in a pure Go binary; system-allocated in a cgo binary).
Runtime code often temporarily switches to the system stack using
systemstack
, mcall
, or asmcgocall
to perform tasks that must not
be preempted, that must not grow the user stack, or that switch user
goroutines. Code running on the system stack is implicitly
non-preemptible and the garbage collector does not scan system stacks.
While running on the system stack, the current user stack is not used
for execution.
Most functions start with a prologue that inspects the stack pointer
and the current G's stack bound and calls morestack
if the stack
needs to grow.
Functions can be marked //go:nosplit
(or NOSPLIT
in assembly) to
indicate that they should not get this prologue. This has several
uses:
-
Functions that must run on the user stack, but must not call into stack growth, for example because this would cause a deadlock, or because they have untyped words on the stack.
-
Functions that must not be preempted on entry.
-
Functions that may run without a valid G. For example, functions that run in early runtime start-up, or that may be entered from C code such as cgo callbacks or the signal handler.
Splittable functions ensure there's some amount of space on the stack for nosplit functions to run in and the linker checks that any static chain of nosplit function calls cannot exceed this bound.
Any function with a //go:nosplit
annotation should explain why it is
nosplit in its documentation comment.
Errors that can reasonably be recovered from in user code should use
panic
like usual. However, there are some situations where panic
will cause an immediate fatal error, such as when called on the system
stack or when called during mallocgc
.
Most errors in the runtime are not recoverable. For these, use
throw
, which dumps the traceback and immediately terminates the
process. In general, throw
should be passed a string constant to
avoid allocating in perilous situations. By convention, additional
details are printed before throw
using print
or println
and the
messages are prefixed with "runtime:".
For unrecoverable errors where user code is expected to be at fault for the
failure (such as racing map writes), use fatal
.
For runtime error debugging, it may be useful to run with GOTRACEBACK=system
or GOTRACEBACK=crash
. The output of panic
and fatal
is as described by
GOTRACEBACK
. The output of throw
always includes runtime frames, metadata
and all goroutines regardless of GOTRACEBACK
(i.e., equivalent to
GOTRACEBACK=system
). Whether throw
crashes or not is still controlled by
GOTRACEBACK
.
The runtime has multiple synchronization mechanisms. They differ in semantics and, in particular, in whether they interact with the goroutine scheduler or the OS scheduler.
The simplest is mutex
, which is manipulated using lock
and
unlock
. This should be used to protect shared structures for short
periods. Blocking on a mutex
directly blocks the M, without
interacting with the Go scheduler. This means it is safe to use from
the lowest levels of the runtime, but also prevents any associated G
and P from being rescheduled. rwmutex
is similar.
For one-shot notifications, use note
, which provides notesleep
and
notewakeup
. Unlike traditional UNIX sleep
/wakeup
, note
s are
race-free, so notesleep
returns immediately if the notewakeup
has
already happened. A note
can be reset after use with noteclear
,
which must not race with a sleep or wakeup. Like mutex
, blocking on
a note
blocks the M. However, there are different ways to sleep on a
note
:notesleep
also prevents rescheduling of any associated G and
P, while notetsleepg
acts like a blocking system call that allows
the P to be reused to run another G. This is still less efficient than
blocking the G directly since it consumes an M.
To interact directly with the goroutine scheduler, use gopark
and
goready
. gopark
parks the current goroutine—putting it in the
"waiting" state and removing it from the scheduler's run queue—and
schedules another goroutine on the current M/P. goready
puts a
parked goroutine back in the "runnable" state and adds it to the run
queue.
In summary,
Blocks | |||
---|---|---|---|
Interface | G | M | P |
(rw)mutex | Y | Y | Y |
note | Y | Y | Y/N |
park | Y | N | N |
The runtime uses its own atomics package at internal/runtime/atomic
.
This corresponds to sync/atomic
, but functions have different names
for historical reasons and there are a few additional functions needed
by the runtime.
In general, we think hard about the uses of atomics in the runtime and try to avoid unnecessary atomic operations. If access to a variable is sometimes protected by another synchronization mechanism, the already-protected accesses generally don't need to be atomic. There are several reasons for this:
-
Using non-atomic or atomic access where appropriate makes the code more self-documenting. Atomic access to a variable implies there's somewhere else that may concurrently access the variable.
-
Non-atomic access allows for automatic race detection. The runtime doesn't currently have a race detector, but it may in the future. Atomic access defeats the race detector, while non-atomic access allows the race detector to check your assumptions.
-
Non-atomic access may improve performance.
Of course, any non-atomic access to a shared variable should be documented to explain how that access is protected.
Some common patterns that mix atomic and non-atomic access are:
-
Read-mostly variables where updates are protected by a lock. Within the locked region, reads do not need to be atomic, but the write does. Outside the locked region, reads need to be atomic.
-
Reads that only happen during STW, where no writes can happen during STW, do not need to be atomic.
That said, the advice from the Go memory model stands: "Don't be [too] clever." The performance of the runtime matters, but its robustness matters more.
In general, the runtime tries to use regular heap allocation. However, in some cases the runtime must allocate objects outside of the garbage collected heap, in unmanaged memory. This is necessary if the objects are part of the memory manager itself or if they must be allocated in situations where the caller may not have a P.
There are three mechanisms for allocating unmanaged memory:
-
sysAlloc obtains memory directly from the OS. This comes in whole multiples of the system page size, but it can be freed with sysFree.
-
persistentalloc combines multiple smaller allocations into a single sysAlloc to avoid fragmentation. However, there is no way to free persistentalloced objects (hence the name).
-
fixalloc is a SLAB-style allocator that allocates objects of a fixed size. fixalloced objects can be freed, but this memory can only be reused by the same fixalloc pool, so it can only be reused for objects of the same type.
In general, types that are allocated using any of these should be
marked as not in heap by embedding internal/runtime/sys.NotInHeap
.
Objects that are allocated in unmanaged memory must not contain heap pointers unless the following rules are also obeyed:
-
Any pointers from unmanaged memory to the heap must be garbage collection roots. More specifically, any pointer must either be accessible through a global variable or be added as an explicit garbage collection root in
runtime.markroot
. -
If the memory is reused, the heap pointers must be zero-initialized before they become visible as GC roots. Otherwise, the GC may observe stale heap pointers. See "Zero-initialization versus zeroing".
There are two types of zeroing in the runtime, depending on whether the memory is already initialized to a type-safe state.
If memory is not in a type-safe state, meaning it potentially contains
"garbage" because it was just allocated and it is being initialized
for first use, then it must be zero-initialized using
memclrNoHeapPointers
or non-pointer writes. This does not perform
write barriers.
If memory is already in a type-safe state and is simply being set to
the zero value, this must be done using regular writes, typedmemclr
,
or memclrHasPointers
. This performs write barriers.
In addition to the "//go:" directives documented in "go doc compile", the compiler supports additional directives only in the runtime.
go:systemstack
indicates that a function must run on the system
stack. This is checked dynamically by a special function prologue.
go:nowritebarrier
directs the compiler to emit an error if the
following function contains any write barriers. (It does not
suppress the generation of write barriers; it is simply an assertion.)
Usually you want go:nowritebarrierrec
. go:nowritebarrier
is
primarily useful in situations where it's "nice" not to have write
barriers, but not required for correctness.
go:nowritebarrierrec
directs the compiler to emit an error if the
following function or any function it calls recursively, up to a
go:yeswritebarrierrec
, contains a write barrier.
Logically, the compiler floods the call graph starting from each
go:nowritebarrierrec
function and produces an error if it encounters
a function containing a write barrier. This flood stops at
go:yeswritebarrierrec
functions.
go:nowritebarrierrec
is used in the implementation of the write
barrier to prevent infinite loops.
Both directives are used in the scheduler. The write barrier requires
an active P (getg().m.p != nil
) and scheduler code often runs
without an active P. In this case, go:nowritebarrierrec
is used on
functions that release the P or may run without a P and
go:yeswritebarrierrec
is used when code re-acquires an active P.
Since these are function-level annotations, code that releases or
acquires a P may need to be split across two functions.
The //go:uintptrkeepalive directive must be followed by a function declaration.
It specifies that the function's uintptr arguments may be pointer values that have been converted to uintptr and must be kept alive for the duration of the call, even though from the types alone it would appear that the object is no longer needed during the call.
This directive is similar to //go:uintptrescapes, but it does not force arguments to escape. Since stack growth does not understand these arguments, this directive must be used with //go:nosplit (in the marked function and all transitive calls) to prevent stack growth.
The conversion from pointer to uintptr must appear in the argument list of any call to this function. This directive is used for some low-level system call implementations.