diff options
author | Robert Griesemer <gri@golang.org> | 2023-06-15 16:09:52 -0700 |
---|---|---|
committer | Gopher Robot <gobot@golang.org> | 2023-07-20 23:09:32 +0000 |
commit | a04a665a92714c71a039575d27155cd495891799 (patch) | |
tree | 5f9a5c3a9844c1a8792543f69dff422524317f5d /doc/go_spec.html | |
parent | 82ee946d7acd93cad27f748efb9c3131ae69668e (diff) | |
download | go-a04a665a92714c71a039575d27155cd495891799.tar.gz go-a04a665a92714c71a039575d27155cd495891799.zip |
spec: update section on type inference for Go 1.21
The new section describes type inference as the problem
of solving a set of type equations for bound type parameters.
The next CL will update the section on unification to match
the new inference approach.
Change-Id: I2cb49bfb588ccc82d645343034096a82b7d602e2
Reviewed-on: https://go-review.googlesource.com/c/go/+/503920
TryBot-Bypass: Robert Griesemer <gri@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
Auto-Submit: Robert Griesemer <gri@google.com>
Diffstat (limited to 'doc/go_spec.html')
-rw-r--r-- | doc/go_spec.html | 484 |
1 files changed, 189 insertions, 295 deletions
diff --git a/doc/go_spec.html b/doc/go_spec.html index 9370cf632c..7099f36020 100644 --- a/doc/go_spec.html +++ b/doc/go_spec.html @@ -1,6 +1,6 @@ <!--{ "Title": "The Go Programming Language Specification", - "Subtitle": "Version of July 19, 2023", + "Subtitle": "Version of July 20, 2023", "Path": "/ref/spec" }--> @@ -2511,7 +2511,7 @@ type ( <p> A type definition creates a new, distinct type with the same -<a href="#Types">underlying type</a> and operations as the given type +<a href="#Underlying_types">underlying type</a> and operations as the given type and binds an identifier, the <i>type name</i>, to it. </p> @@ -4343,7 +4343,7 @@ type parameter list type arguments after substitution When using a generic function, type arguments may be provided explicitly, or they may be partially or completely <a href="#Type_inference">inferred</a> from the context in which the function is used. -Provided that they can be inferred, type arguments may be omitted entirely if the function is: +Provided that they can be inferred, type argument lists may be omitted entirely if the function is: </p> <ul> @@ -4351,7 +4351,7 @@ Provided that they can be inferred, type arguments may be omitted entirely if th <a href="#Calls">called</a> with ordinary arguments, </li> <li> - <a href="#Assignment_statements">assigned</a> to a variable with an explicitly declared type, + <a href="#Assignment_statements">assigned</a> to a variable with a known type </li> <li> <a href="#Calls">passed as an argument</a> to another function, or @@ -4371,7 +4371,7 @@ must be inferrable from the context in which the function is used. // sum returns the sum (concatenation, for strings) of its arguments. func sum[T ~int | ~float64 | ~string](x... T) T { … } -x := sum // illegal: sum must have a type argument (x is a variable without a declared type) +x := sum // illegal: the type of x is unknown intSum := sum[int] // intSum has type func(x... int) int a := intSum(2, 3) // a has value 5 of type int b := sum[float64](2.0, 3) // b has value 5.0 of type float64 @@ -4406,72 +4406,224 @@ For a generic type, all type arguments must always be provided explicitly. <h3 id="Type_inference">Type inference</h3> <p> -<em>NOTE: This section is not yet up-to-date for Go 1.21.</em> +A use of a generic function may omit some or all type arguments if they can be +<i>inferred</i> from the context within which the function is used, including +the constraints of the function's type parameters. +Type inference succeeds if it can infer the missing type arguments +and <a href="#Instantiations">instantiation</a> succeeds with the +inferred type arguments. +Otherwise, type inference fails and the program is invalid. </p> <p> -Missing function type arguments may be <i>inferred</i> by a series of steps, described below. -Each step attempts to use known information to infer additional type arguments. -Type inference stops as soon as all type arguments are known. -After type inference is complete, it is still necessary to substitute all type arguments -for type parameters and verify that each type argument -<a href="#Implementing_an_interface">implements</a> the relevant constraint; -it is possible for an inferred type argument to fail to implement a constraint, in which -case instantiation fails. +Type inference uses the type relationships between pairs of types for inference: +For instance, a function argument must be <a href="#Assignability">assignable</a> +to its respective function parameter; this establishes a relationship between the +type of the argument and the type of the parameter. +If either of these two types contains type parameters, type inference looks for the +type arguments to substitute the type parameters with such that the assignability +relationship is satisfied. +Similarly, type inference uses the fact that a type argument must +<a href="#Satisfying_a_type_constraint">satisfy</a> the constraint of its respective +type parameter. </p> <p> -Type inference is based on +Each such pair of matched types corresponds to a <i>type equation</i> containing +one or multiple type parameters, from one or possibly multiple generic functions. +Inferring the missing type arguments means solving the resulting set of type +equations for the respective type parameters. +</p> + +<p> +For example, given +</p> + +<pre> +// dedup returns a copy of the argument slice with any duplicate entries removed. +func dedup[S ~[]E, E comparable](S) S { … } + +type Slice []int +var s Slice +s = dedup(s) // same as s = dedup[Slice, int](s) +</pre> + +<p> +the variable <code>s</code> of type <code>Slice</code> must be assignable to +the function parameter type <code>S</code> for the program to be valid. +To reduce complexity, type inference ignores the directionality of assignments, +so the type relationship between <code>Slice</code> and <code>S</code> can be +expressed via the (symmetric) type equation <code>Slice ≡<sub>A</sub> S</code> +(or <code>S ≡<sub>A</sub> Slice</code> for that matter), +where the <code><sub>A</sub></code> in <code>≡<sub>A</sub></code> +indicates that the LHS and RHS types must match per assignability rules +(see the section on <a href="#Type_unification">type unifcation</a> for +details). +Similarly, the type parameter <code>S</code> must satisfy its constraint +<code>~[]E</code>. This can be expressed as <code>S ≡<sub>C</sub> ~[]E</code> +where <code>X ≡<sub>C</sub> Y</code> stands for +"<code>X</code> satisfies constraint <code>Y</code>". +These observations lead to a set of two equations +</p> + +<pre> + Slice ≡<sub>A</sub> S (1) + S ≡<sub>C</sub> ~[]E (2) +</pre> + +<p> +which now can be solved for the type parameters <code>S</code> and <code>E</code>. +From (1) a compiler can infer that the type argument for <code>S</code> is <code>Slice</code>. +Similarly, because the underlying type of <code>Slice</code> is <code>[]int</code> +and <code>[]int</code> must match <code>[]E</code> of the constraint, +a compiler can infer that <code>E</code> must be <code>int</code>. +Thus, for these two equations, type inference infers +</p> + +<pre> + S ➞ Slice + E ➞ int +</pre> + +<p> +Given a set of type equations, the type parameters to solve for are +the type parameters of the functions that need to be instantiated +and for which no explicit type arguments is provided. +These type parameters are called <i>bound</i> type parameters. +For instance, in the <code>dedup</code> example above, the type parameters +<code>P</code> and <code>E</code> are bound to <code>dedup</code>. +An argument to a generic function call may be a generic function itself. +The type parameters of that function are included in the set of bound +type parameters. +The types of function arguments may contain type parameters from other +functions (such as a generic function enclosing a function call). +Those type parameters may also appear in type equations but they are +not bound in that context. +Type equations are always solved for the bound type parameters only. +</p> + +<p> +Type inference supports calls of generic functions and assignments +of generic functions to (explicitly function-typed) variables. +This includes passing generic functions as arguments to other +(possibly also generic) functions, and returning generic functions +as results. +Type inference operates on a set of equations specific to each of +these cases. +The equations are as follows (type argument lists are omitted for clarity): </p> <ul> <li> - a <a href="#Type_parameter_declarations">type parameter list</a> + <p> + For a function call <code>f(a<sub>0</sub>, a<sub>1</sub>, …)</code> where + <code>f</code> or a function argument <code>a<sub>i</sub></code> is + a generic function: + <br> + Each pair <code>(a<sub>i</sub>, p<sub>i</sub>)</code> of corresponding + function arguments and parameters where <code>a<sub>i</sub></code> is not an + <a href="#Constants">untyped constant</a> yields an equation + <code>typeof(p<sub>i</sub>) ≡<sub>A</sub> typeof(a<sub>i</sub>)</code>. + <br> + If <code>a<sub>i</sub></code> is an untyped constant <code>c<sub>j</sub></code>, + and <code>p<sub>i</sub></code> is a bound type parameter <code>P<sub>k</sub></code>, + the pair <code>(c<sub>j</sub>, P<sub>k</sub>)</code> is collected separately from + the type equations. + </p> </li> <li> - a substitution map <i>M</i> initialized with the known type arguments, if any + <p> + For an assignment <code>v = f</code> of a generic function <code>f</code> to a + (non-generic) variable <code>v</code> of function type: + <br> + <code>typeof(v) ≡<sub>A</sub> typeof(f)</code>. + </p> </li> <li> - a (possibly empty) list of ordinary function arguments (in case of a function call only) + <p> + For a return statement <code>return …, f, … </code> where <code>f</code> is a + generic function returned as a result to a (non-generic) result variable + of function type: + <br> + <code>typeof(r) ≡<sub>A</sub> typeof(f)</code>. + </p> </li> </ul> <p> -and then proceeds with the following steps: +Additionally, each type parameter <code>P<sub>k</sub></code> and corresponding type constraint +<code>C<sub>k</sub></code> yields the type equation +<code>P<sub>k</sub> ≡<sub>C</sub> C<sub>k</sub></code>. +</p> + +<p> +Type inference gives precedence to type information obtained from typed operands +before considering untyped constants. +Therefore, inference proceeds in two phases: </p> <ol> <li> - apply <a href="#Function_argument_type_inference"><i>function argument type inference</i></a> - to all <i>typed</i> ordinary function arguments -</li> -<li> - apply <a href="#Constraint_type_inference"><i>constraint type inference</i></a> -</li> -<li> - apply function argument type inference to all <i>untyped</i> ordinary function arguments - using the default type for each of the untyped function arguments + <p> + The type equations are solved for the bound + type parameters using <a href="#Type_unification">type unification</a>. + If unification fails, type inference fails. + </p> </li> <li> - apply constraint type inference + <p> + For each bound type parameter <code>P<sub>k</sub></code> for which no type argument + has been inferred yet and for which one or more pairs + <code>(c<sub>j</sub>, P<sub>k</sub>)</code> with that same type parameter + were collected, determine the <a href="#Constant_expressions">constant kind</a> + of the constants <code>c<sub>j</sub></code> in all those pairs the same way as for + <a href="#Constant_expressions">constant expressions</a>. + The type argument for <code>P<sub>k</sub></code> is the + <a href="#Constants">default type</a> for the determined constant kind. + If a constant kind cannot be determined due to conflicting constant kinds, + type inference fails. + </p> </li> </ol> <p> -If there are no ordinary or untyped function arguments, the respective steps are skipped. -Constraint type inference is skipped if the previous step didn't infer any new type arguments, -but it is run at least once if there are missing type arguments. +If not all type arguments have been found after these two phases, type inference fails. </p> <p> -The substitution map <i>M</i> is carried through all steps, and each step may add entries to <i>M</i>. -The process stops as soon as <i>M</i> has a type argument for each type parameter or if an inference step fails. -If an inference step fails, or if <i>M</i> is still missing type arguments after the last step, type inference fails. +If the two phases are successful, type inference determined a type argument for each +bound type parameter: +</p> + +<pre> + P<sub>k</sub> ➞ A<sub>k</sub> +</pre> + +<p> +A type argument <code>A<sub>k</sub></code> may be a composite type, +containing other bound type parameters <code>P<sub>k</sub></code> as element types +(or even be just another bound type parameter). +In a process of repeated simplification, the bound type parameters in each type +argument are substituted with the respective type arguments for those type +parameters until each type argument is free of bound type parameters. +</p> + +<p> +If type arguments contain cyclic references to themselves +through bound type parameters, simplification and thus type +inference fails. +Otherwise, type inference succeeds. </p> <h4 id="Type_unification">Type unification</h4> <p> +<em> +Note: This section is not up-to-date for Go 1.21. +</em> +</p> + +<p> Type inference is based on <i>type unification</i>. A single unification step applies to a <a href="#Type_inference">substitution map</a> and two types, either or both of which may be or contain type parameters. The substitution map tracks @@ -4546,264 +4698,6 @@ and the type literal <code>[]E</code>, unification compares <code>[]float64</cod the substitution map. </p> -<h4 id="Function_argument_type_inference">Function argument type inference</h4> - -<!-- In this section and the section on constraint type inference we start with examples -rather than have the examples follow the rules as is customary elsewhere in spec. -Hopefully this helps building an intuition and makes the rules easier to follow. --> - -<p> -Function argument type inference infers type arguments from function arguments: -if a function parameter is declared with a type <code>T</code> that uses -type parameters, -<a href="#Type_unification">unifying</a> the type of the corresponding -function argument with <code>T</code> may infer type arguments for the type -parameters used by <code>T</code>. -</p> - -<p> -For instance, given the generic function -</p> - -<pre> -func scale[Number ~int64|~float64|~complex128](v []Number, s Number) []Number -</pre> - -<p> -and the call -</p> - -<pre> -var vector []float64 -scaledVector := scale(vector, 42) -</pre> - -<p> -the type argument for <code>Number</code> can be inferred from the function argument -<code>vector</code> by unifying the type of <code>vector</code> with the corresponding -parameter type: <code>[]float64</code> and <code>[]Number</code> -match in structure and <code>float64</code> matches with <code>Number</code>. -This adds the entry <code>Number</code> → <code>float64</code> to the -<a href="#Type_unification">substitution map</a>. -Untyped arguments, such as the second function argument <code>42</code> here, are ignored -in the first round of function argument type inference and only considered if there are -unresolved type parameters left. -</p> - -<p> -Inference happens in two separate phases; each phase operates on a specific list of -(parameter, argument) pairs: -</p> - -<ol> -<li> - The list <i>Lt</i> contains all (parameter, argument) pairs where the parameter - type uses type parameters and where the function argument is <i>typed</i>. -</li> -<li> - The list <i>Lu</i> contains all remaining pairs where the parameter type is a single - type parameter. In this list, the respective function arguments are untyped. -</li> -</ol> - -<p> -Any other (parameter, argument) pair is ignored. -</p> - -<p> -By construction, the arguments of the pairs in <i>Lu</i> are <i>untyped</i> constants -(or the untyped boolean result of a comparison). And because <a href="#Constants">default types</a> -of untyped values are always predeclared non-composite types, they can never match against -a composite type, so it is sufficient to only consider parameter types that are single type -parameters. -</p> - -<p> -Each list is processed in a separate phase: -</p> - -<ol> -<li> - In the first phase, the parameter and argument types of each pair in <i>Lt</i> - are unified. If unification succeeds for a pair, it may yield new entries that - are added to the substitution map <i>M</i>. If unification fails, type inference - fails. -</li> -<li> - The second phase considers the entries of list <i>Lu</i>. Type parameters for - which the type argument has already been determined are ignored in this phase. - For each remaining pair, the parameter type (which is a single type parameter) and - the <a href="#Constants">default type</a> of the corresponding untyped argument is - unified. If unification fails, type inference fails. -</li> -</ol> - -<p> -While unification is successful, processing of each list continues until all list elements -are considered, even if all type arguments are inferred before the last list element has -been processed. -</p> - -<p> -Example: -</p> - -<pre> -func min[T ~int|~float64](x, y T) T - -var x int -min(x, 2.0) // T is int, inferred from typed argument x; 2.0 is assignable to int -min(1.0, 2.0) // T is float64, inferred from default type for 1.0 and matches default type for 2.0 -min(1.0, 2) // illegal: default type float64 (for 1.0) doesn't match default type int (for 2) -</pre> - -<p> -In the example <code>min(1.0, 2)</code>, processing the function argument <code>1.0</code> -yields the substitution map entry <code>T</code> → <code>float64</code>. Because -processing continues until all untyped arguments are considered, an error is reported. This -ensures that type inference does not depend on the order of the untyped arguments. -</p> - -<h4 id="Constraint_type_inference">Constraint type inference</h4> - -<p> -Constraint type inference infers type arguments by considering type constraints. -If a type parameter <code>P</code> has a constraint with a -<a href="#Core_types">core type</a> <code>C</code>, -<a href="#Type_unification">unifying</a> <code>P</code> with <code>C</code> -may infer additional type arguments, either the type argument for <code>P</code>, -or if that is already known, possibly the type arguments for type parameters -used in <code>C</code>. -</p> - -<p> -For instance, consider the type parameter list with type parameters <code>List</code> and -<code>Elem</code>: -</p> - -<pre> -[List ~[]Elem, Elem any] -</pre> - -<p> -Constraint type inference can deduce the type of <code>Elem</code> from the type argument -for <code>List</code> because <code>Elem</code> is a type parameter in the core type -<code>[]Elem</code> of <code>List</code>. -If the type argument is <code>Bytes</code>: -</p> - -<pre> -type Bytes []byte -</pre> - -<p> -unifying the underlying type of <code>Bytes</code> with the core type means -unifying <code>[]byte</code> with <code>[]Elem</code>. That unification succeeds and yields -the <a href="#Type_unification">substitution map</a> entry -<code>Elem</code> → <code>byte</code>. -Thus, in this example, constraint type inference can infer the second type argument from the -first one. -</p> - -<p> -Using the core type of a constraint may lose some information: In the (unlikely) case that -the constraint's type set contains a single <a href="#Type_definitions">defined type</a> -<code>N</code>, the corresponding core type is <code>N</code>'s underlying type rather than -<code>N</code> itself. In this case, constraint type inference may succeed but instantiation -will fail because the inferred type is not in the type set of the constraint. -Thus, constraint type inference uses the <i>adjusted core type</i> of -a constraint: if the type set contains a single type, use that type; otherwise use the -constraint's core type. -</p> - -<p> -Generally, constraint type inference proceeds in two phases: Starting with a given -substitution map <i>M</i> -</p> - -<ol> -<li> -For all type parameters with an adjusted core type, unify the type parameter with that -type. If any unification fails, constraint type inference fails. -</li> - -<li> -At this point, some entries in <i>M</i> may map type parameters to other -type parameters or to types containing type parameters. For each entry -<code>P</code> → <code>A</code> in <i>M</i> where <code>A</code> is or -contains type parameters <code>Q</code> for which there exist entries -<code>Q</code> → <code>B</code> in <i>M</i>, substitute those -<code>Q</code> with the respective <code>B</code> in <code>A</code>. -Stop when no further substitution is possible. -</li> -</ol> - -<p> -The result of constraint type inference is the final substitution map <i>M</i> from type -parameters <code>P</code> to type arguments <code>A</code> where no type parameter <code>P</code> -appears in any of the <code>A</code>. -</p> - -<p> -For instance, given the type parameter list -</p> - -<pre> -[A any, B []C, C *A] -</pre> - -<p> -and the single provided type argument <code>int</code> for type parameter <code>A</code>, -the initial substitution map <i>M</i> contains the entry <code>A</code> → <code>int</code>. -</p> - -<p> -In the first phase, the type parameters <code>B</code> and <code>C</code> are unified -with the core type of their respective constraints. This adds the entries -<code>B</code> → <code>[]C</code> and <code>C</code> → <code>*A</code> -to <i>M</i>. - -<p> -At this point there are two entries in <i>M</i> where the right-hand side -is or contains type parameters for which there exists other entries in <i>M</i>: -<code>[]C</code> and <code>*A</code>. -In the second phase, these type parameters are replaced with their respective -types. It doesn't matter in which order this happens. Starting with the state -of <i>M</i> after the first phase: -</p> - -<p> -<code>A</code> → <code>int</code>, -<code>B</code> → <code>[]C</code>, -<code>C</code> → <code>*A</code> -</p> - -<p> -Replace <code>A</code> on the right-hand side of → with <code>int</code>: -</p> - -<p> -<code>A</code> → <code>int</code>, -<code>B</code> → <code>[]C</code>, -<code>C</code> → <code>*int</code> -</p> - -<p> -Replace <code>C</code> on the right-hand side of → with <code>*int</code>: -</p> - -<p> -<code>A</code> → <code>int</code>, -<code>B</code> → <code>[]*int</code>, -<code>C</code> → <code>*int</code> -</p> - -<p> -At this point no further substitution is possible and the map is full. -Therefore, <code>M</code> represents the final map of type parameters -to type arguments for the given type parameter list. -</p> - <h3 id="Operators">Operators</h3> <p> @@ -5479,7 +5373,7 @@ in any of these cases: ignoring struct tags (see below), <code>x</code>'s type and <code>T</code> are not <a href="#Type_parameter_declarations">type parameters</a> but have - <a href="#Type_identity">identical</a> <a href="#Types">underlying types</a>. + <a href="#Type_identity">identical</a> <a href="#Underlying_types">underlying types</a>. </li> <li> ignoring struct tags (see below), @@ -8291,7 +8185,7 @@ of if the general conversion rules take care of this. <p> A <code>Pointer</code> is a <a href="#Pointer_types">pointer type</a> but a <code>Pointer</code> value may not be <a href="#Address_operators">dereferenced</a>. -Any pointer or value of <a href="#Types">underlying type</a> <code>uintptr</code> can be +Any pointer or value of <a href="#Underlying_types">underlying type</a> <code>uintptr</code> can be <a href="#Conversions">converted</a> to a type of underlying type <code>Pointer</code> and vice versa. The effect of converting between <code>Pointer</code> and <code>uintptr</code> is implementation-defined. </p> |