We propose clarifications for the semantics of constraint satisfaction in the generics proposal. We also propose changing the syntax of type lists to remove the type
keyword and to explicitly specify when type arguments should match on underlying types.
The changes this would make to the current generics proposal document can be seen in https://golang.org/cl/306689.
Background
The current generics proposal proposes a new syntax for type lists within interfaces. A type list within an interface is the keyword type
followed by a list of types separated by commas. Type lists are only permitted in interface types that are used as type constraints. For example:
// SignedInteger is a type constraint that permits any
// signed integer type.
type SignedInteger interface {
type int, int8, int16, int32, int64
}
A type argument matches a constraint with a type list if
- The type argument implements the interface ignoring the type list, and
- either the type argument or its underlying type is identical to one of the types in the type list.
This rule was adopted in part to support permitting type lists in ordinary interface types, not only in constraints. However, discussion has made clear that the rule is too subtle. This suggests that it is too subtle not just for use in ordinary interface types, but also for use in constraints.
The behavior when embedding interfaces with type lists is also subtle.
We can do better.
Type sets
We start by defining a type set for all types. We will define what it means for a type to implement an interface in terms of type sets, resulting in a behavior that is equivalent to the current definition based on method sets.
Every type has an associated type set. The type set of an ordinary non-interface type T
is simply the set {T}
which contains just T
itself. The type set of an interface type (in this section we only discuss ordinary interface types, without type lists) is the set of all types that declare all the methods of the interface.
Note that the type set of an interface type is an infinite set. For any given type T
and interface type IT
it's easy to tell whether T
is in the type set of IT
(by checking whether all methods of IT
are declared by T
), but there is no reasonable way to enumerate all the types in the type set of IT
. The type IT
is a member of its own type set because an interface inherently declares all of its own methods. The type set of the empty interface interface{}
is the set of all possible types.
With this idea of type sets, we can restate what it means for a type T
to implement an interface type IT
: T
implements IT
if T
is a member of the type set of IT
. Since the type set of IT
is the set of all types that declare all the methods of the interface, T
is a member of the type set of IT
if and only if the method set of T
is a (possibly improper) superset of the method set of IT
, which is the standard definition of implementing an interface.
Now let's consider embedded interfaces. For a case like type O1 interface{ E }
, the type set of O1
is the same as the type set of E
. The case type O2 interface{ E1; E2 }
is more interesting: the type set of O2
is the intersection of the type sets of E1
and E2
. To see this, observe that the type set of E1
is the set of all types that implement all the methods of E1
, and similarly for E2
. What we want for the type set of O2
is the set of all types that implement all the methods of O2
. The methods of O2
are all of the methods of E1
combined with all of the methods of E2
. The set of types that implement all the methods of both E1
and E2
is the intersection of the type sets of E1
and E2
.
Note that listing a method in an interface type definition in the usual way is, from a type set perspective, indistinguishable from embedding an interface that declares just that method. Although a method by itself is not a type, for our purposes we can say that the type set for a method listed explicitly in an interface type definition is exactly the type set of an interface type with only that method: the set of all types that implement that method. The advantage of doing this is that we can now say that the type set of an interface type is exactly the intersection of the type sets of each element listed in the interface.
We've now described type sets, and we've explained the meaning of implementing an interface in terms of type sets. None of this changes the language in any way, but it serves as background and motivation for the next steps.
Proposal
We propose to replace type lists as defined by the generics proposal with three new, simpler, ideas.
An interface type that is used as a constraint, or that is embedded in a constraint, is permitted to embed some additional constructs that we will call interface elements. An interface element can be:
- Any type, not just an interface type.
- A new syntactic construct called an approximation element.
- A new syntactic construct called a union element.
With these new elements we will be able to state simply that a type argument A
satisfies a constraint C
exactly when A
implements the interface type C
, or, in terms of type sets, exactly when A
is a member of the type set of C
.
First, we propose that an interface type used as a constraint is permitted to embed a non-interface type. For example: type Integer interface{ int }
. As discussed in the previous section, the type set of an interface type is the intersection of the type sets of the elements of the interface. The type set of int
is simply {int}
. This means that the type set of Integer
is also {int}
.
This constraint can be satisfied by any type that is a member of the set {int}
. There is exactly one such type: int
.
Of course, that is useless by itself. For constraint satisfaction, we want to be able to say not just int
, but "any type whose underlying type is int
." To implement this, we propose a new syntactic construct, which may be embedded in an interface type used as a constraint. This is an approximation element, written as ~T
. The type set of an approximation ~T
is the set of all types whose underlying type is T
. An approximation ~T
is only valid if the underlying type of T
is itself T
; this is discussed in more detail below.
For example: type AnyInt interface{ ~int }
. The type set of ~int
, and therefore the type set of AnyInt
, is the set of all types whose underlying type is int
. For example, if MyInt
is defined as type MyInt int
, then MyInt
used as a type argument will satisfy the constraint AnyInt
.
The final step is another new syntactic construct that may be embedded in an interface type used as a constraint: a union element. A union element is written as a sequence of types or approximation elements separated by vertical bars (|
). For example: int | float32
or ~int8 | ~int16 | ~int32 | ~int64
. The type set of a union element is the union of the type sets of each element in the sequence. The types and elements listed in a union must all be different: no two types may be identical, and no two approximation elements ~T1
and ~T2
may have T1
identical to T2
. For example:
type PredeclaredSignedInteger interface {
int | int8 | int16 | int32 | int64
}
The type set of this union element is the set {int, int8, int16, int32, int64}
. Since the union is the only element of PredeclaredSignedInteger
, that is also the type set of PredeclaredSignedInteger
. This constraint can be satisfied by any of those five types.
Here is an example using approximation elements:
type SignedInteger interface {
~int | ~int8 | ~int16 | ~int32 | ~int64
}
The type set of this constraint is the set of all types whose underlying type is one of int
, int8
, int16
, int32
, or int64
.
Any of those types will satisfy this constraint. This is the equivalent of the notation used in the generics proposal
interface {
type int, int8, int16, int32, int64
}
The use of explicit approximation elements clarifies when we are matching on underlying types, the use of |
instead of ,
emphasizes that this is a union of elements, and the type
keyword can be omitted by permitting constraints to embed non-interface elements.
The purpose of introducing type lists in the generics proposal was to specify the operations available to type parameters in parameterized functions. This is easy to define based on the idea of type sets. Given a type parameter P
with a constraint C
, a parameterized function is permitted to use an operation with a value of type P
if the operation is permitted for every member of the type set of C
.
That is the complete proposal: a conceptual change to use type sets, and three new syntax changes. We will now mention some details and ramifications.
Approximation elements
The new ~T
syntax will be the first use of ~
as a token in Go.
Since ~T
means the set of all types whose underlying type is T
, it will be an error to use ~T
with a type T
whose underlying type is not itself. Types whose underlying types are themselves are:
- Type literals, such as
[]byte
or struct{ f int }
.
- Predeclared types, such as
int
or string
.
We do not permit ~P
where P
is a type parameter.
The type set of ~T
is an infinite set of types.
The ~
will bind more tightly than |
.
~T1 | T2
means (~T1) | (T2)
, not ~(T1 | T2)
(note that ~(T1 | T2)
is not syntactically valid)..
The new syntax is
InterfaceType = "interface" "{" { ( MethodSpec | InterfaceTypeName | ConstraintElem ) ";" } "}" .
ConstraintElem = ConstraintTerm { "|" ConstraintTerm } .
ConstraintTerm = [ "~" ] Type .
Embedding constraints
A constraint can embed another constraint. Union elements can include constraints.
// Signed is a constraint whose type set is any signed integer type.
type Signed interface {
~int | ~int8 | ~int16 | ~int32 | ~int64
}
// Unsigned is a constraint whose type set is any unsigned integer type.
type Unsigned interface {
~uint | ~uint8 | ~uint16 | ~uint32 | ~uint64 | ~uintptr
}
// Float is a constraint whose type set is any floating point type.
type Float interface {
~float32 | ~float64
}
// Ordered is a constraint whose type set is any ordered type.
// That is, any type that supports the < operator.
type Ordered interface {
Signed | Unsigned | Float | ~string
}
Interface types in union constraint elements
The type set of a union element is the union of the type sets of all elements in the union. For most types T
the type set of T
is simply T
itself. For interface types (and approximation elements), however, this is not the case.
The type set of an interface type that does not embed a non-interface element is the set of all types that implement the interface, including the interface type itself. Using such an interface type in a union element will add that type set to the union. For example:
type Stringish interface {
string | fmt.Stringer
}
The type set of Stringish
will be the type string
and all types that implement fmt.Stringer
. Any of those types (including fmt.Stringer
itself) will be permitted as a type argument for this constraint. No operations will be permitted for a value of a type parameter that uses Stringish
as a constraint (other than operations supported by all types). This is because fmt.Stringer
is in the type set of Stringish
, and fmt.Stringer
, an interface type, does not support any type-specific operations. The operations permitted by Stringish
are those operations supported by all the types in the type set, including fmt.Stringer
, so in this case there are no operations other than those supported by all types. A parameterized function that uses this constraint will have to use type assertions or reflection in order to use the values. Still, this may be useful in some cases for stronger static type checking. The main point is that it follows directly from the definition of type sets and constraint satisfaction.
Combining embedded non-interfaces with methods
A constraint can embed a constraint element and also list methods.
type StringableSignedInteger interface {
~int | ~int8 | ~int16 | ~int32 | ~int64
String() string
}
The rules for type sets define what this means. The type set of the union element is the set of all types whose underlying type is one of the predeclared signed integer types. The type set of String() string
is the set of all types that declare that method. The type set of StringableSignedInteger
is the intersection of those two type sets. The result is the set of all types whose underlying type is one of the predeclared signed integer types and that declare the method String() string
. A function that uses a parameterized type P
that uses StringableSignedInteger
as a constraint may use the operations permitted for any integer type (+
, *
, and so forth) on a value of type P
. It may also call the String
method on a value of type P
to get back a string
.
Empty type sets
It is possible to write a constraint with an empty type set. There is no type argument that will satisfy such a constraint. ~~The compiler should give an error whenever it detects such an unsatisfiable constraint. However, in general a compiler may not be able to detect all such cases.~~ It is not feasible to detect all such cases, though they can't be used with any type argument. It may be appropriate to have vet give an error for cases that it can detect.
// Unsatisfiable is an unsatisfiable constraint with an empty type set.
// No predeclared types have any methods.
// If this used ~int | ~float32 the type set would not be empty.
type Unsatisfiable interface {
int | float32
String() string
}
Method sets of constraint elements
Much as the type set of an interface type is the intersection of the type sets of the elements of the interface, the method set of an interface type can be defined as the union of the method sets of the elements of the interface. In most cases, an embedded element will have no methods, and as such will not contribute any methods to the interface type. That said, for completeness, we'll note that the method set of ~T
is the method set of T
. The method set of a union element is the intersection of the method sets of the elements of the union. These rules are implied by the definition of type sets, but they are not needed for understanding the behavior of constraints.
Possible future step: permitting constraints as ordinary interface types
We have proposed that constraints can embed some additional elements. With this proposal, any interface type that embeds anything other than an interface type can only be used as a constraint or as an embedded element in another constraint. A natural next step would be to permit using interface types that embed any type, or that embed these new elements, as an ordinary type, not just as a constraint.
We are not proposing that today. But the rules for type sets and methods set above describe how they would behave.
Any type that is an element of the type set could be assigned to such an interface type. A value of such an interface type would permit calling any member of the corresponding method set.
This would permit a version of what other languages call sum types or union types. It would be a Go interface type to which only specific types could be assigned. Such an interface type could still take the value nil
, of course, so it would not be quite the same as a typical sum type.
In any case, this is something to consider in a future proposal, not this one.