Verification: a143cc29221c9be0

Php check class implements interface

Introduction

An “intersection type” requires a value to satisfy multiple type constraints instead of a single one.

Intersection types are currently not supported natively by the language. Instead, one must either use phpdoc annotations, and/or abuse typed properties [1] as can be seen in the following example:

class Test {
    private ?Traversable $traversable = null;
    private ?Countable $countable = null;
    /** @var Traversable&Countable */
    private $both = null;
 
    public function __construct($countableIterator) {
        $this->traversable =& $this->both;
        $this->countable =& $this->both;
        $this->both = $countableIterator;
    }
}

Supporting intersection types in the language allows us to move more type information from phpdoc into function signatures, with the usual advantages this brings:

  • Types are actually enforced, so mistakes can be caught early.

  • Because they are enforced, type information is less likely to become outdated or miss edge-cases.

  • Types are checked during inheritance, enforcing the Liskov Substitution Principle.

  • Types are available through Reflection.

  • The syntax is a lot less boilerplate-y than phpdoc.

Motivation

It is possible to emulate intersection types by creating a new interface which inherits from multiple ones, one such case is the built in SeekableIterator which extends the Iterator interface by adding a seek() method on it. However, an iterator can also be countable, an if a function needs to type against such a requirement the only possible way is to currently create a new interface:

interface CountableIterator extends Iterator, Countable {}

This works, but what if we want an iterator that is countable and seekable? We need to create another interface:

interface SeekableCountableIterator extends CountableIterator, SeekableIterator {}

As such, each new requirement necessitates the creation of various new interfaces taking into account all possible combinations.

Moreover, the class needs to implement the specific interface and cannot rely on just implementing the base interfaces, meaning the introduction of such interfaces need to be propagated to all relevant classes, something which can be error prone. See this non-example:

interface A {}
interface B {}
interface AB extends A, B {}
 
class Test implements A, B {}
 
function foo(AB $v) {
    var_dump($v);
}
 
foo(new Test());

Intersection types solve these issues.

Proposal

Add support for pure intersection types are specified using the syntax T1&T2&... and can be used in all positions where types are currently accepted:

class A {
    private Traversable&Countable $countableIterator;
 
    public function setIterator(Traversable&Countable $countableIterator): void {
        $this->countableIterator = $countableIterator;
    }
 
    public function getIterator(): Traversable&Countable {
        return $this->countableIterator;
    }
}

This means it would not be possible to mix intersection and union types together such as A&B|C, this is left as a future scope.

Supported types

Only class types (interfaces and class names) are supported by intersection types.

The rationale is that for nearly all standard types using them in an intersection type result in a type which can never be satisfied (e.g. int&string).

Usage of mixed in an intersection type is redundant as mixed&T corresponds to T, as such this is disallowed.

Similarly using iterable in an intersection results in a redundant invalid type, this can be seen by expanding the type expression iterable&T = (array|Traversable)&T = (array&T) | (Traversable&T) = Traversable&T

Although an intersection with callable can make sense (e.g. string&callable), we think it is unwise and points to a bug.

Similarly parent, self, and static are technically feasible and could be used as part of an intersection, but impose strange restrictions on a child class which the base class violates or the base class already satisfies the type requirements in which case it is redundant. Therefore those 3 types are also forbidden because they likely point to a design issue.

Duplicate and redundant types

To catch some simple bugs in intersection type declarations, redundant types that can be detected without performing class loading will result in a compile-time error. This includes:

  • Each name-resolved type may only occur once. Types like A&B&A result in an error.

This does not guarantee that the type is “minimal”, because doing so would require loading all used class types.

For example, if A and B are runtime class aliases, then A&B remains a legal intersection type, even though it could be reduced to either A or B. Similarly, if class B extends A {}, then A&B is also a legal intersection type, even though it could be reduced to just B.

function foo(): A&A {} // Disallowed
 
use A as B;
function foo(): A&B {} // Disallowed ("use" is part of name resolution)
 
class_alias('X', 'Y');
function foo(): X&Y {} // Allowed (redundancy is only known at runtime)

Type grammar

Due to a parser ambiguity with the declaration of by-ref parameter while using the current LR(1) parser, the grammar and lexer are modified to create different tokens for the & character depending if it is followed by a (variadic) variable or not.

The grammar thus looks as following:

type_expr:
        type    
    |   '?' type    
    |   union_type
    |   intersection_type
;


intersection_type:
        type T_AMPERSAND_NOT_FOLLOWED_BY_VAR_OR_VARARG type
    |   intersection_type T_AMPERSAND_NOT_FOLLOWED_BY_VAR_OR_VARARG type
;

Variance

Intersection types follow standard PHP variance rules that are already used for inheritance and type checking:

  • Return types are covariant (child must be subtype).

  • Parameter types are contravariant (child must be supertype).

  • Property types are invariant (child must be subtype and supertype).

The only change is in how intersection types interact with subtyping, with two additional rules:

  • A is a subtype of B_1&...&B_n if for all B_i, A is a subtype of B_i

  • A_1&...&A_n is a subtype of B if there exists an A_i such that A_i is a subtype of B

In the following, some examples of what is allowed and what isn't are given.

Property types

Property types are invariant, which means that types must stay the same during inheritance. However, the “same” type may be expressed in different ways.

Intersection types expand the possibilities in this area: For example A&B and B&A represent the same type. The following example shows a more complex case:

class A {}
class B extends A {}
 
class Test {
    public A&B $prop;
}
class Test2 extends Test {
    public B $prop;
}

In this example, the intersection A&B actually represents the same type as just B, and this inheritance is legal, despite the type not being syntactically the same.

Formally, we arrive at this result as follows: First, the parent type A&B is a subtype of B. Second, B is a subtype of A&B, because B is a subtype of A and B is a subtype of B.

Adding and removing intersection types

It is legal to add intersection types in return position and remove intersection types in parameter position:

class A {}
interface X {}
 
class Test {
    public function param1(A $param) {}
    public function param2(A&X $param) {}
 
    public function return1(): A&X {}
    public function return2(): A {}
}
 
class Test2 extends Test {
    public function param1(A&X $param) {}   // FORBIDDEN: Adding extra param type constraint
    public function param2(A $param) {}     // Allowed: Removing param type constraint
 
    public function return1(): A {}         // FORBIDDEN: Removing return type constraint
    public function return2(): A&X {}       // Allowed: Adding extra return type constraint
}

Variance of individual intersection members

Similarly, it is possible to restrict an intersection member in return position, or widen an intersection member in parameter position:

class A {}
class B extends A {}
interface X {}
 
class Test {
    public function param1(B&X $param) {}
    public function param2(A&X $param) {}
 
    public function return1(): A&X {}
    public function return2(): B&X {}
}
 
class Test2 extends Test {
    public function param1(A&X $param) {} // Allowed: Widening intersection member B -> A
    public function param2(B&X $param) {} // FORBIDDEN: Restricting intersection member A -> B
 
    public function return1(): B&X {}     // Allowed: Restricting intersection member A -> B
    public function return2(): A&X {}     // FORBIDDEN: Widening intersection member B -> A
}

Of course, the same can also be done with multiple intersection members at a time, and be combined with the addition/removal of types mentioned previously.

Variance of intersection type to concrete class type

As the primary use of intersection types is to ensure multiple interfaces are implemented, a concrete class or interface which implements all the interfaces present in the intersection is considered a subtype and thus can be used where co-variance is allowed.

interface X {}
interface Y {}
 
class TestOne implements X, Y {}
 
interface A
{
    public function foo(): X&Y;
}
 
 
interface B extends A
{
    public function foo(): TestOne;
}

Moreover, it is possible to use a union type of concrete classes/interface when each of the member of the union implement all of the interfaces in the intersection.

class TestTwo implements X, Y {}
 
interface C extends A
{
    public function foo(X&Y $param): TestOne|TestTwo;
}

The reason why this is possible is that a union of concrete classes/interfaces is less general then the set of possible classes which satisfy the intersection type.

Coercive typing mode

As standard types are not allowed in pure intersection types, no consideration for the coercive typing mode needs to done.

Property types and references

References to typed properties with intersection types follow the semantics outlined in the typed properties RFC:

If typed properties are part of the reference set, then the value is checked against each property type. If a type check fails, a TypeError is generated and the value of the reference remains unchanged.

interface X {}
interface Y {}
interface Z {}
 
class A implements X, Y, Z {}
class B implements X, Y {}
 
class Test {
    public X&Y $y;
    public X&Z $z;
}
$test = new Test;
$r = new A;
$test->y =& $r;
$test->z =& $r;
 
// Reference set: { $r, $test->y, $test->z }
// Types: { A, X&Y, X&Z }
 
$r = new B;
// TypeError: Cannot assign B to reference held by property Test::$z of type X&Z

Reflection

To support intersection types, a new class ReflectionIntersectionType is added:

class ReflectionIntersectionType extends ReflectionType {
    /** @return ReflectionType[] */
    public function getTypes();
 
    /* Inherited from ReflectionType */
    /** @return bool */
    public function allowsNull();
 
    /* Inherited from ReflectionType */
    /** @return string */
    public function __toString();
}

The getTypes() method returns an array of ReflectionTypes that are part of the intersection. The types may be returned in an arbitrary order that does not match the original type declaration. The types may also be subject to equivalence transformations.

For example, the type X&Y may return types in the order ["Y", "X"] instead. The only requirement on the Reflection API is that the ultimately represented type is equivalent.

The __toString() method returns a string representation of the type that constitutes a valid code representation of the type in a non-namespaced context. It is not necessarily the same as what was used in the original code.

Examples

// This is one possible output, getTypes() and __toString() could
// also provide the types in the reverse order instead.
function test(): A&B {}
$rt = (new ReflectionFunction('test'))->getReturnType();
var_dump(get_class($rt));    // "ReflectionIntersectionType"
var_dump($rt->allowsNull()); // false
var_dump($rt->getTypes());   // [ReflectionType("A"), ReflectionType("B")]
var_dump((string) $rt);      // "A&B"
 
function test2(): A&B&C {}
$rt = (new ReflectionFunction('test2'))->getReturnType();
var_dump(get_class($rt));    // "ReflectionIntersectionType"
var_dump($rt->allowsNull()); // false
var_dump($rt->getTypes());   // [ReflectionType("A"), ReflectionType("B"),
                             //  ReflectionType("C")]
var_dump((string) $rt); // "A&B&C"

Backward Incompatible Changes

This RFC does not contain any backwards incompatible changes.

However, existing ReflectionType based code might need to be adjusted in order to support processing of code that uses intersection types.

Proposed PHP Version

Next minor version, i.e. PHP 8.1.

Future Scope

The features discussed in the following are not part of this proposal.

Composite types (i.e. mixing union and intersection types)

While early prototyping [2] shows that supporting A&B|C without any grouping looks feasible, there are still many other considerations (e.g. Reflection), but namely the variance rules and checks, which would be dramatically increased and prone to error.

There is also the opinion that composite types should not rely on precedence of unions but be explicitly grouped together.

As such we consider a stepped approach by only allowing pure intersection first the best way forward.

Type Aliases

As types become increasingly complex, it may be worthwhile to allow reusing type declarations. There are two general ways in which this could work. One is a local alias, such as:

use Traversable&Countable as CountableIterator;
 
function foo(CountableIterator $x) {}

In this case CountableIterator is a symbol that is only visible locally and will be resolved to the original Traversable&Countable type during compilation.

The second possibility is an exported typedef:

namespace Foo;
type CountableIterator = Traversable&Countable;
 
// Usable as \Foo\CountableIterator from elsewhere

It should be noted that inclusion of this proposal will add extra considerations for type aliases as it would be possible to write composite types as if grouping was supported. However, the groundwork for supporting this is present in this proposal.

Proposed Voting Choices

As per the voting RFC a yes/no vote with a 2/3 majority is needed for this proposal to be accepted.

Implementation

Implemented in PHP 8.1:

  • docs: TDB

Acknowledgements

To Ilija Tovilo for resolving the parser conflict with by-ref parameters.

To Nikita Popov for reviewing and refactoring the variance code.

Introduction

The purpose of inheritance is code reuse, for when you have a class that shares common functionality, and you want others to be able to extend it and make use of this functionality in their own class.

However, when you have a class in your code base that shares some implementation detail between 2 or more other objects, your only protection against others making use of this class is to add `@internal` annotation, which doesn't offer any runtime guarantee that no one is extending this object.

Internally, PHP has the `Throwable` interface, which defines common functionality between `Error` and `Exception` and is implemented by both, however, end users are not allowed to implement `Throwable`.

Currently PHP has a special case for `Throwable`, and what this RFC is proposing is to make this kind of functionally possible to end users as well, so that `Throwable` is not a spacial case anymore.

Proposal

Support for sealed classes is added through a new modifier `sealed`, and a new `permits` clause that takes place after `extends`, and `implements`.

sealed class Shape permits Circle, Square, Rectangle {}
 
final class Circle extends Shape {} // ok
final class Square extends Shape {} // ok
final class Rectangle extends Shape {} // ok
 
final class Triangle extends Shape {} // Fatal error: Class Triangle cannot extend sealed class Shape.

An interface that is sealed can be implemented directly only by the classes named in the `permits` clause.

namespace Psl\Result {
  sealed interface ResultInterface permits Success, Failure { ... }
 
  final class Success implements ResultInterface { ... }
  final class Failure implements ResultInterface { ... }
 
  function wrap(callable $callback): ResultInterface { ... }
 
  function unwrap(ResultInterface $result): mixed
  {    
    return match($result::class) {
      Result\Success::class => $result->value(),
      Result\Failure::class => throw $result->error(),
    }; // no need for default, it's not possible.
  }
 
}
 
namespace App {
  use Psl\Result;
 
  // Fatal error: Class App\Maybe cannot implement sealed interface Psl\Result\ResultInterface.
  final class Maybe implements Result\ResultInterface {}
}

Similarly, a trait that is sealed can only be used by the classes named in the `permits` clause.

This is an example taken from the Symfony Cache component

namespace Symfony\Component\Cache\Traits {
  use Symfony\Component\Cache\Adapter\FilesystemAdapter;
  use Symfony\Component\Cache\Adapter\FilesystemTagAwareAdapter;
  use Symfony\Component\Cache\Adapter\PhpFilesAdapter;
 
  sealed trait FilesystemCommonTrait permits FilesystemTrait, PhpFilesAdapter { ... }
  sealed trait FilesystemTrait permits FilesystemAdapter, FilesystemTagAwareAdapter {
   use FilesystemCommonTrait; // ok
   ...
  }
}
 
namespace Symfony\Component\Cache\Adapter {
   use Symfony\Component\Cache\Traits\FilesystemTrait;
 
   final class FilesystemAdapter {
     use FilesystemTrait; // ok
     ...
   }
 
   final class FilesystemTagAwareAdapter {
     use FilesystemTrait; // ok
     ...
   }
}
 
namespace App\Cache {
    use Symfony\Component\Cache\Traits\FilesystemTrait;
 
    // Error: Class App\Cache\MyFilesystemCache may not use sealed trait (Symfony\Component\Cache\Traits\FilesystemTrait)
    final class MyFilesystemAdapter {
      use FilesystemTrait;
    }
 
    // Error: Trait App\Cache\MyFilesystemTrait may not use sealed trait (Symfony\Component\Cache\Traits\FilesystemTrait)
    trait MyFilesystemTrait {
      use FilesystemTrait;
    }
}

Syntax

Some people might be against introducing a new keyword into the language, which will lead to `sealed` and `permits` not being a valid class names anymore, therefor, a second vote will take place to decide which syntax should be used.

The available options are the following:

1. using `sealed`+`permits`:

sealed class Foo permits Bar, Baz {}
 
sealed interface Qux permits Quux, Quuz {}
 
sealed trait Corge permits Grault, Garply {}

2. using `permits` only:

class Foo permits Bar, Baz {}
 
interface Qux permits Quux, Quuz {}
 
trait Corge permits Grault, Garply {}

3. using pre-reserved `for` keyword:

class Foo for Bar, Baz {}
 
interface Qux for Quux, Quuz {}
 
trait Corge for Grault, Garply {}

Backward Incompatible Changes

`sealed` and `permits` become reserved keywords in PHP 8.1

Proposed PHP Version(s)

PHP 8.1

RFC Impact

To Opcache

TBD

To Reflection

The following additions will be made to expose the new flag via reflection:

  • New constant ReflectionClass::IS_SEALED to expose the bit flag used for sealed classes

  • The return value of ReflectionClass::getModifiers() will have this bit set if the class being reflected is sealed

  • Reflection::getModifierNames() will include the string “sealed” if this bit is set

  • A new ReflectionClass::isSealed() method will allow directly checking if a class is sealed

  • A new ReflectionClass::getPermittedClasses() method will return the list of class names allowed in the `permits` clause.

Proposed Voting Choices

As this is a language change, a 2/3 majority is required.

Patches and Tests

Links to any external patches and tests go here.

If there is no patch, make it clear who will create a patch, or whether a volunteer to help with implementation is needed.

Make it clear if the patch is intended to be the final patch, or is just a prototype.

For changes affecting the core language, you should also provide a patch for the language specification.

Introduction

PHP is currently having problems with RNG reproducibility.

PHP's RNG has been unified into an implementation using the Mersenne twister, with the rand() and srand() functions becoming aliases for mt_rand() and mt_srand() respectively in PHP 7.1.

But, these functions still store the state in the global state of PHP and are not easily reproducible. Look at the following example.

echo foo(1234, function (): void {}) . PHP_EOL; // Result: 1480009472
echo foo(1234, function (): void { mt_rand(); }) . PHP_EOL; // Result: 1747253290
 
function foo(int $seed, callable $bar): int {
    mt_srand($seed);
    $result = mt_rand();
    $bar();
    $result += mt_rand();
    return $result;
}

As mentioned above, the reproducibility of random numbers can easily be lost if additional processing is added later.

In addition, the fiber extension was introduced in PHP 8.1. This makes it more difficult to keep track of the execution order. However, this problem has existed since the introduced of Generator.

There is also the problem of functions that implicitly use the state stored in PHP's global state. shuffle(), str_shuffle(), and array_rand() functions implicitly advance the state of a random number. This means that the following code is not reproducible, but it is difficult for the user to notice this.

mt_srand(1234);
echo mt_rand() . PHP_EOL; // Result: 411284887
 
mt_srand(1234);
str_shuffle('foobar');
echo mt_rand() . PHP_EOL; // Result: 1314500282

Proposal

Implement and bundled Random extension into PHP.

The pseudo-implementation for the whole extension is as follows:


 
namespace Random
{
    interface NumberGenerator
    {
        public function generate(): int;
    }
}
 
namespace Random\NumberGenerator
{
    class XorShift128Plus implements Random\NumberGenerator
    {
        public function __construct(?int $seed = null) {}
        public function generate(): int {}
        public function __serialize(): array {}
        public function __unserialize(array $data): void {}
    }
 
    class MT19937 implements Random\NumberGenerator
    {
        public function __construct(?int $seed = null) {}
        public function generate(): int {}
        public function __serialize(): array {}
        public function __unserialize(array $data): void {}
    }
 
    class Secure implements Random\NumberGenerator
    {
        public function __construct() {}
        public function generate(): int {}
    }
}
 
namespace
{
    final class Random
    {
        private Random\NumberGenerator $rng;
 
        public function __construct(?Random\NumberGenerator $rng = null) {}
        public function getNumberGenerator(): Random\NumberGenerator {}
        public function getInt(int $min, int $max): int {}
        public function getBytes(int $length): string {}
        public function shuffleArray(array $array): array {}
        public function shuffleString(string $string): string {}
        public function __serialize(): array {}
        public function __unserialize(array $data): void {}
    }
}

Each RNG is implemented as a class in the Random\NumberGenerator namespace. They all implement the Random\NumberGenerator interface.

The bundled RNGs are as follows:

  • Random\NumberGenerator\XorShift128Plus: 64-bit, reproducible, PRNG.

  • Random\NumberGenerator\MT19937: 32-bit, reproducible, PRNG, compatible mt_srand() / mt_rand().

  • Random\NumberGenerator\Secure: 64-bit, non-reproducible, CSPRNG, uses php_random_bytes() internally.

Random class use a XorShift128+ by default. It can generate 64-bit values, is used by major browsers, and is fast and reliable. However, when used XorShift128+ in a 32-bit environment, the upper 32 bits are always truncated. This means that compatibility cannot be maintained between platforms, but this is not a problem since most platforms running PHP today are 64-bit and MT19937 can be used explicitly if compatibility is required.

Note that (new Random\NumberGenerator\MT19937($seed))->generate() requires an additional bit shift to get a result equivalent to mt_rand(). mt_rand() implicitly did the bit-shifting internally, but there was no obvious reason for this.

Secure is practically equivalent to random_int() and random_bytes(), This is useful when secure array or string shuffling is required.

This class also supports RNGs defined in userland. It can be used by passing an instance of a class that implements the Random\NumberGenerator interface provided at the same time as the first argument.This is useful for unit testing or when you want to use a fixed number.

class UserDefinedRNG implements Random\NumberGenerator
{
    protected int $current = 0;
 
    public function generate(): int
    {
        return ++$this->current;
    }
}
 
function foobar(Random\NumberGenerator $numberGenerator): void {
    for ($i = 0; $i  9; $i++) {
        echo $numberGenerator->generate();
    }
}
 
foobar(new UserDefinedRNG()); // Results: 123456789

Also, as with MT, various alternative APIs using Random class will be provided.

The Random class can be serialized using the standard PHP serialization mechanism. But, if the $rng member is not serializable, it will throws Exception.

// serialize
$foo = new Random(new Random\Numbergenerator\XorShift128Plus());
for ($i = 0; $i  10; $i++) { $foo->getInt(PHP_INT_MIN, PHP_INT_MAX); }
var_dump(unserialize(serialize($foo))->getInt(PHP_INT_MIN, PHP_INT_MAX) === $foo->getInt(PHP_INT_MIN, PHP_INT_MAX)); // true
 
// can't serialize
$foo = new Random(new Random\Numbergenerator\Secure());
for ($i = 0; $i  10; $i++) { $foo->getInt(PHP_INT_MIN, PHP_INT_MAX); }
var_dump(unserialize(serialize($foo))->getInt(PHP_INT_MIN, PHP_INT_MAX) === $foo->getInt(PHP_INT_MIN, PHP_INT_MAX)); // throws Exception:  Serialization of CLASS is not allowed.

It is not possible to clone the Random class. it always throws Error (Error: Trying to clone an uncloneable object of class Random). This is because the standard PHP clone method copies the members by reference when cloning. This will be an unintended behavior for most users. Instead, you can use the getNumberGenerator() method to retrieve the internal RNG instance. The RNG instance can be cloned.

$foo = new Random();
 
// can't direct clone
// $bar = clone $foo;
 
// safe
$bar = new Random(clone $foo->getNumberGenerator());

Using this feature, the first example can be rewritten as follows:

echo foo(1234, function (): void {}) . PHP_EOL; // Result: 1480009472
echo foo(1234, function (): void { mt_rand(); }) . PHP_EOL; // Result: 1480009472
 
function foo(int $seed, callable $bar): int {
    $numberGenerator = new Random\NumberGenerator\MT19937($seed);
    $result = ($numberGenerator->generate() >> 1); // requires bit-shift for compatibility.
    $bar();
    $result += ($numberGenerator->generate() >> 1); // requires bit-shift for compatibility.
    return $result;
}

Future Scope

This RFC will be the basis for making PHP RNGs safe in the future.

By first accepted this RFC, PHP gets a random number in the local scope.

The Random class can also be used when new features are implemented that use random numbers. This has the effect of discouraging more implementations from using random numbers that depend on the global scope.

More in the future, we can consider doing away with functions such as mt_srand(). These functions are simple and convenient, but they may unintentionally create implementations that depend on global scope.

Backward Incompatible Changes

The following class name will no longer be available:

  • “Random”

  • “Random\NumberGenerator”

  • “Random\NumberGenerator\XorShift128Plus”

  • “Random\NumberGenerator\MT19937”

  • “Random\NumberGenerator\Secure”

Proposed PHP Version(s)

8.1

RFC Impact

To SAPIs

none

To Existing Extensions

none

To Opcache

none

New Constants

none

php.ini Defaults

none

Open Issues

none

Vote

Voting opens 2021-MM-DD and 2021-MM-DD at 00:00:00 EDT. 2/3 required to accept.

Introduction

This RFC introduces Enumerations to PHP. The scope of this RFC is limited to “unit enumerations,” that is, enumerations that are themselves a value, rather than simply a fancy syntax for a primitive constant, and do not include additional associated information. This capability offers greatly expanded support for data modeling, custom type definitions, and monad-style behavior. Enums enable the modeling technique of “make invalid states unrepresentable,” which leads to more robust code with less need for exhaustive testing.

Many languages have support for enumerations of some variety. A survey we conducted of various languages found that they could be categorized into three general groups: Fancy Constants, Fancy Objects, and full Algebraic Data Types (ADTs).

This RFC is part of a larger effort to introduce full Algebraic Data Types. It implements the “Fancy Objects” variant of enumerations in such a way that it may be extended to full ADTs by future RFCs. It draws both conceptually and semantically from Swift, Rust, and Kotlin, although it is not directly modeled on either.

The most popular case of enumerations is boolean, which is an enumerated type with legal values true and false. This RFC allows developers to define their own arbitrarily robust enumerations.

Proposal

Enumerations are built on top of classes and objects. That means, except where otherwise noted, “how would Enums behave in situation X” can be answered “the same as any other object instance.” They would, for example, pass an object type check. Enum names are case-insensitive, but subject to the same caveat about autoloading on case-sensitive file systems that already applies to classes generally. Case names are internally implemented as class constants, and thus are case-sensitive.

Basic enumerations

This RFC introduces a new language construct, enum. Enums are similar to classes, and share the same namespaces as classes, interfaces, and traits. They are also autoloadable the same way. An Enum defines a new type, which has a fixed, limited number of possible legal values.

enum Suit {
  case Hearts;
  case Diamonds;
  case Clubs;
  case Spades;
}

This declaration creates a new enumerated type named Suit, which has four and only four legal values: Suit::Hearts, Suit::Diamonds, Suit::Clubs, and Suit::Spades. Variables may be assigned to one of those legal values. A function may be type checked against an enumerated type, in which case only values of that type may be passed.

function pick_a_card(Suit $suit) { ... }
 
$val = Suit::Diamonds;
 
pick_a_card($val);        // OK
pick_a_card(Suit::Clubs); // OK
pick_a_card('Spades');    // TypeError: pick_a_card(): Argument #1 ($suit) must be of type Suit, string given

An Enumeration may have zero or more case definitions, with no maximum. A zero-case enum is syntactically valid, if rather useless.

By default, cases are not intrinsically backed by a scalar value. That is, Suit::Hearts is not equal to 0. Instead, each case is backed by a singleton object of that name. That means that:

$a = Suit::Spades;
$b = Suit::Spades;
 
$a === $b; // true
 
$a instanceof Suit;  // true

It also means that enum values are never or > each other, since those comparisons are not meaningful on objects. Those comparisons will always return false when working with enum values.

This type of case, with no related data, is called a “Pure Case.” An Enum that contains only Pure Cases is called a Pure Enum.

All Pure Cases are implemented as instances of their enum type. The enum type is represented internally as a class.

All Cases have a read-only property, name, that is the case-sensitive name of the case itself. That is largely an implementation artifact, but may also be used for debugging purposes.

print Suit::Spades->name;
// prints "Spades"

Backed Enums

By default, Enumerated Cases have no scalar equivalent. They are simply singleton objects. However, there are ample cases where an Enumerated Case needs to be able to round-trip to a database or similar datastore, so having a built-in scalar (and thus trivially serializable) equivalent defined intrinsically is useful.

To define a scalar equivalent for an Enumeration, the syntax is as follows:

enum Suit: string {
  case Hearts = 'H';
  case Diamonds = 'D';
  case Clubs = 'C';
  case Spades = 'S';
}

A case that has a scalar equivalent is called a Backed Case, as it is “Backed” by a simpler value. An Enum that contains all Backed Cases is called a “Backed Enum.” A Backed Enum may contain only Backed Cases. A Pure Enum may contain only Pure Cases.

A Backed Enum may be backed by types of int or string, and a given enumeration supports only a single type at a time. (That is, no union of int|string.) If an enumeration is marked as having a scalar equivalent, then all cases must have a unique scalar equivalent defined explicitly. There are no auto-generated scalar equivalents (e.g., sequential integers). Value cases must be unique; two backed enum cases may not have the same scalar equivalent. (However, a constant may refer to a case, effectively creating an alias.)

Equivalent values must be literals or literal expressions. Constants and constant expressions are not supported. That is, 1+1 is allowed, but 1 + SOME_CONST is not. This is primarily due to implementation complexity. (See Future Scope below.)

Value Cases have an additional read-only property, value, which is the value specified in the definition.

print Suit::Clubs->value;
// Prints "C"

In order to enforce the value property as read-only, a variable cannot be assigned as a reference to it. That is, the following throws an error:

$suit = Suit::Clubs;
$ref = &$suit->value;
// Error: Cannot acquire reference to property Suit::$value

Backed enums implement an internal BackedEnum interface, which exposes two additional methods:

  • from(int|string): self will take a scalar and return the corresponding Enum Case. If one is not found, it will throw a ValueError. This is mainly useful in cases where the input scalar is trusted and a missing enum value should be considered an application-stopping error.

  • tryFrom(int|string): ?self will take a scalar and return the corresponding Enum Case. If one is not found, it will return null. This is mainly useful in cases where the input scalar is untrusted and the caller wants to implement their own error handling or default-value logic.

The “tryX” idiom is common in C# and Rust (albeit in somewhat different ways) to indicate that the result may be null/optional. It would be new to PHP, but not incompatible with any current conventions.

The from() and tryFrom() methods follow standard weak/strong typing rules. In weak typing mode, passing an integer or string is acceptable and the system will coerce the value accordingly. Passing a float will also work and be coerced. In strict typing mode, passing an integer to from() on a string-backed enum (or vice versa) will result in a TypeError, as will a float in all circumstances. All other parameter types will throw a TypeError in both modes.

$record = get_stuff_from_database($id);
print $record['suit'];
 
$suit =  Suit::from($record['suit']);
// Invalid data throws a ValueError: "X" is not a valid scalar value for enum "Suit"
print $suit->value;
 
$suit = Suit::tryFrom('A') ?? Suit::Spades;
// Invalid data returns null, so Suit::Spades is used instead.
print $suit->value;

Manually defining a from() or tryFrom() method on a Backed Enum will result in a fatal error.

Enumerated Methods

Enums (both Pure Enums and Backed Enums) may contain methods, and may implement interfaces. If an Enum implements an interface, then any type check for that interface will also accept all cases of that Enum.

interface Colorful {
  public function color(): string;
}
 
enum Suit implements Colorful {
  case Hearts;
  case Diamonds;
  case Clubs;
  case Spades;
 
  // Fulfills the interface contract.
  public function color(): string {
    return match($this) {
      Suit::Hearts, Suit::Diamonds => 'Red',
      Suit::Clubs, Suit::Spades => 'Black',
    };
  }
 
  // Not part of an interface; that's fine.
  public function shape(): string {
    return "Rectangle";
  }
}
 
function paint(Colorful $c) { ... }
 
paint(Suit::Clubs);  // Works
 
print Suit::Diamonds->shape(); // prints "rectangle"

In this example, all four instances of Suit have two methods, color() and shape(). As far as calling code and type checks are concerned, they behave exactly the same as any other object instance.

Inside a method, the $this variable is defined and refers to the Case instance.

Methods may be arbitrarily complex, but in practice will usually return a static value or match on $this to provide different results for different cases.

Note that in this case it would be a better data modeling practice to also define a SuitColor Enum Type with values Red and Black and return that instead. However, that would complicate this example.

The above hierarchy is logically similar to the following class structure (although this is not the actual code that runs):

interface Colorful {
  public function color(): string;
}
 
final class Suit implements UnitEnum, Colorful {
 
  public const Hearts = new self('Hearts');
  public const Diamonds = new self('Diamonds');
  public const Clubs = new self('Clubs');
  public const Spades = new self('Spades');
 
  private function __construct(public string $name) {}
 
  public function color(): string {
    return match($this) {
      Suit::Hearts, Suit::Diamonds => 'Red',
      Suit::Clubs, Suit::Spades => 'Black',
    };
  }
 
  public function shape(): string {
    return "Rectangle";
  }
 
  public static function cases(): array {
    // See below.
  }
}

The case instance objects may be assigned to constants because they are created internally in the engine rather than in user-space. Additionally, the differentiating flag for each case is not actually a constructor parameter.

Methods may be public, private, or protected, although in practice private and protected are equivalent as inheritance is not allowed.

Enumeration static methods

Enumerations may also have static methods. The use for static methods on the enumeration itself is primarily for alternative constructors. E.g.:

enum Size {
  case Small;
  case Medium;
  case Large;
 
  public static function fromLength(int $cm) {
    return match(true) {
      $cm  50 => static::Small,
      $cm  100 => static::Medium,
      default => static::Large,
    };
  }
}

Static methods may be public, private, or protected, although in practice private and protected are equivalent as inheritance is not allowed.

Enumeration constants

Enumerations may include constants, which may be public, private, or protected, although in practice private and protected are equivalent as inheritance is not allowed.

An enum constant may refer to an enum case:

enum Size {
  case Small;
  case Medium;
  case Large;
 
  public const Huge = self::Large;
}

Traits

Enumerations may leverage traits, which will behave the same as on classes. The caveat is that traits used in an enum must not contain properties. They may only include methods and static methods. A trait with properties will result in a fatal error.

interface Colorful {
  public function color(): string;
}
 
trait Rectangle {
  public function shape(): string {
    return "Rectangle";
  }
}
 
enum Suit implements Colorful {
  use Rectangle;
 
  case Hearts;
  case Diamonds;
  case Clubs;
  case Spades;
 
  public function color(): string {
    return match($this) {
      Suit::Hearts, Suit::Diamonds => 'Red',
      Suit::Clubs, Suit::Spades => 'Black',
    };
  }
}

Enum values in constant expressions

Because cases are represented as constants on the enum itself, they may be used as static values in most constant expressions: property defaults, static variable defaults, parameter defaults, global and class constant values. They may not be used in other enum case values due to implementation complexity. (That restriction may be lifted in the future, but since they can be used by constants on an enum it is not a significant limitation.)

However, implicit magic method calls such as ArrayAccess on enums are not allowed in static or constant definitions as we cannot absolutely guarantee that the resulting value is deterministic or that the method invocation is free of side effects. Function calls, method calls, and property access continue to be invalid operations in constant expressions.

In code:

// This is an entirely legal Enum definition.
enum Direction implements ArrayAccess {
  case Up;
  case Down;
 
  public function offsetGet($val) { ... }
  public function offsetExists($val) { ... }
  public function offsetSet($val) { throw new Exception(); }
  public functiond offsetUnset($val) { throw new Exception(); }
}
 
class Foo {
  // This is allowed.
  const Bar = Direction::Down;
 
  // This is disallowed, as it may not be deterministic.
  const Bar = Direction::Up['short'];
  // Fatal error: Cannot use [] on enums in constant expression
}
 
// This is entirely legal, because it's not a constant expression.
$x = Direction::Up['short'];

Comparison to objects

Although Enums are implemented using classes under the hood and share much of their semantics, some object-style functionality is forbidden. These either do not make sense in the scope of enums, their value is debatable (but could be re-added in the future), or their semantics are unclear.

Specifically, the following features of objects are not allowed on enumerations:

  • Constructors - Not relevant without data/state.

  • Destructors - Not relevant without data/state.

  • Class/Enum inheritance. - Enums are by design a closed list, which inheritance would violate. (Interfaces are allowed, but not parent classes.)

  • Enum/Case properties - Properties are a form of state, and enum cases are stateless singletons. Metadata about an enum or case can always be exposed via methods.

  • Dynamic properties - Avoid state. Plus, they're a bad idea on classes anyway.

  • Magic methods except for those specifically listed below - Most of the excluded ones involve state.

  • Cloning of enum cases. Enum cases must be single instances in order to behave predictably.

If you need any of that functionality, classes as they already exist are the superior option.

The following object functionality is available, and behaves just as it does on any other object:

  • Public, private, and protected methods.

  • Public, private, and protected static methods.

  • Public, private, and protected constants.

  • __call, __callStatic, and __invoke magic methods

  • __CLASS__ and __FUNCTION__ constants behave as normal

The ::class magic constant on an Enum type evaluates to the type name including any namespace, exactly the same as an object. The ::class magic constant on a Case instance also evaluates to the Enum type, as it is an instance of that type.

Additionally, enum cases may not be instantiated directly with new, nor with newInstanceWithoutConstructor in reflection. Both will result in an error.

$clovers = new Suit();
// Error: Cannot instantiate enum Suit
$mace = (new ReflectionClass(Suit::class))->newInstanceWithoutConstructor()
// Error: Cannot instantiate enum Suit

Value listing

Both Pure Enums and Backed Enums implement an internal interface named UnitEnum. UnitEnum includes a static method cases(). cases() returns a packed array of all defined Cases in the order of declaration.

Suit::cases();
// Produces: [Suit::Hearts, Suit::Diamonds, Suit::Clubs, Suit:Spades]

Manually defining a cases() method on an Enum will result in a fatal error.

Non-iterable Enums are not yet supported, but are expected to be part of the future ADT/Tagged Union RFC. (Those will not have a finite set of possible values.)

Note that UnitEnum does not extend Iterator, as the enum case instances themselves are not iterable; it's the Enum type that is iterable. An Enum could implement Iterator or IteratorAggregate if it so chose, however.

Serialization

Enumerations are serialized differently from objects. Specifically, they have a new serialization code, “E”, that specifies the name of the enum case. The deserialization routine is then able to use that to set a variable to the existing singleton value. That ensures that:

Suit::Hearts === unserialize(serialize(Suit::Hearts));
 
print serialize(Suit::Hearts);
// E:11:"Suit:Hearts";

On deserialization, if an enum and case cannot be found to match a serialized value a warning will be issued and false returned. (That is standard existing behavior for unserialize().)

If a Pure Enum is serialized to JSON, an error will be thrown. If a Backed Enum is serialized to JSON, it will be represented by its value scalar only, in the appropriate type. The behavior of both may be overridden by implementing JsonSerializable.

For print_r(), the output of an enum case has been modified to not confuse it with objects, although it is still similar to objects.

enum Foo {
  case Bar;
}
 
enum Baz: int {
  case Beep = 5;
}
 
print_r(Foo::Bar);
print_r(Baz::Beep);
Foo Enum (
  [name] => Bar
)
Baz Enum:int {
  [name] => Beep
  [value] => 5
}

Attributes

Enums and cases may have attributes attached to them, like any other language construct. The TARGET_CLASS target filter will include Enums themselves. The TARGET_CLASS_CONST target filter will include Enum Cases.

No engine-defined attributes are included. User-defined attributes can do whatever.

Match expressions

match expressions offer a natural and convenient way to branch logic depending on the enum value. Since every instance of an Enum is a singleton, it will always pass an identity check. Therefore:

$val = Suit::Diamonds;
 
$str = match ($val) {
  Suit::Spades => "The swords of a soldier",
  Suit::Clubs => "Weapons of war",
  Suit::Diamonds => "Money for this art",
  default => "The shape of my heart",
}

This usage requires no modification of match. It is a natural implication of the current functionality.

SplObjectStorage and WeakMaps

As objects, Enum cases cannot be used as keys in an array. However, they can be used as keys in a SplObjectStorage or WeakMap. Because they are singletons they never get garbage collected, and thus will never be removed from a WeakMap, making these two storage mechanisms effectively equivalent.

This usage requires no modification to SplObjectStorage or WeakMap. It is a natural implication of the current functionality.

Reflection

Enums are reflectable using a ReflectionEnum class, which extends ReflectionClass. Their cases are reflectable using ReflectionEnumPureCase and ReflectionEnumBackedCase, which extend ReflectionClassConstant. They are defined as follows:

class ReflectionEnum extends ReflectionClass {
 
  // Returns true if there is a Case defined with that name.  
  // For instance, ''$r->hasCase('Hearts')'' returns true.
  public function hasCase(string $name): bool {}
 
  // Returns an array of ReflectionEnumPureCase|ReflectionEnumBackedCase objects.
  public function getCases(): array {}
 
  // Returns a single reflection object for the corresponding case.
  // If not found, throws, ReflectionException.
  public function getCase(string $name): ReflectionEnumPureCase|ReflectionEnumBackedCase
 
  // True if this enum has a backing type, false otherwise.
  public function isBacked(): bool {}
 
  // Returns the type of the backing values of this enum, if any.
  // On a Pure Enum, returns null.
  public getBackingType(): ?ReflectionType {}
}
 
class ReflectionEnumUnitCase extends ReflectionClassConstant {
 
  // Pre-existing. This will return the corresponding enum instance for this case.
  public function getValue() {}
 
  // Returns the ReflectionEnum instance for this case's enum class.
  public function getEnum(): ReflectionEnum {}
}
 
class ReflectionEnumBackedCase extends ReflectionEnumUnitCase {
 
  // Returns the scalar equivalent defined for the case.
  public function getBackingValue(): int|string {}
}

Additionally, a new function enum_exists(string $enum, bool $autoload = true): bool returns true if the value passed is the name of an Enum class.

Examples

Below are a few examples of Enums in action.

Basic limited values

enum SortOrder {
  case ASC;
  case DESC;
}
 
function query($fields, $filter, SortOrder $order = SortOrder::ASC) { ... }

The query() function can now proceed safe in the knowledge that $order is guaranteed to be either SortOrder::ASC or SortOrder::DESC. Any other value would have resulted in a TypeError, so no further error checking or testing is needed.

Advanced Exclusive values

enum UserStatus: string {
  case Pending = 'P';
  case Active = 'A';
  case Suspended = 'S';
  case CanceledByUser = 'C';
 
  public function label(): string {
    return match($this) {
      static::Pending => 'Pending',
      static::Active => 'Active',
      static::Suspended => 'Suspended',
      static::CanceledByUser => 'Canceled by user',
    };
  }
}

In this example, a user's status may be one of, and exclusively, UserStatus::Pending, UserStatus::Active, UserStatus::Suspended, or UserStatus::CanceledByUser. A function can type a parameter against UserStatus and then only accept those four values, period.

All four values have a label() method, which returns a human-readable string. That string is independent of the “machine name” scalar equivalent string, which can be used in, for example, a database field or an HTML select box.

foreach (UserStatus::cases() as $case) {
  printf('\n', $case->value, $case->label());
}

New interfaces

As noted above, this RFC defines two additional internal interfaces. These interfaces are available to make it possible for user code to determine if a given object is an enumeration, and if so what type. User code may not implement or extend these interfaces directly.

interface UnitEnum {
  public string $name;
 
  public static function cases(): array;
}
 
interface BackedEnum extends UnitEnum {
  public string $value;
 
  public static function from(int|string $scalar): static;
  public static function tryFrom(int|string $scalar): ?static;
}

Backward Incompatible Changes

“enum” becomes a language keyword, with the usual potential for naming conflicts with existing global constants and class/interface/trait names.

Thanks to a clever trick from Nikita (discussed after the RFC was approved), “enum” is not a reserved word on its own. That means it is still a legal name for a class/interface/trait at this time. It will likely be converted into a full keyword at some point in the future, but this RFC does not specify that timeline. As a side effect, comments are not supported between “enum” and the Enum name, which is of little consequence in practice.

The global scoped internal interfaces UnitEnum, and BackedEnum are defined.

The global function enum_exists is defined.

Future Scope

Grouped syntax

It would be possible, in the simple case, to allow multiple cases to be defined together, like so:

enum Suit {
  case Hearts, Diamonds, Clubs, Spades;
}

However, that may cause syntactic issues with the planned addition of tagged unions, which may or may not end up including per-case methods. Until that future extension is settled, we opted to skip this syntactic optimization. Grouped syntaxes have a somewhat controversial history anyway (they're not universally loved, and often unused entirely in many situations), and it's easy enough to add later if needed, so we have omitted that shorthand at this time. Once the dust settles, they may get added in the future.

Enums as array keys

Because they are objects, enum cases may not be used as keys in an associative array. It may be possible to support that in the future, but that is not covered at this time. For now, SplObjectStorage and WeakMaps are good enough.

Enum Sets

An enum set is the logical OR of two other cases. For instance, $red = Suit::Hearts | Suit::Diamonds. Those are not supported at this time.

Adding support for enum sets is a possibility for a future RFC, should an appropriate implementation be determined.

Auto-scalar conversion

Whether or not a Backed Enum can be viewed as “close enough” to its corresponding scalar value is debatable, and of debatable value. For instance, is a string-backed enum Stringable? Should an int type check accept an int-backed enum value? Should a string-backed enum work in a print statement? What about up-converting a scalar to its corresponding enum automatically?

The optimal behavior here, if any, will likely not become apparent until enums see widespread use. We have therefore opted to omit all auto-conversion at this time. If clear and compelling use cases for auto-conversion appear in the future, later PHP versions can re-introduce such auto-conversion in a more targeted, well-informed way.

Magic read-methods

The __get and __isset magic methods are likely safe, as they cannot manipulate state (or at least no more than any other method). They have been omitted at this time largely to avoid BC breaks in future planned extensions of enumerations, such as Tagged Unions/ADTs. (See the Meta RFC linked above.) It is possible that the introduction of associated values will require internal changes that result in additional property names becoming reserved. For that reason, we have for now omitted those potentially conflicting magic methods. In practice, there is no functionality they offer that couldn't be implemented using methods.

If when the dust settles it appears that __get would not cause a conflict after all, it may be permitted at a later date.

Constant-reference expression values

Currently, a Backed Enum value may only be a constant literal or an arithmetic expression involving only constant literals. They cannot reference other constant symbols, such as const constants or other Enum cases. That is not out of a lack of desire but simply because it turns out to be quite difficult to do. It's not a blocker for the remainder of the functionality listed here. If we or someone else can figure out how to make it work in the future it would be a good addition, but for now it is infeasible.

Voting

This is a simple yes/no vote to include Enumerations. 2/3 required to pass.

Voting started 2021-02-03 and closes 2021-02-17.

Introduction

The new custom object serialization mechanism RFC introduced new __serialize() and __unserialize() magic methods in PHP 7.4, with the intent of replacing the broken Serializable interface. This RFC finalizes that work by laying out a plan for the eventual removal of Serializable.

Please see the referenced RFC for a detailed discussion of why the Serializable interface is broken and needs to be replaced. Since PHP 7.4 a robust alternative mechanism exists, but some of the motivating issues will only be resolved once support for Serializable is dropped entirely.

Proposal

Serializable

A class is “only Serializable” if it is non-abstract, implements Serializable, and does not implement __serialize() and __unserialize(). Then:

  • In PHP 8.1, declaring an “only Serializable” class will throw a deprecation warning. Other implementations of Serializable will be accepted without a deprecation warning, because libraries supporting PHP

  • In PHP 9.0 the Serializable interface will be removed and unserialize() will reject payloads using the C serialization format. Code needing to support both PHP = 9.0 may polyfill the Serializable interface, though it will have no effect on serialization.

If a class implements both Serializable and __serialize()/__unserialize(), the latter take precedence (on versions that support them), and the Serializable interface is only used to decode existing serialization payload using the obsolete C format. To migrate to the new mechanism, it's possible to either replace Serializable entirely (if support for PHP 7.3 and below is not needed) or to implement both (if it is needed).

An earlier version of this RFC proposed an additional step: PHP 9.0 would deprecate all uses of Serializable (including those that are not “only Serializable”) and only remove the interface in PHP 10.0. However, this approach was deemed too complicated.

PDO::FETCH_SERIALIZE

PDO has a PDO::FETCH_SERIALIZE flag that can be used in conjunction with PDO::FETCH_CLASS. This fetch mode is based on the Serializable interface, and as such it cannot be supported once it is removed. Apparently, the PDO::FETCH_SERIALIZE mode is not actually usable due to an implementation bug (https://bugs.php.net/bug.php?id=68802) anyway.

In addition to the Serializable changes, this RFC proposes to deprecate PDO::FETCH_SERIALIZE in PHP 8.1 and remove it in PHP 9.0.