Verification: a143cc29221c9be0

Php arguments not being passed

Php arguments not being passed

Introduction

PHP's array data type is rare in that it supports both integer and string keys, and that iteration order is guaranteed. While it is possible to efficiently check that something is an array, that array may be an associative array, have missing array offsets, or contain out of order keys. It can be useful to verify that the assumption that array keys are consecutive integers is correct, both for data that is being passed into a module or for data being returned by a module. In serializers, it may also be useful to have an efficient check to distinguish lists from associative arrays - for example, json_encode does this when deciding to serialize a value as [0, 1, 2] instead of {“0”:0,“2”:1,“1”:1} for arrays with different key orders.

Proposal

Add a new function array_is_list(array $array): bool that will return true if the array keys are 0 .. count($array)-1 in that order. For other arrays, it returns false. For non-arrays, it throws a TypeError.

This RFC doesn't change PHP's type system and doesn't add new type hints.

The functionality is equivalent to the below polyfill:

function array_is_list(array $array): bool {
    $expectedKey = 0;
    foreach ($array as $i => $_) {
        if ($i !== $expectedKey) { return false; }
        $expectedKey++;
    }
    return true;
}
 
$x = [1 => 'a', 0 => 'b'];
var_export(array_is_list($x));  // false because keys are out of order
unset($x[1]);
var_export(array_is_list($x));  // true
 
// Pitfalls of simpler polyfills - NAN !== NAN
$x = ['key' => 2, NAN];
unset($x['key']);
var_export($x === array_values($x));  // false because NAN !== NAN
var_export($x);  // array (0 => NAN)
var_export(array_is_list($x));  // true because keys are consecutive integers starting from 0
 
array_is_list(new stdClass());  // throws a TypeError
array_is_list(null);  // throws a TypeError

Note that there are pitfalls in writing a correct polyfill/substitute. For example, array_values($array) === $array would be false for some arrays containing NAN, and array_keys($array) === range(0, count($array) - 1) is wrong for the empty array.

The native implementation will quickly return true for most lists by checking the C macro HT_IS_PACKED(array) && HT_IS_WITHOUT_HOLES(array). This optimization is already used by json_encode().

Example Use Cases

  1. Having an efficient, correct, and readable way to check that an array is actually a list (that doesn't have the pitfalls mentioned earlier).

  2. Making it more efficient and straightforward to check assumptions about data (In most other languages, there is already a non-associative list/array type that could be enforced during compilation or at runtime with the equivalent of instanceof)

  3. Throwing or warning in a library, framework, or API if the passed in value is not a list with elements in order and without gaps. For example, a potential source of bugs is that array_filter($list) returns a list with gaps in it, and array_values(array_filter($list)) should be used instead.

  4. Serializers or data encoders written in PHP, or other use cases that require or benefit from checking if data conforms to an expected format.

  5. Detecting invoking a function with named arguments in variable arguments in code that does not expect named arguments (e.g. example(argname: $value); for function example(...$args) {}).

Proposed PHP Version

8.1

RFC Impact

To Opcache

Opcache's architecture does not change because the type system is unchanged; optimizations of array_is_list() can easily be added or removed.

In the RFC's implementation, opcache evaluates the call array_is_list(arg) to a constant if the argument is a constant value and doesn't throw (same mechanism currently used for array_keys, etc.).

Long-term, if this sees wide enough adoption to affect performance on widely used apps or frameworks, opcache's contributors will have the option of adding additional checks to make opcache infer that array_is_list() being true implies that the keys of the array are integers.

(Currently, Opcache only optimizes type checks that are converted to type check opcodes such as is_resource() and is_array(). Opcache doesn't do anything similar for opcodes that become regular function calls such as is_numeric(), so the implementation for array_is_list() included with this RFC does not do this.)

Discussion

Possibility of naming conflicts with future vector-like types

Originally, this was called is_list, but renamed due to the potential of naming conflicts with a potential list type.

https://externals.io/message/112560#112565

If we do eventually end up with list/vec types, would the naming here conflict at all? Or would it cause confusion and name collision? (Insert name bikeshedding here.)

There's definitely the potential for naming conflicts if the type is called list but not if it's called vec/vector/varray similar to https://docs.hhvm.com/hack/built-in-types/arrays - I'd strongly prefer the latter if there was a viable implementation and it used sequential memory instead of a linked list.

If the type is named list instead of vec and ends up incompatible with arrays, there'd need to be an is_list_type($val) or $val is list or some other new type check with a less preferable name. If it's compatible with arrays/lists (e.g. only checked during property assignment, passing in arguments, and returning values), then it wouldn't be an issue.

- array_is_list(array $array) is consistent with many other array_* methods, which only accept arrays. - It is very possible that we may end up using the word list anyway despite those objections, because it's already a reserved keyword in PHP for unrelated syntax (list($first, $second) = $values). Recently added types such as object, void, and iterable (and scalar types) were added in previous PHP versions despite not being reserved in the past. - The name vector may conflict with the php-ds PECL depending on how functionality is implemented.

Providing objects with APIs similar to the external PECL https://www.php.net/manual/en/class.ds-vector.php and the SPL may be easier to adopt because it can be polyfilled, but there's the drawback that there aren't the memory savings from copy-on-write and that there's the performance overhead of method calls to offsetGet(), etc.

As mentioned in Changes to PHP's type system, I'd expect the addition of a separate/incompatible vector type to be a massive undertaking, and possibly unpopular if it splits the language. In Hack/HHVM, it was practical for users to adopt because HHVM is bundled with a typechecker that checks that the uses are correct at compile time - because PHP has no bundled type checker, a new type would potentially cause a lot of unintuitive behaviors.

Additionally, a name of is_list may cause confusion with built-in list types such as SplDoublyLinkedList.

Vote

Voting started on 2021-01-06 and ended 2021-01-20

This is a Yes/No vote, requiring a 2/3 majority

References

Rejected Features

Alternate names

is_sequential_array/array_is_sequential was rejected because [2=>'a', 3=>'b'] is also sequential.

is_zero_indexed_array/array_is_zero_indexed was rejected because that term is much less commonly used.

Alternate implementations

The signature is_array_and_list(mixed $value): bool was considered, but rejected because silently returning false for objects would be surprising, and the behavior for future list-like types might be misunderstood (SplDoublyLinkedList, ArrayObject, etc.)

This deliberately only returns true for arrays with sequential keys and a start offset of 0. It returns false for [1=>'first', 2=>'second'].

This deliberately throws a TypeError for non-arrays.

Adding flags to is_array()

https://externals.io/message/112612#112612

I actually like the idea of flags added to is_array() for this.

Something like:

is_array($value, ZERO_INDEXED | ASSOCIATIVE | INTEGER_INDEXED)

I’m not suggesting these names; they’re for illustration only.

I'm strongly opposed to adding any flags to is_array - keeping basic type checks simple would help in learning/reading/remembering the language. The addition of flags has a small impact on performance for calls that aren't unambiguously qualified (especially if using both), and it makes it harder to see issues like is_array(really_long_multiline_call(arg1, arg2, ZERO_INDEXED)) where ZERO_INDEXED is passed to another function instead of is_array.

Changes to PHP's type system

This RFC does not attempt to change php's type system. External static analyzers may still benefit from inferring key types from array_is_list() conditionals seen in code - array_is_list() conditionals would give more accurate information about array keys that can be used to detect issues or avoid false positives. (Phan, Psalm, and PHPStan are all static analyzers that support the unofficial phpdoc type list, which is used for arrays that would satisfy array_is_list()).

Any attempt to change php's type system would need to deal with references and the global scope - e.g. what would happen if an array was passed to list &$val but modified to become a non-list from a different callback or through asort().

Additionally, I'd personally expect that changes to the type system that were backwards incompatible would be possible, but unpopular and difficult to implement. HHVM is a project that was initially compatible with php, but has recently dropped compatibility with PHP. https://docs.hhvm.com/hack/built-in-types/arrays may be of interest to anyone who is interested in ways to migrate to stricter alternatives to php's arrays, but that required an entirely different language mode to use (), which doesn't seem viable for PHP itself (for reasons such as splitting the ecosystem and being incompatible with older php versions).

The thread https://externals.io/message/109760#109812 discussed this, but I'm not aware of anyone working on an implementation of list/vec, and supporting adding list/vec to the type system would be a lot of work for PECL extensions, language design, backwards compatibility concerns, etc. (It would also potentially be an issue with serializing/unserializing for data sent to/from older php versions (e.g. memcache, $_SESSION data, etc.))

Hack introduced the vec type (with value semantics) in 2016 after they'd experimented first with Vector (object semantics). Use of Vector is now discouraged.

Details here: https://github.com/facebook/hhvm/issues/6451

FB/Hack appears to be in the multi-year process of moving all PHP arrays to one of [vec/dict/keyset]. That's likely not an option for PHP itself, but having the option of a vec equivalent (in this proposal “list”) would make sense, I think.

https://externals.io/message/109760#109781

Most users don't realize that PHP's arrays-not-really-arrays have caused millions of dollars in security breaches in the past. :-) They're dangerous and to be avoided whenever possible.

I'm very open to a list/sequence type, but as others have noted there's a whole crapload of details to sort out to make it viable. In particular:

  • Is it an effective subclass of array? IMO, no. It should have absolutely no auto-conversion to/from an array whatsoever of any kind, period. Keep them as separate as possible.

  • Should it even have random-access indexes? Honestly I'd say no; Just support adding, removing, and iteration and generate the indexes on the fly when iterating if necessary.

  • Should they pass like arrays or like objects? Many questions here.

  • Should they be mutable or immutable? I could argue for either one effectively, I think, though I'd honestly favor immutable.

  • Are they iterable? Presumably, but does that have any weird implications for iterables that implicitly assume there are keys? How's that work?

  • Does it make sense to add them without type enforcement via generics? Lists + Generics would be lovely, but as we've seen Generics are Hard(tm) and Not Imminent(tm). But would adding them now make a generic version harder in the future? (I've no idea.)

  • Besides add/remove/iterate, what other baked-in functionality should they have? Eg, can they be mapped/filtered/reduced? It would really suck to revisit lists and not fix that disconnect in the API. (Insert me talking about comprehensions and stuff here.) Ideally this would happen as part of a larger review of how collections work at various levels, which are currently highly clunky.

Those are all solvable problems (and I've likely forgotten several), but they would have to be thought through extensively before an implementation could be viable.

Word of Caution

It is strongly discouraged from using PHP with Symfony and/or Doctrine for any long-running background processes (daemon), that listens for WebSocket (or other) connections, using Ratchet/ReactPHP style features in any production/real-world environments. PHP was not designed to run as a daemon. As such, without proper planning, the process will crash with either a Doctrine Connection exception with the MySQL Server Has Gone Away error or from memory leaks caused by maintaining the Entity Manager, Symfony service definitions and logger overflows.

Using PHP as a daemon would require implementing unintuitive workarounds, such as a Messenger Queue (causes long delays between responses) or Lazy Proxy objects (causes code-level maintainability issues) and additional background processes, like supervisor and/or cron jobs to circumvent the inherent issues and recover from crashes.


Provided you are using the default autowire configuration for your config/services.yaml and ScoreHandler is not in one of the excluded paths, the following options are feasible using dependency injection.

# config/services.yaml
services:
    # default configuration for services in *this* file
    _defaults:
        autowire: true      # Automatically injects dependencies in your services.
        autoconfigure: true # Automatically registers your services as commands, event subscribers, etc.

    # makes classes in src/ available to be used as services
    # this creates a service per class whose id is the fully-qualified class name
    App\:
        resource: '../src/*'
        exclude: '../src/{DependencyInjection,Entity,Migrations,Tests,Kernel.php}'

    # ...

Inject ScoreHandler as a Service

The recommended approach is to inject the ScoreHandler service into the command, which will also automatically inject the EntityManagerInterface into the ScoreHandler.

class YourCommand extends Command
{

    private $handler;

    public function __construct(ScoreHandler $handler)
    {
        $this->handler = $handler;
        parent::__construct();
    }

    //...

    public function execute(InputInterface $input, OutputInterface $outpu)
    {

        //...

        $server = IoServer::factory(
            new HttpServer(
                new WsServer($this->handler)
            ),
            8080
        );
        $server->run();
    }
}

Inject EntityManagerInterface and pass to ScoreHandler

Since you are creating a new instance of ScoreHandler manually, it requires the EntityManagerInterface to be supplied as an argument to the constructor.

This is not recommended, as a service already exists that is already autowired as in the previous example.

class YourCommand extends Command
{

    private $em;

    public function __construct(EntityManagerInterface $em)
    {
        $this->em = $em;
        parent::__construct();
    }

    //...

    public function execute(InputInterface $input, OutputInterface $outpu)
    {

        //...

        $server = IoServer::factory(
            new HttpServer(
                new WsServer(
                    new ScoreHandler($this->em)
                )
            ),
            8080
        );
        $server->run();
    }
}

Introduction

Internal functions (defined by PHP or PHP extensions) currently silently accept null values for non-nullable arguments in coercive typing mode. This is contrary to the behavior of user-defined functions, which only accept null for nullable arguments. This RFC aims to resolve this inconsistency.

For example, this code currently runs without producing any diagnostics:

var_dump(str_contains("foobar", null));
// bool(true)

The null value is silently converted into an empty string and str_contains() thus returns true.

However, the behavior is different for user-defined functions. For example, consider the same function being defined as a polyfill:

// Inside a polyfill:
function str_contains(string $haystack, string $needle): bool {
    return false !== strpos($haystack, $needle);
}
 
var_dump(str_contains("foobar", null));
// TypeError: str_contains(): Argument #2 ($needle) must be of type string, null given

After the changes in PHP 8.0, this is the only remaining fundamental difference in behavior between user-defined and internal functions. This makes it infeasible to accurately polyfill functions, and will complicate long-term projects such as migrating the standard library to be partially implemented in PHP rather than C.

Historically, the reason for this discrepancy is that internal functions have supported a concept of scalar types (bool, int, float, string) long before they were introduced for user-defined functions in PHP 7.0, and the existing implementation silently accepted null values. For the new scalar type declarations introduced in PHP 7.0 an explicit choice was made to not accept null values to non-nullable arguments, but changing the existing behavior of internal functions would have been too disruptive at the time.

This RFC proposes to synchronize the behavior of internal functions, by throwing a deprecation warning in PHP 8.1, which will become a TypeError in the next major version.

var_dump(str_contains("foobar", null));
// Deprecated: Passing null to argument of type string is deprecated

Proposal

If

  • null is passed to an argument of an internal function,

  • the argument is not nullable,

  • the argument accepts at least one scalar type (one of bool, int, float, or string),

  • coercive typing mode is used, i.e. strict_types=1 is not enabled,

then

  • in PHP false, 0, 0.0 or “”.

  • in PHP >= 8.1 a deprecation error is thrown, but the value is still coerced and the call still takes place.

  • in PHP >= 9.0 a TypeError is thrown, consistent with the behavior of user-defined functions.

Backward Incompatible Changes

Passing null to non-nullable scalar internal arguments will be forbidden in the future, which constitutes a backwards compatibility break. The behavior for user-defined functions, strict typing mode and non-scalar arguments is not affected, as all these cases already generate a TypeError for null arguments.

The following shows typical ways to resolve a deprecation warning:

// Throws deprecation warning if $maybe_null is null.
strlen($maybe_null);
 
// Resolution 1: Explicitly check for null, take different control flow path.
if (null === $maybe_null) {
    return; // or similar
}
strlen($maybe_null);
 
// Resolution 2: Explicitly specify a default value for null values.
strlen($maybe_null ?? '');
 
// Resolution 3: Explicitly cast the argument.
strlen((string) $maybe_null);