Verification: a143cc29221c9be0

Parsing query string in php

Introduction

PHP offers a function for parsing query string into an array. The function historically has been a mockery as it allowed to register the values as global variables. Since PHP 8.0 this is no longer possible. The function still has two main downsides: a confusing name, and no return value (it returns by out parameter instead).

$str = "first=value&arr[]=foo+bar&arr[]=baz";
 
parse_str($str, $output);
echo $output['first'];  // value
echo $output['arr'][0]; // foo bar
echo $output['arr'][1]; // baz

Proposal

The proposal is to create a new function as a copy of parse_str() but change the name to http_parse_query() and return the array instead of using the out parameter. The new function will take only one argument.

The new function will also be maintained in the manual under https://www.php.net/manual/en/ref.url.php instead of https://www.php.net/manual/en/ref.strings.php

$str = "first=value&arr[]=foo+bar&arr[]=baz";
 
$output = http_parse_query($str);
echo $output['first'];  // value
echo $output['arr'][0]; // foo bar
echo $output['arr'][1]; // baz

Open Points

Should PHP remove name mangling?

As suggested on the mailing list, the name mangling does not serve any purpose anymore. It was a way to simplify variable access when register globals was used. Since PHP 8.0, this functionality is not available anymore, so the question is should we remove the name mangling from PHP parse functions completely (which would require a separate RFC and smart deprecation path) or should we remove it only from the new function?

If we should remove name mangling, what should happen to mismatched square brackets?

Should it be implemented in OOP way instead?

Maybe the replacement should be in a form of a full-fledged class with properties from two-way conversion from string to PHP array/object. What would such API look like? What functionality is missing currently from PHP? Do we need such API in core or can we just expose a function like http_parse_query() and let the rest be implemented in userland?

Future Scope

The parse_str() function will get deprecated in the next minor release and removed in the next major release.

Backward Incompatible Changes

There will be no breaking change to existing functionality. The new function is designed to give users a way to move away from parse_str() so that we can deprecate it and remove it.

Vote

Standard instantiation

The default constructor is private and can not be accessed to instantiate a new object.

The $query paramater supports parameter widening. Apart from strings, scalar values and objects implementing the __toString method can be used.

Using a RFC3986 query string

use League\Uri\Components\Query;

$query = Query::createFromRFC3986('foo=bar&bar=baz%20bar', '&');
$query->params('bar'); // returns 'baz bar'

This named constructor is capable to instantiate a query string encoded using RFC3986 query component rules.

Using a RFC1738 query string

$query = Query::createFromRFC1738('foo=bar&bar=baz+bar', '&');
$query->params('bar'); // returns 'baz bar'

This named constructor is capable to instantiate a query string encoded using using application/x-www-form-urlencoded rules;

In addition to the string representation methods from the package common API, the following methods are available.

Query separator

The query separator is essential to query manipulation. The Query object provides two (2) simple methods to interact with its separator:

public Query::getSeparator(string $separator): self
public Query::withSeparator(): string

Query::getSeparator returns the current separator attached to the Query object while Query::withSeparator returns a new Query object with an alternate string separator.

Query::withSeparator expects a single argument which is a string separator. If the separator is equal to = an exception will be thrown.

$query    = Query::createFromRFC3986('foo=bar&baz=toto');
$newQuery = $query->withSeparator('|');
$newQuery->__toString(); //return foo=bar|baz=toto

Component representations

RFC3986 representation

The Query object can return the query encoded using the RFC3986 query component rules

$query = Query::createFromRFC1738('foo=bar&bar=baz+bar', '&');
$query->toRFC3986();  //returns 'foo=bar&bar=baz%20bar'
$query->getContent(); //returns 'foo=bar&bar=baz%20bar'

If the query is undefined, this method returns null.

Query::getContent() is a alias of Query::toRFC3986()

RFC1738 representation

The Query object can returns the query encoded using the application/x-www-form-urlencoded query component rules

$query = Query::createFromRFC3986('foo=bar&bar=baz%20bar', '&');
$query->toRFC1738(); // returns 'foo=bar&bar=baz+bar'
$query->jsonSerialize(); //returns 'foo=bar&bar=baz+bar'

If the query is undefined, this method returns null.

Query::jsonSerialize() is a alias of Query::toRFC1738() to improve interoperability with JavaScript.

Modifying the query

Query::merge

Query::merge returns a new Query object with its data merged.

public Query::merge($query): Query

This method expects a single argument which is a string

$query    = Query::createFromRFC3986('foo=bar&baz=toto');
$newQuery = $query->merge('foo=jane&r=stone');
$newQuery->__toString(); //return foo=jane&baz=toto&r=stone
// the 'foo' parameter was updated
// the 'r' parameter was added

Values equal to null or the empty string are merge differently.

$query    = Query::createFromRFC3986('foo=bar&baz=toto');
$newQuery = $query->merge('baz=&r');
$newQuery->__toString(); //return foo=bar&baz=&r
// the 'r' parameter was added without any value
// the 'baz' parameter was updated to an empty string and its = sign remains

Query::append

Query::append returns a new Query object with its data append to it.

public Query::append($query): Query

This method expects a single argument which is a string, a scalar or an object with the __toString method.

$query    = Query::createFromRFC3986('foo=bar&john=doe');
$newQuery = $query->append('foo=baz');
$newQuery->__toString(); //return foo=jane&foo=baz&john=doe
// a new foo parameter is added

Query::sort

Query::sort returns a Query object with its pairs sorted according to its keys. Sorting is done so that parsing stayed unchanged before and after processing the query.

$query    = Query::createFromRFC3986('foo=bar&baz=toto&foo=toto');
$newQuery = $query->sort();
$newQuery->__toString(); //return foo=bar&foo=toto&baz=toto

Using the Query as a PHP data transport layer

public static Query::createFromParams($params, string $separator = '&'): self
public Query::params(?string $name = null): mixed
public Query::withoutNumericIndices(): self
public Query::withoutParam(...string $offsets): self

Using PHP data structure to instantiate a new Query object

Historically, the query string has been used as a data transport layer of PHP variables. The createFromParams uses PHP own data structure to generate a query string à la http_build_query.

parse_str('foo=bar&bar=baz+bar', $params);

$query = Query::createFromParams($params, '|');
echo $query->getContent(); // returns 'foo=bar|bar=baz%20bar'

The $params input can be any argument type supported by http_build_query which means that it can be an iterable or an object with public properties.

If you want a better parsing you can use the QueryString class.

Query::params

If you already have an instantiated Query object you can return all the query string deserialized arguments using the Query::params method:

$query_string = 'foo.bar=bar&foo_bar=baz';
parse_str($query_string, $out);
var_export($out);
// $out = ["foo_bar" => 'baz'];

$arr = Query::createFromRFC3986($query_string))->params();
// $arr = ['foo.bar' => 'bar', 'foo_bar' => baz']];

If you are only interested in a given argument you can access it directly by supplyling the argument name as show below:

$query = Query::createFromRFC3986('foo[]=bar&foo[]=y+olo&z=');
$query->params('foo');   //return ['bar', 'y+olo']
$query->params('gweta'); //return null

The method returns the value of a specific argument. If the argument does not exist it will return null.

Query::withoutParam

If you want to remove PHP’s variable from the query string you can use the Query::withoutParams method as shown below

$query = Query::createFromRFC3986('foo[]=bar&foo[]=y+olo&z=');
$new_query = $query->withoutParam('foo');
$new_query->params('foo'); //return null
echo $new_query->getContent(); //return 'z='

This method takes a variadic arguments representing the keys to be removed.

Query::withoutNumericIndices

If your query string is created with http_build_query or the Query::createFromParams named constructor chances are that numeric indices have been added by the method.

The Query::withoutNumericIndices removes any numeric index found in the query string as shown below:

$query = Query::createFromParms('foo[]=bar&foo[]=baz');
echo $query->getContent(); //return 'foo[0]=bar&foo[1]=baz'
$new_query = $query->withoutNumericIndices();
echo $new_query->getContent(Query::NO_ENCODING); //return 'foo[]=bar&foo[]=baz'
//of note both objects returns the same PHP's variables but differs regarding the pairs
$query->params(); //return ['foo' => ['bar', 'baz']]
$new_query->params(); //return ['foo' => ['bar', 'baz']]

Query before

select xx from xx select xx,(select xx) from xx where y=' cc' select xx from xx left join ( select xx) where (select top 1 xxx from xxx) oder by xxx desc ";