Wanted: PHP built-in functionality for dynamically accessing array elements at arbitrary nesting depth

In general I’ve been pretty happy with the array implementation in PHP, finding them pretty versatile, flexible, and easy to use. I like that you can just mix integer and string keys, associative arrays are ordered, writing to a nonexistent nested array auto-vivifies it, reading a nonexistent key / index just evaluates to NULL and doesn’t generate an error, etc. However, I do see room for improvement.

(If you want to experiment with code examples in this post, see the reference code.)

The subject of this article is that for a long time I’ve wished for built-in functionality in PHP for dynamically accessing array elements at arbitrary nesting level. I’m talking about what are called multi-dimensional [possibly associative] arrays in PHP, and arrays / hashes / maps in other languages. In any case, I’m talking about tree structures / hierarchies built by nesting those data types. For example, consider this array:

<?php

$products = array(

  'literature' => array(

    'books' => array()

    ,

    'audiobooks' => array(

      'cd' => array(),

      'mp3' => array()

    )
    // array 'audiobooks'

  )
  // array 'literature'

  ,

  'electronics' => array(

    'tv-video' => array(),

    'computers' => array(

      'desktops' => array(

        'windows' => array(),

        'apple' => array()

      )
      // array 'desktops'

      ,

      'laptops' => array()

    )
    // array 'computers'

  )
  // array 'electronics'

);
// array $products


$product_id_string = 'electronics/computers/desktops';

$product_id_array = explode( "/", $product_id_string );



It’s easy to access any element of such a structure regardless of nesting level if you decide when you’re writing the script:


$products[ 'electronics' ][ 'computers' ][ 'desktops' ]


But that’s not very dynamic. What if you want the script to decide at runtime? It’s frequently useful to get at an element at arbitrary nesting depth — say, ‘desktops’ — dynamically. An example use case is using URLs like this:

http://example.com/products/electronics/computers/desktops

As shown, the standard PHP array access syntax requires you to hard-code, at a minimum, the nesting level of the element you want to access. That doesn’t easily allow for very dynamic coding, and that’s a frustrating limitation on how otherwise flexible multi-dimensional arrays can be in PHP.

Current Techniques

Currently available means of accessing elements at arbitrary nesting level are inconvenient, and probably perform relatively poorly too. I’ve used techniques like these to deal with this in the past:


$array =& $products;

$keys = $product_id_array;


/* Figure 1 */

$element = &$array;

foreach ( $keys as $key ) {

  $element =& $element[ $key ];

}
// foreach


/* Figure 2 */

foreach ( $keys as $i => $key ) {

  $keys[ $i ] = str_replace( "'", "\\'", $key );

}
// foreach


$keys = "[ '" . join( "' ][ '", $keys ) . "' ]";


$element = eval( "return \$array{$keys};" );




Even the sparsest, no-frills code in Figure 1 is more cumbersome and verbose than I like for functionality that I consider so basic, and it doesn’t even handle unsetting the target. Wrapping that in a custom function would make it less cumbersome to invoke, but having to define a custom function everywhere / everytime you want to do this is a big inconvenience. This comes up repeatedly for me (and others [1]). It’s a big piece missing from PHP’s current array functionality, a really fundamental capability that should be in the core for convenience and performance reasons.

Here’s an example of a more featureful function [array_get_path()] that allows unsetting the target element and provides a shortcut for setting a value:

<?php

function &array_get_path( &$array, $path, $delim = NULL, $value = NULL, $unset = false ) {

  $num_args = func_num_args();

  $element = &$array;


  if ( ! is_array( $path ) && strlen( $delim = (string) $delim ) ) {

    $path = explode( $delim, $path );

  }
  // if


  if ( ! is_array( $path ) ) {

    // Exception?

  }
  // if


  while ( $path && ( $key = array_shift( $path ) ) ) {

    if ( ! $path && $num_args >= 5 && $unset ) {

      unset( $element[ $key ] );

      unset( $element );

      $element = NULL;

    }
    // if


    else {

      $element =& $element[ $key ];

    }
    // else

  }
  // while


  if ( $num_args >= 4 && ! $unset ) {

    $element = $value;

  }
  // if


  return $element;

}
// array_get_path


function array_set_path( $value, &$array, $path, $delimiter = NULL ) {

  array_get_path( $array, $path, $delimiter, $value );


  return;

}
// array_set_path


function array_unset_path( &$array, $path, $delimiter = NULL ) {

  array_get_path( $array, $path, $delimiter, NULL, true );


  return;

}
// array_unset_path


function array_has_path( $array, $path, $delimiter = NULL ) {

  $has = false;


  if ( ! is_array( $path ) ) {

    $path = explode( $delimiter, $path );

  }
  // if


  foreach ( $path as $key ) {

    if ( $has = array_key_exists( $key, $array ) ) {

      $array = $array[ $key ];

    }
    // if

  }
  // foreach


  return $has;

}
// array_has_path;



Solution

Sickest Solution: Syntactic Sugar

What would be sickest is if you could do this:


/*

Where...

$products[ $product_id_array ]

...and alternatives would be equivalent to...

$products[ 'electronics' ][ 'computers' ][ 'desktops' ]

*/

$element = $products[ $product_id_array ];

$element =& $products[ $product_id_array ];

$products[ $product_id_array ] = array( "whatever" );

array_unshift( $products[ $product_id_array ], "something" );

$products[ $product_id_array ][] = "new numeric";

$products[ $product_id_array ][ 'assoc' ] = "new assoc";

unset( $products[ $product_id_array ][ 'assoc' ] );

unset( $products[ $product_id_array ] );


// Ideally with an anonymous array too:

$element = $products[ array( 'electronics', 'computers', 'desktops' ) ];

$element = $products[ explode( "/", $product_id_string ) ];



PHP doesn’t allow an array to be used as an array key currently (and never has, that I know of) — doing so generates an E_WARNING. So, that would reduce the backward compatibility hazard of introducing this feature, but since it’s an E_WARNING and not an E_ERROR it doesn’t completely avoid it. To avoid straying too far from the subject at hand, I’ll leave it at that for now on the issue of E_WARNING and backcompat. Anyway, it would first have to be possible to implement this in order for backward compatibility to be a concern. I don’t know if it’s possible or not.

Assuming this is possible, maybe there’s some new syntax that could be introduced that doesn’t interfere with anything else and avoids any backward compatibility concerns? E.g.:


/* Double square brackets? */

$element = $products[[ $product_id_array ]];

$element = $products[[ explode( "/", $product_id_string ) ]];

// As long as youre going to dream, dream big, right?:

$element = $products[[ $product_id_string, "/" ]];

// Or, for consistency with explode()?

$element = $products[[ "/", $product_id_string ]];


/* Curly braces within square brackets? */

$element = $products[{ $product_id_array }];

$element = $products[{ explode( "/", $product_id_string ) }];

// Go big or go home

$element = $products[{ $product_id_string, "/" }];

// Or, for consistency with explode()?

$element = $products[{ "/", $product_id_string }];



In terms of forward compatibility, are there good use cases for using an array as an associative array key in a different fashion than this? I believe that’s possible in other languages, such as with hashes in Ruby, but it’s because any object can be used as a hash key and arrays / hashes are objects in Ruby.

Alternative Solution: Class / Function(s)

If the syntactic sugar isn’t practical, then what we need is some built-ins like these:

Multipurpose Function


&mixed array_get_path(

  array $array,

  array|string $path

  [, string $delimiter

    [, $value = NULL

      [, $unset = false ]

    ]

  ]

)



Dedicated Functions


&mixed array_get_path( array &$array, array|string $path [, string $delimiter ] )

??? array_set_path( $value, array &$array, array|string $path [, string $delimiter ] )

??? array_unset_path( array &$array, array|string $path [, string $delimiter ] )

bool array_has_path( array $array, array|string $path [, string $delimiter ] );



Class


/**
 * This desc is for the following 3 signatures. Instantiate new object
 * to access nested arrays via paths.
 *
 * @param array $array Array to access via paths.
 *
 * @param array|string $path Array of keys, or string to explode.
 *
 * @param string $delimiter Required if $path is a string. Will be passed to explode().
 *
 * @return obj Instance.
 */

obj MapPath( array $array [, mixed $path [, string $delimiter ] ] )

obj MapPath::__construct( array $array [, mixed $path [, string $delimiter ] ] )

obj MapPath::make( array $array [, mixed $path [, string $delimiter ] ] )


/**
 * Set default path for subsequent method calls.
 *
 * @param array|string $path Array of keys, or string to explode.
 *
 * @param string Required if $path is a string and a default delimiter
 * is not yet set. Will be passed to explode().
 *
 * @return obj Object method was called on.
 */

obj MapPath::path( mixed $path [, string $delimiter ] )


/*

The following methods use the the default path / delimiter set in the
constructor call or most recent call to path() if $path / $delimiter
args are not provided.

$path / $delimiter can be passed to these methods to use for that call
only, taking precendence over any defaults and not setting a new
default.

*/

&mixed MapPath::get( [ mixed $path [, string $delimiter ] ] )

??? MapPath::set( $new_value [, mixed $path [, string $delimiter ]] ] )

void MapPath::unset( [ mixed $path [, string $delimiter ] ] )

bool MapPath::has( [ mixed $path [, string $delimiter ] ] )

mixed MapPath::map( [ array $array ] )

mixed MapPath::delim( [ string $delimiter ] )



The first three signatures are alternative means of instantiating a new object. If $path is provided and is a string, then $delimiter must be provided, in which case $path = explode( $delimiter, $path ).

The first signature is a function with the same name as the class (“MapPath”) that calls new MapPath() and returns the instantiated object. The purpose of this is to cut down on the annoying verbosity of PHP object instantiation / lack of a way to immediately call a method on a new object and code elegant one-liners, e.g.:


// Standard instantiation -- cant do a nice one-liner.

$element = new MapPath( $products, $product_id_array );

$element = $element->get();


/*

Standard instantiation -- have to do it verbose, ugly, with
multiple statements on the same line to do a "one-liner".

*/

$element = new MapPath( $products, $product_id_array ); $element = $element->get();


// Factory method -- one liner, but bloated

$element = MapPath::make( $products, $product_id_array )->get();


// Using function saves 6 chars vs. factory method.

$element = MapPath( $products, $product_id_array )->get();



Examples

Here are some usage examples for these proposed built-ins:


/***********************************************************************

Get value (read-only)

***********************************************************************/

// Sickest

$element = $products[ $product_id_array ];

$element = $products[ explode( '/', $product_id_string ) ];


// Function

$element = array_get_path( $products, $product_id_array );

$element = array_get_path( $products, $product_id_string, '/' );


// Class, use object for one operation and discard

$element = MapPath( $products, $product_id_array )->get();

$element = MapPath( $products )->get( $product_id_array );

$element = MapPath( $products, $product_id_string, '/' )->get();

$element = MapPath( $products )->get( $product_id_string, '/' );


/*

Outcome of all above examples:

$element === array( 'windows' => array(), 'apple' => array() )

*/



/***********************************************************************

Get reference (read-write)

***********************************************************************/

// Sickest

$element =& $products[ $product_id_array ];


// Function

$element =& array_get_path( $products, $product_id_array );


// Class, use object for one operation and discard

$element =& MapPath( $products, $product_id_array )->get();

$element =& MapPath( $products )->get( $product_id_array );


// One of the above, then...

$element = "whatever";


/*

Outcome:

$products[ 'electronics' ][ 'computers' ][ 'desktops' ] === 'whatever'

*/




/***********************************************************************

Set

***********************************************************************/

/* Passing an array for $path */

array_get_path( $products, $product_id_array, NULL, "whatever" );

array_set_path( "whatever", $products, $product_id_array );

MapPath( $products, $product_id_array )->set( "whatever" );

MapPath( $products )->set( "whatever", $product_id_array );


/* Passing a string for $path, and $delimiter */

array_get_path( $products, $product_id_string, '/', "whatever" );

array_set_path( "whatever", $products, $product_id_string, '/' );

MapPath( $products, $product_id_string, '/' )->set( "whatever" );

MapPath( $products )->set( "whatever", $product_id_string, '/' );


/*

Outcome of all above examples:

$products[ 'electronics' ][ 'computers' ][ 'desktops' ] === "whatever"

*/




/***********************************************************************

Unset

***********************************************************************/

/* Passing an array for $path */

array_get_path( $products, $product_id_array, NULL, NULL, true );

array_unset_path( $products, $product_id_array );

MapPath( $products, $product_id_array )->drop();

MapPath( $products )->drop( $product_id_array );


/* Passing a string for $path, and $delimiter */

array_get_path( $products, $product_id_string, '/', NULL, true );

array_unset_path( $products, $product_id_string, '/' );

MapPath( $products, $product_id_string, '/' )->drop();

MapPath( $products )->drop( $product_id_string, '/' );


/*

When $value / $delimiter are irrelevant, 0 could be passed
to abbreviate the call

*/

array_get_path( $products, $product_id_array, 0, 0, true );


/*

Outcome of all above examples:

array_key_exists( 'desktops', $products[ 'electronics' ][ 'computers' ] ) === false

*/




/***********************************************************************

Has key / path

***********************************************************************/

array_key_exists( end( $product_id_array ), array_get_path(

  $products, array_slice( $product_id_array, 0, -1 )

) );


array_has_path( $products, $product_id_array );

array_has_path( $products, $product_id_string, '/' );

MapPath( $products, $product_id_array )->has();

MapPath( $products )->has( $product_id_array );

MapPath( $products, $product_id_string, '/' )->has();

MapPath( $products )->has( $product_id_string, '/' );



/***********************************************************************

Class, use object for multiple operations

***********************************************************************/

// No default path

$array = new MapPath( $products );

// One-off path

$element = $array->get( $product_id_array );

// Set new default path

$element = $array->path( $product_id_array )->get();

// Uses default path

$array->set( "whatever" );

// One-off path

$array->set( "whatever", $one_off_path );


// Sets new array and retrieves default path from it

$element = $array->map( $products )->get();




/***********************************************************************

Lets start over with a new object for multiple ops and set a
default path in the constructor.

***********************************************************************/

// Default path

$array = new MapPath( $products, $product_id_array );

// Uses default path ($product_id_array)

$element = $array->get();

$element = array_reverse( $element, true );

// Uses default path ($product_id_array)

$array->set( $element );


// Sets new default path

$element = $array->path( $some_path )->get();

$element = strtoupper( $element );

// Uses default path ($some_path)

$array->set( $element );




/***********************************************************************

Get-Set one liner

***********************************************************************/

$array->set( strtolower( $array->path( $some_path )->get() ) );




/***********************************************************************

Manipulate by reference

***********************************************************************/

$element = array_pop( $array->get( $product_id_array ) );



Looking at the examples, if the sickest solution isn’t possible, then I’m leaning toward favoring the class rather than the function(s) if it had to be one or the other.

Class / Function Names

I’m not sure what name would make the most sense for a built-in class / function(s). A bunch of ideas have crossed my mind, such as:

  • MapPath
  • ArrayPath
  • TreePath
  • MapTree
  • ArrayTree
  • RayPath
  • PathRay
  • MapMap
  • DeepMap
  • DeepArray
  • Mapalot
  • MapNDeep
  • NDeepMap
  • MapGet
  • MapFind
  • MapDig
  • MapDrill
  • MapPluck
  • MapGrab
  • TreeGet
  • TreeFind
  • TreeDig
  • TreeDrill
  • TreePluck
  • TreeGrab
  • ArrayGet
  • ArrayFind
  • ArrayDig
  • ArrayDrill
  • ArrayPluck
  • ArrayGrab
  • array_get_path
  • array_locate
  • array_dig
  • array_drill
  • array_drill_down
  • array_pluck
  • array_grab
  • array_get_nested
  • array_get_desc (descendant)
  • array_desc (descend)
  • array_find
  • array_find_path
  • array_pathfind
  • array_at_path

All else being equal, I’m in favor of a shorter name that is sufficiently descriptive, especially since PHP’s non-OO nature on one hand favors bloating a function name with the “array_” prefix, and on the other hand bloats method calls by necessitating use of a separate utility class to interface with the non-OO array type.

Do other languages have such functiontionality, and if so, what do they call it? I feel like I may have seen “pluck” somewhere before, but I can’t remember.

Conclusion

With this functionality you could dynamically assemble/ source a path to an array element nested at arbitrary depth and then easily access it. Among other things, it would allow you to get the path from an external source such as a database, URL, POST data, or other user input. This would make it much easier and more consistent to work with arrays dynamically. It’s a big piece missing from the PHP array toolkit, and the resulting necessity to custom code it or include user-defined code is a big inconvenience that it would be nice to see put to rest.

References

  • [1] I did some searching to see if anything has changed since the last time I looked into this, e.g. a built-in feature introduced to PHP. I didn’t find that, but I did find others needing to do the same thing, and coming up with custom code for it, e.g. Drupal: drupal_array_get_nested_value(), drupal_array_set_nested_value

Leave a Reply

Your email address will not be published. Required fields are marked *

Note: Comments are moderated. Spam comments will never be published.

Is this comment spam?