pihh-rocks pihh-rocks - 9 days ago 6
PHP Question

Iterate through complex multidimensional array (Trie data structure on PHP , code Improvement)

Recently I was faced with a coding challenge where I had to build a simple trie in php, I managed to do it using php and foreach loops but I'm not happy with the code itself (seems not solid as it should be) so I'm trying to implement it using php's iterators.

So, I have a complex array ( a trie ), for example:

array(
'a' => array(),
'b' => array(
'a' => array(
'c' => array(
'o' => array(
'n' => array()
)
)
)
),
'x' => array(
'x' => array(
'x' => array()
)
)
);


And I want to check if 'bacon' it's a word stored on this trie, the process to find it should be by iterating through the array and check if each node it's nested and exists, for instance: I need in the root the element with key 'b', then inside the array array['b'] , I need to check if I there is array['b']['a'] , then ['b']['a']['c'] and so on.

With a foreach loop I was able to do so by passing the new array by reference and check the keys. Now using a iterator it seems I'm hammering the code a bit ( and the fact that when doing foreachs php copies the array, makes me think that this solution might use a lot more memory than using iterators).

So the code until now it's a while loop that has a condition finished that stops on fail (the current array doesn't have the key I'm searching) or success ( the word it's complete ):

// OUTSIDE THE LOOP
$finished = false;
$string = 'bacon';
$string = str_split($string);
$queue = new SplQueue();
// Enqueue all the letters to the queue -> skipping this because it's boring

// FIRST WHILE LOOP
$iterator = new ArrayIterator($array);
$iterator->key(); // No match with queue -> check next key

// SECOND WHILELOOP
$iterator->next();
$iterator->key(); // Matches with the key I want do dequeue (B),
$next = new ArrayIterator($array[$iterator->key()]);
$queue->dequeue();

// THIRD WHILE LOOP
$next->key(); // Match [A] -> create new iterator
$next = new ArrayIterator($next[$next->key()]);
$queue->dequeue();

// 4TH WHILE LOOP
$next->key(); // Match [C] -> create new iterator
$next = new ArrayIterator($next[$next->key()]);
$queue->dequeue();

// 5TH WHILE LOOP
$next->key(); // Match [O] -> create new iterator
$next = new ArrayIterator($next[$next->key()]);
$queue->dequeue();

// 5TH WHILE LOOP
$next->key(); // Match [N]
$next = new ArrayIterator($next[$next->key()]);
$queue->dequeue(); // queue empty, throw success


So, up until now it's this I have, but the fact I'm creating a new ArrayIterator on each loop it's bothering me, so I was hoping to hear if someone has a better solution to this problem.

Thanks in advance.

Answer

This is a good challenge to solve this problem using iterators. Although I think that iterators are great, but they force you to think in terms of iterative approach. While for some problems it's ok, but for tasks like you described it makes more sense to use recursion.

So, I think that you should accept the answer of @cjohansson. As it is readable and understandable.

But just as a prove of concept here's my solution using RecursiveIteratorIterator. We have to extend this class and alter it a bit to suit our needs and also to reduce the number of unnecessary iterations:

class TrieRecursiveIteratorIterator extends RecursiveIteratorIterator
{
    protected $word;

    public function __construct(
        $word,
        Traversable $iterator,
        $mode = RecursiveIteratorIterator::LEAVES_ONLY,
        $flags = 0
    ) {
        $this->word = str_split($word);

        parent::__construct($iterator, $mode, $flags);
    }

    public function next()
    {
        if ($this->currentLetterMatched()) {
            $this->updatePrefix();
            $this->setPrefixed();   
        }  

        parent::next();
    }

    protected $prefix = [];
    protected function updatePrefix()
    {
        $this->prefix[$this->getDepth()] = $this->key();
    }

    protected $prefixed = [];
    protected function setPrefixed()
    {
        $this->prefixed = $this->current();
    }

    public function valid()
    {
        if (
            $this->getDepth() < count($this->prefix)
            || count($this->word) === count($this->prefix)
        ) {
            return false;
        }

        return parent::valid();
    }

    public function callHasChildren()
    {
        if ($this->currentLetterMatched()) {
            return parent::callHasChildren();
        }

        return false;
    }

    protected function currentLetterMatched()
    {
        return isset($this->word[$this->getDepth()])
            && $this->key() === $this->word[$this->getDepth()];
    }

    public function testForMatches()
    {
        foreach ($this as $_) {
        }

        return $this;
    }

    public function getPrefix()
    {
        return implode('', $this->prefix);
    }

    public function getPrefixed()
    {
        return $this->prefixed;
    }

    public function matchFound()
    {
        return ($this->word === $this->prefix);
    }

    public function exactMatchFound()
    {
        return ($this->word === $this->prefix) && empty($this->prefixed);
    }

    public function prefixMatchFound()
    {
        return ($this->word === $this->prefix) && !empty($this->prefixed);
    }
}

Then we can do following:

$iterator = new TrieRecursiveIteratorIterator(
    $word,
    new RecursiveArrayIterator($trie),
    RecursiveIteratorIterator::SELF_FIRST
);

$iterator->testForMatches();

After that, we can ask our $iterator different things, such as:

  1. If match was found: $iterator->matchFound();
  2. If exact match was found: $iterator->exactMatchFound();
  3. If there are words that prefixed with given word: $iterator->prefixMatchFound();
  4. Get prefix itself (when either of matches were found the prefix will be equal to given word): $iterator->getPrefix();
  5. Get endings prefixed with given word: $iterator->getPrefixed().

Here is working demo.

So as you can see this realization is not as straight forward as recursion one. And while I am a big fan of iterators and SPL usage, but this is not a silver bullet and you should chose tools that fits your current needs better.

Also, this is outside the domain, but my class violates Single responsibility principle. This was intentional for the sake of simplicity. In real life there would be the other class wich will use our iterator as dependency.

Comments