CandyCoated CandyCoated - 3 months ago 7
PHP Question

Recursively mapping file paths in one folder to another folder

Let's say I have a folder

(folder_1)
with the following structure:

/folder_1
/dir_1
- file_1_1.txt
- file_1_2.txt
/dir_2
- file_2_1.txt
/dir_2_1
- file_2_1_1.txt
- file_1.txt


Now, let's say I have another folder
(folder_2)
with the following structure:

/folder_2
/dir_1
- file_1_1.txt
- default.txt
/dir_2
- file_2_1.txt
- default.txt
- default.txt


I need to map every file in folder_1 to a file in folder_2 such that:


  1. /folder_1/dir_1/file_1_1.txt
    maps to
    /folder_2/dir_1/file_1_1.txt
    .

  2. /folder_1/dir_1/file_1_1.txt
    maps to
    /folder_2/dir_1/default.txt

  3. /folder_1/dir_2/file_2_1.txt
    maps to
    /folder_2/dir_2/file_2_1.txt

  4. /folder_1/dir_2/dir_2_1/file_2_1_1.txt
    maps to
    /folder_2/dir_2/default.txt

  5. /folder_1/file_1.txt
    maps to
    /folder_2/default.txt



I am not the best communicator, so hopefully, the above pattern makes sense to you guys. The question is language agnostic really, but an answer in PHP and/or Javascript would be really great.

So far, I was able to accomplish this in PHP using FileIterator, RecursiveDirectoryIterator, and a bunch of custom classes that extract and then map the path to the files one by one.

This makes me wonder if I am missing an easier way to do this simple mapping. Maybe using regex named groups or something?

**Edit: **

Is it possible that for each file (file path) in folder_1, we use a regex pattern to find (reduce) the best match out of a map of all file paths in folder_2?

Further edit:

This is for mapping data files in folder_1 to template files in folder_2. If for a file in folder_1, an exact matching file path (including filename) in folder_2 is not found, we look for
default.txt
. If
default.txt
is not found, then we move up a directory and use that parent directory's
default.txt
. This way, we keep moving up directory levels till we find the first
default.txt
.

Answer

First, use your recursive directory scanner to scan all of the folder_2 directory tree. Build a hash table that contains the file names, without the folder_2 prefix. So your hash table would contain:

/dir_1
/dir_1/file_1_1.txt
/dir_1/default.txt
/dir_2/file_2_1.txt
/dir_2/default.txt
/default.txt

Now, start scanning folder_1. When you get a file, strip folder_1 from the front, and look for the resulting string in the hash table. If it's there, then you have a match.

If the file is not there, replace the last segment with "default.txt", and try again. So, when you begin scanning folder_1, you get:

/folder_1/dir_1/file_1_1.txt

You look up dir_1/file_1_1.txt in the hash table and find it. You have a match.

Next, you get /folder_1/dir_1/file_1_2.txt. You look up /dir_1/file_1_2.txt in the hash table and don't find it. So you replace file_1_2.txt with default.txt, giving you /dir_1/default.txt. You look that up in the hash table, find it, and you have a match.

Now, if /dir_1/default.txt did not exist, then you would again adjust the file name to remove the last directory. That is, you'd remove /dir_1, and you'd look up /default.txt in the hash table.

In pseudo code it looks like this:

for each file in folder_1
    name = strip `/folder_1` from the name
    if name in hash table then
        match found
        continue (next file)
    end if
    replace file name (everything after the last '/') with "default.txt"
    do
        if name in hash table then
            match found
            continue (next file)
        end if
        remove the last slash, and everything between it and the previous slash.
        (so "/dir_1/default.txt" becomes "/default.txt")
    while name.length > 0

    // if you get here, no match was found
end for