R4j R4j - 4 months ago 18
Java Question

Search string in two dimensional string array java

I have a two dimensional string array look like this:
enter image description here

The first column contains characters of many strings, other columns are extra data for character.
I want to search a string (maybe change to array character) in this array to get all match indexes (start - end). For example, when I search with key "next", the result should be [5 - 8], [13 - 16] (the highlight parts in image above).
Shortly, I need a method look like this:

public static List<Interval> search(String searchText, String[][] data, int columnsCount, int rowCount){
// Convert search text to String array
String[] searchArr = getStringArray(searchText);
// then search in data

}

// where Interval is:
public class Interval{
public int start;
public int end;
}


Is there any fast way to search like this,cause my data is very large?

Thanks in advance!

Answer

I would recommend to adapt the String[][] to a CharSequence. Then you are free to do everything you can do with a CharSequence and this also means that you can use java.util.regex.Matcher to search for the string and you don't need to implement an own search algorithm.

For example:

public class Main {
    public static void main(String[] args) {
        String[][] array2d = createArray();

        int charSeqColumn = 0;
        CharSequence charSequnce = new Array2DColumnCharSequnce(array2d, charSeqColumn);

        System.out.println(charSequnce.toString());

        Pattern patttern = Pattern.compile("ext");
        Matcher matcher = patttern.matcher(charSequnce);

        while (matcher.find()) {
            String matchGroup = matcher.group();
            int start = matcher.start();
            int end = matcher.end() - 1;

            String msg = MessageFormat.format("{0} matched at: [{1}] - [{2}]", matchGroup, start, end);
            System.out.println(msg);
        }
    }

    private static String[][] createArray() {
        String[][] array2d = new String[2][10];
        array2d[0][0] = "N";
        array2d[0][1] = "e";
        array2d[0][2] = "x";
        array2d[0][3] = "t";
        array2d[0][4] = " ";
        array2d[0][5] = "N";
        array2d[0][6] = "e";
        array2d[0][7] = "x";
        array2d[0][8] = "t";
        array2d[0][9] = " ";

        array2d[1][0] = "H";
        array2d[1][1] = "e";
        array2d[1][2] = "l";
        array2d[1][3] = "l";
        array2d[1][4] = "o";
        array2d[1][5] = "W";
        array2d[1][6] = "o";
        array2d[1][7] = "r";
        array2d[1][8] = "l";
        array2d[1][9] = "d";
        return array2d;
    }
}

will output

Next Next 
ext matched at: [1] - [3]
ext matched at: [6] - [8]

I would implement the CharSequence adaption like this

class Array2DColumnCharSequnce implements CharSequence {

    private int column;
    private String[][] array2d;
    private int endIndex;
    private int startIndex;

    public Array2DColumnCharSequnce(String[][] array2d, int column) {
        this(array2d, column, 0, array2d[column].length);
        this.array2d = array2d;
        this.column = column;
    }

    public Array2DColumnCharSequnce(String[][] array2d, int column,
            int startIndex, int endIndex) {
        this.array2d = array2d;
        this.column = column;
        this.startIndex = startIndex;
        this.endIndex = endIndex;
    }

    public int length() {
        return endIndex - startIndex;
    }

    public char charAt(int index) {
        String charString = array2d[column][startIndex + index];
        return charString.charAt(0);
    }

    public CharSequence subSequence(int start, int end) {
        Array2DColumnCharSequnce array2dColumnCharSequnce = new Array2DColumnCharSequnce(
                array2d, column, start, end);
        return array2dColumnCharSequnce;
    }

    @Override
    public String toString() {
        StringBuilder sb = new StringBuilder(this);
        return sb.toString();
    }
}

Note: The Array2DColumnCharSequnce is just a quick implementation and it does not address exception handling yet nor it addresses what happens when there are more than one char in a string column.

Why to use a CharSequence decorator

The difference with adapting the array to a CharSequence to other approaches is that you use a standard java interface that can be re-used with many other classes and thus is very flexible.

Some often used standard java classes that take a CharSequence as parameter

See the full list here.

Use the code above and try this to see how flexibe the decorator is.

public static void main(String[] args) {
    String[][] array2d = createArray();

    CharSequence charSequnce = new Array2DColumnCharSequnce(array2d, 0);

    boolean contentEquals = "Next Next ".contentEquals(charSequnce);
    System.out.println(contentEquals);

    CharSequence column1CharSequnce = new Array2DColumnCharSequnce(array2d, 1);
    String replaced = "I want to say Next Next ".replace(charSequnce, column1CharSequnce);
    System.out.println(replaced);
}

will output

true
I want to say HelloWorld

Finally everyone has to decide what he/she wants and what fits the situation. I prefer implementations that give me more options if I can get them "almost" for free.

Comments