Exception Exception - 10 days ago 5
Java Question

Java, writing my own split string method

I need to be able to write my own split string method so that input like

String[] test1 = mySplit("ab#cd#efg#", "#");
System.out.println(Arrays.toString(test1));


will print
[ab, #, cd, #, efg, #]
to the console.
So far I've got it to split like that but my way leaves awkward spaces where 2 delimiters are in a row, or a delimiter is at the start of the input.

public static String[] mySplit(String str, String regex)
{
String[] storeSplit = new String[str.length()];
char compare1, compare2;
int counter = 0;

//Initializes all the string[] values to "" so when the string
//and char concatonates, 'null' doesn't appear.
for(int i=0; i<str.length(); i++) {
storeSplit[i] = "";
}

//Puts the str values into the split array and concatonates until
//a delimiter is found, then it moves to the next array index.
for(int i=0; i<str.length(); i++) {
compare1 = str.charAt(i);
compare2 = regex.charAt(0);

if(!(compare1 == compare2)) {
storeSplit[counter] += ""+str.charAt(i);
} else {
counter++;
storeSplit[counter] = ""+str.charAt(i);
counter++;
}
}
return storeSplit;
}


When I use that method in my Test main, I get the output [ab, #, cd, #, efg, #, , , , ]. So I'm lost on how to fix the spacing of it all and I'll also need to be able to allow multiple delimiters which my code currently doesn't handle.

Also I know this code is really sloppy at the moment, just trying to lay down the concepts before the optimization.

Answer

The problem is straightforward, you have one offset walking through finding new matches (pos), and another showing then end of the last place you found a match (start).

public static String[] mySplit(String str, String regex)
{
    Vector<String> result = new Vector<String>;
    int start = 0;
    int pos = str.indexOf(regex);
    while (pos>=start) {
        if (pos>start) {
            result.add(str.substring(start,pos));
        }
        start = pos + regex.length();
        result.add(regex);
        pos = str.indexOf(regex,start); 
    }
    if (start<str.length()) {
        result.add(str.substring(start));
    }
    String[] array = result.toArray(new String[0]);
    return array;
}

This avoid extra looping and copies each character only once. Actually, because of the way that substring works, no characters are ever copied, only small string objects are created pointing to the original character buffer. No concatenation of strings is done at all, which is an important consideration.