ElenaDBA ElenaDBA - 5 months ago 10
Vb.net Question

Splitting a string on a number of whitespaces using Regex and VB.NET

I have a table in the following format:

col1 col2 col3 col4 col5 col6
test 300 25.5 14 345 11
test2 100 23 11 203
test3 31 44 175


So even though the number of columns is set and columns are aligned, some columns can be blank. The space between columns is at least 2 whitespace characters. So I am using:

If Regex.Split(line, "[ ]{2,}").Length = 6 Then
sw.WriteLine("col1: " & Regex.Split(line, "[ ]{2,}")(0))
sw.WriteLine("col2: " & Regex.Split(line, "[ ]{2,}")(1))
sw.WriteLine("col3: " & Regex.Split(line, "[ ]{2,}")(2))
sw.WriteLine("col4: " & Regex.Split(line, "[ ]{2,}")(3))
sw.WriteLine("col5: " & Regex.Split(line, "[ ]{2,}")(4))
sw.WriteLine("col6: " & Regex.Split(line, "[ ]{2,}")(5))
End If


inside a loop that grabs each line and writes it to a stream writer.

It would work fine if some column values were not missing. In the example above the first line would run fine, but for lines two and 3,
Regex.Split(line, "[ ]{2,}").Length
would return 5 and 4 respectively. Is there a workaround? I want
Regex.Split(line, "[ ]{2,}").Length
to return 6 as the number of columns and fill the corresponding parts of the array with blanks.

Answer

I'm not exactly certain what your expected output is, although setting a maximum number for the amount of spaces to split by should work:

string[] s = Regex.Split(line, @"\s{2,6}");

Code:

    string line;
    while ((line = Console.ReadLine()) != null) {

        // Console.WriteLine(line); // original output

        string[] s = Regex.Split(line, @"\s{2,6}");

        int i = 0;
        foreach (string word in s) {
            i++;
            Console.WriteLine("col"+i+": " + word);
        }
    }

Example:

http://ideone.com/0lZLk1