MBasith MBasith - 22 days ago 9
Python Question

Python splitting columns with variable spacing

I am trying to parse a table that has columns separated by a single space and multiple spaces. I am able to use re.split to separate columns with more than 1 space but then have to re-split for columns that are separated by a single space. The below code accomplishes this by splitting columns 4 and 5 multiple times but is there a better or more efficient way to do this?

I am using the method below which seems inefficient:

My Code:

import re

string = '''No Mon Date Time Values colors
1 Nov 11-03-2016 23:17:52 Red colors
2 Nov 11-03-2016 19:18:00 Yellow colors
3 Nov 11-03-2016 19:18:18 Blue colors
4 Oct 10-03-2016 19:22:58 Orange Green colors
5 Oct 10-07-2016 10:37:36 Red Blue Yellow colors
6 Oct 10-07-2016 10:37:36 White colors
7 Sep 09-07-2016 10:37:37 Ping White Yellow Green colors'''

for i in string.splitlines():
col1 =re.split(r'\s{2,}', i)[0]
col2 =re.split(r'\s{2,}', i)[1]
col3 = re.split(r'\s{2,}', i)[2]
col4 = re.split(r'\s{2,}', i)[3].split()[0]
col5 = ' '.join(re.split(r'\s{2,}', i)[3].split()[1:])

print('{:3} | {:3} | {:10} | {:10} | {:23}|'.format(col1, col2, col3, col4, col5))


Output:

No | Mon | Date | Time | Values |
1 | Nov | 11-03-2016 | 23:17:52 | Red |
2 | Nov | 11-03-2016 | 19:18:00 | Yellow |
3 | Nov | 11-03-2016 | 19:18:18 | Blue |
4 | Oct | 10-03-2016 | 19:22:58 | Orange Green |
5 | Oct | 10-07-2016 | 10:37:36 | Red Blue Yellow |
6 | Oct | 10-07-2016 | 10:37:36 | White |
7 | Sep | 09-07-2016 | 10:37:37 | Ping White Yellow Green|

Answer

You can get 4 values in a single split operation first and then split 4th element using \s{2,}:

for i in string.splitlines():
    arr = re.split(r'\s+', i, 4)
    print('{:3} | {:3} | {:10} | {:10} | {:23}|'.
          format(arr[0], arr[1], arr[2], arr[3], re.split(r'\s{2,}', arr[4])[0]))

No  | Mon | Date       | Time       | Values                 |
1   | Nov | 11-03-2016 | 23:17:52   | Red                    |
2   | Nov | 11-03-2016 | 19:18:00   | Yellow                 |
3   | Nov | 11-03-2016 | 19:18:18   | Blue                   |
4   | Oct | 10-03-2016 | 19:22:58   | Orange Green           |
5   | Oct | 10-07-2016 | 10:37:36   | Red Blue Yellow        |
6   | Oct | 10-07-2016 | 10:37:36   | White                  |
7   | Sep | 09-07-2016 | 10:37:37   | Ping White Yellow Green|

Code Demo

Comments