imagineerThat imagineerThat - 24 days ago 13
Python Question

Python CSV module: How can I account for multiple tables within the same file?

I have an Excel file that I converted to CSV. There are several tables each separated by an empty row. After converting the Excel file to CSV, I see each empty row represented by a row of commas, with a comma for every column/field element. Can the CSV module (or some other Python module) account for multiple tables from this information? If not, is my only option to separate the tables into different files manually in Excel before conversion?

I know the CSV module will turn each row into a list. I'd like a table to be its own list and all the rows it has as lists within. Each table has the first row as fields. The fields can be different from table to table, and the number of fields can be different as well.

Answer

Sure, it's easy to read the data in that way. You have to decide what constitutes the separator row (is it sufficient to check for the first column being empty, or do you have to check that all columns are empty?) Assuming just the first row (and being extra verbose for clarity):

 rdr = csv.reader(open(filename))

 tables = []
 this_table = []
 tables.append(this_table)
 for row in rdr:
      if row[0] is None:
         this_table = []
         tables.append(this_table)
      this_table.append(row)

The result is a list called tables. Each entry is a list containing the data for one table. Each entry in a table is a list containing the column values for one row.

Comments