NeoVe NeoVe - 1 month ago 11
Python Question

Split .csv while keeping description first row

I have this code:

#!/usr/bin/env python

# Import module
import os

# Define file_splitter function
def file_splitter(fullfilepath, lines=50):
"""Splits a plain text file based on line count."""
path, filename = os.path.split(fullfilepath)
basename, ext = os.path.splitext(filename)

# Open source text file
with open(fullfilepath, 'r') as f_in:
try:
# Open first output file
f_output = os.path.join(path, '{}_{}{}'.format(basename, 0, ext))
f_out = open(f_output, 'w')

# Read input file one line at a time
for i, line in enumerate(f_in):
# When current line can be divided by the line
# count close the output file and open the next one
if i % lines == 0:
f_out.close()
f_output = os.path.join(path, '{}_{}{}'.format(basename, i, ext))
f_out = open(f_output, 'w')

# Write current line to output file
f_out.write(line)

finally:
# Close last output file
f_out.close()

# Call function with source text file and line count
file_splitter('Products_con_almacen_stock.csv', 12000)


This splits a file
Products_con_almacen_stock.csv
into chunks of 120.000 lines.

Now, every chunk has the columns and rows, but no header, only the first chunk has it, I'd like to preserve the first descriptive row on every chunk.

Is this possible?

Thanks in advance!

Answer

You could do something like this:

first = True
for line in lines:
    if first: 
        header = line
        first = False
    ....

You can then use header in all subsequent files.

Comments