Thonanth Siddef Thonanth Siddef - 1 year ago 107
Ruby Question

How to get headers from a on a CSV file with only a header row in ruby

So, I've been messing around with CSV files in Ruby, and I've come across an issue. In my testing, a call of x =, headers:true) on a file which only contains the header row will return a table, that, when converted to an array, returns [[]], and calling x.headers returns []. I can circumvent this problem by setting return_headers:true, but I don't actually want the file to return its headers, I just want the headers. When I add in a fake, second row, x.headers actually returns the headers, and :return_headers does not need to be set to true.

Here is some code from before and after adding a row to help visualize the issue.

With headers:true, return_headers:true on a csv file with only the header row:

a ="June.csv", headers:true, return_headers:true) # <CSV::Table mode:col_or_row row_count:1>
a[0] # <CSV::Row "Day":"Day" "Time":"Time">
a.headers # => ["Day", "Time"]

With only headers:true on a csv file with only the header row:

b ="June.csv", headers:true) #<CSV::Table mode:col_or_row row_count:1>
b[0] # => nil
b.headers # => []

With only headers:true on a csv file with the fake second row:

c ="June.csv", headers:true) #<CSV::Table mode:col_or_row row_count:2>
c.headers # => ["Day", "Time"]
c["Day"] # => ["6/1"]

I can't depend on the CSV file I have to always have a second row, because my program intends to build on it. What am I doing wrong? Is this the intended behavior, or is the problem in my setup, somehow? Do I have to do a read just for the headers, and then another read to get the behavior I would like? I've searched for a good while, but am still having trouble

Answer Source

The behavior you are experiencing is expected. A somewhat similar question two years ago had an answer that pointed out the same issue you're having. That person opened a bug report for Ruby where the Ruby devs responded and rejected it. And according to some people that is technically not a well-formed CSV.

However, I agree with you and the person who opened the bug. The headers: true option should fill out the CSV.headers regardless of whether there is actually data on the following lines or not. The current behavior seems baffling and will only lead to bugs in code.

As a quick fix for your issue I would simply pass return_headers: true and begrudgingly skip over the first entry in the result, which will always be the header row.