I'm trying to copy these files over from S3 to Redshift, and they are all in the format of Row(column1=value, column2=value,...), which obviously causes issues. How do I get a dataframe to write out in normal csv?
I'm calling it like this:
The spark-csv approach is a good one and should be working. It seems by looking at your code that you are calling
df.write on the original DataFrame
df and that's why it's ignoring your transformations. To work properly, maybe you should do:
final_data = # Do your logic on df and return a new DataFrame final_data.write.format('com.databricks.spark.csv').save('results')