Malgi Malgi - 2 months ago 5x
Linux Question

Add a variable to a column in a CSV file

I have a large file (~10GB) and I want to duplicate that file 10 times but each time add a variable to the first column:

for i in (1, 10):
var = (i-1) * 1000
# add var to the first column of the file and save the file as file(i).csv

So far I have tried:

for i in {1..10}
j=$(( $i - t ))
person_id=$(( j * add ))
awk -F"," 'BEGIN{OFS=","} NR>1{$1=$1+$person_id} {print $0}' file.csv > file$i.csv

but no change in column value.


Awk variables are different from shell variables.


awk -F"," 'BEGIN{OFS=","} NR>1{$1=$1+$person_id} {print $0}' file.csv > file$i.csv


awk -F"," -v id="$person_id" 'BEGIN{OFS=","} NR>1{$1=$1+id} {print $0}' file.csv > "file$i.csv"

This uses the -v option to define an awk variable id whose value is the value of the shell variable person_id.

Because , is not a shell-active character, the code can be simplified. Also, changing the location of the definition of OFS can further shorten the code:

awk -F, -v id="$person_id" 'NR>1{$1+=id} 1' OFS=, file.csv > "file$i.csv"

Lastly, we replaced {print $0} with the cryptic shorthand 1. (This works because awk interprets 1 as a logical condition which it evaluates to true and, since no action was supplied, awk will perform the default action which is to print the line.)