Kent Kent - 4 months ago 18
Bash Question

Can someone explain me why awk's sub() / gsub() works like this?

I know

awk
can do text/string substitution with
sub()
and
gsub()
like:

kent$ echo "fffff"|awk '{gsub("f", "b")}1'
bbbbb


or

kent$ echo "fffff"|awk '{gsub(/f/, "b")}1'
bbbbb


However today I made a typo mistake, I wrote the line as:

kent$ echo "fffff"|awk '{gsub('f', "b")}1'


But
awk
didn't complain about that but generated output as usual, of course, unexpected output, it took me sometime to find out the error. The output
awk
gave me was:

bfbfbfbfbfb


another example:

kent$ echo "fafafafafXX"|awk '{gsub('fa', "B")}1'
BfBaBfBaBfBaBfBaBfBXBXB


example with
sub()
is strange too:

kent$ echo "thanks in advance"|awk '{sub('a', "B")}1'
Bthanks in advance


Could someone explain me how was the strange substitution done?

kent$ awk --version
GNU Awk 4.0.2


EDIT

thanks for the answer from Joni. maybe this example explains it better, I just add it here:

kent$ echo "thanks in advance"|awk '{f="k";sub('f', "B")}1'
thanBs in advance

kent$ echo "thanks in advance"|awk '{sub('th ank', "B")}1'
awk: cmd. line:2: {sub(th
awk: cmd. line:2: ^ unexpected newline or end of string

Answer

When you write

echo "fffff"|awk '{gsub('f', "b")}1'

what awk sees is {gsub(f, "b")}1. It interprets f as a variable, with an empty value, and substitutes every empty string in the input with b.

The empty string is found between each character and after the last one, so awk inserts a b after each f.

You can substitute // or "" for the same effect, without an unused variable:

echo "fffff"|awk '{gsub(//, "b")}1'            # fbfbfbfbfb
Comments