Dan Dan - 3 months ago 8
Linux Question

Awk ^ (carat) exclusion

I got shown the below (running on fedora 24):

Input example:

/sys/devices/system/memory/memory101/state:offline
/sys/devices/system/memory/memory104/state:offline
/sys/devices/system/memory/memory107/state:offline


AWK command executed on the input:

grep offline data/onlineMemory | awk -F '[^0-9]+' {'print $2'}


which gives output like:

101
104
107


But when I print awk's $1, I see nothing. Where has the other part of the lines disappeared to?

Why is $2 set to the digits? I thought "^" in awk would negate the characters.

Thanks

Answer

This is weird, but normal: since you are setting the field separator to [^0-9]+, awk understands this as: everything is a field separator apart from the digits.

#field1                                              field3
#<|                                                  |>
#  /sys/devices/system/memory/memory101/state:offline
#  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^   ^^^^^^^^^^^^^^
#              FS                   ^^^       FS
#                                   field2

This way, almost everything in your string becomes a field separator:

$ awk -F '[^0-9]+' '{for (i=1;i<=NF;i++) printf "line=%d. field num %d is --> %s\n", NR, i, $i}' file
line=1. field num 1 is --> 
line=1. field num 2 is --> 101
line=1. field num 3 is --> 
line=2. field num 1 is --> 
line=2. field num 2 is --> 104
line=2. field num 3 is --> 
line=3. field num 1 is --> 
line=3. field num 2 is --> 107
line=3. field num 3 is --> 

Why is this happening? Because the way awk sets the fields:
It sets the 1st one to everything up to the FS, then the 2nd from the 1st to the 2nd FS and so on; finally, the last field ($NF) consists in everything from the last FS up to the end of the record:

$ awk -F ';' '{for (i=1;i<=NF;i++) printf "line=%d. field num %d is --> %s\n", NR, i, $i}' <<< ";hello;"
line=1. field num 1 is --> 
line=1. field num 2 is --> hello
line=1. field num 3 is --> 

So in this case you are making the FS be something rather complex, which can be summarized with this example where the FS is anything but 3:

$ awk -F '[^3]+' '{for (i=1;i<=NF;i++) printf "line=%d. field num %d is --> %s\n", NR, i, $i}' <<< "abcde3fghi"
line=1. field num 1 is --> 
line=1. field num 2 is --> 3
line=1. field num 3 is --> 

So what awk is doing in this case is to split the string abcde3fghi using the FS anything but 3, making everything before the first FS to be the first field (hence empty) and everything after the last occurrence of FS the last field (hence also empty). This leaves us just a single not-empty field, the 2nd one.