Randix Lai Randy Randix Lai Randy - 7 days ago 4
Bash Question

shell scripts variable passed to awk and double quotes needed to preserve

I have some logs called ts.log that look like

[957670][DEBUG:2016-11-30 16:49:17,968:com.ibatis.common.logging.log4j.Log4jImpl.debug(Log4jImpl.java:26)]{pstm-9805256} Parameters: []
[957670][DEBUG:2016-11-30 16:49:17,968:com.ibatis.common.logging.log4j.Log4jImpl.debug(Log4jImpl.java:26)]{pstm-9805256} Types: []
[957670][DEBUG:2016-11-30 16:50:17,969:com.ibatis.common.logging.log4j.Log4jImpl.debug(Log4jImpl.java:26)]{rset-9805257} ResultSet
[957670][DEBUG:2016-11-30 16:51:17,969:com.ibatis.common.logging.log4j.Log4jImpl.debug(Log4jImpl.java:26)]{rset-9805257} Header: [LAST_INSERT_ID()]
[957670][DEBUG:2016-11-30 16:52:17,969:com.ibatis.common.logging.log4j.Log4jImpl.debug(Log4jImpl.java:26)]{rset-9805257} Result: [731747]
[065417][DEBUG:2016-11-30 16:53:17,986:sdk.protocol.process.InitProcessor.process(InitProcessor.java:61)]query String=requestid=10547

I have a script in which there's sth like

cat ts.log | awk -F '[ ,]' '{if($2 ~/^[0-2][0-9]:[0-6][0-9]:[0-6][0-9]&& $2>="16:50:17"){print $0}}'

instead of inputting the time like 16:50:17 I want to just pass $1 of shell to awk so that all I need to do is ./script time:hh:mm:ss The script will look like

cat ts.log | awk -v var=$begin -F '[ ,]' '{if($2 ~/^[0-2][0-9]:[0-6][0-9]:[0-6][0-9]&& $2>="var"){print $0}}'

But the double quotes need to be there OR it won't work.
I tried 2>"\""var"\""
but it doesn't work.
so is there a way to keep the double quotes there?
preferred result ./script
then extract the log from the time specified as $1.


There's many ways to do what you want.

Option 1: Using double quotes enclosing awk program

awk -F '[ ,]'  "\$2 ~ /^..:..:../ && \$2 >= \"${begin}\" "  ts.log
  • Inside double quotes strings, bash does variable substitution. So $begin or ${begin} will be replaced with the shell variable value (whatever sent by the user)
  • Undesired effect: awk special variables starting with $ must be escaped with '\' or bash will try to replace them before execute awk.
  • To get a double quote char (") in bash double quote strings, it has to be escaped with '\', so in bash " \"16:50\" " will be replaced with "16:50". (This won't work with single quote strings, that don't have expansion of variables nor escaped chars at all in bash).
  • To see what variable substitutions are made when bash executes the script, you can execute it with debug option (it's very enlightening):

    $ bash -x yourscript.sh 16:50

Option 2: Using awk variables

awk -F '[ ,]' -v begin=$begin '$2 ~ /^..:..:../ && $2 >= begin'  ts.log
  • Here an awk variable begin is created with option -v varname=value.
  • Awk variables can be used in any place of awk program as any other awk variable (don't need double quotes nor $).

There are other options, but I think you can work with these two.

In both options I've changed a bit your script:

  • It doesn't need cat to send data to awk, because awk can execute your program in one or more data files sent as parameters after your program.
  • Your awk program doesn't need include print at all (as @fedorqui said), because a basic awk program is composed by pairs of pattern {code}, where pattern is the same as you used in the if sentence, and the default code is {print $0}.
  • I've also changed the time pattern, primarly to clarify the script, but in a log file there's almost no chance that exists some 8 char length string that has 2 colons inside (regexp: . repaces any char)