NmdMystery NmdMystery - 5 months ago 58
Bash Question

ls | grep with variable as regex

I'm writing a bash script to automate a few tasks. One of the things I have to do is search for a pattern among filenames in a directory, then loop through the results.

When I run this script:

data=$(ls /path | grep -o 'pattern')
echo $data


I get the expected result - a list of all the matches that were found among the filenames in /path. However, when I store said pattern in a variable and then use it, like this:

pattern="'pattern'"
echo $pattern
data=$(ls /path | grep -o $pattern)
echo $data


The pattern is correctly echoed, but $data is empty. Why is this?

Answer

In the below, I'm ignoring that your input source is ls, beyond this opening note that ls should not be used in this manner, and find (which, in GNU-extended forms, contains a -regex operator) should be considered instead.


In:

pattern="'pattern'"
grep $pattern

...the double quotes (") are syntactic -- they're consumed by the shell during its parsing phase, whereas the single quotes, inside of them, are literal -- the outer, syntactic quotes specified that everything inside them is to be considered a part of the string (except where the rules for parsing double-quoted content differ).

Thus, when you run grep $pattern, the following happens:

  • The contents of $pattern are broken into words on any characters within IFS. By default, IFS contains only whitespace; however, if you had IFS=a, then this would be broken into a word "pa and a word ttern"
  • Each of these words is expanded as a glob. Thus, if your pattern had contained "hello * world", and you had a default value of IFS parsing on whitespace, we would have broken into the words "hello, *, and world" -- and the * would then be replaced with a list of files in the current directory.

Obviously, you don't want this. Thus, use only syntactic quotes if your goal is to prevent string-splitting and glob expansion:

pattern="pattern"
grep "$pattern"
Comments