Will Barnwell Will Barnwell - 5 months ago 17
Bash Question

Correctly allow word splitting of command substitution in bash

I write, maintain and use a healthy amount of bash scripts. I would consider myself a bash hacker and strive to someday be a bash ninja ( need to learn more

awk
first ). One of the most important feature/frustrations of bash to understand is how quotes, and subsequent parameter expansion, work. This is well documented, and for a good reason, many pitfalls, bugs and newbie-traps exist in the mysterious world of quoted parameter expansion and word splitting. For this reason, the advice is to "Double quote everything," but what if I want word splitting to occur?

In multiple style guides I can not find an example of safe and proper use of word splitting after command substitution.

What is the correct way to use unquoted command substitution?



Example:



I don't need help getting this command working, but it seems to be a violation of established patterns, if you would like to give feedback on this command, please keep it in comments

docker stats $(docker ps | awk '{print $NF}' | grep -v NAMES)


The command substitute returns output such as:

container-1 container-3 excitable-newton


This one-liner uses the command substitution to spit out the names of each of my running docker containers and the feeds them, with word splitting, as separate inputs to the
docker stats
command, which takes an arbitrary length list of container names and gives back some info about them.

If I used:

docker stats "$(docker ps | awk '{print $NF}' | grep -v NAMES)"


There would be one string of newline separated container names passed to
docker stats
.

This seems like a perfect example of when I would want word splitting, but shellcheck disagrees, is this somehow unsafe? Is there an established pattern for using word-splitting after expansion or substitution?

Answer

The safe way to capture output from one command and pass it to another is to temporarily capture the output in an array. This allows splitting on arbitrary delimiters and prevents unintentional splitting or globbing while capturing output as more than one string to be passed on to another command.

If you want to read a space-separated string into an array, use read -a:

read -r -a names < <(docker ps | awk '{print $NF}' | grep -v NAMES)
printf 'Found name: %s\n' "${names[@]}"

Unlike the unquoted-expansion approach, this doesn't expand globs. Thus, foo[bar] can't be replaced with a filesystem entry named foob, or with an empty string if no such filesystem entry exists and the nullglob shell option is set. (Likewise, * will no longer be replaced with a list of files in the current directory).


To go into detail regarding behavior: read -r -a reads up to a delimiter passed as the first character of the option argument following -d (if given), or a NUL if that option argument is 0 bytes, and splits the results into fields based on characters within IFS -- a set which, by default, contains the newline, the tab, and the space; it then assigns those split results to an array.

This behavior does not meaningfully vary based on shell-local configuration, except for IFS, which can be modified scoped to the single command.

mapfile -t and readarray -t are similarly consistent in behavior, and likewise recommended if portability constraints do not prevent their use.


By contrast, array=( $string ) is much more dependent on the shell's configuration and settings, and will behave badly if the shell's configuration is left at defaults:

  • When using array=( $string ), if set -f is not set, each word created by splitting $string is evaluated as a glob, with further variances based in behavior depending on the shopt settings nullglob (which would cause a pattern which didn't expand to any contents to result in an empty set, rather than the default of expanding to the glob expression itself), failglob (which would cause a pattern which didn't expand to any contents to result in a failure), extglob, dotglob and others.
  • When using array=( $string ), the value of IFS used for the split operation cannot be easily and reliably altered in a manner scoped to this single operation. By contrast, one can run IFS=: read to force read to split only on :s without modifying the value of IFS outside the scope of that single value; no equivalent for array=( $string ) exists without storing and re-setting IFS (which is an error-prone operation; some common idioms [such as assignment to oIFS or a similar variable name] operate contrary to intent in common scenarios, such as failing to reproduce an unset or empty IFS at the end of the block to which the temporary modification is intended to apply).
Comments