Noel Yap Noel Yap - 2 months ago 6
Scala Question

How to do Bash process substitution in Scala?

How would something like

diff <(echo aoeu) <(echo snth)
be done in Scala?

I've tried using the sys.process interface as follows:

"diff <(echo aoeu) <(echo snth)".!

...however, this doesn't interpret the
as subprocess substitution.

import scala.sys.process._

def diff(one: String, two: String):
  String = Seq(
    "bash", "-c", """
       diff <(printf '%s\n' "$1") \
            <(printf '%s\n' "$2"); retval=$?
       (( retval == 1 )) || exit "$retval"
    """, "_", one, two).!!

This can be tested in practice:

scala> diff("hello", "world")
res1: String =
< hello
> world

To break down the reasoning:

  • Invoking a sequence, rather than a string, allows data (in my examples hello and world; in yours, aoeu and snth) to be passed out-of-band from code. This is critical to avoiding injection attacks when such content is parameterized.
  • Invoking bash as your executable ensures that process substitution syntax is available.
  • Checking for the exit status of 1 (and coercing it to 0) avoids scala treating a case where diff returns an exit status indicating that the two inputs are not identical as an error, while ensuring that other errors still become exceptions in scala.
  • Using printf '%s\n' "$1" instead of echo "$1" avoids ambiguities in the POSIX definition of echo (see in particular the APPLICATION USAGE section).
  • Passing an explicit argument of _ fills in the argv[0] slot, (aka $0).

Note that invoking a sequence rather than a string also prevents you from needing a shell at all in many cases: Seq("hello", "world").! doesn't need to invoke any shell, but can be implemented so as to directly starts an executable named hello, whereas "hello world".! is equivalent to Seq("sh", "-c", "hello world").!, with an extra executable invocation with both performance cost and potential security vulnerabilities required for implementation. See Shellshock for an example of a (now-near-universally-patched) case where a shell invocation with no explicit user-controlled parameters could still be vulnerable in practice (when invoked from a web server following CGI conventions for exporting request parameters as environment variables); avoiding unnecessary shells is thus preferable behavior where feasible.