Stéphane Laurent Stéphane Laurent - 5 months ago 27
Linux Question

Failure of readProcess "grep" in Haskell because of a strange file

I have a file

containing strange characters (this file comes from Windows and I am on Linux). It contains the character string
. When I run
grep "rec" MAIL.txt
in a terminal, I get the expected output.

However this command fails in Haskell:

Prelude System.Process> r <- readProcessWithExitCode "grep" ["rec", "MAIL.txt"] ""
*** Exception: fd:13: hGetContents: invalid argument (invalid byte sequence)

What is the explanation and is there a way to avoid that (without modifying the strange file)?

Here is the output of
where you can see the strange characters:

enter image description here

In fact these should be accented letters.

I cannot read the file with

> r <- readFile "MAIL.txt"
> r
"Bonjour,\r\n\r\n Quelques remarques sur cette fiche :\r\n\r\n- *** Exception: MAIL.txt: hGetContents: invalid argument (invalid byte sequence)

Maybe there's a way to detect the problem with Data.Binary?


To detail @ErikR's answer:

import System.Process.ByteString (readProcessWithExitCode)
import Data.ByteString (ByteString, empty)
import qualified Data.ByteString.Char8 as B
import System.Exit (ExitCode)
(err, stdout, stderr) <- readProcessWithExitCode "grep" ["rec", "MAIL.txt"] empty
B.putStrLn stdout


Use readProcessWithExitCode or readCreateProcessWithExitCode from the process-extras package. They return ByteStrings:

readProcessWithExitCode :: FilePath -> [String] -> ByteString
       -> IO (ExitCode, ByteString, ByteString)

Other versions exists for return lazy Bytestrings and Text.

Edit: Updated links since process-listlike is deprecated in favor of process-extras.