FUZxxl FUZxxl - 4 months ago 11
Brainfuck Question

Of which things should I take care if I'm using unboxed type (like Int#) in Haskell / GHC?

I'm trying to write a small script which parses and executes Brainfuck code, to understand the GHC options of optimization, I'm trying to optimize the code in order to be a bit faster and to understand what's going on there.

On of the parts is the internal represantation of BF-code, I use a special datatype for this. Here's the sourcecode, included the two functions which are doing the conversions:

data BFinstruction
= AdjustValue Int
| MovePointer Int
| GetChar
| PutChar
| Loop BFcode
deriving (Eq)

type BFcode = [BFinstruction]

unsafeCompileBrainfuck :: String -> BFcode
unsafeCompileBrainfuck = fst . parse [] where
-- arguments: input string, built code; output: output code, rest of input
parse :: BFcode -> String -> (BFcode,String)
parse c ('+':s) = parse (AdjustValue 1 :c) s
parse c ('-':s) = parse (AdjustValue (-1):c) s
parse c ('>':s) = parse (MovePointer 1 :c) s
parse c ('<':s) = parse (MovePointer (-1):c) s
parse c ('.':s) = parse (PutChar :c) s
parse c (',':s) = parse (GetChar :c) s
parse c (']':s) = (reverse c, s)
parse c ('[':s) = parse (Loop l :c) s' where (l,s') = parse [] s
parse c [] = (reverse c ,"")
parse c ( _ :s) = parse c s

simplifyBrainfuck :: BFcode -> BFcode
simplifyBrainfuck ((AdjustValue x):(AdjustValue y):zs) = if x + y /= 0
then simplifyBrainfuck (AdjustValue (x + y):zs)
else simplifyBrainfuck zs
simplifyBrainfuck ((MovePointer x):(MovePointer y):zs) = if x + y /= 0
then simplifyBrainfuck (MovePointer (x + y):zs)
else simplifyBrainfuck zs
simplifyBrainfuck (x :zs) = x: simplifyBrainfuck zs
simplifyBrainfuck [] = []

The idea is, that the code will be read from some input (string), preparsed and simplified by the above code and then executed by some other functions. (It is assumed, that the input is valid).

In order to optimize this example, I tried to unbox the Int params of the
constructors by doing domething like this:

data BFinstruction -- BangPatterns
= AdjustValue {-# UNPACK #-} !Int
| MovePointer {-# UNPACK #-} !Int
| GetChar
| PutChar
| Loop BFcode
deriving (Eq)

This will turn the boxed
type into an unboxed, raw
type, which is an implementation detail of GHc. As I read, this option is only good in a few cases, so I want to ask, of which things I have to pay attention if I want to perform this kind of optimization. My goal is to allow the execution of BF-code using the benefits of Haskell - laziness (I want to archieve, that the code may only be kept as needed in memory) and easiness.


This really looks like a premature optimization to me. UNPACK is mostly useful when you have a very large number of BFInstructions sitting around. I doubt you'll ever have enough brainf**k code to make it worthwhile. I agree with Gian, it should be simple enough to test, so do that first.

In any case, the thing to be aware of with UNPACK'ed values is that you don't want the compiler to have to rebox them. You should be ok with numerical ops, but other than that you'd have to look carefully at the functions you're using to see if you ever use the unpacked values as a non-strict argument. The only way to be sure is to look at the core, or the interface files, to see what parameters are strict and which aren't. As always, be sure to compile with at least "-O".