speeder - 4 months ago 14
Brainfuck Question

How does the Brainfuck Hello World actually work?

Someone sent this to me and claimed it is a hello world in Brainfuck (and I hope so...)

++++++++++[>+++++++>++++++++++>+++>+<<<<-]>++.>+.+++++++..+++.>++.<<+++++++++++++++.>.+++.------.--------.>+.>.

I know the basics that it works by moving a pointer and increment and decrementing stuff...

Yet I still want to know, how does it actually work? How does it print anything on the screen in the first place? How does it encode the text? I do not understand at all...

1. Basics

To understand Brainfuck you must imagine infinite array of cells initialized by `0` each.

``````...[0][0][0][0][0]...
``````

When brainfuck program starts, it points to any cell.

``````...[0][0][*0*][0][0]...
``````

If you move pointer right `>` you are moving pointer from cell X to cell X+1

``````...[0][0][0][*0*][0]...
``````

If you increase cell value `+` you get:

``````...[0][0][0][*1*][0]...
``````

If you increase cell value again `+` you get:

``````...[0][0][0][*2*][0]...
``````

If you decrease cell value `-` you get:

``````...[0][0][0][*1*][0]...
``````

If you move pointer left `<` you are moving pointer from cell X to cell X-1

``````...[0][0][*0*][1][0]...
``````

2. Input

To read character you use comma `,`. What it does is: Read character from standard input and write its decimal ASCII code to the actual cell.

Take a look at ASCII table. For example, decimal code of `!` is `33`, while `a` is `97`.

Well, lets imagine your BF program memory looks like:

``````...[0][0][*0*][0][0]...
``````

Assuming standard input stands for `a`, if you use comma `,` operator, what BF does is read `a` decimal ASCII code `97` to memory:

``````...[0][0][*97*][0][0]...
``````

You generally want to think that way, however the truth is a bit more complex. The truth is BF does not read a character but a byte (whatever that byte is). Let me show you example:

In linux

``````\$ printf ł
``````

prints:

``````ł
``````

which is specific polish character. This character is not encoded by ASCII encoding. In this case it's UTF-8 encoding, so it used to take more than one byte in computer memory. We can prove it by making a hexadecimal dump:

``````\$ printf ł | hd
``````

which shows:

``````00000000  c5 82                                             |..|
``````

Zeroes are offset. `82` is first and `c5` is second byte representing `ł` (in order we will read them). `|..|` is graphical representation which is not possible in this case.

Well, if you pass `ł` as input to your BF program that reads single byte, program memory will look like:

``````...[0][0][*197*][0][0]...
``````

Why `197` ? Well `197` decimal is `c5` hexadecimal. Seems familiar ? Of course. It's first byte of `ł` !

3. Output

To print character you use dot `.` What it does is: Assuming we treat actual cell value like decimal ASCII code, print corresponding character to standard output.

Well, lets imagine your BF program memory looks like:

``````...[0][0][*97*][0][0]...
``````

If you use dot (.) operator now, what BF does is print:

a

Because `a` decimal code in ASCII is `97`.

So for example BF program like this (97 pluses 2 dots):

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++..

Will increase value of the cell it points to up to 97 and print it out 2 times.

aa

4. Loops

In BF loop consists of loop begin `[` and loop end `]`. You can think it's like while in C/C++ where the condition is actual cell value.

Take a look BF program below:

``````++[]
``````

`++` increments actual cell value twice:

``````...[0][0][*2*][0][0]...
``````

And `[]` is like `while(2) {}`, so it's infinite loop.

Let's say we don't want this loop to be infinite. We can do for example:

``````++[-]
``````

So each time a loop loops it decrements actual cell value. Once actual cell value is `0` loop ends:

``````...[0][0][*2*][0][0]...        loop starts
...[0][0][*1*][0][0]...        after first iteration
...[0][0][*0*][0][0]...        after second iteration (loop ends)
``````

Let's consider yet another example of finite loop:

``````++[>]
``````

This example shows, we haven't to finish loop at cell that loop started on:

``````...[0][0][*2*][0][0]...        loop starts
...[0][0][2][*0*][0]...        after first iteration (loop ends)
``````

However it is good practice to end where we started. Why ? Because if loop ends another cell it started, we can't assume where will cell pointer be. To be honest, this practice makes brainfuck less brainfuck.