Home

Awesome

Whitespace parser Build Status

Associated blog post: https://www.sderosiaux.com/2018/06/15/a-simple-way-to-write-parsers-using-the-state-monad/

Example of a parser implementation of the Whitespace programming language using only StateT to convey the state, the stack and the output of the program.

Whitespace deals with 3 characters only: " ", "\t", and "\n".

This parser relies massively upon StateT and misc combinators, as we can see in the root parser:

def imp: StateT[F, (String, Stack), String] = for {
   output <- stackCommands.all <+>
     ioCommands.all <+>
     arithmeticCommands.all <+>
     heapCommands.all <+>
     flowCommands.all
   rest <- imp <+> StateT.pure[F, (String, Stack), String]("")
} yield output + rest

The state of our parser contains the program and the stack of execution. The result is a String which is what is print to the screen.

Whitespace has 5 big chunks of commands to determine how to interpret the following characters.

For instance, one of the ioCommands (triggered by "\t\n") is " " which means: print the character on top of the stack.

Hello, world!

As a typical example, here we are parsing the Hello world from Wikipedia:

val helloWorld = "   \t  \t   \n\t\n     \t\t  \t \t\n\t\n     \t\t \t\t  \n\t\n     \t\t \t\t  \n\t\n     \t\t \t\t\t\t\n\t\n     \t \t\t  \n\t\n     \t     \n\t\n     \t\t\t \t\t\t\n\t\n     \t\t \t\t\t\t\n\t\n     \t\t\t  \t \n\t\n     \t\t \t\t  \n\t\n     \t\t  \t  \n\t\n     \t    \t\n\t\n  \n\n\n"
val Some(((rest, stack), output)) = new WhitespaceParser[Option].eval(helloWorld)
println(s"Stack: $stack")
println(s"Output: $output")
Stack: List(33, 100, 108, 114, 111, 119, 32, 44, 111, 108, 108, 101, 72)
Output: Hello, world!

Syntax

Details about the syntax here: https://hackage.haskell.org/package/whitespace-0.4/src/docs/tutorial.html

Notes

The parser is not complete, some commands are missing, but you get the idea.