Home

Awesome

flatparse

Hackage CI

flatparse is a high-performance parsing library, supporting parsing for programming languages, human-readable data and machine-readable data. The "flat" in the name refers to the ByteString parsing input, which has pinned contiguous data, and also to the library internals, which avoids indirections and heap allocations whenever possible. flatparse is generally lower-level than parsec-style libraries, but it is possible to build higher-level features (such as source spans, hints, indentation parsing) on top of it, without making any compromises in performance.

LLVM

It is advised to build with -fllvm option when using this package, since that can result in significant speedups (20-40% from what I've seen). Additionally, you can enable -fllvm for flatparse specifically by enabling the llvm package flag. However, this has minor impact, since almost all parser code will be typically inlined into modules outside flatparse, and compiled there.

Features and non-features

flatparse comes in two flavors: FlatParse.Basic and FlatParse.Stateful. Both support a custom error type. Also, both come in three modes, where we can respectively run IO actions, ST actions, or no side effects. The modes are selected by a state token type parameter on the parser types.

Tutorial

Informative tutorials are work in progress. See src/FlatParse/Examples for a lexer/parser example with acceptably good error messages.

Contribution

Pull requests are welcome. I'm fairly quick to add PR authors as collaborators.

Some benchmarks

Execution times below. See source code in bench. Compiled with GHC 9.4.4 -O2 -fllvm. Executed on Intel 1165G7 CPU at 28W power draw. Uses nightly-2023-02-06 Stackage snapshot for the involved packages.

benchmarkruntime
sexp/fpbasic1.93 ms
sexp/fpstateful2.00 ms
sexp/attoparsec21.82 ms
sexp/megaparsec59.60 ms
sexp/parsec79.81 ms
long keyword/fpbasic0.1 ms
long keyword/fpstateful0.1 ms
long keyword/attoparsec2.43 ms
long keyword/megaparsec5.2 ms
long keyword/parsec10.02 ms
numeral csv/fpbasic0.72 ms
numeral csv/fpstateful0.56 ms
numeral csv/attoparsec10.52 ms
numeral csv/megaparsec19.77 ms
numeral csv/parsec26.46 ms

Object file sizes for each module containing the s-exp, long keyword and numeral csv benchmarks.

libraryobject file size (bytes)
fpbasic20656
fpstateful26664
attoparsec69384
megaparsec226232
parsec117696