Awesome

Python Parsing Benchmarks

This repository contains a suite of test cases and implementations for parsing text using Python. Both speed an memory usage are compared for each implementation. Each implementation must be:

Correct -- it implements the specification exactly
- Syntax -- it parses all valid inputs and rejects bad ones
- Semantics -- it produces the expected data structures
Plausible -- it only uses generally known parser features or parsing strategies

Validation tests can ensure the correctness of an implementation, but its plausibility is harder to pin down. Basically, grammars that require knowlege of a parser's internals or undocumented features to increase performance are discouraged. The grammar should be understandable by someone with access to the parser's documentation and examples.

Implementations

This table lists the tasks performed and the libraries benchmarked.

Implementation	Algorithm	JSON	Arithmetic	INI
stdlib	handwritten	yes	yes	yes
Lark	LALR	yes	yes	yes
parsimonious	Recursive Descent	yes
pe	Recursive Descent	yes	yes	yes
pyparsing	Recursive Descent	yes
SLY	LALR	yes
textX	Recursive Descent	yes

Results

The following bar chart shows the time in milliseconds to parse a ~5MB JSON file using both CPython and PyPy.

[cpython] stdlib      : ▏ 109 ms
[cpython] pe          : ▇▇ 819 ms
[cpython] sly         : ▇▇▇▇▇▇▇▇▇▇▇▇ 3716 ms
[cpython] lark        : ▇▇▇▇▇▇▇▇▇▇▇▇▇ 3967 ms
[cpython] parsimonious: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 8150 ms
[cpython] pyparsing   : ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 11518 ms
[cpython] textx       : ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 14738 ms
[pypy]    stdlib      : ▇ 338 ms
[pypy]    pe          : ▇▇ 599 ms
[pypy]    sly         : ▇▇ 763 ms
[pypy]    lark        : ▇▇ 796 ms
[pypy]    pyparsing   : ▇▇▇▇▇▇▇ 2181 ms
[pypy]    parsimonious: ▇▇▇▇▇▇▇▇▇ 2669 ms
[pypy]    textx       : ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 5242 ms

Here are the results for parsing 10k complicated (from a parsing point of view) arithmetic expressions:

[cpython] stdlib: ▇▇ 101 ms
[cpython] pe    : ▇▇▇▇▇▇▇▇▇▇▇▇▇ 551 ms
[cpython] lark  : ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 2109 ms
[pypy]    lark  : ▇▇▇▇▇ 253 ms
[pypy]    stdlib: ▇▇▇▇▇▇ 260 ms
[pypy]    pe    : ▇▇▇▇▇▇▇▇ 344 ms

And here is an INI file with 1000 sections:

[cpython] stdlib: ▇▇▇▇▇▇▇▇▇▇▇▇ 91 ms
[cpython] pe    : ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 125 ms
[cpython] lark  : ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 352 ms
[pypy]    stdlib: ▇▇▇▇ 34 ms
[pypy]    pe    : ▇▇▇▇▇▇ 44 ms
[pypy]    lark  : ▇▇▇▇▇▇▇ 53 ms

Charts produced with termgraph

These benchmarks were run on a Lenovo Thinkpad with an Intel Core-i5 6200U with 8GB memory running Pop!_OS Linux 21.04. The millisecond values will change across systems but the relative performance should be similar (but I'd be interested in hearing if you find otherwise!).

Setup

Python 3.6+ is required.

First create a virtual environment (recommended) and install the package and requirements:

$ python3 -m venv cpy
$ source cpy/bin/activate
(cpy) $ pip install -r requirements.txt

You can also use PyPy by creating its own virtual environment:

$ pypy3 -m venv pypy
$ source pypy/bin/activate
(pypy) $ pip install -r requirements.txt

From here on it's assumed you're in one of the (cpy) or (pypy) environments.

Run Benchmarks

The benchmarks are implemented using pytest and pytest-benchmark, so you can run the tests with pytest if you adjust PYTHONPATH:

$ PYTHONPATH=. pytest

But it might be easier to just run the included validate.py and benchmark.py scripts, which pass the appropriate options on to pytest:

$ python validate.py            # skip performance tests (they take a while)
$ python benchmark.py           # skip validity tests

You can give specific library names to limit the tests that are run:

$ python validate.py pe stdlib  # only run validation for pe and stdlib

Some tests print to stdout diagnostic information that can be useful, such as the recursion limit in JSON parsing. Use the following option to see that information:

$ python validate.py -rP        # print stdout for each test

Disclaimer

Performance benchmarks are not the only criterion one should use when choosing a parsing library, and this repository is not meant to declare some winner. On the one hand, there are many other valid criteria (ease of use, stability, security, availability, support, etc.), but on the other hand we can't discuss relative performance without numbers, so here we are.