Home

Awesome

markdown-it-pyrs

PyPI

<p align="center"> <img alt="markdown-it-pyrs icon" src="https://raw.githubusercontent.com/chrisjsewell/markdown-it-pyrs/main/docs/_static/markdown-it.rs.svg"> </p>

Currently in Beta, feedback welcome!

A Python interface for markdown-it.rs (and plugins), using Rust for blazingly fast Markdown parsing ⚡️

The goal of this project is to provide a fast, safe, extensible, and easy-to-use Markdown parser for Python. It is complimentary to markdown-it-py, which is a pure Python implementation of markdown-it, and here we aim to follow as close as possible the API for that package.

If you care primarily about speed, this is the library for you. For example, benchmarking the two libraries when parsing the CommonMark Spec file, markdown-it-pyrs is 20x faster than markdown-it-py.

Name (time, ms)MinMaxMeanRounds
markdown-it-pyrs5.2177.9695.96885
markdown-it-py122.696143.246131.4317

The drawback is that the library vendors compiled Rust code, and so:

  1. Parser plugins cannot currently be written in Python and added dynamically to the parser.
  2. It can be more difficult to integrate into environments like pyiodide and py-script (but maybe not for long: https://discuss.python.org/t/support-wasm-wheels-on-pypi/21924/3).

Usage

First install the package:

pip install markdown-it-pyrs

Then use it like you would markdown-it-py:

from markdown_it_pyrs import MarkdownIt

md = MarkdownIt("commonmark").enable("table")
md.render("# Hello, world!")
# '<h1>Hello, world!</h1>\n'

markdown-it.rs does not generate a token stream, but instead directly generates a Node tree. This is similar to the markdown-it-py's SyntaxTreeNode class, although the API is not identical. (source mapping is also provided by byte-offset, rather than line only)

md = (
  MarkdownIt("commonmark")
  .enable("table")
  .enable_many(["linkify", "strikethrough"])
)
node = md.tree("# Hello, world!")
print(node.walk())
# [Node(root), Node(heading), Node(text)]
print(node.pretty(srcmap=True, meta=True))
# <root srcmap="0:15">
#   <heading srcmap="0:15">
#     level: 1
#     <text srcmap="2:15">
#       content: Hello, world!

Note: Attributes of the Node class, such as Node.attrs, return a copy of the underlying data, and so mutating it will not affect what is stored on the node, e.g.

from markdown_it_pyrs import Node
node = Node("name")
# don't do this!
node.attrs["key"] = "value"
print(node.attrs) # {}
# do this instead (Python 3.9+)
node.attrs = node.attrs | {"key": "value"}
print(node.attrs) # {"key": "value"}
# Node.children is only a shallow copy though, so this is fine
child = Node("child")
node.children = [child]
node.children[0].name = "other"
print(child.name) # "other"

Command Line Interface

A CLI is also provided, which can be used like this:

echo "# Hello, world!" | markdown-it-pyrs html -
# <h1>Hello, world!</h1>
echo "# Hello, world!" | markdown-it-pyrs ast -
# <root>
#   <heading>
#     <text>

Replace - with a filename to read from a file, and see markdown-it-pyrs --help for more options, including initial configuration and enabling plugins.

Initial Configuration

Initialising MarkdownIt("zero") will not enable any plugins, and so you can add only the ones you need.

Use MarkdownIt("commonmark") to enable all the CommonMark plugins.

Use MarkdownIt("gfm") to enable all the CommonMark plugins, plus the GitHub Flavoured Markdown plugins.

Plugins

All syntax rules in markdown-it.rs are implemented as plugins. Plugins can be added to the parser by calling enable or enable_many with the name of the plugin. The following plugins are currently supported:

CommonMark Blocks:

CommonMark Inlines:

GitHub Flavoured Markdown (https://github.github.com/gfm):

Others:

Development

I'm quite new to Rust, so if you see something that could be improved, issues and PRs are welcome!

PyO3 and Maturin are used to build the Python package, by wrapping markdown-it.rs in a Python module.

pre-commit is used to run code formatting and linting checks, and tox is used to run tests.

TODO

Improvements:

Open issue upstream:

Maintenance: