Home

Awesome

Commonmarker

Ruby wrapper for Rust's comrak crate.

It passes all of the CommonMark test suite, and is therefore spec-complete. It also includes extensions to the CommonMark spec as documented in the GitHub Flavored Markdown spec, such as support for tables, strikethroughs, and autolinking.

For more information on available extensions, see the documentation below.

Installation

Add this line to your application's Gemfile:

gem 'commonmarker'

And then execute:

$ bundle

Or install it yourself as:

$ gem install commonmarker

Usage

Converting to HTML

Call to_html on a string to convert it to HTML:

require 'commonmarker'
Commonmarker.to_html('"Hi *there*"', options: {
    parse: { smart: true }
})
# => <p>“Hi <em>there</em>”</p>\n

(The second argument is optional--see below for more information.)

Generating a document

You can also parse a string to receive a :document node. You can then print that node to HTML, iterate over the children, and do other fun node stuff. For example:

require 'commonmarker'

doc = Commonmarker.parse("*Hello* world", options: {
    parse: { smart: true }
})
puts(doc.to_html) # => <p><em>Hello</em> world</p>\n

doc.walk do |node|
  puts node.type # => [:document, :paragraph, :emph, :text, :text]
end

(The second argument is optional--see below for more information.)

When it comes to modifying the document, you can perform the following operations:

You can also get the source position of a node by calling source_position:

doc = Commonmarker.parse("*Hello* world")
puts doc.first_child.first_child.source_position
# => {:start_line=>1, :start_column=>1, :end_line=>1, :end_column=>7}

You can also modify the following attributes:

Example: Walking the AST

You can use walk or each to iterate over nodes:

require 'commonmarker'

# parse some string
doc = Commonmarker.parse("# The site\n\n [GitHub](https://www.github.com)")

# Walk tree and print out URLs for links
doc.walk do |node|
  if node.type == :link
    printf("URL = %s\n", node.url)
  end
end
# => URL = https://www.github.com

# Transform links to regular text
doc.walk do |node|
  if node.type == :link
    node.insert_before(node.first_child)
    node.delete
  end
end
# => <h1><a href=\"#the-site\"></a>The site</h1>\n<p>GitHub</p>\n

Example: Converting a document back into raw CommonMark

You can use to_commonmark on a node to render it as raw text:

require 'commonmarker'

# parse some string
doc = Commonmarker.parse("# The site\n\n [GitHub](https://www.github.com)")

# Transform links to regular text
doc.walk do |node|
  if node.type == :link
    node.insert_before(node.first_child)
    node.delete
  end
end

doc.to_commonmark
# => # The site\n\nGitHub\n

Options and plugins

Options

Commonmarker accepts the same parse, render, and extensions options that comrak does, as a hash dictionary with symbol keys:

Commonmarker.to_html('"Hi *there*"', options:{
  parse: { smart: true },
  render: { hardbreaks: false}
})

Note that there is a distinction in comrak for "parse" options and "render" options, which are represented in the tables below.

Parse options

NameDescriptionDefault
smartPunctuation (quotes, full-stops and hyphens) are converted into 'smart' punctuation.false
default_info_stringThe default info string for fenced code blocks.""
relaxed_tasklist_matchingEnables relaxing of the tasklist extension matching, allowing any non-space to be used for the "checked" state instead of only x and X.false
relaxed_autolinksEnable relaxing of the autolink extension parsing, allowing links to be recognized when in brackets, as well as permitting any url scheme.false

Render options

NameDescriptionDefault
hardbreaksSoft line breaks translate into hard line breaks.true
github_pre_langGitHub-style <pre lang="xyz"> is used for fenced code blocks with info tags.true
full_info_stringGives info string data after a space in a data-meta attribute on code blocks.false
widthThe wrap column when outputting CommonMark.80
unsafeAllow rendering of raw HTML and potentially dangerous links.false
escapeEscape raw HTML instead of clobbering it.false
sourceposInclude source position attribute in HTML and XML output.false
escaped_char_spansWrap escaped characters in span tags.true
ignore_setextIgnores setext-style headings.false
ignore_empty_linksIgnores empty links, leaving the Markdown text in place.false
gfm_quirksOutputs HTML with GFM-style quirks; namely, not nesting <strong> inlines.false
prefer_fencedAlways output fenced code blocks, even where an indented one could be used.false

As well, there are several extensions which you can toggle in the same manner:

Commonmarker.to_html('"Hi *there*"', options: {
    extension: { footnotes: true, description_lists: true },
    render: { hardbreaks: false }
})

Extension options

NameDescriptionDefault
strikethroughEnables the strikethrough extension from the GFM spec.true
tagfilterEnables the tagfilter extension from the GFM spec.true
tableEnables the table extension from the GFM spec.true
autolinkEnables the autolink extension from the GFM spec.true
tasklistEnables the task list extension from the GFM spec.true
superscriptEnables the superscript Comrak extension.false
header_idsEnables the header IDs Comrak extension. from the GFM spec.""
footnotesEnables the footnotes extension per cmark-gfm.false
description_listsEnables the description lists extension.false
front_matter_delimiterEnables the front matter extension.""
multiline_block_quotesEnables the multiline block quotes extension.false
math_dollars, math_codeEnables the math extension.false
shortcodesEnables the shortcodes extension.true
wikilinks_title_before_pipeEnables the wikilinks extension, placing the title before the dividing pipe.false
wikilinks_title_after_pipeEnables the shortcodes extension, placing the title after the dividing pipe.false
underlineEnables the underline extension.false
spoilerEnables the spoiler extension.false
greentextEnables the greentext extension.false

For more information on these options, see the comrak documentation.

Plugins

In addition to the possibilities provided by generic CommonMark rendering, Commonmarker also supports plugins as a means of providing further niceties.

Syntax Highlighter Plugin

The library comes with a set of pre-existing themes for highlighting code:

code = <<~CODE
  ```ruby
  def hello
    puts "hello"
  end
  ```
CODE

# pass in a theme name from a pre-existing set
puts Commonmarker.to_html(code, plugins: { syntax_highlighter: { theme: "InspiredGitHub" } })

# <pre style="background-color:#ffffff;" lang="ruby"><code>
# <span style="font-weight:bold;color:#a71d5d;">def </span><span style="font-weight:bold;color:#795da3;">hello
# </span><span style="color:#62a35c;">puts </span><span style="color:#183691;">&quot;hello&quot;
# </span><span style="font-weight:bold;color:#a71d5d;">end
# </span>
# </code></pre>

By default, the plugin uses the "base16-ocean.dark" theme to syntax highlight code.

To disable this plugin, set the value to nil:

code = <<~CODE
  ```ruby
  def hello
    puts "hello"
  end
  ```
CODE

Commonmarker.to_html(code, plugins: { syntax_highlighter: nil })

# <pre lang="ruby"><code>def hello
#   puts &quot;hello&quot;
# end
# </code></pre>

To output CSS classes instead of style attributes, set the theme key to "":

code = <<~CODE
  ```ruby
  def hello
    puts "hello"
  end
CODE

Commonmarker.to_html(code, plugins: { syntax_highlighter: { theme: "" } })

# <pre class="syntax-highlighting"><code><span class="source ruby"><span class="meta function ruby"><span class="keyword control def ruby">def</span></span><span class="meta function ruby"> # <span class="entity name function ruby">hello</span></span>
#   <span class="support function builtin ruby">puts</span> <span class="string quoted double ruby"><span class="punctuation definition string begin ruby">&quot;</span>hello<span class="punctuation definition string end ruby">&quot;</span></span>
# <span class="keyword control ruby">end</span>\n</span></code></pre>

To use a custom theme, you can provide a path to a directory containing .tmtheme files to load:

Commonmarker.to_html(code, plugins: { syntax_highlighter: { theme: "Monokai", path: "./themes" } })

Output formats

Commonmarker can currently only generate output in one format: HTML.

HTML

puts Commonmarker.to_html('*Hello* world!')

# <p><em>Hello</em> world!</p>

Developing locally

After cloning the repo:

script/bootstrap
bundle exec rake compile

If there were no errors, you're done! Otherwise, make sure to follow the comrak dependency instructions.

Benchmarks

❯ bundle exec rake benchmark
input size = 11064832 bytes

ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin23]
Warming up --------------------------------------
  Markly.render_html     1.000 i/100ms
Markly::Node#to_html     1.000 i/100ms
Commonmarker.to_html     1.000 i/100ms
Commonmarker::Node.to_html
                         1.000 i/100ms
Kramdown::Document#to_html
                         1.000 i/100ms
Calculating -------------------------------------
  Markly.render_html     15.606 (±25.6%) i/s -     71.000 in   5.047132s
Markly::Node#to_html     15.692 (±25.5%) i/s -     72.000 in   5.095810s
Commonmarker.to_html      4.482 (± 0.0%) i/s -     23.000 in   5.137680s
Commonmarker::Node.to_html
                          5.092 (±19.6%) i/s -     25.000 in   5.072220s
Kramdown::Document#to_html
                          0.379 (± 0.0%) i/s -      2.000 in   5.277770s

Comparison:
Markly::Node#to_html:       15.7 i/s
  Markly.render_html:       15.6 i/s - same-ish: difference falls within error
Commonmarker::Node.to_html:        5.1 i/s - 3.08x  slower
Commonmarker.to_html:        4.5 i/s - 3.50x  slower
Kramdown::Document#to_html:        0.4 i/s - 41.40x  slower