Blackfriday: a markdown processor for Go

Blackfriday Build Status PkgGoDev

Blackfriday is a Markdown processor implemented in Go. It is paranoid about its input (so you can safely feed it user-supplied data), it is fast, it supports common extensions (tables, smart punctuation substitutions, etc.), and it is safe for all utf-8 (unicode) input.

HTML output is currently supported, along with Smartypants extensions.

It started as a translation from C of Sundown.

Installation

Blackfriday is compatible with modern Go releases in module mode. With Go installed:

go get github.com/russross/blackfriday

will resolve and add the package to the current development module, then build and install it. Alternatively, you can achieve the same if you import it in a package:

import "github.com/russross/blackfriday"

and go get without parameters.

Old versions of Go and legacy GOPATH mode might work, but no effort is made to keep them working.

Versions

Currently maintained and recommended version of Blackfriday is v2. It's being developed on its own branch: https://github.com/russross/blackfriday/tree/v2 and the documentation is available at https://pkg.go.dev/github.com/russross/blackfriday/v2.

It is go get-able in module mode at github.com/russross/blackfriday/v2.

Version 2 offers a number of improvements over v1:

  • Cleaned up API
  • A separate call to Parse, which produces an abstract syntax tree for the document
  • Latest bug fixes
  • Flexibility to easily add your own rendering extensions

Potential drawbacks:

  • Our benchmarks show v2 to be slightly slower than v1. Currently in the ballpark of around 15%.
  • API breakage. If you can't afford modifying your code to adhere to the new API and don't care too much about the new features, v2 is probably not for you.
  • Several bug fixes are trailing behind and still need to be forward-ported to v2. See issue #348 for tracking.

If you are still interested in the legacy v1, you can import it from github.com/russross/blackfriday. Documentation for the legacy v1 can be found here: https://pkg.go.dev/github.com/russross/blackfriday.

Usage

v1

For basic usage, it is as simple as getting your input into a byte slice and calling:

output := blackfriday.MarkdownBasic(input)

This renders it with no extensions enabled. To get a more useful feature set, use this instead:

output := blackfriday.MarkdownCommon(input)

v2

For the most sensible markdown processing, it is as simple as getting your input into a byte slice and calling:

output := blackfriday.Run(input)

Your input will be parsed and the output rendered with a set of most popular extensions enabled. If you want the most basic feature set, corresponding with the bare Markdown specification, use:

output := blackfriday.Run(input, blackfriday.WithNoExtensions())

Sanitize untrusted content

Blackfriday itself does nothing to protect against malicious content. If you are dealing with user-supplied markdown, we recommend running Blackfriday's output through HTML sanitizer such as Bluemonday.

Here's an example of simple usage of Blackfriday together with Bluemonday:

import (
    "github.com/microcosm-cc/bluemonday"
    "github.com/russross/blackfriday"
)

// ...
unsafe := blackfriday.Run(input)
html := bluemonday.UGCPolicy().SanitizeBytes(unsafe)

Custom options, v1

If you want to customize the set of options, first get a renderer (currently only the HTML output engine), then use it to call the more general Markdown function. For examples, see the implementations of MarkdownBasic and MarkdownCommon in markdown.go.

Custom options, v2

If you want to customize the set of options, use blackfriday.WithExtensions, blackfriday.WithRenderer and blackfriday.WithRefOverride.

blackfriday-tool

You can also check out blackfriday-tool for a more complete example of how to use it. Download and install it using:

go get github.com/russross/blackfriday-tool

This is a simple command-line tool that allows you to process a markdown file using a standalone program. You can also browse the source directly on github if you are just looking for some example code:

Note that if you have not already done so, installing blackfriday-tool will be sufficient to download and install blackfriday in addition to the tool itself. The tool binary will be installed in $GOPATH/bin. This is a statically-linked binary that can be copied to wherever you need it without worrying about dependencies and library versions.

Sanitized anchor names

Blackfriday includes an algorithm for creating sanitized anchor names corresponding to a given input text. This algorithm is used to create anchors for headings when EXTENSION_AUTO_HEADER_IDS is enabled. The algorithm has a specification, so that other packages can create compatible anchor names and links to those anchors.

The specification is located at https://pkg.go.dev/github.com/russross/blackfriday#hdr-Sanitized_Anchor_Names.

SanitizedAnchorName exposes this functionality, and can be used to create compatible links to the anchor names generated by blackfriday. This algorithm is also implemented in a small standalone package at github.com/shurcooL/sanitized_anchor_name. It can be useful for clients that want a small package and don't need full functionality of blackfriday.

Features

All features of Sundown are supported, including:

  • Compatibility. The Markdown v1.0.3 test suite passes with the --tidy option. Without --tidy, the differences are mostly in whitespace and entity escaping, where blackfriday is more consistent and cleaner.

  • Common extensions, including table support, fenced code blocks, autolinks, strikethroughs, non-strict emphasis, etc.

  • Safety. Blackfriday is paranoid when parsing, making it safe to feed untrusted user input without fear of bad things happening. The test suite stress tests this and there are no known inputs that make it crash. If you find one, please let me know and send me the input that does it.

    NOTE: "safety" in this context means runtime safety only. In order to protect yourself against JavaScript injection in untrusted content, see this example.

  • Fast processing. It is fast enough to render on-demand in most web applications without having to cache the output.

  • Thread safety. You can run multiple parsers in different goroutines without ill effect. There is no dependence on global shared state.

  • Minimal dependencies. Blackfriday only depends on standard library packages in Go. The source code is pretty self-contained, so it is easy to add to any project, including Google App Engine projects.

  • Standards compliant. Output successfully validates using the W3C validation tool for HTML 4.01 and XHTML 1.0 Transitional.

Extensions

In addition to the standard markdown syntax, this package implements the following extensions:

  • Intra-word emphasis supression. The _ character is commonly used inside words when discussing code, so having markdown interpret it as an emphasis command is usually the wrong thing. Blackfriday lets you treat all emphasis markers as normal characters when they occur inside a word.

  • Tables. Tables can be created by drawing them in the input using a simple syntax:

    Name    | Age
    --------|------
    Bob     | 27
    Alice   | 23
    
  • Fenced code blocks. In addition to the normal 4-space indentation to mark code blocks, you can explicitly mark them and supply a language (to make syntax highlighting simple). Just mark it like this:

    ```go
    func getTrue() bool {
        return true
    }
    ```
    

    You can use 3 or more backticks to mark the beginning of the block, and the same number to mark the end of the block.

    To preserve classes of fenced code blocks while using the bluemonday HTML sanitizer, use the following policy:

    p := bluemonday.UGCPolicy()
    p.AllowAttrs("class").Matching(regexp.MustCompile("^language-[a-zA-Z0-9]+$")).OnElements("code")
    html := p.SanitizeBytes(unsafe)
  • Definition lists. A simple definition list is made of a single-line term followed by a colon and the definition for that term.

    Cat
    : Fluffy animal everyone likes
    
    Internet
    : Vector of transmission for pictures of cats
    

    Terms must be separated from the previous definition by a blank line.

  • Footnotes. A marker in the text that will become a superscript number; a footnote definition that will be placed in a list of footnotes at the end of the document. A footnote looks like this:

    This is a footnote.[^1]
    
    [^1]: the footnote text.
    
  • Autolinking. Blackfriday can find URLs that have not been explicitly marked as links and turn them into links.

  • Strikethrough. Use two tildes (~~) to mark text that should be crossed out.

  • Hard line breaks. With this extension enabled (it is off by default in the MarkdownBasic and MarkdownCommon convenience functions), newlines in the input translate into line breaks in the output.

  • Smart quotes. Smartypants-style punctuation substitution is supported, turning normal double- and single-quote marks into curly quotes, etc.

  • LaTeX-style dash parsing is an additional option, where -- is translated into –, and --- is translated into —. This differs from most smartypants processors, which turn a single hyphen into an ndash and a double hyphen into an mdash.

  • Smart fractions, where anything that looks like a fraction is translated into suitable HTML (instead of just a few special cases like most smartypant processors). For example, 4/5 becomes <sup>4</sup>&frasl;<sub>5</sub>, which renders as 4โ„5.

Other renderers

Blackfriday is structured to allow alternative rendering engines. Here are a few of note:

  • github_flavored_markdown: provides a GitHub Flavored Markdown renderer with fenced code block highlighting, clickable heading anchor links.

    It's not customizable, and its goal is to produce HTML output equivalent to the GitHub Markdown API endpoint, except the rendering is performed locally.

  • markdownfmt: like gofmt, but for markdown.

  • LaTeX output: renders output as LaTeX.

  • bfchroma: provides convenience integration with the Chroma code highlighting library. bfchroma is only compatible with v2 of Blackfriday and provides a drop-in renderer ready to use with Blackfriday, as well as options and means for further customization.

  • Blackfriday-Confluence: provides a Confluence Wiki Markup renderer.

  • Blackfriday-Slack: converts markdown to slack message style

TODO

  • More unit testing
  • Improve Unicode support. It does not understand all Unicode rules (about what constitutes a letter, a punctuation symbol, etc.), so it may fail to detect word boundaries correctly in some instances. It is safe on all UTF-8 input.

License

Blackfriday is distributed under the Simplified BSD License

Comments
  • Discussion: initial work on v2

    Discussion: initial work on v2

    I have rebased my v2 experiments on top the latest master and pushed them to v2 branch for eyeballing.

    Here's what I was doing and where I stand:

    • All constants renamed from C_STYLE to IdiomaticGo (and plain ints to typed constants). The rename was mechanical so far, I didn't try to take a chance and look for opportunities to eliminate naming inconsistencies; but we probably should do that while we're breaking the compatibility. Suggestions welcome.
    • Renderer interface overhaul. Eliminated callbacks from renderer methods. #120 suggested changing *bytes.Buffer to io.Writer, but I tried another approach: leaving all writing business to the renderer internals.
    • #189 is probably fixed as side effect: no more callbacks, no more dead loops.
    • When table of contents is enabled, it still behaves the old way: backups all the document in a temp buffer, truncates it, writes out TOC, writes back temp buffer. I don't see an easy way around it.

    The next big thing I'm going to try is constructing an actual AST before rendering. This should allow to clean up our internal code substantially in some places and further simplify the renderer interface (and properly fix the TOC problem).

    Comments welcome.

  • v2: API Enhancement: Why is RenderNode part of the Renderer interface?

    v2: API Enhancement: Why is RenderNode part of the Renderer interface?

    Hello v2 question.

    This is the interface for a Renderer is:

    type Renderer interface {
    	Render(ast *Node) []byte
    	RenderNode(w io.Writer, node *Node, entering bool) WalkStatus
    }
    

    Why is RenderNode part of the interface? I don't see any direct usage of it. I do see Render called. It's not clear to me why implementers of a new renderer need to use this method name or signature (and make it public). But perhaps I missed something, as node.Walk and RenderNode are similar.

    thanks!

    n

  • v2: decide on import path and make a new release

    v2: decide on import path and make a new release

    The latest release is from September 2018, maybe it's time to make a new one?

    The primary reasons for a new release are #509 and #515; this will make the package not depend on any other packages. Including #586 would be nice, too.

  • v2: Why the distinct types `Processor` and `Parser`?

    v2: Why the distinct types `Processor` and `Parser`?

    Sorry for being a bit late on this: I've just found time to follow up with Blackfriday's recent changes and update blackfriday-latex.

    I like the new simplifications that have been done, in particular that Extensions only belong to the parser while Flags only belong to the renderer. It makes the whole picture much clearer

    I am not sure I understand the purpose of the distinction between the Parser and the Processor types however.

    In fact NewParser() has side effects on the processor. Parser contains both Extensions and ReferenceOverrideFunc like Processor, only Renderer is specific to the latter, but it is only ever used in Markdown() which could just as easily instantiate its own.

    NewHTMLRenderer(HTMLRendererParameters{
    	Flags: CommonHTMLFlags,
    }),
    

    The way I see it, the Renderer and Parser types are distinct object that only interact through the Render() function.

    I find the current API somewhat confusing because of this.

  • Consider other days of the week?

    Consider other days of the week?

    I've recently stumbled upon https://github.com/microcosm-cc/bluemonday, which seems to be a Go library for HTML sanitizing.

    Would it be a good idea or a bad idea to use it?

    I haven't really looked at it closely yet, but I just wanted to start the discussion here.

  • Going to tag v2.0.0

    Going to tag v2.0.0

    Hey folks. I've submitted what I hope to be the last API-affecting PRs #381 and #382. Aftery they're in, I'm going to tag the tip of v2 branch with v2.0.0, migrate the relevant README changes to master and call v2 done.

    So this is the last call to shout out whatever API-affecting issues, concerns or questions you might have.

    I plan to do that some time within two weeks if nothing comes up.

  • Discussion: v2 performance tweaks

    Discussion: v2 performance tweaks

    Hi all,

    I have recently pushed a branch, v2-perf-tweaks, with some initial performance improvements for v2. I'm opening this discussion thread for discussing whatever issues people might find reviewing that code and/or suggestions/observations for further improvement.

  • Fenced code block with list fails

    Fenced code block with list fails

    Given the following fenced code block,

    1. Foo
    
        ```
        + bar
        ```
    

    blackfriday renders:

    <ol>\n<li><p>Foo</p>\n\n<p>```</p>\n\n<ul>\n<li>bar\n```</li>\n</ul></li>\n</ol>\n
    

    I expect it to render:

    <ol>\n<li><p>Foo</p>\n\n<pre><code>+ bar\n</code></pre></li>\n</ol>\n
    

    It appears to occur when the first line of the code block matches a list item pattern (ie. -, +, *, or 1.). If there's a blank line or any non-list item patterned content on the first line, the code fencing works properly.

    Here's a test case for TestFenceCodeBlockWithinList:

            "1. Foo\n\n    ```\n    + bar\n    ```\n",
            `<ol>
    <li><p>Foo</p>
    
    <pre><code>+ bar
    </code></pre></li>
    </ol>
    `,
    

    I haven't been able to figure out where it's breaking down.

  • Preserve code block language

    Preserve code block language

    Right now following block of code:

    ```go
    func main() {
      fmt.Println("Hello world")
    }
    ``
    

    will be translated into

    <pre>
      <code>
       ... code goes here...
      </code>
    </pre>
    

    can language be preserved like that for example:

    <pre>
      <code lang="go">
       ... code goes here...
      </code>
    </pre>
    
  • Add PostRefOverride functionality

    Add PostRefOverride functionality

    I mimicked the WithRefOverride option for this functionality, and hope that it or something like it will be accepted. I would like to be able to override refs but only if the user has not specified a reference, which means that ref overriding needs to happen after the refs have been resolved.

    Fixes #467

  • v2: Remove the LaTeX renderer stub and point to Ambrevar's implementation

    v2: Remove the LaTeX renderer stub and point to Ambrevar's implementation

    I've implemented an almost fully functional LaTeX renderer.

    The main extension that is broken at the moment is footnote support, but this needs upatream changes to the parser first.

  • enclosed parentheses in markdown links not rendered properly in html

    enclosed parentheses in markdown links not rendered properly in html

    When given the following input:

    [some link](www.foo.com?param=(null))
    

    blackfriday converts that input into:

    <p><a href="www.foo.com?param=(null">some link</a>)</p>
    

    Visually this will look like some link) and the link url will be www.foo.com?param=(null

    Expected behavior would be: some link

  • What flags to (un)set if i want no header tags?

    What flags to (un)set if i want no header tags?

    I'm trying to enforce an effect such that user content is somewhat consistent with font sizes.

    blackfriday.AutoHeadingIDs and blackfriday.SpaceHeadings does not seem to give me what I want. Any clues as to how to do this?

  • bugfix: Resolve #701

    bugfix: Resolve #701

    isFenceLine was being called within listItem for finding code blocks. It was unable to detect code blocks with an info string since a nil string pointer was being passed in. Now the pointer is not nil so info-string code blocks are correctly parsed within lists. Resolves #701

  • Buggy, fragile list behavior

    Buggy, fragile list behavior

    Using blackfriday v2.1.0, default parameters

    The following markdown has 3 list items.

    7. Defina
    	```python
    	> greet_user("esteban")
    	```
    
    11. Dado
    	```python
    	resultado = simple()
    	```
    
    12. Examine
    	```python
    	def convertir_a_float(cadena):
    	```
    
    

    Here is the html result. Some observations:

    • It unexpectedly nests a list
    • Second item code is inline, inconsistent with other code blocks (and unexpected
    <ol>
    <li><p>Defina</p>
    
    <pre><code class="language-python">&gt; greet_user(&quot;esteban&quot;)
    </code></pre>
    <ol>
    <li>Dado
    <code>python
    resultado = simple()
    </code></li>
    </ol></li>
    
    <li><p>Examine</p>
    
    <pre><code class="language-python">def convertir_a_float(cadena):
    </code></pre></li>
    </ol>
    

    If we remove the third item from the markdown list we get the following. Few observations:

    • First item info string escaping the code block? what?
    • Second item is still inline.
    • Now the second item is no longer a list item, The whole markdown content has been interpreted as a single list node. Not sure what could possibly be causing this...
    <ol>
    <li>Defina
    <code>python
    &gt; greet_user(&quot;esteban&quot;)
    </code>
    11. Dado
    <code>python
    resultado = simple()
    </code></li>
    </ol>
    

    Something similar happens if we delete the first item of the list.

    <ol>
    <li>Dado
    <code>python
    resultado = simple()
    </code>
    12. Examine
    <code>python
    def convertir_a_float(cadena):
    </code></li>
    </ol>
    

    Please note the tabs in front of the code blocks have no effect on the bug.

  • Add math extension

    Add math extension

    Hey, I built this PR to discuss what form a Math extension could have. This would be useful for rendering LaTeX output and for having HTML content that can be displayed using MathJAX or Katex.

    There is an issue which is nearly four years old on this matter(https://github.com/russross/blackfriday/issues/504). I believe with the new v2 engine this would be a change that does not break existing programs since math nodes would not be parsed without enabling this extension. @j2kun also created an issue on this subject a while ago and another in the Hugo project and the issue has still not been solved.

    It is worth noting the latex renderer link by @Ambrevar in this projects README is broken. I would like to replace it with a working LaTeX renderer in the future, but before that I'd like to have a few opinions on what best to do about the math extension functionality.

Mmark: a powerful markdown processor in Go geared towards the IETF

title date aliases About 2018-07-22 14:05:51 +0100 /about/ Mmark is a powerful markdown processor written in Go, geared towards writing IETF documents

Dec 30, 2022
Enhanced Markdown template processor for golang

emd Enhanced Markdown template processor. See emd README file TOC Install glide

Jan 2, 2022
๐Ÿšฉ TOC, zero configuration table of content generator for Markdown files, create table of contents from any Markdown file with ease.
๐Ÿšฉ TOC, zero configuration table of content generator for Markdown files, create table of contents from any Markdown file with ease.

toc toc TOC, table of content generator for Markdown files Table of Contents Table of Contents Usage Installation Packages Arch Linux Homebrew Docker

Dec 29, 2022
Markdown - Markdown converter for golang

markdown ?? Talks ?? Join ?? Youtube โค๏ธ Sponsor Install via nami nami install ma

Jun 2, 2022
Mdfmt - A Markdown formatter that follow the CommonMark. Like gofmt, but for Markdown

Introduction A Markdown formatter that follow the CommonMark. Like gofmt, but fo

Dec 18, 2022
pdfcpu is a PDF processor written in Go.
pdfcpu is a PDF processor written in Go.

pdfcpu is a PDF processing library written in Go supporting encryption. It provides both an API and a CLI. Supported are all versions up to PDF 1.7 (ISO-32000).

Jan 4, 2023
โš™๏ธ Convert HTML to Markdown. Even works with entire websites and can be extended through rules.
โš™๏ธ Convert HTML to Markdown. Even works with entire websites and can be extended through rules.

html-to-markdown Convert HTML into Markdown with Go. It is using an HTML Parser to avoid the use of regexp as much as possible. That should prevent so

Jan 6, 2023
Produces a set of tags from given source. Source can be either an HTML page, Markdown document or a plain text. Supports English, Russian, Chinese, Hindi, Spanish, Arabic, Japanese, German, Hebrew, French and Korean languages.
Produces a set of tags from given source. Source can be either an HTML page, Markdown document or a plain text. Supports English, Russian, Chinese, Hindi, Spanish, Arabic, Japanese, German, Hebrew, French and Korean languages.

Tagify Gets STDIN, file or HTTP address as an input and returns a list of most popular words ordered by popularity as an output. More info about what

Dec 19, 2022
Upskirt markdown library bindings for Go

Goskirt Package goskirt provides Go-bindings for the excellent Sundown Markdown parser. (F/K/A Upskirt). To use goskirt, create a new Goskirt-value wi

Oct 23, 2022
A CLI markdown converter written in Go.

MDConv is a markdown converter written in Go. It is able to create PDF and HTML files from Markdown without using LaTeX. Instead MDConv u

Dec 20, 2022
A markdown renderer package for the terminal
A markdown renderer package for the terminal

go-term-markdown go-term-markdown is a go package implementing a Markdown renderer for the terminal. Note: Markdown being originally designed to rende

Nov 25, 2022
A markdown parser written in Go. Easy to extend, standard(CommonMark) compliant, well structured.

goldmark A Markdown parser written in Go. Easy to extend, standards-compliant, well-structured. goldmark is compliant with CommonMark 0.29. Motivation

Dec 29, 2022
:triangular_ruler:gofmtmd formats go source code block in Markdown. detects fenced code & formats code using gofmt.
:triangular_ruler:gofmtmd formats go source code block in Markdown. detects fenced code & formats code using gofmt.

gofmtmd gofmtmd formats go source code block in Markdown. detects fenced code & formats code using gofmt. Installation $ go get github.com/po3rin/gofm

Oct 31, 2022
Convert Microsoft Word Document to Markdown
Convert Microsoft Word Document to Markdown

docx2md Convert Microsoft Word Document to Markdown Usage $ docx2md NewDocument.docx Installation $ go get github.com/mattn/docx2md Supported Styles

Jan 4, 2023
Stylesheet-based markdown rendering for your CLI apps ๐Ÿ’‡๐Ÿปโ€โ™€๏ธ
Stylesheet-based markdown rendering for your CLI apps ๐Ÿ’‡๐Ÿปโ€โ™€๏ธ

Glamour Write handsome command-line tools with Glamour. glamour lets you render markdown documents & templates on ANSI compatible terminals. You can c

Jan 1, 2023
go-md2man - ่ฝฌๆข Markdown ไธบ man ๆ‰‹ๅ†Œๅ†…ๅฎน

go-md2man Converts markdown into roff (man pages). Uses blackfriday to process markdown into man pages. Usage ./md2man -in /path/to/markdownfile.md -o

Dec 22, 2022
A PDF renderer for the goldmark markdown parser.
A PDF renderer for the goldmark markdown parser.

goldmark-pdf goldmark-pdf is a renderer for goldmark that allows rendering to PDF. Reference See https://pkg.go.dev/github.com/stephenafamo/goldmark-p

Jan 7, 2023
Markdown to Webpage app

mark2web Markdown to webpage link Usage $ mark2web test.md https://mark2web.test/aa32d8f230ef9d44c3a7acb55b572c8599502701 $ mark2web /tmp/session/test

Apr 18, 2021
Markdown Powered Graph API

What is Arachne? Arachne, (Greek: โ€œSpiderโ€) in [[greek/mythology]], the [[Arachne:daughter of:Idmon of Colophon]] in Lydia, a dyer in purple. Arachne

Dec 19, 2021