goldmark-pdf is a renderer for goldmark that allows rendering to PDF.

goldmark-pdf

goldmark-pdf is a renderer for goldmark that allows rendering to PDF.

goldmark-pdf screenshot

Reference

See https://pkg.go.dev/github.com/stephenafamo/goldmark-pdf

Usage

Care has been taken to match the semantics of goldmark and its extensions.

The PDF renderer can be initiated with pdf.New() and the returned value satisfies goldmark's renderer.Renderer interface, so it can be passed to goldmark.New() using the goldmark.WithRenderer() option.

markdown := goldmark.New(
    goldmark.WithRenderer(pdf.New()),
)

Options can also be passed to pdf.New(), the options interface to be satisfied is:

// An Option interface is a functional option type for the Renderer.
type Option interface {
	SetConfig(*Config)
}

Here is the Config struct that is to be modified:

type Config struct {
	Context context.Context

	PDF PDF

	// A source for images
	ImageFS fs.FS

	// All other options have sensible defaults
	Styles Styles

	// A cache for the fonts
	FontsCache fonts.Cache

	// For debugging
	TraceWriter io.Writer

	NodeRenderers util.PrioritizedSlice
}

Some helper functions for adding options are already provided. See option.go

An example with some more options:

goldmark.New(
    goldmark.WithRenderer(
        pdf.New(
            pdf.WithTraceWriter(os.Stdout),
            pdf.WithContext(context.Background()),
            pdf.WithImageFS(os.DirFS(".")),
            pdf.WithLinkColor("cc4578"),
            pdf.WithHeadingFont(pdf.GetTextFont("IBM Plex Serif", pdf.FontLora)),
            pdf.WithBodyFont(pdf.GetTextFont("Open Sans", pdf.FontRoboto)),
            pdf.WithCodeFont(pdf.GetCodeFont("Inconsolata", pdf.FontRobotoMono)),
        ),
    ),
)

Fonts

The fonts that can be used in the PDF are based on the Font struct

// Represents a font.
type Font struct {
	CanUseForText bool
	CanUseForCode bool

	Category string
	Family   string

	FileRegular    string
	FileItalic     string
	FileBold       string
	FileBoldItalic string

	Type fontType
}

To be used for text, a font should have regular, italic, bold and bold-italic styles. Each of these has to be loaded separately.

To ease this process, variables have been generated for all the Google fonts that have these styles. For example:

var FontRoboto = Font{
	CanUseForCode:  false,
	CanUseForText:  true,
	Category:       "sans-serif",
	Family:         "Roboto",
	FileBold:       "700",
	FileBoldItalic: "700italic",
	FileItalic:     "italic",
	FileRegular:    "regular",
	Type:           fontTypeGoogle,
}

For codeblocks, if any other style is missing, the regular font is used in place.

var FontMajorMonoDisplay = Font{
	CanUseForCode:  true,
	CanUseForText:  false,
	Category:       "monospace",
	Family:         "Major Mono Display",
	FileBold:       "regular",
	FileBoldItalic: "regular",
	FileItalic:     "regular",
	FileRegular:    "regular",
	Type:           fontTypeGoogle,
}

When loading the fonts, they are downloaded on the fly using the fonts.

If you'd like to use a font outside of these, you should pass your own font struct which have been loaded into the PDF object you set in the Config. Be sure to set the FontType to FontTypeCustom so that we do not attempt to download it.

Contributing

Here's a list of things that I'd love help with:

  • More documentation
  • Testing
  • Finish the (currently buggy) implementation based on gopdf

License

MIT

Author

Stephen Afam-Osemene

Owner
Stephen Afam-Osemene
Part Programmer, Part Engineer, Part Entrepreneur. I have many interests that converge on improving lives with technology
Stephen Afam-Osemene
Comments
  • Tables do not render correctly

    Tables do not render correctly

    If the cell contents are longer than the heading length of the column then the next column will overwrite the last part of the previous column. Column widths are determined based on the values in the header row. Column contents do not wrap (at least not without changing the row height, which I have not tested).

    This makes the table functionality not very useful except for the simplest of tables, which is a shame since otherwise the generated PDF looks very good.

  • Local links/navigation is not working

    Local links/navigation is not working

    Having local navigation, like:

    # <a name="top"></a>Markdown Test Page
    
    * [Headings](#Headings)
    * [Paragraphs](#Paragraphs)
    * [Blockquotes](#Blockquotes)
    * [Lists](#Lists)
    * [Horizontal rule](#Horizontal)
    * [Table](#Table)
    * [Code](#Code)
    * [Inline elements](#Inline)
    
    ***
    
    # <a name="Headings"></a>Headings
    
    # Heading one
    
    Sint sit cillum pariatur eiusmod nulla pariatur ipsum. Sit laborum anim qui mollit tempor pariatur nisi minim dolor. Aliquip et adipisicing sit sit fugiat commodo id sunt. Nostrud enim ad commodo incididunt cupidatat in ullamco ullamco Lorem cupidatat velit enim et Lorem. Ut laborum cillum laboris fugiat culpa sint irure do reprehenderit culpa occaecat. Exercitation esse mollit tempor magna aliqua in occaecat aliquip veniam reprehenderit nisi dolor in laboris dolore velit.
    
    ## Heading two
    
    [[Top]](#top)
    

    obrázok

    will render links but they won't work.

  • Image ALT is ignored

    Image ALT is ignored

    When image is not loaded, it's alt text is ignored:

    ![The San Juan Mountains are beautiful!](san-juan-mountains.jpggg "San Juan Mountains Alt Text")

    obrázok

  • Image caption is not centered

    Image caption is not centered

    Image with caption will have it rendered aligned to the left of the document instead of centered, with the image.

    For example: ![The San Juan Mountains are beautiful!](/assets/images/san-juan-mountains.jpg "San Juan Mountains")

    will end up like: obrázok

  • Small images are stretched for the entire width of the document

    Small images are stretched for the entire width of the document

    I have a testing document which contains a small image of 176px*176px dimensions. The problem is that this image will be scaled to the entire width of the document, which does not look good and is most likely not desired because it would not be rendered as such in HTML document, unless specifically configured with CSS.

    The code responsible for this behavior is: https://github.com/stephenafamo/goldmark-pdf/blob/master/renderer_funcs.go#L581

    I think that if the image is smaller than the width of the document, it should be kept as is and if the image is wider, only then it should be scaled down to fit the document.

  • Lack of image mime detection

    Lack of image mime detection

    The image renderer:

    func (r *nodeRederFuncs) renderImage(w *Writer, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) {
    	// while this has entering and leaving states, it doesn't appear
    	// to be useful except for other markup languages to close the tag
    	n := node.(*ast.Image)
    
    	if entering {
    		w.LogDebug("Image (entering)", fmt.Sprintf("Destination[%v] Title[%v]", string(n.Destination), string(n.Title)))
    		// following changes suggested by @sirnewton01, issue #6
    		// does file exist?
    		imgPath := string(n.Destination) <--------------------------------------------------
    		imgFile, err := w.ImageFS.Open(imgPath)
    		if err == nil {
    			defer imgFile.Close()
    
    			width, _ := w.Pdf.GetPageSize()
    			mleft, _, mright, _ := w.Pdf.GetMargins()
    			maxw := width - (mleft * 2) - (mright * 2)
    
    			format := strings.ToUpper(strings.Trim(filepath.Ext(imgPath), ".")) <-------
    			w.Pdf.RegisterImage(imgPath, format, imgFile)
    			w.Pdf.UseImage(imgPath, (mleft * 2), w.Pdf.GetY(), maxw, 0)
    		} else {
    			log.Printf("IMAGE ERROR: %v", err)
    			w.LogDebug("Image (file error)", err.Error())
    		}
    	} else {
    		w.LogDebug("Image (leaving)", "")
    	}
    
    	return ast.WalkContinue, nil
    }
    

    relies on path to determine the mime type of the file. But if I use http file directory to embed/render images linked via http and not stored locally, this fails miserably unless the url contains the mime type suffix, which is rarely the case.

    Hence, there should be a built-in http file directory the mime should be determined manually by github.com/gabriel-vasile/mimetype or similar library.

    My quickly built fs is:

    
    type HttpFs struct{}
    
    func (f *HttpFs) Open(name string) (fs.File, error) {
    	res, err := http.Get(name)
    	if err != nil {
    		return nil, err
    	}
    	return &HttpFile{r: res}, nil
    }
    
    type HttpFile struct {
    	r *http.Response
    }
    
    func (f *HttpFile) Stat() (fs.FileInfo, error) {
    	return &HttpInfo{r: f.r}, nil
    }
    
    func (f *HttpFile) Read(p []byte) (int, error) {
    	return f.r.Body.Read(p)
    }
    
    func (f *HttpFile) Close() error {
    	return f.r.Body.Close()
    }
    
    type HttpInfo struct {
    	r *http.Response
    }
    
    func (i *HttpInfo) Name() string {
    	fn := strings.TrimPrefix(i.r.Request.URL.Path, "/")
    	if fn == "" {
    		if _, params, err := mime.ParseMediaType(i.r.Header.Get("Content-Disposition")); err == nil {
    			fn = params["filename"]
    		}
    	}
    	if filepath.Ext(fn) == "" {
    		mt, _, _ := mime.ParseMediaType(i.r.Header.Get("Content-Type"))
    		if spl := strings.Split(mt, "/"); len(spl) > 0 {
    			if fn == "" {
    				fn = spl[0]
    			}
    			fn += "." + spl[len(spl)-1]
    		}
    	}
    	return filepath.Base(fn)
    }
    
    func (i *HttpInfo) Size() int64 {
    	return i.r.ContentLength
    }
    
    func (i *HttpInfo) Mode() fs.FileMode {
    	return fs.ModeIrregular
    }
    
    func (i *HttpInfo) ModTime() time.Time {
    	if t, err := time.Parse(time.RFC1123, i.r.Header.Get("Last-Modified")); err == nil {
    		return t
    	}
    	return time.Time{}
    }
    
    func (i *HttpInfo) IsDir() bool {
    	return false
    }
    
    func (i *HttpInfo) Sys() any {
    	return i.r
    }
    
Golang wrapper for Exiftool : extract as much metadata as possible (EXIF, ...) from files (pictures, pdf, office documents, ...)

go-exiftool go-exiftool is a golang library that wraps ExifTool. ExifTool's purpose is to extract as much metadata as possible (EXIF, IPTC, XMP, GPS,

Dec 28, 2022
A PDF processor written in Go.
A PDF processor written in Go.

pdfcpu: a Go PDF processor pdfcpu is a PDF processing library written in Go supporting encryption. It provides both an API and a CLI. Supported are al

Jan 8, 2023
A simple library for generating PDF written in Go lang

gopdf gopdf is a simple library for generating PDF document written in Go lang. Features Unicode subfont embedding. (Chinese, Japanese, Korean, etc.)

Jan 3, 2023
A PDF document generator with high level support for text, drawing and images

GoFPDF document generator Package go-pdf/fpdf implements a PDF document generator with high level support for text, drawing and images. Features UTF-8

Jan 4, 2023
PDF tools for reMarkable tablets

rm-pdf-tools - PDF tools for reMarkable Disclaimer: rm-pdf-tools is currently in a very early version, bugs are to be expected. Furthermore, the inten

Oct 14, 2022
A command line tool for mainly exporting logbook records from Google Spreadsheet to PDF file in EASA format
A command line tool for mainly exporting logbook records from Google Spreadsheet to PDF file in EASA format

Logbook CLI This is a command line tool for mainly exporting logbook records from Google Spreadsheet to PDF file in EASA format. It also supports rend

Feb 6, 2022
A Docker-powered stateless API for PDF files.
A Docker-powered stateless API for PDF files.

Gotenberg provides a developer-friendly API to interact with powerful tools like Chromium and LibreOffice to convert many documents (HTML, Markdown, Word, Excel, etc.) to PDF, transform them, merge them, and more!

Dec 30, 2022
PDF file parser

#pdf A pdf document parsing and modifying library The libary provides functions to parse and show elements in PDF documents. It checks the validity

Nov 7, 2021
create PDF from ASCII File for Cable labels

CableLable create PDF from ASCII File for Cable labels file format is one label per line, a line containing up to 3 words, each word is a line on the

Nov 8, 2021
Convert document to pdf with golang

Convert document to pdf Build docker: docker build --pull --rm -f "Dockerfile" -t convertdocument:latest "." docker run -p 3000:3000 registry.gitlab.

Nov 29, 2021
Ghostinthepdf - This is a small tool that helps to embed a PostScript file into a PDF

This is a small tool that helps to embed a PostScript file into a PDF in a way that GhostScript will run the PostScript code during the

Dec 20, 2022
Read data from rss, convert in pdf and send to kindle. Amazon automatically convert them in azw3.

Kindle-RSS-PDF-AZW3 The Kindle RSS PDF AZW3 is a personal project. The Kindle RSS PDF AZW3 is a personal project. I received a Kindle for Christmas, a

Jan 10, 2022
Go-wk - PDF Generation API with wkhtmltopdf

Simple PDF Generation API with wkhtmltopdf Quick start Clone the repo locally an

Jan 25, 2022
Newser is a simple utility to generate a pdf with you favorite news articles
Newser is a simple utility to generate a pdf with you favorite news articles

Newser A simple utility to crawl some news sites or other resources and download content into a pdf Building Make sure you have config.yaml setup and

Nov 9, 2022
PDF Annotator of Nightmares 🎃
PDF Annotator of Nightmares 🎃

PDFrankenstein is a GUI tool that intends to fill the gap on Linux where a good capable PDF annotator like Adobe Acrobat does not exist. What can you

Dec 8, 2022
Allows parsing CSV files into custom structs and implements required fields that can't be empty

Welcome to Go Custom CSV Parser ?? Allows parsing CSV files into custom structs and implements required fields that can't be empty ?? Homepage Install

Nov 9, 2021
A rudimentary go program that allows you to mount a mongo database as a FUSE file system

This is a rudimentary go program that allows you to mount a mongo database as a

Dec 29, 2021
A PDF renderer for the goldmark markdown parser.
A PDF renderer for the goldmark markdown parser.

goldmark-pdf goldmark-pdf is a renderer for goldmark that allows rendering to PDF. Reference See https://pkg.go.dev/github.com/stephenafamo/goldmark-p

Jan 7, 2023
A gemtext renderer for goldmark.

goldmark-gemtext A gemtext renderer for goldmark. You can use this library to parse commonmark markdown (with support for autolinks and strikethrough)

Dec 28, 2021