goldmark-pdf is a renderer for goldmark that allows rendering to PDF.

Stephen Afam-Osemene

Last update: Dec 27, 2022

Comments: 6

goldmark-pdf

Reference

See https://pkg.go.dev/github.com/stephenafamo/goldmark-pdf

Usage

Care has been taken to match the semantics of goldmark and its extensions.

The PDF renderer can be initiated with pdf.New() and the returned value satisfies goldmark's renderer.Renderer interface, so it can be passed to goldmark.New() using the goldmark.WithRenderer() option.

markdown := goldmark.New(
    goldmark.WithRenderer(pdf.New()),
)

Options can also be passed to pdf.New(), the options interface to be satisfied is:

// An Option interface is a functional option type for the Renderer.
type Option interface {
	SetConfig(*Config)
}

Here is the Config struct that is to be modified:

type Config struct {
	Context context.Context

	PDF PDF

	// A source for images
	ImageFS fs.FS

	// All other options have sensible defaults
	Styles Styles

	// A cache for the fonts
	FontsCache fonts.Cache

	// For debugging
	TraceWriter io.Writer

	NodeRenderers util.PrioritizedSlice
}

Some helper functions for adding options are already provided. See option.go

An example with some more options:

goldmark.New(
    goldmark.WithRenderer(
        pdf.New(
            pdf.WithTraceWriter(os.Stdout),
            pdf.WithContext(context.Background()),
            pdf.WithImageFS(os.DirFS(".")),
            pdf.WithLinkColor("cc4578"),
            pdf.WithHeadingFont(pdf.GetTextFont("IBM Plex Serif", pdf.FontLora)),
            pdf.WithBodyFont(pdf.GetTextFont("Open Sans", pdf.FontRoboto)),
            pdf.WithCodeFont(pdf.GetCodeFont("Inconsolata", pdf.FontRobotoMono)),
        ),
    ),
)

Fonts

The fonts that can be used in the PDF are based on the Font struct

// Represents a font.
type Font struct {
	CanUseForText bool
	CanUseForCode bool

	Category string
	Family   string

	FileRegular    string
	FileItalic     string
	FileBold       string
	FileBoldItalic string

	Type fontType
}

To be used for text, a font should have regular, italic, bold and bold-italic styles. Each of these has to be loaded separately.

To ease this process, variables have been generated for all the Google fonts that have these styles. For example:

var FontRoboto = Font{
	CanUseForCode:  false,
	CanUseForText:  true,
	Category:       "sans-serif",
	Family:         "Roboto",
	FileBold:       "700",
	FileBoldItalic: "700italic",
	FileItalic:     "italic",
	FileRegular:    "regular",
	Type:           fontTypeGoogle,
}

For codeblocks, if any other style is missing, the regular font is used in place.

var FontMajorMonoDisplay = Font{
	CanUseForCode:  true,
	CanUseForText:  false,
	Category:       "monospace",
	Family:         "Major Mono Display",
	FileBold:       "regular",
	FileBoldItalic: "regular",
	FileItalic:     "regular",
	FileRegular:    "regular",
	Type:           fontTypeGoogle,
}

When loading the fonts, they are downloaded on the fly using the fonts.

If you'd like to use a font outside of these, you should pass your own font struct which have been loaded into the PDF object you set in the Config. Be sure to set the FontType to FontTypeCustom so that we do not attempt to download it.

Contributing

Here's a list of things that I'd love help with:

More documentation
Testing
Finish the (currently buggy) implementation based on gopdf

License

MIT

Author

Stephen Afam-Osemene

Owner

Stephen Afam-Osemene

Part Programmer, Part Engineer, Part Entrepreneur. I have many interests that converge on improving lives with technology

https://github.com/stephenafamo/goldmark-pdf

Comments

Tables do not render correctly

If the cell contents are longer than the heading length of the column then the next column will overwrite the last part of the previous column. Column widths are determined based on the values in the header row. Column contents do not wrap (at least not without changing the row height, which I have not tested).

This makes the table functionality not very useful except for the simplest of tables, which is a shame since otherwise the generated PDF looks very good.

Local links/navigation is not working

Having local navigation, like:

# <a name="top"></a>Markdown Test Page

* [Headings](#Headings)
* [Paragraphs](#Paragraphs)
* [Blockquotes](#Blockquotes)
* [Lists](#Lists)
* [Horizontal rule](#Horizontal)
* [Table](#Table)
* [Code](#Code)
* [Inline elements](#Inline)

***

# <a name="Headings"></a>Headings

# Heading one

Sint sit cillum pariatur eiusmod nulla pariatur ipsum. Sit laborum anim qui mollit tempor pariatur nisi minim dolor. Aliquip et adipisicing sit sit fugiat commodo id sunt. Nostrud enim ad commodo incididunt cupidatat in ullamco ullamco Lorem cupidatat velit enim et Lorem. Ut laborum cillum laboris fugiat culpa sint irure do reprehenderit culpa occaecat. Exercitation esse mollit tempor magna aliqua in occaecat aliquip veniam reprehenderit nisi dolor in laboris dolore velit.

## Heading two

[[Top]](#top)

obrázok

will render links but they won't work.

Image ALT is ignored

When image is not loaded, it's alt text is ignored:

![The San Juan Mountains are beautiful!](san-juan-mountains.jpggg "San Juan Mountains Alt Text")
Image caption is not centered

Image with caption will have it rendered aligned to the left of the document instead of centered, with the image.

For example: ![The San Juan Mountains are beautiful!](/assets/images/san-juan-mountains.jpg "San Juan Mountains")

will end up like:
Small images are stretched for the entire width of the document

I have a testing document which contains a small image of 176px*176px dimensions. The problem is that this image will be scaled to the entire width of the document, which does not look good and is most likely not desired because it would not be rendered as such in HTML document, unless specifically configured with CSS.

The code responsible for this behavior is: https://github.com/stephenafamo/goldmark-pdf/blob/master/renderer_funcs.go#L581

I think that if the image is smaller than the width of the document, it should be kept as is and if the image is wider, only then it should be scaled down to fit the document.

Lack of image mime detection

The image renderer:

func (r *nodeRederFuncs) renderImage(w *Writer, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) {
	// while this has entering and leaving states, it doesn't appear
	// to be useful except for other markup languages to close the tag
	n := node.(*ast.Image)

	if entering {
		w.LogDebug("Image (entering)", fmt.Sprintf("Destination[%v] Title[%v]", string(n.Destination), string(n.Title)))
		// following changes suggested by @sirnewton01, issue #6
		// does file exist?
		imgPath := string(n.Destination) <--------------------------------------------------
		imgFile, err := w.ImageFS.Open(imgPath)
		if err == nil {
			defer imgFile.Close()

			width, _ := w.Pdf.GetPageSize()
			mleft, _, mright, _ := w.Pdf.GetMargins()
			maxw := width - (mleft * 2) - (mright * 2)

			format := strings.ToUpper(strings.Trim(filepath.Ext(imgPath), ".")) <-------
			w.Pdf.RegisterImage(imgPath, format, imgFile)
			w.Pdf.UseImage(imgPath, (mleft * 2), w.Pdf.GetY(), maxw, 0)
		} else {
			log.Printf("IMAGE ERROR: %v", err)
			w.LogDebug("Image (file error)", err.Error())
		}
	} else {
		w.LogDebug("Image (leaving)", "")
	}

	return ast.WalkContinue, nil
}

relies on path to determine the mime type of the file. But if I use http file directory to embed/render images linked via http and not stored locally, this fails miserably unless the url contains the mime type suffix, which is rarely the case.

Hence, there should be a built-in http file directory the mime should be determined manually by github.com/gabriel-vasile/mimetype or similar library.

My quickly built fs is:


type HttpFs struct{}

func (f *HttpFs) Open(name string) (fs.File, error) {
	res, err := http.Get(name)
	if err != nil {
		return nil, err
	}
	return &HttpFile{r: res}, nil
}

type HttpFile struct {
	r *http.Response
}

func (f *HttpFile) Stat() (fs.FileInfo, error) {
	return &HttpInfo{r: f.r}, nil
}

func (f *HttpFile) Read(p []byte) (int, error) {
	return f.r.Body.Read(p)
}

func (f *HttpFile) Close() error {
	return f.r.Body.Close()
}

type HttpInfo struct {
	r *http.Response
}

func (i *HttpInfo) Name() string {
	fn := strings.TrimPrefix(i.r.Request.URL.Path, "/")
	if fn == "" {
		if _, params, err := mime.ParseMediaType(i.r.Header.Get("Content-Disposition")); err == nil {
			fn = params["filename"]
		}
	}
	if filepath.Ext(fn) == "" {
		mt, _, _ := mime.ParseMediaType(i.r.Header.Get("Content-Type"))
		if spl := strings.Split(mt, "/"); len(spl) > 0 {
			if fn == "" {
				fn = spl[0]
			}
			fn += "." + spl[len(spl)-1]
		}
	}
	return filepath.Base(fn)
}

func (i *HttpInfo) Size() int64 {
	return i.r.ContentLength
}

func (i *HttpInfo) Mode() fs.FileMode {
	return fs.ModeIrregular
}

func (i *HttpInfo) ModTime() time.Time {
	if t, err := time.Parse(time.RFC1123, i.r.Header.Get("Last-Modified")); err == nil {
		return t
	}
	return time.Time{}
}

func (i *HttpInfo) IsDir() bool {
	return false
}

func (i *HttpInfo) Sys() any {
	return i.r
}

Golang wrapper for Exiftool : extract as much metadata as possible (EXIF, ...) from files (pictures, pdf, office documents, ...)

go-exiftool go-exiftool is a golang library that wraps ExifTool. ExifTool's purpose is to extract as much metadata as possible (EXIF, IPTC, XMP, GPS,

Dec 28, 2022

🎨 Terminal color rendering tool library, support 8/16 colors, 256 colors, RGB color rendering output, support Print/Sprintf methods, compatible with Windows. GO CLI 控制台颜色渲染工具库，支持16色，256色，RGB色彩渲染输出，使用类似于 Print/Sprintf，兼容并支持 Windows 环境的色彩渲染

CLI Color A command-line color library with true color support, universal API methods and Windows support. 中文说明 Basic color preview: Now, 256 colors a

Dec 23, 2022

goldmark-pdf is a renderer for goldmark that allows rendering to PDF.

goldmark-pdf

Reference

Usage

Fonts

Contributing

License

Author

Owner

Stephen Afam-Osemene

Comments

Tables do not render correctly

Local links/navigation is not working

Image ALT is ignored

Image caption is not centered

Small images are stretched for the entire width of the document

Lack of image mime detection

Related tags

Golang wrapper for Exiftool : extract as much metadata as possible (EXIF, ...) from files (pictures, pdf, office documents, ...)

A PDF processor written in Go.

A simple library for generating PDF written in Go lang

A PDF document generator with high level support for text, drawing and images

PDF tools for reMarkable tablets

A command line tool for mainly exporting logbook records from Google Spreadsheet to PDF file in EASA format

A Docker-powered stateless API for PDF files.

PDF file parser

create PDF from ASCII File for Cable labels

Convert document to pdf with golang

Ghostinthepdf - This is a small tool that helps to embed a PostScript file into a PDF

Read data from rss, convert in pdf and send to kindle. Amazon automatically convert them in azw3.

Go-wk - PDF Generation API with wkhtmltopdf

Newser is a simple utility to generate a pdf with you favorite news articles

PDF Annotator of Nightmares 🎃

Allows parsing CSV files into custom structs and implements required fields that can't be empty

A rudimentary go program that allows you to mount a mongo database as a FUSE file system

A PDF renderer for the goldmark markdown parser.

A gemtext renderer for goldmark.