A Docker-powered stateless API for PDF files.

Gotenberg Logo

Gotenberg

A Docker-powered stateless API for PDF files

Documentation ยท ๐Ÿ”ฅ Live Demo


Gotenberg provides a developer-friendly API to interact with powerful tools like Chromium and LibreOffice to convert many documents (HTML, Markdown, Word, Excel, etc.) to PDF, transform them, merge them, and more!

Quick Start

Open a terminal and run the following command:

docker run --rm -p 3000:3000 gotenberg/gotenberg:7

Alternatively, using the historic Docker repository from our sponsor TheCodingMachine:

docker run --rm -p 3000:3000 thecodingmachine/gotenberg:7

The API is now available on your host at http://localhost:3000.

Head to the documentation to learn how to interact with it ๐Ÿš€

Sponsors

TheCodingMachine Logo

Badges

Docker pulls Docker pulls Continuous Integration Go Reference Codecov Go Report Card

Comments
  • Anyone tried this on Cloudrun? GCP.

    Anyone tried this on Cloudrun? GCP.

    Cloud Run is a managed compute platform that automatically scales your stateless containers. Cloud Run is serverless: it abstracts away all infrastructure management, so you can focus on what matters most โ€” building great applications

    Anyone tried Gotenberg on Cloudrun? Im assuming it would work as its stateless? Could have huge cost savings...

  • Latest 7.5.0 docker image intermittently crashes

    Latest 7.5.0 docker image intermittently crashes

    Every so often, the latest 7.5 docker image will crash on startup. Here's the docker logs:

    pdfgen_1     | 
    pdfgen_1     |   _____     __           __               
    pdfgen_1     |  / ___/__  / /____ ___  / /  ___ _______ _
    pdfgen_1     | / (_ / _ \/ __/ -_) _ \/ _ \/ -_) __/ _ '/
    pdfgen_1     | \___/\___/\__/\__/_//_/_.__/\__/_/  \_, / 
    pdfgen_1     |                                    /___/
    pdfgen_1     | 
    pdfgen_1     | A Docker-powered stateless API for PDF files.
    pdfgen_1     | Version: 7.5.0
    pdfgen_1     | -------------------------------------------------------
    pdfgen_1     | [SYSTEM] modules: api chromium gc libreoffice logging pdfcpu pdfengines pdftk prometheus qpdf uno uno-pdfengine webhook 
    pdfgen_1     | [SYSTEM] gc: application started
    pdfgen_1     | [SYSTEM] api: server listening on port 3000
    pdfgen_1     | [SYSTEM] prometheus: collecting metrics
    pdfgen_1     | [SYSTEM] pdfengines: pdfcpu pdftk qpdf uno-pdfengine
    pdfgen_1     | [FATAL] starting uno: start long-running LibreOffice listener: waiting for the LibreOffice listener socket to be available: context deadline exceeded
    

    Is this a timeout issue? I don't believe the previous docker image (7.4.3) had this issue.

  • Performance of Libreoffice PDF generation

    Performance of Libreoffice PDF generation

    We are currently using a Libreoffice + unoconv + api (node) architecture to generate our PDF, and we are considering switching to the more recent, maintained and beautifully coded Gotenberg ;) but we noticed some performance drawbacks.

    The doc says It starts a dedicated LibreOffice instance for each request.. This seems to add about 2 to 3s to the generation time, vs. our current architecture where the Libreoffice instance is just constantly idling, waiting for a new request (unoconv in listener mode).

    3 additional seconds per conversion feels quite like a long time =/ Is there any way to improve that? Do you envision an option in the future to keep the LibreOffice open for longer than one request?

    Also, it says It starts a dedicated LibreOffice instance for each REQUEST but according to our preliminary tests, it seems like it does it for every conversion, not every request : if we make a request with multiple files to convert (and merge), the conversion time seems to just be linear with the number of docs (vs. our expectation of faster conversion for subsequent files once the LibreOffice instance is started for the first one). Do you confirm that's what is happening?

  • `unoconv` listener process may become a zombie

    `unoconv` listener process may become a zombie

    On the Live Demo, the unoconv listener process did become a zombie for some reason.

    While unoconv is still working (it creates a dedicated LibreOffice instance in such a scenario), the conversions are way slower.

    In order to prevent this from happening, we have to:

    1. Check the state of the unoconv listener process.
    2. If state is invalid (no such process / zombie), try to restart the listener.
    3. If restart fails, /health endpoint should return an error.
  • feat: add qpdf engine

    feat: add qpdf engine

    I never programmed in Go before, so I don't clearly understand what I am doing.

    This PR aims to add QPDF as a tool to merge PDFs, because it can deal with invalid PDFs files, which PDFtk can not do.

    Also, I think that would be cool if Gotenberg have another tool on it.

  • Support Proxy Server for Chromium Headless

    Support Proxy Server for Chromium Headless

    Hi

    Is there any way to support a proxy server for the chromium module? I have tried setting the system proxy, but that doesn't seem to do the trick. So there's still the --proxy-server flag for chromium but I guess that's not supported, is it?

  • HTML to PDF rendering fails with 'rpc error: Printing failed (code = -32000)'

    HTML to PDF rendering fails with 'rpc error: Printing failed (code = -32000)'

    When I send a significantly large payload of HTML (about 800kb) that contains a lot of images (say ~3-4 per page, 250 pages in total) - rendering fails with the following error message:

    {
        "level": "error",
        "msg": "cdp.Page: PrintToPDF: rpc error: Printing failed (code = -32000)",
        "op": "xhttp.htmlHandler: xhttp.convertSync: printer.chromePrinter.Print",
        "time": "2020-09-22T18:04:50Z",
        "trace": "s1azQwaNSYBE6zK3y5R7VhstCYbPtrix"
    }
    

    and

    {
        "level": "debug",
        "msg": "[0922/180450.372244:ERROR:print_render_frame_helper.cc(1889)] Printing failed.",
        "op": "stderr.google-chrome-stable.--no-sandbox.--headless.--disable-dev-shm-usage.--font-render-hinting=none.--remote-debugging-port=9222.--disable-gpu.--disable-translate.--disable-extensions.--disable-background-networking.--safebrowsing-disable-auto-update.--disable-sync.--disable-default-apps.--hide-scrollbars.--metrics-recording-only.--mute-audio.--no-first-run",
        "time": "2020-09-22T18:04:50Z",
        "trace": "system"
    }
    

    Is there anything I can do / tweak in the environment variables to mitigate this? I've already increased the RPC buffer without success. System is running on AWS Fargate, 2vCPUs, 8GB of RAM.

    Expected Behavior

    Rendered document is returned.

    Current Behavior

    Rendering fails with HTTP 500.

    Your Environment

    • Version used: 6.3.0
    • Operating System and version: Docker container running on AWS Fargate 2/8GB
  • Very slow POST request to gotenberg to convert html -> pdf

    Very slow POST request to gotenberg to convert html -> pdf

    I noticed that sending the post request with file that has 150โ€ฏ344 bytes is slow. In my local enviroment I takes about 4 sec to get the Response. In my production enviroment I takes about 40 sec! We use python request library and we send just the attached file to the server. Do we need to use some extrat options to zip the html file or sth?

  • Memory usage constantly increasing

    Memory usage constantly increasing

    Hey there,

    We're experiencing an issue with Gotenberg where by the memory usage is constantly increasing over time, up to a point that it is being killed by the container orchestrator, ECS in this instance. Here's a graph of gotenbergs memory usage over time. image

    The large drops are ECS killing the container when the memory limit is reached. (the last restart is a manual one by me) We've given the container 512MB of memory in this example.

    Have you experienced this behaviour yourselves? Is 512MB sufficient? (We used 512MB as it is in the docs as a 'production' config) Is there a possible memory leak somewhere?

    Many thanks in advance!

    Your Environment

    • Version used: 5.1.0
    • Operating System and version: vendor provided docker image.
  • Bug 7.4.1 - Context deadline exceeded

    Bug 7.4.1 - Context deadline exceeded

    Hi @gulien Everything was fine with 7.4.0, but we just upgraded to 7.4.1 and when running the container with these parameters, we are getting this error :

    docker run --rm -d -p xxxx:3000 --name gotenberg xxxxx/gotenberg gotenberg --chromium-disable-routes --pdfengines-disable-routes --webhook-disable --prometheus-disable-collect --prometheus-disable-route-logging

    image

    If we downgrade back to 7.4.0 it starts working again (iso parameters). Since the latest release is all about timeout, it seems like a big coincidence ^^

    Is this meant btw? image https://github.com/gotenberg/gotenberg/compare/v7.4.0...v7.4.1#diff-51bc8303a14a97d22bd6cf1de23da23a89d419b96fe2e37972552a0b4cec8f77R393

  • go mod incompatibility

    go mod incompatibility

    Trying to use this module in other a go mod enabled repo of us is causing issues with some of the unused dependencies (particularly delve). Running go mod tidy fixes these issues.

    Expected Behavior

    GO111MODULE=on go get -u github.com/thecodingmachine/[email protected] Should allow us to use the go module.

    Current Behavior

    The gotenberg module is marked as incompatible.

    Possible Solution

    run go mod tidy

    Context

    We are unable to use the go module for talking to the gotenberg service in our k8s cluster.

    I have a commit ready on a fork. Wanted to raise an issue first, happy to raise PR if accepted.

  • Converter doesn't work properly with DOCX files

    Converter doesn't work properly with DOCX files

    Hello!

    I run Gotenberg 7.6.0 in k8s. I noticed that handling small DOCX files (~200-500KB) fails with status 503 and with following error message: {"level":"error","ts":1663864359.1665294,"logger":"api","msg":"convert to PDF: unoconv PDF: context done: context deadline exceeded","trace":"16a6c3a6-61e9-4e42-bda6-94aa43d3f46f","remote_ip":"172.17.0.1","host":"localhost:3000","uri":"/forms/libreoffice/convert","method":"POST","path":"/forms/libreoffice/convert","referer":"","user_agent":"Apache-HttpClient/4.5.10 (Java/1.8.0_202)","status":503,"latency":600003005400,"latency_human":"10m0.0030054s","bytes_in":295267,"bytes_out":19}

    I have also applied all suggestions from https://gotenberg.dev/docs/get-started/kubernetes. The combination of the following parameters doesn't help:

    • --uno-listener-start-timeout=60s
    • --uno-listener-restart-threshold=0

    Finally, I created one single dummy Test.docx file and started gotenberg in docker on my local machine (CPU i9 + 16GB RAM) with the following command: docker run --rm -p 3000:3000 gotenberg/gotenberg:7.6.0 gotenberg --api-disable-health-check-logging --uno-listener-restart-threshold=0 --api-timeout 1200s With the "curl" I tried to convert my dummy file: curl --noproxy '*' --request POST 'http://localhost:3000/forms/libreoffice/convert' --form '[email protected]"Test.docx"' -o Test.pdf After 20 minutes, I didn't get my PDF file.

    Why is it not able to convert the file, which is so small? Or maybe I'm doing something wrong?

  • Cloud Run error 'failed to load /usr/bin/tini'

    Cloud Run error 'failed to load /usr/bin/tini'

    https://github.com/gotenberg/gotenberg/blob/main/build/Dockerfile.cloudrun seems like it should work out of the box, but it's not.

    Getting this error: terminated: Application failed to start: Failed to create init process: failed to load /usr/bin/tini: exec format error

    What can I do about it?

  • feat(chromium): add ability to export images

    feat(chromium): add ability to export images

    Hi, this PR implements https://github.com/gotenberg/gotenberg/issues/110

    First time using go, hope the code is clear enough, didn't add any test though

    How

    • Based on a format parameter:
      • pdf: default if not specified
      • img: exports a png if quality parameter is 100 (default), otherwise a jpeg

    Example

    curl --request POST 'http://localhost:3000/forms/chromium/convert/url' \
    --form 'url="https://www.google.com"' --form 'landscape="true"' --form 'marginTop="1"' \
    --form 'marginBottom="1"' --form 'format="img"' -o my.png
    

    Thank you for the work on gotenberg !

  • What are the differences between version 6.x and 7.x?

    What are the differences between version 6.x and 7.x?

    Can't find any document on what are the differences between 6.x and 7.x. Curious what has changes, what has improved (as in: is it processing faster/better? slightly different API?)

  • add encrypt route

    add encrypt route

    First time even touching go. Implemented encryption of PDFs for qpdf and pdfcpu. Pdftk also supports encrypting, but behaviour was weird, so i dropped it for now.

    Api description was extended, but no tests so far, because i have no idea how they work in go.

    Please give feedback :)

goldmark-pdf is a renderer for goldmark that allows rendering to PDF.
goldmark-pdf is a renderer for goldmark that allows rendering to PDF.

A PDF renderer for the goldmark markdown parser.

Aug 27, 2022
Golang wrapper for Exiftool : extract as much metadata as possible (EXIF, ...) from files (pictures, pdf, office documents, ...)

go-exiftool go-exiftool is a golang library that wraps ExifTool. ExifTool's purpose is to extract as much metadata as possible (EXIF, IPTC, XMP, GPS,

Sep 17, 2022
Go-wk - PDF Generation API with wkhtmltopdf

Simple PDF Generation API with wkhtmltopdf Quick start Clone the repo locally an

Jan 25, 2022
A PDF processor written in Go.
A PDF processor written in Go.

pdfcpu: a Go PDF processor pdfcpu is a PDF processing library written in Go supporting encryption. It provides both an API and a CLI. Supported are al

Sep 21, 2022
A simple library for generating PDF written in Go lang

gopdf gopdf is a simple library for generating PDF document written in Go lang. Features Unicode subfont embedding. (Chinese, Japanese, Korean, etc.)

Sep 21, 2022
A PDF document generator with high level support for text, drawing and images

GoFPDF document generator Package go-pdf/fpdf implements a PDF document generator with high level support for text, drawing and images. Features UTF-8

Sep 7, 2022
PDF tools for reMarkable tablets

rm-pdf-tools - PDF tools for reMarkable Disclaimer: rm-pdf-tools is currently in a very early version, bugs are to be expected. Furthermore, the inten

Sep 4, 2022
A command line tool for mainly exporting logbook records from Google Spreadsheet to PDF file in EASA format
A command line tool for mainly exporting logbook records from Google Spreadsheet to PDF file in EASA format

Logbook CLI This is a command line tool for mainly exporting logbook records from Google Spreadsheet to PDF file in EASA format. It also supports rend

Feb 6, 2022
PDF file parser

#pdf A pdf document parsing and modifying library The libary provides functions to parse and show elements in PDF documents. It checks the validity

Nov 7, 2021
create PDF from ASCII File for Cable labels

CableLable create PDF from ASCII File for Cable labels file format is one label per line, a line containing up to 3 words, each word is a line on the

Nov 8, 2021
Convert document to pdf with golang

Convert document to pdf Build docker: docker build --pull --rm -f "Dockerfile" -t convertdocument:latest "." docker run -p 3000:3000 registry.gitlab.

Nov 29, 2021
Ghostinthepdf - This is a small tool that helps to embed a PostScript file into a PDF

This is a small tool that helps to embed a PostScript file into a PDF in a way that GhostScript will run the PostScript code during the

Aug 10, 2022
Read data from rss, convert in pdf and send to kindle. Amazon automatically convert them in azw3.

Kindle-RSS-PDF-AZW3 The Kindle RSS PDF AZW3 is a personal project. The Kindle RSS PDF AZW3 is a personal project. I received a Kindle for Christmas, a

Jan 10, 2022
Newser is a simple utility to generate a pdf with you favorite news articles
Newser is a simple utility to generate a pdf with you favorite news articles

Newser A simple utility to crawl some news sites or other resources and download content into a pdf Building Make sure you have config.yaml setup and

Aug 7, 2022
PDF Annotator of Nightmares ๐ŸŽƒ
PDF Annotator of Nightmares ๐ŸŽƒ

PDFrankenstein is a GUI tool that intends to fill the gap on Linux where a good capable PDF annotator like Adobe Acrobat does not exist. What can you

Sep 24, 2022
Split text files into gzip files with x lines

hakgzsplit split lines of text into multiple gzip files

Jun 21, 2022
Easily create Go files from stub files

go-stubs Easily create .go files from stub files in your projects. Usage go get github.com/nwby/go-stubs Create a stub file: package stubs type {{.Mo

Jan 27, 2022
app-services-go-linter plugin analyze source tree of Go files and validates the availability of i18n strings in *.toml files

app-services-go-linter app-services-go-linter plugin analyze source tree of Go files and validates the availability of i18n strings in *.toml files. A

Nov 29, 2021