Newser is a simple utility to generate a pdf with you favorite news articles

Newser

A simple utility to crawl some news sites or other resources and download content into a pdf

Screenshot

Building

Make sure you have config.yaml setup and go available, then run go build cmd/newser.go or just run it from source with go run cmd/newser.go

Configuration

Configuration file is used to guide the pdf building process, right now only website parsing is supported.

The configuration file must have a top level defs (definitions), font and output properties. Right now defs must have a website property that contains website definitions.

Default config is part of the source repo.

Website Definitions

-   index: "index-page-url"
    indexSelector: "css-selector-for-articles-index"
    titleSelector: "title-selector-for-articles"
    linkSelector: "selector-for-the-link-for-the-article-content"
    linkAttr: "attribute-to-gather-from-link-selector"
    articleContainerSelector: "article-container-selector"
    articleContentSelector: "article-content-selector"
    ignoreString: "if-found-in-article-article-will-be-ignored"
    removeElems:
        - "selector-in-article-html-to-remove"
        - "someother-selector-in-article-html-to-remove"
    collectOnly: 0 # 0 if you want to collect all articles, or limit to N articles
    disable: 0 # 1 if you want to disable this entry 

The good thing is you can be as specific with selectors as you want. So if a website has multiple sections that contain articles, you can have multiple definitions for it and only get the articles that you want.

Deps

Top level deps are

  • fpdf - "github.com/go-pdf/fpdf" - For generating pdfs
  • yaml - "gopkg.in/yaml.v2" - For parsing yamls
  • colly - "github.com/gocolly/colly/v2" - For crawling websites

Contributing

Right now the project is still pretty much done for my desire to read news on my Supernote (awesome gadget btw) so if you wanna do something clever just create a PR.

Contributors

Licence

Licence is free for personal but paid for commercial, get in touch if you want to use the utility or code for commercial purposes.

Owner
Nenad
Speaking and typing many languages. Alternate company profile @nenadlukic
Nenad
Similar Resources

PDF file parser

#pdf A pdf document parsing and modifying library The libary provides functions to parse and show elements in PDF documents. It checks the validity

Nov 7, 2021

create PDF from ASCII File for Cable labels

CableLable create PDF from ASCII File for Cable labels file format is one label per line, a line containing up to 3 words, each word is a line on the

Nov 8, 2021

Convert document to pdf with golang

Convert document to pdf Build docker: docker build --pull --rm -f "Dockerfile" -t convertdocument:latest "." docker run -p 3000:3000 registry.gitlab.

Nov 29, 2021

Ghostinthepdf - This is a small tool that helps to embed a PostScript file into a PDF

This is a small tool that helps to embed a PostScript file into a PDF in a way that GhostScript will run the PostScript code during the

Dec 20, 2022

Read data from rss, convert in pdf and send to kindle. Amazon automatically convert them in azw3.

Kindle-RSS-PDF-AZW3 The Kindle RSS PDF AZW3 is a personal project. The Kindle RSS PDF AZW3 is a personal project. I received a Kindle for Christmas, a

Jan 10, 2022

Go-wk - PDF Generation API with wkhtmltopdf

Simple PDF Generation API with wkhtmltopdf Quick start Clone the repo locally an

Jan 25, 2022

PDF Annotator of Nightmares 🎃

PDF Annotator of Nightmares 🎃

PDFrankenstein is a GUI tool that intends to fill the gap on Linux where a good capable PDF annotator like Adobe Acrobat does not exist. What can you

Dec 8, 2022

A simple utility for validating CSV columns

gompare A simple utility for validating CSV columns Building In project directly, run go build Usage ./gompare --template-file=template.csv --input-fi

Feb 3, 2022

🌳 📂 The utility displays a tree of directories and files(symlinks in future).

dirTree The utility displays a tree of directories and files. usage: dirTree [-f] How it works with directory, where I wrote this project for example

Aug 12, 2021
goldmark-pdf is a renderer for goldmark that allows rendering to PDF.
goldmark-pdf is a renderer for goldmark that allows rendering to PDF.

A PDF renderer for the goldmark markdown parser.

Dec 27, 2022
Starter files for the News application built with Go
Starter files for the News application built with Go

News Demo starter files Starter files for the News application built with Go. Tutorial: https://freshman.tech/web-development-with-go/ Here's what the

Oct 16, 2021
QueryCSV enables you to load CSV files and manipulate them using SQL queries then after you finish you can export the new values to a CSV file
QueryCSV enables you to load CSV files and manipulate them using SQL queries then after you finish you can export the new values to a CSV file

QueryCSV enable you to load CSV files and manipulate them using SQL queries then after you finish you can export the new values to CSV file

Dec 22, 2021
A simple library for generating PDF written in Go lang

gopdf gopdf is a simple library for generating PDF document written in Go lang. Features Unicode subfont embedding. (Chinese, Japanese, Korean, etc.)

Jan 3, 2023
Golang wrapper for Exiftool : extract as much metadata as possible (EXIF, ...) from files (pictures, pdf, office documents, ...)

go-exiftool go-exiftool is a golang library that wraps ExifTool. ExifTool's purpose is to extract as much metadata as possible (EXIF, IPTC, XMP, GPS,

Dec 28, 2022
A PDF processor written in Go.
A PDF processor written in Go.

pdfcpu: a Go PDF processor pdfcpu is a PDF processing library written in Go supporting encryption. It provides both an API and a CLI. Supported are al

Jan 8, 2023
A PDF document generator with high level support for text, drawing and images

GoFPDF document generator Package go-pdf/fpdf implements a PDF document generator with high level support for text, drawing and images. Features UTF-8

Jan 4, 2023
PDF tools for reMarkable tablets

rm-pdf-tools - PDF tools for reMarkable Disclaimer: rm-pdf-tools is currently in a very early version, bugs are to be expected. Furthermore, the inten

Oct 14, 2022
A command line tool for mainly exporting logbook records from Google Spreadsheet to PDF file in EASA format
A command line tool for mainly exporting logbook records from Google Spreadsheet to PDF file in EASA format

Logbook CLI This is a command line tool for mainly exporting logbook records from Google Spreadsheet to PDF file in EASA format. It also supports rend

Feb 6, 2022
A Docker-powered stateless API for PDF files.
A Docker-powered stateless API for PDF files.

Gotenberg provides a developer-friendly API to interact with powerful tools like Chromium and LibreOffice to convert many documents (HTML, Markdown, Word, Excel, etc.) to PDF, transform them, merge them, and more!

Dec 30, 2022