export stripTags from html/template as strip.StripTags

HTML StripTags for Go

Used By Build Status Go Report Card Docs License

This is a Go package containing an extracted version of the unexported stripTags function in html/template/html.go.

⚠️ This package does not protect against untrusted input. Please use bluemonday if you have untrusted data ⚠️

Background

  • The stripTags function in html/template/html.go is very useful, however, it is not exported.
  • Requests were made on GitHub without success.
  • This package is a repo for work done by Christopher Hesse provided in this Gist.

Installation

$ go get github.com/grokify/html-strip-tags-go

Usage

import(
    "github.com/grokify/html-strip-tags-go" // => strip
)

func main() {
    original := "<h1>Hello World</h1>"
    stripped := strip.StripTags(original) // => "Hello World"
}
Owner
John Wang
Creator, Developer, PM @ RingCentral
John Wang
Similar Resources

network .md into .html with plaintext files

network .md into .html with plaintext files

plain network markdown files into html with plaintext files plain is a static-site generator operating on plaintext files containing a small set of co

Dec 10, 2022

golang program that simpily converts html into markdown

Simpily converts html to markdown Just a simple project I wrote in golang to convert html to markdown, surprisingly works decent for a lot of websites

Oct 23, 2021

Simple Markdown to Html converter in Go.

Markdown To Html Converter Simple Example package main import ( "github.com/gopherzz/MTDGo/pkg/lexer" "github.com/gopherzz/MTDGo/pkg/parser" "fm

Jan 29, 2022

This command line converts thuderbird's exported RSS .eml file to .html file

thunderbird-rss-html This command line tool converts .html to .epub with images fetching. Install go get github.com/gonejack/thunderbird-rss-html Us

Dec 15, 2021

Develop Sites Faster with HTML-Includer!

HTML Includer Develop Sites Faster with HTML Includer! How to Install Install HTML Includer on your machine: go install github.com/GameWorkstore/html-

Jan 1, 2022

HTML, CSS and SVG static renderer in pure Go

Web render This module implements a static renderer for the HTML, CSS and SVG formats. It consists for the main part of a Golang port of the awesome W

Apr 19, 2022

Golang library for converting Markdown to HTML. Good documentation is included.

md2html is a golang library for converting Markdown to HTML. Install go get github.com/wallblog/md2html Example package main import( "github.com/wa

Jan 11, 2022

Godown - Markdown to HTML converter made with Go

Godown Godown is a tiny-teeny utility that helps you convert your Markdown files

Jan 18, 2022

A complete Liquid template engine in Go

A complete Liquid template engine in Go

Liquid Template Parser liquid is a pure Go implementation of Shopify Liquid templates. It was developed for use in the Gojekyll port of the Jekyll sta

Dec 15, 2022
Comments
  • project idea sounds good, but suffers from possible security issues

    project idea sounds good, but suffers from possible security issues

    Hi,

    just wanted to notify you that I've made a similar project like yours, but after discussing it with other gophers on reddit, decided to deprecate it.

    Here's why:

    Using the stripTags function could be dangerous. From https://golang.org/pkg/html/template/#hdr-Security_Model:

    This package assumes that template authors are trusted

    stripTags resides within html/template and works according to those guaranties. Which means, that certain XSS attacks might go through undetected (we strip html, but XSS hackers crafted many attacks to circumvent simple sanitizers).

    A fast, reliable and already battle-worn library to strip HTML tags is bluemonday.

    They've got the bluemonday.StrictPolicy() mode:

    bluemonday.StrictPolicy()is a mode which can be thought of as equivalent to stripping all HTML elements and their attributes as it has nothing on it's whitelist. An example usage scenario would be blog post titles where HTML tags are not expected at all and if they are then the elements and the content of the elements should be stripped. This is a very strict policy.

    Example:

    stripped := bluemonday.StrictPolicy().SanitizeBytes(`<a onblur="alert(secret)" href="http://www.google.com">Google</a>`)
    // Output: Google
    

    That is exactly what you want when stripping arbitrary HTML content. A library, which understands XSS attacks and knows how to defuse these attacks. Even to the point of stripping all tags, leaving only plain text. No tags, no worry 😄

    Just wanted to raise awareness, that there's maybe a reason, why stripTags is not exported, and that there might be hidden pitfalls.

    Greetings Denis

  • !DOCTYPE html tag in result

    !DOCTYPE html tag in result

    With this site http://cocon.se/, I have <!DOCTYPE html> in result text content. Is it normal?

    Test

    	originalTest := "<!DOCTYPE html><h1>Hello World</h1>"
    	strippedTest := strip.StripTags(originalTest) // => "<!DOCTYPE html>Hello WorldHello World"
    	println(strippedTest)
    
  • escapes characters in result

    escapes characters in result

    For some HTML content, I have escapes characters in result text content.Test Test

    	originalTest := "<h1>J&rsquo;aime la visualisation simple et explicite du maillage.</h1><p>Très intéressant. C&rsquo;est l&rsquo;application au e-commerce qui m’intéresse maintenant !</p>"
    	strippedTest := strip.StripTags(originalTest) // => "J&rsquo;aime la visualisation simple et explicite du maillage.Très intéressant. C&rsquo;est l&rsquo;application au e-commerce qui m’intéresse maintenant !"
    	println(strippedTest)
    

    Must I use html.UnescapeString() before? Or does exists some function/parameter that can do this in html-strip-tags-go package?

bluemonday: a fast golang HTML sanitizer (inspired by the OWASP Java HTML Sanitizer) to scrub user generated content of XSS

bluemonday bluemonday is a HTML sanitizer implemented in Go. It is fast and highly configurable. bluemonday takes untrusted user generated content as

Jan 4, 2023
A declarative struct-tag-based HTML unmarshaling or scraping package for Go built on top of the goquery library

goq Example import ( "log" "net/http" "astuart.co/goq" ) // Structured representation for github file name table type example struct { Title str

Dec 12, 2022
⚙️ Convert HTML to Markdown. Even works with entire websites and can be extended through rules.
⚙️ Convert HTML to Markdown. Even works with entire websites and can be extended through rules.

html-to-markdown Convert HTML into Markdown with Go. It is using an HTML Parser to avoid the use of regexp as much as possible. That should prevent so

Jan 6, 2023
htmlquery is golang XPath package for HTML query.

htmlquery Overview htmlquery is an XPath query package for HTML, lets you extract data or evaluate from HTML documents by an XPath expression. htmlque

Jan 4, 2023
Pagser is a simple, extensible, configurable parse and deserialize html page to struct based on goquery and struct tags for golang crawler
Pagser is a simple, extensible, configurable parse and deserialize html page to struct based on goquery and struct tags for golang crawler

Pagser Pagser inspired by page parser。 Pagser is a simple, extensible, configurable parse and deserialize html page to struct based on goquery and str

Dec 13, 2022
Produces a set of tags from given source. Source can be either an HTML page, Markdown document or a plain text. Supports English, Russian, Chinese, Hindi, Spanish, Arabic, Japanese, German, Hebrew, French and Korean languages.
Produces a set of tags from given source. Source can be either an HTML page, Markdown document or a plain text. Supports English, Russian, Chinese, Hindi, Spanish, Arabic, Japanese, German, Hebrew, French and Korean languages.

Tagify Gets STDIN, file or HTTP address as an input and returns a list of most popular words ordered by popularity as an output. More info about what

Dec 19, 2022
Golang HTML to plaintext conversion library

html2text Converts HTML into text of the markdown-flavored variety Introduction Ensure your emails are readable by all! Turns HTML into raw text, usef

Dec 28, 2022
Templating system for HTML and other text documents - go implementation

FAQ What is Kasia.go? Kasia.go is a Go implementation of the Kasia templating system. Kasia is primarily designed for HTML, but you can use it for any

Mar 15, 2022
Take screenshots of websites and create PDF from HTML pages using chromium and docker

gochro is a small docker image with chromium installed and a golang based webserver to interact wit it. It can be used to take screenshots of w

Nov 23, 2022
Frongo is a Golang package to create HTML/CSS components using only the Go language.

Frongo Frongo is a Go tool to make HTML/CSS document out of Golang code. It was designed with readability and usability in mind, so HTML objects are c

Jul 29, 2021