Hi,
just wanted to notify you that I've made a similar project like yours, but after discussing it with other gophers on reddit, decided to deprecate it.
Here's why:
Using the stripTags
function could be dangerous. From https://golang.org/pkg/html/template/#hdr-Security_Model:
This package assumes that template authors are trusted
stripTags
resides within html/template
and works according to those guaranties. Which means, that certain XSS attacks might go through undetected (we strip html, but XSS hackers crafted many attacks to circumvent simple sanitizers).
A fast, reliable and already battle-worn library to strip HTML tags is bluemonday.
They've got the bluemonday.StrictPolicy()
mode:
bluemonday.StrictPolicy()
is a mode which can be thought of as equivalent to stripping all HTML elements and their attributes as it has nothing on it's whitelist. An example usage scenario would be blog post titles where HTML tags are not expected at all and if they are then the elements and the content of the elements should be stripped. This is a very strict policy.
Example:
stripped := bluemonday.StrictPolicy().SanitizeBytes(`<a onblur="alert(secret)" href="http://www.google.com">Google</a>`)
// Output: Google
That is exactly what you want when stripping arbitrary HTML content. A library, which understands XSS attacks and knows how to defuse these attacks. Even to the point of stripping all tags, leaving only plain text. No tags, no worry 😄
Just wanted to raise awareness, that there's maybe a reason, why stripTags
is not exported, and that there might be hidden pitfalls.
Greetings
Denis