A light libxml wrapper for Go


LibXML bindings for the Go programming language.

By Zhigang Chen and Hampton Catlin

This is a major rewrite from v0 in the following places:

  • Separation of XML and HTML
  • Put more burden of memory allocation/deallocation on Go
  • Fragment parsing -- no more deep-copy
  • Serialization
  • Some API adjustment


# Linux
sudo apt-get install libxml2-dev
# Mac
brew install libxml2

go get github.com/moovweb/gokogiri

Running tests

go test github.com/moovweb/gokogiri/...

Basic example

package main

import (

func main() {
  // fetch and read a web page
  resp, _ := http.Get("http://www.google.com")
  page, _ := ioutil.ReadAll(resp.Body)

  // parse the web page
  doc, _ := gokogiri.ParseHtml(page)

  // perform operations on the parsed page -- consult the tests for examples

  // important -- don't forget to free the resources when you're done!
  • memory leak under heavy load

    memory leak under heavy load

    i am parsing around 200-300 3kb html snippets per second. which in itself proves how cool your lib is ;) sadly it's leaking memory at around 1-2 mb/min. not constantly though, so i am guessing it could be some kind of error while parsing.

    if i can help you to fix this let me know



  • can't seem to easily build on OS X

    can't seem to easily build on OS X

    The README is obviously outdated since the makefile is gone, but I still didn't manage to build/install on Mountain Lion:


    I installed libxml2 from homebrew, updated the xpath import statement to reflect the path of the brew files. Tried to build and go some error.

    An updated readme would be very appreciated since this lib seems very useful.


    • Matt
  • change import paths; fix `go get`

    change import paths; fix `go get`

    Currently this package refers to imports as a path gokogiri/... that doesn't exist. This makes it imposible to install as any other path (ie: "github.com/moovweb/gokogiri"). I think this can be resolved by making all local references to gokogiri components as relative imports.

    ie: in gokogiri/html/document.go the reference to util should be "../util"

    ~$ pkg-config --cflags libxml-2.0 libxml-2.0
    ~$ go get github.com/moovweb/gokogiri
    package gokogiri/html: unrecognized import path "gokogiri/html"
    package gokogiri/xml: unrecognized import path "gokogiri/xml"
    $ go get github.com/moovweb/gokogiri/html
    package gokogiri/util: unrecognized import path "gokogiri/util"
    package gokogiri/xml: unrecognized import path "gokogiri/xml"
  • TestDisableOutputEscaping fails in Darwin

    TestDisableOutputEscaping fails in Darwin

    Not sure why, seems to work fine on other platforms (windows and linux included).

    Below is the output:

    gokogiri/xml $ go test .
    Testing: Basic Parsing [....]
    All (4) tests passed!
    Testing: Buffered Parsing [....]
    All (4) tests passed!
    --- FAIL: TestDisableOutputEscaping (0.00 seconds)
        node_test.go:364: TestDisableOutputEscaping (escaping disabled) Expected: <br/>
            Actual: &lt;br/&gt;
    FAIL    github.com/moovweb/gokogiri/xml 0.134s
  • clang: error: argument unused during compilation: '-fno-eliminate-unused-debug-types'

    clang: error: argument unused during compilation: '-fno-eliminate-unused-debug-types'

    I seem to get this both when trying to use Gokogiri and when I tried to go get gokogiri again. :S Hope this isn't just me being stupid haha.



  • Better XPath support

    Better XPath support

    This pull request addresses both #42 and #39.

    Node.EvalXPath handles evaluating an XPath that returns a string or number instead of a nodeset. Unhandled return types are now coerced into a string.

    Node.SearchWithVariables and Node.EvalXPath both take a VariableScope that allows XPath expressions to resolve any variable names. This is specifically needed for my XSLT processor and may be useful in other contexts.

  • Inject HTML into a node

    Inject HTML into a node

    There should be a way to inject HTML into a node. For instance,

    node.String() // ""
    node.Inject("<div />")
    node.String() // "<div />"

    And, furthermore, this new div has to be properly doc'd.

    node.FirstElement().Doc() == node.Doc()
    // and ensure this happens in C-world too!
  • Encoding support

    Encoding support

    Gokogiri doesn't seem to support the encoding of some pages, although http://www.xmlsoft.org/encoding.html claims libxml will use iconv on unix systems. Here's a small test:

    package main
    import (
    func get(url string) []byte {
        r, err := http.Get(url)
        if err != nil { panic(err) }
        body, err := ioutil.ReadAll(r.Body)
        if err != nil { panic(err) }
        return body
    func main() {
        buf := get("http://bbs.chinaunix.net/thread-4080291-1-1.html")
        doc, err := gokogiri.ParseHtml(buf)
        if err != nil { panic(err) }
        fmt.Println("MetaEncoding:", doc.MetaEncoding())
        title, _ := doc.Search("//title")


    ~/gtest > go run gokogiritest.go
    MetaEncoding: gbk
    ~/gtest > go run gokogiritest.go | iconv -f gbk
    MetaEncoding: gbk

    Any idea why it's not working? Did I misunderstand the libxml page?

  • cannot build, test, or install gokogiri

    cannot build, test, or install gokogiri

    I've tried several avenues, including what's detailed in the README. Here's the steps I took:

    hobbsc@ea:~/incoming/gokogiri 1014:0% make test
    make: *** No rule to make target `test'.  Stop.
    hobbsc@ea:~/incoming/gokogiri 1015:2% go build
    gokogiri.go:4:2: import "gokogiri/html": cannot find package
    gokogiri.go:5:2: import "gokogiri/xml": cannot find package
    hobbsc@ea:~/incoming/gokogiri 1016:1% go get github.com/moovweb/gokogiri
    # pkg-config --cflags libxml-2.0 libxml-2.0
    exec: "pkg-config": executable file not found in $PATH
    hobbsc@ea:~/incoming/gokogiri 1017:2% make install
    make: *** No rule to make target `install'.  Stop.
    hobbsc@ea:~/incoming/gokogiri 1018:2% go test
    gokogiri.go:4:2: import "gokogiri/html": cannot find package
    gokogiri.go:5:2: import "gokogiri/xml": cannot find package
  • Make gokogiri compile with go 1.6

    Make gokogiri compile with go 1.6

    In Go 1.6 it is basically forbidden to pass a Go pointer to Go functions that are used as callbacks from C.

    Fix this by funneling those pointers through global variables.

    Fixes #92

  • Node.Search() uses the wrong XPath context

    Node.Search() uses the wrong XPath context

    Node.Search() should create a new XPath context using the current node instead of using the document context to allow searching from the current node.

  • Get error when start Docker container

    Get error when start Docker container

    Intall libxml2 in Dockerfile

    RUN apt-get update && apt-get install -y build-essential libxml2 libxml2-dev libxmlsec1-dev

    When start container, getting error: error while loading shared libraries: libxml2.so.2: cannot open shared object file: No such file or directory. How can i fix it?

  • identifier

    identifier "_Ctype_struct__xmlDoc" may conflict with identifiers generated by cgo

    I'm trying to install gokogiri on a macOS 10.14.4 (Mojave) and Go 1.12.3. I've installed libxml2 using brew. Installing gokogiri with:

    LDFLAGS="-L/usr/local/opt/libxml2/lib" CPPFLAGS="-I/usr/local/opt/libxml2/include" PKG_CONFIG_PATH="/usr/local/opt/libxml2/lib/pkgconfig" go get github.com/moovweb/gokogiri

    Outputs the error:

    # github.com/moovweb/gokogiri/xml
    ../../github.com/moovweb/gokogiri/xml/document.go:330:19: identifier "_Ctype_struct__xmlDoc" may conflict with identifiers generated by cgo

    How may I compile gokogiri?

  • pkg-config: exec:

    pkg-config: exec: "pkg-config": executable file not found in %PATH%

    go get github.com/moovweb/gokogiri

    pkg-config --cflags libxml-2.0

    pkg-config: exec: "pkg-config": executable file not found in %PATH%

    pkg-config --cflags libxml-2.0 libxml-2.0

    pkg-config: exec: "pkg-config": executable file not found in %PATH%

    Is it normal for a go get statement to require a dependency in the environment path?

  • build constraints exclude all Go files in /moovweb/gokogiri/help, failed to build with arch=386

    build constraints exclude all Go files in /moovweb/gokogiri/help, failed to build with arch=386

    $ GOOS=windows GOARCH=386 go build -o anan
    go build github.com/moovweb/gokogiri/help: build constraints exclude all Go files in /home/javier/go/src/github.com/moovweb/gokogiri/help
    go build github.com/moovweb/gokogiri/xpath: build constraints exclude all Go files in /home/javier/go/src/github.com/moovweb/gokogiri/xpath

    any thoughts?

