An (almost) compliant XPath 1.0 library.

xsel

Donate Go Reference

xsel is a library that (almost) implements the XPath 1.0 specification. The non-compliant bits are:

  • xsel does not implement the id function.
  • The grammar as defined in the XPath 1.0 spec doesn't explicitly allow function calls in the middle of a path expression, such as /path/function-call()/path. xsel allows function calls in the middle of path expressions.
  • xsel allows name lookups with a wildcard for the namespace, such as /*:path.

Basic usage

package main

import (
	"bytes"
	"fmt"

	"github.com/ChrisTrenkamp/xsel/exec"
	"github.com/ChrisTrenkamp/xsel/grammar"
	"github.com/ChrisTrenkamp/xsel/parser"
	"github.com/ChrisTrenkamp/xsel/store"
)

func main() {
	xml := `
<root>
	<a>This is an XML node.</a>
</root>
`

	xpath := grammar.MustBuild(`/root/a`)
	parser := parser.ReadXml(bytes.NewBufferString(xml))
	cursor, _ := store.CreateInMemory(parser)
	result, _ := exec.Exec(cursor, &xpath)

	fmt.Println(result) // This is an XML node.
}

Binding variables and namespaces

package main

import (
	"bytes"
	"fmt"

	"github.com/ChrisTrenkamp/xsel/exec"
	"github.com/ChrisTrenkamp/xsel/grammar"
	"github.com/ChrisTrenkamp/xsel/parser"
	"github.com/ChrisTrenkamp/xsel/store"
)

func main() {
	xml := `
<root>
	<node>2.50</node>
	<node>3.14</node>
	<node>0.30</node>
</root>
`

	contextSettings := func(c *exec.ContextSettings) {
		c.NamespaceDecls["ns"] = "http://some.namespace.com"
		c.Variables[exec.Name("http://some.namespace.com", "mynum")] = exec.Number(3.14)
	}

	xpath := grammar.MustBuild(`//node()[. = $ns:mynum]`)
	parser := parser.ReadXml(bytes.NewBufferString(xml))
	cursor, _ := store.CreateInMemory(parser)
	result, _ := exec.Exec(cursor, &xpath, contextSettings)

	fmt.Println(result) //3.14
}

Binding custom functions

package main

import (
	"bytes"
	"fmt"

	"github.com/ChrisTrenkamp/xsel/exec"
	"github.com/ChrisTrenkamp/xsel/grammar"
	"github.com/ChrisTrenkamp/xsel/node"
	"github.com/ChrisTrenkamp/xsel/parser"
	"github.com/ChrisTrenkamp/xsel/store"
)

func main() {
	xml := `
<root>
	<a>This is an element.</a>
	<!-- This is a comment. -->
</root>
`

	isComment := func(context exec.Context, args ...exec.Result) (exec.Result, error) {
		nodeSet, isNodeSet := context.Result().(exec.NodeSet)

		if !isNodeSet || len(nodeSet) == 0 {
			return exec.Bool(false), nil
		}

		_, isComment := nodeSet[0].Node().(node.Comment)
		return exec.Bool(isComment), nil
	}

	contextSettings := func(c *exec.ContextSettings) {
		c.FunctionLibrary[exec.Name("", "is-comment")] = isComment
	}

	xpath := grammar.MustBuild(`//node()[is-comment()]`)
	parser := parser.ReadXml(bytes.NewBufferString(xml))
	cursor, _ := store.CreateInMemory(parser)
	result, _ := exec.Exec(cursor, &xpath, contextSettings)

	fmt.Println(result) // This is a comment.
}

Extensible

xsel supplies an XML parser (using the encoding/xml package) out of the box, but the XPath logic does not depend directly on XML. It instead depends on the interfaces defined in the node and store packages. This means it's possible to use xsel for querying against non-XML documents, such as JSON.

To build a custom document, implement your own Parser method, and build Element's, Attribute's Character Data, Comment's, Processing Instruction's, and Namespace's.

Commandline Utility

xsel supplies a grep-like commandline utility for querying XML documents:

$ go get github.com/ChrisTrenkamp/xsel
$ xsel -h
Usage of xsel:
  -a    If the result is a NodeSet, print the string value of all the nodes instead of just the first
  -c    Execute XPath queries concurrently on files (beware that results will have no predictable order)
  -e value
        Bind an entity value e.g. entityname=entityval
  -m    If the result is a NodeSet, print all the results as XML
  -n    Suppress filenames
  -r    Recursively traverse directories
  -s value
        Namespace mapping. e.g. -ns companyns=http://company.com
  -u    Turns off strict XML decoding
  -v value
        Bind a variable (all variables are bound as string types) e.g. -v var=value or -v companyns:var=value
  -x string
        XPath expression to execute (required)

CLI examples

$ cat test.xml
<?xml version="1.0" encoding="UTF-8"?>
<root>
  <a xmlns="http://a">Element a</a>
  <b>Element b</b>
</root>

This is a basic query:

$ xsel -x '/root/b' test.xml
test.xml: Element b

This is a basic query on stdin:

$ cat foo.xml | xsel -x '/root/b' -
Element b

This query has multiple results, but only the first value is printed:

$ xsel -x '/root/*' test.xml
test.xml: Element a

This query has multiple results, and all values are printed:

$ xsel -x '/root/*' -a test.xml
test.xml: Element a
test.xml: Element b

Print all results as XML:

$ xsel -x '/root/*' -m test.xml
test.xml: <a xmlns="http://a">Element a</a>
test.xml: <b>Element b</b>

Suppress the filename when printing results:

$ xsel -x '/root/*' -m -n test.xml
<a xmlns="http://a">Element a</a>
<b>Element b</b>

Bind a namespace:

$ xsel -x '//a:*' -s a='http://a' -m test.xml
test.xml: <a xmlns="http://a">Element a</a>

Bind a variable (variables are bound as strings):

$ xsel -x '//*[. = $textval]' -v textval="Element b" test.xml
test.xml: Element b
Similar Resources

Golang HTML to plaintext conversion library

html2text Converts HTML into text of the markdown-flavored variety Introduction Ensure your emails are readable by all! Turns HTML into raw text, usef

Dec 28, 2022

Go Library [DEPRECATED]

Tideland Go Library Description The Tideland Go Library contains a larger set of useful Google Go packages for different purposes. ATTENTION: The cell

Nov 15, 2022

Go library to parse and render Remarkable lines files

Go library to parse and render Remarkable lines files

go-remarkable2pdf Go library to parse and render Remarkable lines files as PDF.

Nov 7, 2022

A modern text indexing library for go

A modern text indexing library for go

bleve modern text indexing in go - blevesearch.com Features Index any go data structure (including JSON) Intelligent defaults backed up by powerful co

Jan 4, 2023

Faker is a Go library that generates fake data for you.

Faker is a Go library that generates fake data for you.

Faker is a Go library that generates fake data for you. Whether you need to bootstrap your database, create good-looking XML documents, fill-in your p

Jan 7, 2023

character-set conversion library implemented in Go

mahonia character-set conversion library implemented in Go. Mahonia is a character-set conversion library implemented in Go. All data is compiled into

Dec 22, 2022

:book: A Golang library for text processing, including tokenization, part-of-speech tagging, and named-entity extraction.

prose prose is a natural language processing library (English only, at the moment) in pure Go. It supports tokenization, segmentation, part-of-speech

Jan 4, 2023

golang rss/atom generator library

gorilla/feeds feeds is a web feed generator library for generating RSS, Atom and JSON feeds from Go applications. Goals Provide a simple interface to

Dec 26, 2022

pdf document generation library

pdf document generation library

gopdf 项目介绍 gopdf 是一个生成 PDF 文档的 Golang 库. 主要有以下的特点: 支持 Unicode 字符 (包括中文, 日语, 朝鲜语, 等等.) 文档内容的自动定位与分页, 减少用户的工作量. 支持图片插入, 支持多种图片格式, PNG, BMP, JPEG, WEBP,

Dec 8, 2022
Comments
  • Any plan for XPath 2.0

    Any plan for XPath 2.0

    First of all, bravo on your new polished xsel @ChrisTrenkamp!

    Now,

    xsel is a library that implements the XPath 1.0 specification.

    Any plan for XPath 2.0? since

    XML tropes such as namespaces and processing instructions are defined in the package

    Basically all XML files that I'm dealing with are XPath 2.0 ones, with all sorts of weird namespaces.

xmlquery is Golang XPath package for XML query.

xmlquery Overview xmlquery is an XPath query package for XML documents, allowing you to extract data or evaluate from XML documents with an XPath expr

Jan 1, 2023
A markdown parser written in Go. Easy to extend, standard(CommonMark) compliant, well structured.

goldmark A Markdown parser written in Go. Easy to extend, standards-compliant, well-structured. goldmark is compliant with CommonMark 0.29. Motivation

Dec 29, 2022
A general purpose application and library for aligning text.

align A general purpose application that aligns text The focus of this application is to provide a fast, efficient, and useful tool for aligning text.

Sep 27, 2022
A NMEA parser library in pure Go

go-nmea This is a NMEA library for the Go programming language (Golang). Features Parse individual NMEA 0183 sentences Support for sentences with NMEA

Dec 20, 2022
Go library for the TOML language

go-toml Go library for the TOML format. This library supports TOML version v1.0.0-rc.3 Features Go-toml provides the following features for using data

Dec 27, 2022
A Go library to parse and format vCard

go-vcard A Go library to parse and format vCard. Usage f, err := os.Open("cards.vcf") if err != nil { log.Fatal(err) } defer f.Close() dec := vcard.

Dec 26, 2022
A declarative struct-tag-based HTML unmarshaling or scraping package for Go built on top of the goquery library

goq Example import ( "log" "net/http" "astuart.co/goq" ) // Structured representation for github file name table type example struct { Title str

Dec 12, 2022
omniparser: a native Golang ETL streaming parser and transform library for CSV, JSON, XML, EDI, text, etc.
omniparser: a native Golang ETL streaming parser and transform library for CSV, JSON, XML, EDI, text, etc.

omniparser Omniparser is a native Golang ETL parser that ingests input data of various formats (CSV, txt, fixed length/width, XML, EDI/X12/EDIFACT, JS

Jan 4, 2023
The Go library for working with delimited separated value (DSV).

Package dsv is a Go library for working with delimited separated value (DSV). NOTE: This package has been deprecated. See https://github.com/shuLhan/s

Sep 15, 2021
Upskirt markdown library bindings for Go

Goskirt Package goskirt provides Go-bindings for the excellent Sundown Markdown parser. (F/K/A Upskirt). To use goskirt, create a new Goskirt-value wi

Oct 23, 2022