Cascadia package implements CSS selectors

cascadia

The Cascadia package implements CSS selectors for use with the parse trees produced by the html package.

To test CSS selectors without writing Go code, check out cascadia the command line tool, a thin wrapper around this package.

Refer to godoc here.

Comments
  • Use Matcher interface for selectors

    Use Matcher interface for selectors

    Hi ! This PR adds support for specificity of CSS selectors.

    The package API doesn't change; one method is added on the top-level Selector type.

    Internally, one type oneSelector is added, which wraps a matching function and a specificity. The new Selector type becomes a list of oneSelector, to support comma-separated groups of selectors.

    I added a few tests for specificity; all older tests passe.

  • Add support for pseudo elements

    Add support for pseudo elements

    Hi again, This PR adds support for pseudo elements.

    The approach taken boils down to adding a method PseudoElement on Sel, even for simple selectors, where the implementation is "empty".

  • cannot find package

    cannot find package "golang.org/x/net/html"

    ../../go/src/github.com/andybalholm/cascadia/parser.go:11:2: cannot find package "golang.org/x/net/html"

    Need to update to https://github.com/golang/net ?

  • use html.Render() to replace nodeString()

    use html.Render() to replace nodeString()

    • [*] use html.Render() to replace nodeString(). Closed #23.
    • [+] add two more multiple attribute selector test cases

    You may want to take a closer look at the following code,

    	{
    		`<ul>
    			<li><a id="a1" href="http://www.google.com/finance"></a>
    			<li><a id="a2" href="http://finance.yahoo.com/"></a>
    			<li><a id="a2" href="http://finance.untrusted.com/"/>
    			<li><a id="a3" href="https://www.google.com/news"/>
    			<li><a id="a4" href="http://news.yahoo.com"/>
    		</ul>`,
    		`[href#=(fina)]:not([href#=(\/\/[^\/]+untrusted)])`,
    		[]string{
    			`<a id="a1" href="http://www.google.com/finance"></a>`,
    			`<a id="a2" href="http://finance.yahoo.com/"></a>`,
    		},
    
    

    and make some changes, and of course you may leave it as-is as well.

  • Case-insensitive selectors without regex using under the hood

    Case-insensitive selectors without regex using under the hood

    It would be nice to add a case-insensitive attribute search. Now this is possible only through regex which has a low performance. Instead it possible to add the css 4 syntax (https://css4-selectors.com/selector/css4/attribute-case-sensitivity/). Under the hood you can use the EqualFold Go function for checking equality which has much better performance against Regex

    For example, this tags

    <div class="Red">
    <div class="red">
    

    can be find by div[class="red" i] selector

  • Identifiers prefixed with multiple dashes don't work

    Identifiers prefixed with multiple dashes don't work

    It looks like this is the offending line:

    https://github.com/andybalholm/cascadia/blob/master/parser.go#L101

    Simple test case:

    func TestDoubleDash(t *testing.T) {
    	// Works
    	_, err := cascadia.Compile(".-foobar")
    	if err != nil {
    		t.Error("Should succeed with single dash")
    	}
    
    	// Doesn't work
    	_, err = cascadia.Compile(".--foobar")
    	if err != nil {
    		t.Error("Should succeed with double dash")
    	}
    }
    
  • nodeString vs html.Render in test

    nodeString vs html.Render in test

    Hi Andy,

    I'm wondering what your considerations were when choosing to use nodeString() function (instead of html.Render()) in your selector_test.go file. I mean, html.Render() would work as well, right?

    The reason I'm asking is that, I think people would be more interested in the actual effect on html.Render(). Thx.

  • Top level > in query

    Top level > in query

    I'm using goquery, and retrieved a selection. I want to do a further query on that selection, requiring direct descendents of the selected nodes, so I start my query with '>'. For example say I find a particular 'ul' element, then want to target direct child 'li' elements in a later query on the selection containing the 'ul', I'd pass '> li'. Perhaps I'm not doing this right, but it seems jQuery supports it, but cascadia will not. Thanks.

    https://github.com/PuerkitoBio/goquery/issues/117

  • Handle colon in elementid

    Handle colon in elementid

    Similar to that problem: http://stackoverflow.com/questions/5552462/handling-colon-in-element-id-with-jquery

    Double backslashes doesn't work

    <div id="test:abc" value="123">
    
    value, exists := doc.Find("div#test:abc").Attr("value")
      if exists == false {
              log.Fatal("Not found\n")
      } else {
      log.Printf("Value: %s\n", value)
    
    panic: unknown pseudoclass :abc
    
    goroutine 1 [running]:
    github.com/andybalholm/cascadia.MustCompile(0x7f1020, 0x23, 0xc82006a380)
            /go/src/github.com/andybalholm/cascadia/selector.go:59 +0x72
    github.com/PuerkitoBio/goquery.(*Selection).Find(0xc8201887b0, 0x7f1020, 0x23, 0x0)
            /go/src/github.com/PuerkitoBio/goquery/traversal.go:27 +0x38
    main.ExampleScrape()
            /go/src/test/test.go:16 +0x7e
    main.main()
            /go/src/test/test.go:42 +0x14
    
  • Add :input selector

    Add :input selector

    This adds :input selector that matches input, select, textarea and button elements like Sizzle (https://github.com/jquery/sizzle/blob/a6ca3e9919c7a77001bdec01b3579e4bafd73a84/src/sizzle.js#L129)

    @andybalholm are you okay with this?

  • Null pointer dereference

    Null pointer dereference

    Hi,

    my project is using this one as a lib. After some slight changes on webpage that it parsers there is a panic: https://github.com/Komosa/cf/issues/4

    I suppose there is just a need to handle it around lines from stacktrace. But filling an issue for reference. offending line: gopath.../github.com/andybalholm/cascadia/selector.go:217

Package macho implements access to and creation of Mach-O object files.

go-macho [WIP] ?? Package macho implements access to and creation of Mach-O object files. Why ?? This package goes beyond the Go's debug/macho to: Cov

Sep 2, 2022
gobreaker implements the Circuit Breaker pattern in Go.

gobreaker gobreaker implements the Circuit Breaker pattern in Go.

Jan 7, 2023
Convert dates from or to 31 calendars in Go. Implements the functions discussed in Reingold/Dershowitz 2018.

libcalcal - Calendrical calculations in Go About libcalcal implements in Go the functions described and presented in: Reingold, Edward M., and Nachum

Dec 30, 2021
A package for running subprocesses in Go, similar to Python's subprocesses package.

A package for running subprocesses in Go, similar to Python's subprocesses package.

Jul 28, 2022
Utility to restrict which package is allowed to import another package.

go-import-rules Utility to restrict which package is allowed to import another package. This tool will read import-rules.yaml or import-rules.yml in t

Jan 7, 2022
Golang source code parsing, usage like reflect package

gotype Golang source code parsing, usage like reflect package English 简体中文 Usage API Documentation Examples License Pouch is licensed under the MIT Li

Dec 9, 2022
A Go preprocessor for package scoped reflection

pkgreflect - A Go preprocessor for package scoped reflection Problem: Go reflection does not support enumerating types, variables and functions of pac

Dec 13, 2022
reactssr is a package for rendering React applications.

reactssr A Go package to perform Server Side Rendering of React apps. Example usage Given a bundle produced from an additional entrypoint to your appl

Jan 9, 2023
Package ethtool allows control of the Linux ethtool generic netlink interface.

ethtool Package ethtool allows control of the Linux ethtool generic netlink interface.

Dec 14, 2022
Extremely flexible golang deep comparison, extends the go testing package, tests HTTP APIs and provides tests suite
Extremely flexible golang deep comparison, extends the go testing package, tests HTTP APIs and provides tests suite

go-testdeep Extremely flexible golang deep comparison, extends the go testing package. Latest news Synopsis Description Installation Functions Availab

Jan 5, 2023
A well tested and comprehensive Golang statistics library package with no dependencies.

Stats - Golang Statistics Package A well tested and comprehensive Golang statistics library / package / module with no dependencies. If you have any s

Dec 30, 2022
A computational topology package for gophers.
A computational topology package for gophers.

Simplices; simplicial complexes; simplicial chains; chain, cycle, boundary and homology groups; sets of simplices; methods for computing boundaries, Euler characteristics, Euler integrals, and Betti numbers, and more (with even more to come)!

Apr 19, 2021
Go package implementing Bloom filters

Go package implementing Bloom filters

Dec 30, 2022
go-i18n is a Go package and a command that helps you translate Go programs into multiple languages.

go-i18n is a Go package and a command that helps you translate Go programs into multiple languages.

Jan 2, 2023
Go package to generate and manage color palettes & schemes 🎨
Go package to generate and manage color palettes & schemes 🎨

Go package to generate and manage color palettes & schemes

Dec 29, 2022
Goridge is high performance PHP-to-Golang codec library which works over native PHP sockets and Golang net/rpc package.
Goridge is high performance PHP-to-Golang codec library which works over native PHP sockets and Golang net/rpc package.

Goridge is high performance PHP-to-Golang codec library which works over native PHP sockets and Golang net/rpc package. The library allows you to call Go service methods from PHP with a minimal footprint, structures and []byte support.

Dec 28, 2022
Gene parsing package for Axie Infinity

agp Package agp is a gene parsing package for Axie Infinity. The name agp stands for "Axie Gene Parser" which decodes the hex representation of an Axi

Apr 18, 2022
keeper is package for Go that provides a mechanism for waiting a result of execution function until context cancel.

keeper is package for Go that provides a mechanism for waiting a result of execution function until context cancel.

Apr 18, 2022
A lightweight casting package for Go projects

Cast GoLobby Cast is a lightweight casting package for Go projects. Documentation Required Go Versions It requires Go v1.11 or newer versions. Install

Dec 21, 2022