Home

Awesome

goq

Build Status GoDoc Coverage Status Go Report Card

Example

import (
	"log"
	"net/http"

	"astuart.co/goq"
)

// Structured representation for github file name table
type example struct {
	Title string `goquery:"h1"`
	Files []string `goquery:"table.files tbody tr.js-navigation-item td.content,text"`
}

func main() {
	res, err := http.Get("https://github.com/andrewstuart/goq")
	if err != nil {
		log.Fatal(err)
	}
	defer res.Body.Close()

	var ex example
	
	err = goq.NewDecoder(res.Body).Decode(&ex)
	if err != nil {
		log.Fatal(err)
	}

	log.Println(ex.Title, ex.Files)
}

Details

goq

-- import "astuart.co/goq"

Package goq was built to allow users to declaratively unmarshal HTML into go structs using struct tags composed of css selectors.

I've made a best effort to behave very similarly to JSON and XML decoding as well as exposing as much information as possible in the event of an error to help you debug your Unmarshaling issues.

When creating struct types to be unmarshaled into, the following general rules apply:

[foo] will be used to determine the string map key,but [bar] and [baz] will be ignored, with the [bang] tag present S struct type taking precedence.

Usage

func NodeSelector

func NodeSelector(nodes []*html.Node) *goquery.Selection

NodeSelector is a quick utility function to get a goquery.Selection from a slice of *html.Node. Useful for performing unmarshaling, since the decision was made to use []*html.Node for maximum flexibility.

func Unmarshal

func Unmarshal(bs []byte, v interface{}) error

Unmarshal takes a byte slice and a destination pointer to any interface{}, and unmarshals the document into the destination based on the rules above. Any error returned here will likely be of type CannotUnmarshalError, though an initial goquery error will pass through directly.

func UnmarshalSelection

func UnmarshalSelection(s *goquery.Selection, iface interface{}) error

UnmarshalSelection will unmarshal a goquery.goquery.Selection into an interface appropriately annoated with goquery tags.

type CannotUnmarshalError

type CannotUnmarshalError struct {
	Err      error
	Val      string
	FldOrIdx interface{}
}

CannotUnmarshalError represents an error returned by the goquery Unmarshaler and helps consumers in programmatically diagnosing the cause of their error.

func (*CannotUnmarshalError) Error

func (e *CannotUnmarshalError) Error() string

type Decoder

type Decoder struct {
}

Decoder implements the same API you will see in encoding/xml and encoding/json except that we do not currently support proper streaming decoding as it is not supported by goquery upstream.

func NewDecoder

func NewDecoder(r io.Reader) *Decoder

NewDecoder returns a new decoder given an io.Reader

func (*Decoder) Decode

func (d *Decoder) Decode(dest interface{}) error

Decode will unmarshal the contents of the decoder when given an instance of an annotated type as its argument. It will return any errors encountered during either parsing the document or unmarshaling into the given object.

type Unmarshaler

type Unmarshaler interface {
	UnmarshalHTML([]*html.Node) error
}

Unmarshaler allows for custom implementations of unmarshaling logic

TODO