Home

Awesome

<picture> <source media="(prefers-color-scheme: dark)" srcset="http://svg.wiersma.co.za/hamba/project?title=avro&tag=A%20fast%20Go%20avro%20codec&mode=dark"> <source media="(prefers-color-scheme: light)" srcset="http://svg.wiersma.co.za/hamba/project?title=avro&tag=A%20fast%20Go%20avro%20codec"> <img alt="Logo" src="http://svg.wiersma.co.za/hamba/project?title=avro&tag=A%20fast%20Go%20avro%20codec"> </picture>

Go Report Card Build Status Coverage Status Go Reference GitHub release GitHub license

A fast Go avro codec

Overview

Install with:

go get github.com/hamba/avro/v2

Note: This project has renamed the default branch from master to main. You will need to update your local environment.

Usage

type SimpleRecord struct {
	A int64  `avro:"a"`
	B string `avro:"b"`
}

schema, err := avro.Parse(`{
    "type": "record",
    "name": "simple",
    "namespace": "org.hamba.avro",
    "fields" : [
        {"name": "a", "type": "long"},
        {"name": "b", "type": "string"}
    ]
}`)
if err != nil {
	log.Fatal(err)
}

in := SimpleRecord{A: 27, B: "foo"}

data, err := avro.Marshal(schema, in)
if err != nil {
	log.Fatal(err)
}

fmt.Println(data)
// Outputs: [54 6 102 111 111]

out := SimpleRecord{}
err = avro.Unmarshal(schema, data, &out)
if err != nil {
	log.Fatal(err)
}

fmt.Println(out)
// Outputs: {27 foo}

More examples in the godoc.

Types Conversions

AvroGo StructGo Interface
nullnilnil
booleanboolbool
bytes[]byte[]byte
floatfloat32float32
doublefloat64float64
longint*, int64, uint32**int, int64, uint32
intint*, int32, int16, int8, uint8*, uint16*int, uint8, uint16
fixeduint64uint64
stringstringstring
array[]T[]any
enumstringstring
fixed[n]byte[n]byte
mapmap[string]T{}map[string]any
recordstructmap[string]any
unionsee belowsee below
int.datetime.Timetime.Time
int.time-millistime.Durationtime.Duration
long.time-microstime.Durationtime.Duration
long.timestamp-millistime.Timetime.Time
long.timestamp-microstime.Timetime.Time
long.local-timestamp-millistime.Timetime.Time
long.local-timestamp-microstime.Timetime.Time
bytes.decimal*big.Rat*big.Rat
fixed.decimal*big.Rat*big.Rat
string.uuidstringstring

* Please note that the size of the Go type int is platform dependent. Decoding an Avro long into a Go int is only allowed on 64-bit platforms and will result in an error on 32-bit platforms. Similarly, be careful when encoding a Go int using Avro int on a 64-bit platform, as that can result in an integer overflow causing misinterpretation of the data.

** Please note that when the Go type is an unsigned integer care must be taken to ensure that information is not lost when converting between the Avro type and Go type. For example, storing a negative number in Avro of int = -100 would be interpreted as uint16 = 65,436 in Go. Another example would be storing numbers in Avro int = 256 that are larger than the Go type uint8 = 0.

Unions

The following union types are accepted: map[string]any, *T and any.

TextMarshaler and TextUnmarshaler

The interfaces TextMarshaler and TextUnmarshaler are supported for a string schema type. The object will be tested first for implementation of these interfaces, in the case of a string schema, before trying regular encoding and decoding.

Enums may also implement TextMarshaler and TextUnmarshaler, and must resolve to valid symbols in the given enum schema.

Identical Underlying Types

One type can be ConvertibleTo another type if they have identical underlying types. A non-native type is allowed be used if it can be convertible to time.Time, big.Rat or avro.LogicalDuration for the particular of LogicalTypes.

Ex.: type Timestamp time.Time

Untrusted Input With Bytes and Strings

For security reasons, the configuration Config.MaxByteSliceSize restricts the maximum size of bytes and string types created by the Reader. The default maximum size is 1MiB and is configurable. This is required to stop untrusted input from consuming all memory and crashing the application. Should this not be need, setting a negative number will disable the behaviour.

Benchmark

Benchmark source code can be found at: https://github.com/nrwiersma/avro-benchmarks

BenchmarkGoAvroDecode-8      	  788455	      1505 ns/op	     418 B/op	      27 allocs/op
BenchmarkGoAvroEncode-8      	  624343	      1908 ns/op	     806 B/op	      63 allocs/op
BenchmarkGoGenAvroDecode-8   	 1360375	       876.4 ns/op	     320 B/op	      11 allocs/op
BenchmarkGoGenAvroEncode-8   	 2801583	       425.9 ns/op	     240 B/op	       3 allocs/op
BenchmarkHambaDecode-8       	 5046832	       238.7 ns/op	      47 B/op	       0 allocs/op
BenchmarkHambaEncode-8       	 6017635	       196.2 ns/op	     112 B/op	       1 allocs/op
BenchmarkLinkedinDecode-8    	 1000000	      1003 ns/op	    1688 B/op	      35 allocs/op
BenchmarkLinkedinEncode-8    	 3170553	       381.5 ns/op	     248 B/op	       5 allocs/op

Always benchmark with your own workload. The result depends heavily on the data input.

Go structs generation

Go structs can be generated for you from the schema. The types generated follow the same logic in types conversions

Install the struct generator with:

go install github.com/hamba/avro/v2/cmd/avrogen@<version>

Example usage assuming there's a valid schema in in.avsc:

avrogen -pkg avro -o bla.go -tags json:snake,yaml:upper-camel in.avsc

Tip: Omit -o FILE to dump the generated Go structs to stdout instead of a file.

Check the options and usage with -h:

avrogen -h

Or use it as a lib in internal commands, it's the gen package

Avro schema validation

avrosv

A small Avro schema validation command-line utility is also available. This simple tool leverages the schema parsing functionality of the library, showing validation errors or optionally dumping parsed schemas to the console. It can be used in CI/CD pipelines to validate schema changes in a repository.

Install the Avro schema validator with:

go install github.com/hamba/avro/v2/cmd/avrosv@<version>

Example usage assuming there's a valid schema in in.avsc (exit status code is 0):

avrosv in.avsc

An invalid schema will result in a diagnostic output and a non-zero exit status code:

avrosv bad-default-schema.avsc; echo $?
Error: avro: invalid default for field someString. <nil> not a string
2

Schemas referencing other schemas can also be validated by providing all of them (schemas are parsed in order):

avrosv base-schema.avsc schema-withref.avsc

Check the options and usage with -h:

avrosv -h

Name Validation

Avro names are validated according to the Avro specification.

However, the official Java library does not validate said names accordingly, resulting to some files out in the wild to have invalid names. Thus, this library has a configuration option to allow for these invalid names to be parsed.

avro.SkipNameValidation = true

Note that this variable is global, so ideally you'd need to unset it after you're done with the invalid schema.

Go Version Support

This library supports the last two versions of Go. While the minimum Go version is not guaranteed to increase along side Go, it may jump from time to time to support additional features. This will be not be considered a breaking change.

Who uses hamba/avro?