Home

Awesome

avro-cli-examples

Examples on how to use the command line tools in Avro Tools to read and write Avro files.

See my original article Reading and Writing Avro Files From the Command Line for more information on using Avro Tools.


Table of Contents


<a name="prerequisites"></a>

Getting Avro Tools

You can get a copy of the latest stable Avro Tools jar file from the Avro Releases page. The actual avro-tools-*.jar file is in the java/ subdirectory of a given Avro release version.

Here is a direct link to avro-tools-1.11.3.jar (53 MB).

# Download the Avro Tools jar to the current local directory.
# The examples below assume the jar is in the current directory.
$ curl -O -J https://dlcdn.apache.org/avro/avro-1.11.3/java/avro-tools-1.11.3.jar

File overview

<a name="json-to-avro"></a>

JSON to binary Avro

Without compression:

$ java -jar avro-tools-1.11.3.jar fromjson --schema-file twitter.avsc twitter.json > twitter.avro

With Snappy compression:

$ java -jar avro-tools-1.11.3.jar fromjson --codec snappy --schema-file twitter.avsc twitter.json > twitter.snappy.avro

<a name="avro-to-json"></a>

Binary Avro to JSON

The same command works on both uncompressed and compressed data.

$ java -jar avro-tools-1.11.3.jar tojson twitter.avro > twitter.json
$ java -jar avro-tools-1.11.3.jar tojson twitter.snappy.avro > twitter.json

Example:

$ java -jar avro-tools-1.11.3.jar tojson twitter.avro

returns

{"username":"miguno","tweet":"Rock: Nerf paper, scissors is fine.","timestamp": 1366150681 }
{"username":"BlizzardCS","tweet":"Works as intended.  Terran is IMBA.","timestamp": 1366154481 }

You can also pretty-print the JSON output with the -pretty parameter:

$ java -jar avro-tools-1.11.3.jar tojson -pretty twitter.avro > twitter.pretty.json
$ java -jar avro-tools-1.11.3.jar tojson -pretty twitter.snappy.avro > twitter.pretty.json

Example:

$ java -jar avro-tools-1.11.3.jar tojson -pretty twitter.avro

returns

{
  "username" : "miguno",
  "tweet" : "Rock: Nerf paper, scissors is fine.",
  "timestamp" : 1366150681
}
{
  "username" : "BlizzardCS",
  "tweet" : "Works as intended.  Terran is IMBA.",
  "timestamp" : 1366154481
}

<a name="retrieve-avro-schema"></a>

Retrieve Avro schema from binary Avro

The same command works on both uncompressed and compressed data.

$ java -jar avro-tools-1.11.3.jar getschema twitter.avro > twitter.avsc
$ java -jar avro-tools-1.11.3.jar getschema twitter.snappy.avro > twitter.avsc

Example:

$ java -jar avro-tools-1.11.3.jar getschema twitter.avro
{
  "type" : "record",
  "name" : "twitter_schema",
  "namespace" : "com.miguno.avro",
  "fields" : [ {
    "name" : "username",
    "type" : "string",
    "doc" : "Name of the user account on Twitter.com"
  }, {
    "name" : "tweet",
    "type" : "string",
    "doc" : "The content of the user's Twitter message"
  }, {
    "name" : "timestamp",
    "type" : "long",
    "doc" : "Unix epoch time in seconds"
  } ],
  "doc:" : "A basic schema for storing Twitter messages"
}

<a name="related-tools"></a>

Related tools

You can also take a look at the CLI tools avrocat, avromod, and avropipe that are part of the Avro suite. You must build these tools yourself by following their respective INSTALL instructions.