philipjkim philipjkim - 6 months ago 19
Java Question

NullPointerException when creating SpecificDatumWriter<T>

While learning Apache Avro by Tom White's book, Hadoop: The Definitive Guide, I got an error.

The example has 3 steps:


  1. Create an Avro schema file (
    Pair.avsc
    )

    {
    "type":"record",
    "name":"Pair",
    "doc":"A pair of strings.",
    "fields":[
    { "name":"left", "type":"string" },
    { "name":"right", "type":"string" }
    ]
    }

  2. Compile the schema file to create a Java class (
    Pair.java
    ) using

    $ java -jar $AVRO_HOME/avro-tools-1.6.2.jar compile schema src/main/resources/Pair.avsc src/main/java/

  3. Use
    SpecificDatumWriter<Pair>
    and
    SpecificDatumReader<Pair>
    to serialize/deserialize data.



The original example method is
testPairSpecific()
in https://github.com/tomwhite/hadoop-book/blob/master/avro/src/main/java/AvroTest.java .

I rewrote the example code (
createPairAndSerializeThenDeserialize()
in https://github.com/philipjkim/avro-examples/blob/master/src/test/java/org/sooo/AvroTest.java), which is almost similar to the original one. The differneces are:


  1. Avro version I used is 1.6.2, in original 1.3.2.

  2. The contents of
    Pair.java
    created by avro-tools.jar differ (original: https://github.com/tomwhite/hadoop-book/blob/master/avro/src/main/java/Pair.java , mine: https://github.com/philipjkim/avro-examples/blob/master/src/main/java/org/sooo/Pair.java )



After running the test, I got an error:

java.lang.NullPointerException
at java.lang.String.replace(String.java:2228)
at org.apache.avro.specific.SpecificData.createSchema(SpecificData.java:195)
at org.apache.avro.specific.SpecificData.getSchema(SpecificData.java:140)
at org.apache.avro.specific.SpecificDatumWriter.<init>(SpecificDatumWriter.java:33)
at org.sooo.AvroTest.createPairAndSerializeThenDeserialize(AvroTest.java:86)
...


AvroTest.createPairAndSerializeThenDeserialize()
is:

@Test
public void createPairAndSerializeThenDeserialize() throws IOException {
// given
Pair datum = new Pair();
datum.setLeft(new Utf8("L"));
datum.setRight(new Utf8("R"));

// serialize
ByteArrayOutputStream out = new ByteArrayOutputStream();
DatumWriter<Pair> writer = new SpecificDatumWriter<Pair>(Pair.class); /* Line 86 */
Encoder encoder = EncoderFactory.get().binaryEncoder(out, null);
writer.write(datum, encoder);
encoder.flush();
out.close();

// deserialize
DatumReader<Pair> reader = new SpecificDatumReader<Pair>(Pair.class);
Decoder decoder = DecoderFactory.get().binaryDecoder(out.toByteArray(),
null);
Pair result = reader.read(null, decoder);

// then
assertThat(result.getLeft().toString(), is("L"));
assertThat(result.getRight().toString(), is("R"));
}


I'd like to know what is wrong with this example. Thanks for any comments.

FYI, my example project repo is https://github.com/philipjkim/avro-examples .

Answer

Your Pair.avsc file is missing a namespace field for your custom package name:

...
  "namespace": "org.sooo",
...