org.apache.avro.mapreduce
Class AvroRecordReaderBase<K,V,T>

java.lang.Object
  extended by org.apache.hadoop.mapreduce.RecordReader<K,V>
      extended by org.apache.avro.mapreduce.AvroRecordReaderBase<K,V,T>
Type Parameters:
K - The type of key the record reader should generate.
V - The type of value the record reader should generate.
T - The type of the entries within the Avro container file being read.
All Implemented Interfaces:
Closeable
Direct Known Subclasses:
AvroKeyRecordReader, AvroKeyValueRecordReader

public abstract class AvroRecordReaderBase<K,V,T>
extends org.apache.hadoop.mapreduce.RecordReader<K,V>

Abstract base class for RecordReaders that read Avro container files.


Constructor Summary
protected AvroRecordReaderBase(org.apache.avro.Schema readerSchema)
          Constructor.
 
Method Summary
 void close()
          
protected  org.apache.avro.file.DataFileReader<T> createAvroFileReader(org.apache.avro.file.SeekableInput input, org.apache.avro.io.DatumReader<T> datumReader)
          Creates an Avro container file reader from a seekable input stream.
protected  org.apache.avro.file.SeekableInput createSeekableInput(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path path)
          Creates a seekable input stream to an Avro container file.
protected  T getCurrentRecord()
          Gets the current record read from the Avro container file.
 float getProgress()
          
 void initialize(org.apache.hadoop.mapreduce.InputSplit inputSplit, org.apache.hadoop.mapreduce.TaskAttemptContext context)
          
 boolean nextKeyValue()
          
 
Methods inherited from class org.apache.hadoop.mapreduce.RecordReader
getCurrentKey, getCurrentValue
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

AvroRecordReaderBase

protected AvroRecordReaderBase(org.apache.avro.Schema readerSchema)
Constructor.

Parameters:
readerSchema - The reader schema for the records of the Avro container file.
Method Detail

initialize

public void initialize(org.apache.hadoop.mapreduce.InputSplit inputSplit,
                       org.apache.hadoop.mapreduce.TaskAttemptContext context)
                throws IOException,
                       InterruptedException

Specified by:
initialize in class org.apache.hadoop.mapreduce.RecordReader<K,V>
Throws:
IOException
InterruptedException

nextKeyValue

public boolean nextKeyValue()
                     throws IOException,
                            InterruptedException

Specified by:
nextKeyValue in class org.apache.hadoop.mapreduce.RecordReader<K,V>
Throws:
IOException
InterruptedException

getProgress

public float getProgress()
                  throws IOException,
                         InterruptedException

Specified by:
getProgress in class org.apache.hadoop.mapreduce.RecordReader<K,V>
Throws:
IOException
InterruptedException

close

public void close()
           throws IOException

Specified by:
close in interface Closeable
Specified by:
close in class org.apache.hadoop.mapreduce.RecordReader<K,V>
Throws:
IOException

getCurrentRecord

protected T getCurrentRecord()
Gets the current record read from the Avro container file.

Calling nextKeyValue() moves this to the next record.

Returns:
The current Avro record (may be null if no record has been read).

createSeekableInput

protected org.apache.avro.file.SeekableInput createSeekableInput(org.apache.hadoop.conf.Configuration conf,
                                                                 org.apache.hadoop.fs.Path path)
                                                          throws IOException
Creates a seekable input stream to an Avro container file.

Parameters:
conf - The hadoop configuration.
path - The path to the avro container file.
Throws:
IOException - If there is an error reading from the path.

createAvroFileReader

protected org.apache.avro.file.DataFileReader<T> createAvroFileReader(org.apache.avro.file.SeekableInput input,
                                                                      org.apache.avro.io.DatumReader<T> datumReader)
                                                               throws IOException
Creates an Avro container file reader from a seekable input stream.

Parameters:
input - The input containing the Avro container file.
datumReader - The reader to use for the individual records in the Avro container file.
Throws:
IOException - If there is an error reading from the input stream.


Copyright © 2009-2013 The Apache Software Foundation. All Rights Reserved.