Kunal Kunal - 27 days ago 21
Java Question

Skipping the first line of the .csv in Map reduce java

As mapper function runs for every line , can i know the way how to skip the first line. For some file it consists of column header which i want to ignore

Answer

In mapper while reading the file, the data is read in as key-value pair. The key is the byte offset where the next line starts. For line 1 it is always zero. So in mapper function do the following

    @Override
    public void map(LongWritable key, Text value, Context context) throws IOException {
        try {
            if (key.get() == 0 && value.toString().contains("header") /*Some condition satisfying it is header*/)
                return;
            else {
                // For rest of data it goes here
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }     
Comments