Yu Gu Yu Gu - 9 days ago 5
Java Question

What's the reason for this failure in hadoop?

enter image description here

It occurred frequently in my hadoop job when executing the reduce task.
Some reasons for this problem may be that the reducer did not write the context for a long time, and so you need to add context.progress() in your code. But in my reduce function, the context is written frequently. Here is my reduce function:


public void reduce(Text key, Iterable<Text> values, Context context) throws
IOException,InterruptedException{
Text s=new Text();
Text exist=new Text("e");
ArrayList<String> T=new ArrayList<String>();
for(Text val:values){
String value=val.toString();
T.add(value);
s.set(key.toString()+"-"+value);
context.write(s,exist);
}
Text need=new Text("n");
for(int i=0;i<T.size();++i){
String a=T.get(i);
for(int j=i+1;j<T.size();++j){
String b=T.get(j);
int f=a.compareTo(b);
if(f<0){
s.set(a+"-"+b);
context.write(s,need);
}
if(f>0){
s.set(b+"-"+a);
context.write(s,need);
}
}
}
}


You can see that the context is written frequently in the loop.
What's the reason for this failure? And how can I handle it?

Answer

Your task is taking more than 600 seconds to complete.

From Apache documentation page, you can find more details.

mapreduce.task.timeout

600000 ( default value in milli seconds)

The number of milliseconds before a task will be terminated if it neither reads an input, writes an output, nor updates its status string. A value of 0 disables the timeout.

Possible options:

  1. Finetune your application to complete the task with in 600 seconds

    OR

  2. Increase timeout for parameter mapreduce.task.timeout in mapred-site.xml