How Do You Troubleshoot the Hadoop Error “Could not find or load main class jar”?

Problem scenario
You are trying to run a Hadoop operation.  You issue your "hadoop jar" or "hdfs jar" command.  You get one of these errors:

"Error: Could not find or load main class jar"

or this error

"Error: Could not find or load main class hadoop-streaming-2.8.1.jar"

What is wrong?

Solution
1.  Use "hadoop..." instead of "hdfs".

2.  Make sure you have the hadoop-streaming*.jar file.  You could run this command to find it:

sudo find / -name hadoop-streaming-*.jar

If you cannot find it, you can download it from here.

3.  Remember to put the full path after the "hadoop jar ".  Here is an example:

hadoop jar /usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.8.1.jar -file /home/hduser/mapper.py -mapper /home/hduser/mapper.py -file  /home/hduser/reducer.py -reducer /home/hduser/reducer.py -input /user/hduser/contint/* -output /user/hduser/gooddir-output

Alternatively you could use "cd /to/the/path/with/hadoop-streaming-2.8.1.jar" (where 2.8.1 is just an example).

Leave a comment

Your email address will not be published. Required fields are marked *