How Do You Troubleshoot “ERROR: Hadoop common not found” when Running Hadoop?

Problem Scenario
You run an HDFS command, but you get this message: “ERROR: Hadoop common not found”

What should you do?

Solution
As the hduser or user that runs hdfs, log in. Run echo $HADOOP_HOME

That directory should have a libexec directory with a file called hadoop-config.sh.

Run this: ls -lh $HADOOP_HOME/libexec

One way to create it is this: 1) find a hadoop-config.sh file (e.g.,

Where Is the core-site.xml File in a Hadoop Installation?

Problem scenario
You downloaded and installed Hadoop core (an open source version). But you cannot find the core-site.xml file. (This is for open source Hadoop — not a proprietary version.) What should you do?

Possible Solution #1
Run this:

sudo find / -name core-site.xml

Possible Solution #2
Did you download the installation media with a file “-site” in its name?

Where Is the core-site.xml File in a Hadoop Installation?

Problem scenario
You downloaded and installed Hadoop core. But you cannot find the core-site.xml file. What should you do?

Possible Solution #1
Run this: sudo find / -name core-site.xml

Possible Solution #2
Did you download the installation media with a file “-site” in its name?

Try again with a .tar.gz file without “-site” (and without “-src”) in its name:
https://dlcdn.apache.org/hadoop/core/stable/

(This is the for open source version and not a specific vendor’s implementation of Hadoop.)

How Do You Troubleshoot “Segmentation fault” Errors in Hadoop/HDFS?

Problem scenario
You get a “Segmentation fault” or “Segmentation fault (core dumped)” error when you run any “hdfs” command. What should you do?

Solution

Root cause
There is probably an infinite loop/recursion problem. (Segmentation faults involve writing to the stack. Eventually the memory gets filled up.) There is some configuration problem with your Hadoop / hdfs installation.

How Do You Troubleshoot “ERROR: Hadoop common not found” when Running Hadoop?

Problem Scenario
You run an HDFS command, but you get this message: “ERROR: Hadoop common not found”

What should you do?

Solution
As the hduser or user that runs hdfs, log in. Run “echo $HADOOP_HOME”

That directory should have a libexec directory with a file called hadoop-config.sh.

Run this: ls -lh $HADOOP_HOME/libexec

One way to create it is this: 1) find a hadoop-config.sh file (e.g.,

How Do You Troubleshoot “Segmentation fault” Errors in Hadoop/HDFS?

Problem scenario
You get a “Segmentation fault” or “Segmentation fault (core dumped)” error when you run any “hdfs” command. (This is open source Hadoop and not a proprietary, or vendor’s, version.) What should you do?

Solution

Root cause
There is probably an infinite loop/recursion problem. (Segmentation faults involve writing to the stack. Eventually the memory gets filled up.) There is some configuration problem with your Hadoop / hdfs installation.

What is the Difference between Yarn, the Package Manager, and YARN, the Hadoop Framework Tool?

Problem scenario
You are familiar with two different software tools called Yarn. You are not sure what they do or how they are different.

Solution / Yarn Disambiguation
There is a package manager that has some project manager functionalities called yarn. It often works with JavaScript.

This tool has an icon of a one-line drawing of a cat; the original Yarn website had other cat imagery,

How Do You Troubleshoot the Hadoop Error “ApplicationClientProtocolPBClientImpl.getApplicationReport”?

Problem scenario
You are running a Hadoop command, but you get this message:

java.net.ConnectException: Your endpoint configuration is wrong; For more details see: http://wiki.apache.org/hadoop/UnsetHostnameOrPort, while invoking ApplicationClientProtocolPBClientImpl.getApplicationReport over null after 3 failover attempts. Trying to failover after sleeping for 22088ms.
2020-12-20 18:54:36,679 INFO ipc.Client:

What should you do?

Possible Solution
Is Resource Manager running? Start a new terminal and run “jps” to find out.

How Do You Troubleshoot the Hadoop Error “Connecting to ResourceManager”?

Problem scenario
You run a Hadoop command, but you get this error:

2020-12-20 18:19:33,706 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at /0.0.0.0:8032
2020-12-20 18:19:36,068 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

You tried restarted the start-dfs script with this command: bash /usr/local/hadoop/sbin/start-dfs.sh

It did not help. You ran “jps” to see if Resource Manager was running.