How Do You Troubleshoot an hdfs Command with the Error “java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)…ConnectionRefused”?

Problem scenario
When trying to run an hdfs command you see this message:

"WARN ipc.Client: Failed to connect to server: localhost/127.0.0.1:54310: try once and fail.
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
        at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:681)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:777)
        at org.apache.hadoop.ipc.Client$Connection.access$3500(Client.java:409)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1542)
        at org.apache.hadoop.ipc.Client.call(Client.java:1373)
        at org.apache.hadoop.ipc.Client.call(Client.java:1337)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
        at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:787)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:398)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:335)
        at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1700)
        at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1436)
        at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1433)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1433)
        at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:64)
        at org.apache.hadoop.fs.Globber.doGlob(Globber.java:269)
        at org.apache.hadoop.fs.Globber.glob(Globber.java:148)
        at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1685)
        at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:326)
        at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:235)
        at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:218)
        at org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:103)
        at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
        at org.apache.hadoop.fs.FsShell.run(FsShell.java:315)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
        at org.apache.hadoop.fs.FsShell.main(FsShell.java:378)
ls: Call From contint/10.10.10.10 to localhost:54310 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused"

What do you do?

Solution

See if the regular NameNode (not the SecondaryNameNode) is running.  Run this (as the hduser or something similar):  jps

You should see "NameNode" in the results.  If NameNode is not running, then you need to start it.

To start NameNode run this command:  bash /usr/local/hadoop/sbin/start-dfs.sh

If start-dfs.sh completes without errors but the problem remains, see one of these links: onetwo, or three.

If start-dfs.sh is having issues and running a single-node cluster, can you SSH into the local host with root?  Can you SSH to the localhost with root subsequently?  If you cannot login with SSH using the root user twice consecutively, see if getting that to work will allow start-dfs.sh to work.

If you can do both of these without any hdfs commands, then the links above should be what you try next:

ssh root@localhost
ssh root@localhost 

Leave a comment

Your email address will not be published. Required fields are marked *