Big Data – Page 10 – CONTINUAL INTEGRATION

How Do You Copy a File into HDFS without the Error “No such file or directory”?

09/26/201701/07/2021 0 Comments

Problem scenario
You want to add a file to Hadoop. You are trying to run a basic Hadoop command to copy a file into HDFS. You get this error: copyFromLocal: `hdfs://localhost:54310/user/…’: No such file or directory

How do you copy a file from your OS into HDFS?

Solution
Do one of the following:
Option 1. Run this command to create a new directory (substitute “jdoe” with the name of your user):

hdfs dfs -mkdir -p /user/jdoe/contint
# Now repeat your copy command

Option 2.

…

Continue reading “How Do You Copy a File into HDFS without the Error “No such file or directory”?”

How Do You Troubleshoot a Fatal HDFS Error?

09/24/201706/03/2019 0 Comments

Problem scenario
You run an hdfs command and you get this:

[Fatal Error] core-site.xml:2:6: The processing instruction target matching “[xX][mM][lL]” is not allowed.
17/09/25 04:21:00 FATAL conf.Configuration: error parsing conf core-site.xml
org.xml.sax.SAXParseException; systemId: file:/home/hadoop/hadoop/etc/hadoop/core-site.xml; lineNumber: 2; columnNumber: 6; The processing instruction target matching “[xX][mM][lL]” is not allowed.
        at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
        at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
        at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)
        at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2531)

…

Continue reading “How Do You Troubleshoot a Fatal HDFS Error?”

How Do You Install R on a RHEL Instance of AWS?

08/19/201706/01/2019 0 Comments

Problem scenario
You have a RedHat Enterprise version of Linux. You want to install the R programming language. What do you do?

Solution
1. Run these six commands:

sudo yum -y install wget
cd /tmp
wget ftp://rpmfind.net/linux/epel/7/x86_64/e/epel-release-7-10.noarch.rpm
sudo rpm -ivh epel-release-7-10.noarch.rpm
sudo yum-config-manager –enable rhui-REGION-rhel-server-extras rhui-REGION-rhel-server-optional
sudo yum -y install R

2. This is an optional step to complete these instructions.

…

Continue reading “How Do You Install R on a RHEL Instance of AWS?”

How Do You Know If Apache Spark Has Been Installed?

08/16/201706/03/2019 0 Comments

Problem scenario
You are looking for the Hadoop components’ versions. You run these commands:

hadoop version
hdfs version
yarn version

You notice the output is the same for each of the three commands above. You are not sure if Apache Spark has been installed. What do you do?

Solution
Run this command:

spark-submit –version

…

Continue reading “How Do You Know If Apache Spark Has Been Installed?”

How Do You Know What Version of R Is Installed on Your Linux System?

08/16/201706/01/2019 0 Comments

Problem scenario
You want to determine which version of R is installed. How do you find this out in Linux?

Solution
Run this command:
R –version

…

Continue reading “How Do You Know What Version of R Is Installed on Your Linux System?”

How Do You Connect to Your Apache Spark Deployment in AWS?

08/15/201706/03/2019 0 Comments

Problem scenario
You have recently deployed Apache Spark to AWS. You see the EC-2 instances were created. But you cannot access them over the web UI (even over ports 4140, 8088, or 50070). You cannot access the instances via Putty. You changed your normal Security Group to allow TCP communication from your work station’s IP address. What should you do to connect to your new Spark instance for the first time?

…

Continue reading “How Do You Connect to Your Apache Spark Deployment in AWS?”