How Do You Troubleshoot the Spark-Shell Error “A JNI error has occurred”?

Problem scenario
You run spark-shell in a Debian distribution of Linux (e.g., Ubuntu) but you receive this error:

Error: A JNI error has occurred, please check your installation and try again
Exception in thread “main” java.lang.ArrayIndexOutOfBoundsException: 64
        at java.util.jar.JarFile.match(java.base@9-internal/JarFile.java:983)
        at java.util.jar.JarFile.checkForSpecialAttributes(java.base@9-internal/JarFile.java:1017)
        at java.util.jar.JarFile.isMultiRelease(java.base@9-internal/JarFile.java:399)
        at java.util.jar.JarFile.getEntry(java.base@9-internal/JarFile.java:524)
        at java.util.jar.JarFile.getJarEntry(java.base@9-internal/JarFile.java:480)
at jdk.internal.util.jar.JarIndex.getJarIndex(java.base@9-internal/JarIndex.java:114)
        at jdk.internal.loader.URLClassPath$JarLoader$1.run(java.base@9-internal/URLClassPath.java:640)
        at jdk.internal.loader.URLClassPath$JarLoader$1.run(java.base@9-internal/URLClassPath.java:632)
        at java.security.AccessController.doPrivileged(java.base@9-internal/Native Method)
        at jdk.internal.loader.URLClassPath$JarLoader.ensureOpen(java.base@9-internal/URLClassPath.java:631)
        at jdk.internal.loader.URLClassPath$JarLoader.<init(java.base@9-internal/URLClassPath.java:606)
        at jdk.internal.loader.URLClassPath$3.run(java.base@9-internal/URLClassPath.java:386)
        at jdk.internal.loader.URLClassPath$3.run(java.base@9-internal/URLClassPath.java:376)
        at java.security.AccessController.doPrivileged(java.base@9-internal/Native Method)
        at jdk.internal.loader.URLClassPath.getLoader(java.base@9-internal/URLClassPath.java:375)
        at jdk.internal.loader.URLClassPath.getLoader(java.base@9-internal/URLClassPath.java:352)
        at jdk.internal.loader.URLClassPath.getResource(java.base@9-internal/URLClassPath.java:218)
        at jdk.internal.loader.BuiltinClassLoader$3.run(java.base@9-internal/BuiltinClassLoader.java:463)
        at jdk.internal.loader.BuiltinClassLoader$3.run(java.base@9-internal/BuiltinClassLoader.java:460)
        at java.security.AccessController.doPrivileged(java.base@9-internal/Native Method)
        at jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(java.base@9-internal/BuiltinClassLoader.java:459)
        at jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(java.base@9-internal/BuiltinClassLoader.java:406)
        at jdk.internal.loader.BuiltinClassLoader.loadClass(java.base@9-internal/BuiltinClassLoader.java:364)
        at jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(java.base@9-internal/ClassLoaders.java:184)
        at java.lang.ClassLoader.loadClass(java.base@9-internal/ClassLoader.java:419)
        at sun.launcher.LauncherHelper.loadMainClass(java.base@9-internal/LauncherHelper.java:585)
        at sun.launcher.LauncherHelper.checkAndLoadMain(java.base@9-internal/LauncherHelper.java:497)

How do you solve this?

How Do You Install Apache Spark on Any Type of Linux?

Problem scenario
You want a generic script that can install open source Apache Spark on Debian/Ubuntu, CentOS/RedHat/Fedora or SUSE distributions of Linux.  How do you do this?

Solution
1.  Create a script such as this in /tmp/ (e.g., /tmp/spark.sh).

#!/bin/bash
# Written by www.continualintegration.com

sparkversion=2.2.1  # Change this version as necessary

distro=$(cat /etc/*-release | grep NAME)

debflag=$(echo $distro | grep -i “ubuntu”)
if [ -z “$debflag” ]
then  

How Do You Install Apache Solr on Any Type of Linux?

Problem scenario
You want a generic script that can install open source Apache Solr on Debian/Ubuntu, CentOS/RedHat/Fedora or SUSE distributions of Linux.  How do you do this with the same bash script?

Solution
1.  Create a script such as this in /tmp/ (e.g., /tmp/solr.sh).

#!/bin/bash
# Written by www.continualintegration.com

solrversion=7.2.1  # Change this version as necessary

distro=$(cat /etc/*-release | grep NAME)

debflag=$(echo $distro | grep -i “ubuntu”)
if [ -z “$debflag” ]
then  

How Do You Troubleshoot an Empty Multi-node Hadoop Cluster?

Problem scenario
One or more of the following is happening:
   1) There are 0 DataNodes in your Hadoop cluster according to an error message
   2) There is 0 B configured as capacity (as shown from a “hdfs dfsadmin -report” command).
   3)  There is one fewer DataNode in your Hadoop cluster than you expect.
   4)  You run “hdfs dfsadmin -report | grep Hostname” and do not see a node that has its DataNode service (as seen with the 

What Are the Different Acronym Stacks in I.T.?

Question
What are the different acronym stacks in I.T.?

Answer
There are many open source combinations of technologies that are in wide use.  These acronyms referred to as “full stacks” or “stacks” appear in articles and job descriptions.  A full stack is a bundle of software that (includes an OS and) can create a complete and functional product when properly configured. 

How Do You Install Hadoop with a Script for Any Type of Linux Server?

Updated on 1/6/21

Problem scenario
You want to install open source Hadoop.  You may want a single-node or multi-node deployment with CentOS/RedHat/Fedora, Debian/Ubuntu, and/or SUSE Linux distributions.  You want to have most of it scripted and have the same script work on any variety of Linux.  How do you install Hadoop quickly with a script that works on almost any type of Linux?

Solution
1. 

How Do You Install Apache Parquet?

Problem scenario
You want to install Apache Parquet on the Hadoop namenode.  What do you do?

Solution
Prerequisite
This assumes that you have installed Hadoop.  For directions, see this posting.

Procedure
Run these commands:

sudo su –
apt-get -y install pip
pip install thriftpy
pip install snappy
exit

sudo apt-get -y install libsnappy-dev thrift-compiler

curl https://pypi.python.org/packages/74/b5/bc459aab0566fc3cf3397467922c37411ab6e3361bab9e0ca165e1089ce8/parquet-1.2.tar.gz#md5=05aacec0620ac63ecd7dd77bf7fb9fee >