How Do You Troubleshoot the Spark-Shell Error “A JNI error has occurred”?

Problem scenario
You run spark-shell in a Debian distribution of Linux (e.g., Ubuntu) but you receive this error:

Error: A JNI error has occurred, please check your installation and try again
Exception in thread “main” java.lang.ArrayIndexOutOfBoundsException: 64
        at java.util.jar.JarFile.match(java.base@9-internal/JarFile.java:983)
        at java.util.jar.JarFile.checkForSpecialAttributes(java.base@9-internal/JarFile.java:1017)
        at java.util.jar.JarFile.isMultiRelease(java.base@9-internal/JarFile.java:399)
        at java.util.jar.JarFile.getEntry(java.base@9-internal/JarFile.java:524)
        at java.util.jar.JarFile.getJarEntry(java.base@9-internal/JarFile.java:480)
at jdk.internal.util.jar.JarIndex.getJarIndex(java.base@9-internal/JarIndex.java:114)
        at jdk.internal.loader.URLClassPath$JarLoader$1.run(java.base@9-internal/URLClassPath.java:640)
        at jdk.internal.loader.URLClassPath$JarLoader$1.run(java.base@9-internal/URLClassPath.java:632)
        at java.security.AccessController.doPrivileged(java.base@9-internal/Native Method)
        at jdk.internal.loader.URLClassPath$JarLoader.ensureOpen(java.base@9-internal/URLClassPath.java:631)
        at jdk.internal.loader.URLClassPath$JarLoader.<init(java.base@9-internal/URLClassPath.java:606)
        at jdk.internal.loader.URLClassPath$3.run(java.base@9-internal/URLClassPath.java:386)
        at jdk.internal.loader.URLClassPath$3.run(java.base@9-internal/URLClassPath.java:376)
        at java.security.AccessController.doPrivileged(java.base@9-internal/Native Method)
        at jdk.internal.loader.URLClassPath.getLoader(java.base@9-internal/URLClassPath.java:375)
        at jdk.internal.loader.URLClassPath.getLoader(java.base@9-internal/URLClassPath.java:352)
        at jdk.internal.loader.URLClassPath.getResource(java.base@9-internal/URLClassPath.java:218)
        at jdk.internal.loader.BuiltinClassLoader$3.run(java.base@9-internal/BuiltinClassLoader.java:463)
        at jdk.internal.loader.BuiltinClassLoader$3.run(java.base@9-internal/BuiltinClassLoader.java:460)
        at java.security.AccessController.doPrivileged(java.base@9-internal/Native Method)
        at jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(java.base@9-internal/BuiltinClassLoader.java:459)
        at jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(java.base@9-internal/BuiltinClassLoader.java:406)
        at jdk.internal.loader.BuiltinClassLoader.loadClass(java.base@9-internal/BuiltinClassLoader.java:364)
        at jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(java.base@9-internal/ClassLoaders.java:184)
        at java.lang.ClassLoader.loadClass(java.base@9-internal/ClassLoader.java:419)
        at sun.launcher.LauncherHelper.loadMainClass(java.base@9-internal/LauncherHelper.java:585)
        at sun.launcher.LauncherHelper.checkAndLoadMain(java.base@9-internal/LauncherHelper.java:497)

How do you solve this?

How Do You Install Apache Spark on Any Type of Linux?

Problem scenario
You want a generic script that can install open source Apache Spark on Debian/Ubuntu, CentOS/RedHat/Fedora or SUSE distributions of Linux.  How do you do this?

Solution
1.  Create a script such as this in /tmp/ (e.g., /tmp/spark.sh).

#!/bin/bash
# Written by www.continualintegration.com

sparkversion=2.2.1  # Change this version as necessary

distro=$(cat /etc/*-release | grep NAME)

debflag=$(echo $distro | grep -i “ubuntu”)
if [ -z “$debflag” ]
then  

What Are the Different Acronym Stacks in I.T.?

Question
What are the different acronym stacks in I.T.?

Answer
There are many open source combinations of technologies that are in wide use.  These acronyms referred to as “full stacks” or “stacks” appear in articles and job descriptions.  A full stack is a bundle of software that (includes an OS and) can create a complete and functional product when properly configured. 

How Do You Connect to Your Apache Spark Deployment in AWS?

Problem scenario
You have recently deployed Apache Spark to AWS.  You see the EC-2 instances were created.  But you cannot access them over the web UI (even over ports 4140, 8088, or 50070).  You cannot access the instances via Putty.  You changed your normal Security Group to allow TCP communication from your work station’s IP address.  What should you do to connect to your new Spark instance for the first time?

How Do You Deploy an Apache Spark Cluster in AWS?

Problem scenario
You want to deploy Apache Spark to AWS.  How do you do this?

Solution
1.  Log into the AWS management console.  Once in, go to this link.

2.  Click “Create cluster” and then “Quick Create”

3. For Software Configuration, choose “Spark:…”

4. For “Security and access”, for EC2 key pair, choose the key pair you desire.

A List of Apache Spark Books

99 Apache Spark Interview Questions for Professionals by Yogesh Kumar
Advanced Analytics with Spark: Patterns for Learning from Data at Scale by Juliet Hougland, Uri Laserson, Sean Owen, Sandy Ryza and Josh Wills
Apache Spark 2 for Beginners by Rajanarayanan Thottuvaikkatumana
Apache Spark in 24 Hours, Sams Teach Yourself by Jeffrey Aven
Apache Spark for Data Science Cookbook by Padma Priya Chitturi
Apache Spark for Java Developers by Sumit Kumar and Sourav Gulati
Apache Spark Graph Processing by Rindra Ramamonjison
Apache Spark Interview Question &