How Do You Install Apache Parquet?

Problem scenario
You want to install Apache Parquet on the Hadoop namenode.  What do you do?

Solution
Prerequisite
This assumes that you have installed Hadoop.  For directions, see this posting.

Procedure
Run these commands:

sudo su -
apt-get -y install pip
pip install thriftpy
pip install snappy
exit

sudo apt-get -y install libsnappy-dev thrift-compiler

curl https://pypi.python.org/packages/74/b5/bc459aab0566fc3cf3397467922c37411ab6e3361bab9e0ca165e1089ce8/parquet-1.2.tar.gz#md5=05aacec0620ac63ecd7dd77bf7fb9fee > /tmp/parquet-1.2.tar.gz
sudo cp /tmp/parquet-1.2.tar.gz /opt/
cd /opt
sudo tar -xvf parquet-1.2.tar.gz
cd parquet-1.2
sudo python setup.py build
sudo python setup.py install

Leave a comment

Your email address will not be published.