Problem scenario
You want to install Apache Parquet on the Hadoop namenode. What do you do?
Solution
Prerequisite
This assumes that you have installed Hadoop. For directions, see this posting.
Procedure
Run these commands:
sudo su -
apt-get -y install pip
pip install thriftpy
pip install snappy
exit
sudo apt-get -y install libsnappy-dev thrift-compiler
curl https://pypi.python.org/packages/74/b5/bc459aab0566fc3cf3397467922c37411ab6e3361bab9e0ca165e1089ce8/parquet-1.2.tar.gz#md5=05aacec0620ac63ecd7dd77bf7fb9fee > /tmp/parquet-1.2.tar.gz
sudo cp /tmp/parquet-1.2.tar.gz /opt/
cd /opt
sudo tar -xvf parquet-1.2.tar.gz
cd parquet-1.2
sudo python setup.py build
sudo python setup.py install