Saturday, August 10, 2013

Vertica installation tutorial

        

How to install Vertica Analytic Database

Download Vertica

Download Vertica RPM from the site http://www.vertica.com/
For this tutorial we are using Community Edition 6.1.2-0, which is the latest version so far. This version has bug, that need to fixed manually in multiple nodes install, the bug will be fixed in next versions.

Server preparation

Before installing Vertica few things must be done on the server and some of the are optional(but strongly suggested). All this steps must be done on all nodes if installing multi node solution.        

Must configurations

  • Install Linux OS - we are using Centos 6.4 for this tutorial
  • Check that the server is has at least 1 GB RAM free(minimum for install, we suggest     to have more memory for normal usage)
[root@vertica01 ~]# free -m
            total       used       free     shared    buffers     cached
Mem:          1877        465       1412          0         14        343
-/+ buffers/cache:        106       1770
Swap:         3039          0       3039
  • Server need at least 2GB of swap
[root@vertica01 ~]# free -m
            total       used       free     shared    buffers     cached
Mem:          1877        465       1412          0         14        343
-/+ buffers/cache:        106       1770
Swap:         3039          0       3039
  • Disable SELinux.
[root@vertica01 ~]# vi /etc/sysconfig/selinux
SELINUX=disabled
  • Disable firewall (for multiple nodes install). If you system must have firewall ensure that this ports are open and not used:







Port


Protocol


Description


22


TCP


SSH


5433


TCP


Vertica Client


5433


UDP


Vertica Spread


5434


TCP


Vertica cluster communication


5444


TCP


Vertica Management Console


5450


TCP


Vertica Management Console


4803


TCP/UDP


Spread


4804


UDP


Spread


4805


UDP


Spread






  • Verify that pam_limits.so module is configured for su command
[root@vertica01 ~]# vi /etc/pam.d/su
session required pam_limits.so

  • For multiple node install add nodes names and IP’s to /etc/hosts file for name resolution. Add it even if you’re using DNS server for faster resolution. Also add master host name and IP to it self, Vertica is not checking on what host it’s running
[root@vertica01 ~]# vi /etc/hosts
192.168.122.01  vertica01
192.168.122.02  vertica02
192.168.122.03  vertica03


  • For multiple node install make sure that root and DB management (default is     dbadmin) are able to ssh between the nodes without a password. The root user ssh is used at the install only

Suggested configurations

  • Configure and     start NTP service
  • Disable CPU Frequency Scaling in BIOS
  • Configure I/O scheduler to deadline, noop or cfq
[root@vertica01 ~]# vi /boot/grub/grub.conf
Add elevator=<name> to kernel line

Single Node

Once the system is ready for installation you may install the RPM you have downloaded from the Vertica
[root@vertica01 ~]# rpm -ivh vertica-6.1.2-0.x86_64.RHEL5.rpm
RPM will add new directory under /opt with Vertica installation and management scripts. For the basic install run the script with one parameter that indicates DB admin user, with this parameter the script will try to recreate it:
[root@vertica01 ~]# /opt/vertica/sbin/install_vertica -u dbadmin
When the script finish to run you have Vertica installed on single node. You can use admin tools to create new DB:
[root@vertica01 ~]# /opt/vertica/bin/adminTools

Multiple Nodes

Multi node installation uses the same script that is ran only on one server(master) that will install Vertica on the rest of the nodes. To install installation script, install same RPM
[root@vertica01 ~]# rpm -ivh vertica-6.1.2-0.x86_64.RHEL5.rpm
The script will copy the RPM to the rest of the nodes so it's important to have passwordless SSH between the nodes for root user and configure all the nodes in the /etc/hosts. Run the installation script with few more parameters:
-s     – nodes list comma-seperated
-r     – path to the Vertica installation RPM file, to install on rest of the nodes
-u    – DB admin user with password less SSH login between the nodes
-T     – point-to-point nodes communication, used when nodes are not on the same subnet or when nodes are virtual machines
[root@vertica01 ~]# /opt/vertica/sbin/install_vertica -s node01,node02,node03 -r ~/vertica-6.1.2-0.x86_64.rpm -u dbadmin -T
Installation will take more time as it installing on couple of nodes and doing system and networks tests. Once it done you can create new DB and check that it’s working on all the nodes
[root@vertica01 ~]# /opt/vertica/bin/adminTools

Common problems

Installation bug – installation fails with Error: invalid literal for int() with base 10: '8%'    
This is bug in Centos/Red Hat installation confirmed by Vertica, current solution is manual fix. Open /opt/vertica/oss/python/lib/python2.7/site-packages/vertica/network/SSH.py on line 1982 change
df /tmp | tail -1 | awk '{print $4}'
to
df -P /tmp | tail -1 | awk '{print $4}'

Network tests fails without error:
Installation will not succeed if the SSH fails between the nodes. If you never SSH from vertica01 to vertica02 the system will ask to add it's fingerprints to know_hosts file, but installation script can't do it by it's own. There are two workarounds for this:
  • Make manual     SSH between all the nodes for the first time
  • Change or add StrictHostKeyChecking no to /etc/ssh/ssh_config on all the nodes, this will cause SSH not to check server fingerprints with known_hosts. You can read more about here: http://linux.die.net/man/5/ssh_config




Provided by: ForthScale systems, scalable infrastructure experts

1 comment:

Lalit Jha said...

Thanks....this helped me a lot