How to install Vertica Analytic Database
Download Vertica
Download Vertica RPM from the site http://www.vertica.com/
For this tutorial we are using Community Edition 6.1.2-0, which is the latest version so far. This version has bug, that need to fixed manually in multiple nodes install, the bug will be fixed in next versions.
Server preparation
Before installing Vertica few things must be done on the server and some of the are optional(but strongly suggested). All this steps must be done on all nodes if installing multi node solution.
Must configurations
- Install Linux OS - we are using Centos 6.4 for this tutorial
- Check that the server is has at least 1 GB RAM free(minimum for install, we suggest to have more memory for normal usage)
[root@vertica01 ~]# free -m
total used free shared buffers cached
Mem: 1877 465 1412 0 14 343
-/+ buffers/cache: 106 1770
Swap: 3039 0 3039
- Server need at least 2GB of swap
[root@vertica01 ~]# free -m
total used free shared buffers cached
Mem: 1877 465 1412 0 14 343
-/+ buffers/cache: 106 1770
Swap: 3039 0 3039
- Disable SELinux.
[root@vertica01 ~]# vi /etc/sysconfig/selinux
SELINUX=disabled
- Disable firewall (for multiple nodes install). If you system must have firewall ensure that this ports are open and not used:
Port
|
Protocol
|
Description
|
22
|
TCP
|
SSH
|
5433
|
TCP
|
Vertica Client
|
5433
|
UDP
|
Vertica Spread
|
5434
|
TCP
|
Vertica cluster communication
|
5444
|
TCP
|
Vertica Management Console
|
5450
|
TCP
|
Vertica Management Console
|
4803
|
TCP/UDP
|
Spread
|
4804
|
UDP
|
Spread
|
4805
|
UDP
|
Spread
|
- Verify that pam_limits.so module is configured for su command
[root@vertica01 ~]# vi /etc/pam.d/su
session required pam_limits.so
- For multiple node install add nodes names and IP’s to /etc/hosts file for name resolution. Add it even if you’re using DNS server for faster resolution. Also add master host name and IP to it self, Vertica is not checking on what host it’s running
[root@vertica01 ~]# vi /etc/hosts
192.168.122.01 vertica01
192.168.122.02 vertica02
192.168.122.03 vertica03
- For multiple node install make sure that root and DB management (default is dbadmin) are able to ssh between the nodes without a password. The root user ssh is used at the install only
Suggested configurations
- Configure and start NTP service
- Disable CPU Frequency Scaling in BIOS
- Configure I/O scheduler to deadline, noop or cfq
[root@vertica01 ~]# vi /boot/grub/grub.conf
Add elevator=<name> to kernel line
Single Node
Once the system is ready for installation you may install the RPM you have downloaded from the Vertica
[root@vertica01 ~]# rpm -ivh vertica-6.1.2-0.x86_64.RHEL5.rpm
RPM will add new directory under /opt with Vertica installation and management scripts. For the basic install run the script with one parameter that indicates DB admin user, with this parameter the script will try to recreate it:
[root@vertica01 ~]# /opt/vertica/sbin/install_vertica -u dbadmin
When the script finish to run you have Vertica installed on single node. You can use admin tools to create new DB:
[root@vertica01 ~]# /opt/vertica/bin/adminTools
Multiple Nodes
Multi node installation uses the same script that is ran only on one server(master) that will install Vertica on the rest of the nodes. To install installation script, install same RPM
[root@vertica01 ~]# rpm -ivh vertica-6.1.2-0.x86_64.RHEL5.rpm
The script will copy the RPM to the rest of the nodes so it's important to have passwordless SSH between the nodes for root user and configure all the nodes in the /etc/hosts. Run the installation script with few more parameters:
-s – nodes list comma-seperated
-r – path to the Vertica installation RPM file, to install on rest of the nodes
-u – DB admin user with password less SSH login between the nodes
-T – point-to-point nodes communication, used when nodes are not on the same subnet or when nodes are virtual machines
[root@vertica01 ~]# /opt/vertica/sbin/install_vertica -s node01,node02,node03 -r ~/vertica-6.1.2-0.x86_64.rpm -u dbadmin -T
Installation will take more time as it installing on couple of nodes and doing system and networks tests. Once it done you can create new DB and check that it’s working on all the nodes
[root@vertica01 ~]# /opt/vertica/bin/adminTools
Common problems
Installation bug – installation fails with Error: invalid literal for int() with base 10: '8%'
This is bug in Centos/Red Hat installation confirmed by Vertica, current solution is manual fix. Open /opt/vertica/oss/python/lib/python2.7/site-packages/vertica/network/SSH.py on line 1982 change
df /tmp | tail -1 | awk '{print $4}'
to
df -P /tmp | tail -1 | awk '{print $4}'
Network tests fails without error:
Installation will not succeed if the SSH fails between the nodes. If you never SSH from vertica01 to vertica02 the system will ask to add it's fingerprints to know_hosts file, but installation script can't do it by it's own. There are two workarounds for this:
- Make manual SSH between all the nodes for the first time
- Change or add StrictHostKeyChecking no to /etc/ssh/ssh_config on all the nodes, this will cause SSH not to check server fingerprints with known_hosts. You can read more about here: http://linux.die.net/man/5/ssh_config
Provided by: ForthScale systems, scalable infrastructure experts
1 comment:
Thanks....this helped me a lot
Post a Comment