Installing Zabbix cluster in Azure Cloud on CentOS VMs

Installing Zabbix cluster in Azure Cloud on CentOS VMs

Z for Zabbix

As we all know, monitoring is a huge part of operations and, in my case, DevOps activities. It's one of the elephants, on which any software maintenance rely on, I would say.

Back to the ground, Azure Cloud is a nice platform, but still, it doesn't have "real" monitoring solution. Not for VMs, nor for any other it's resources. You can say - it has OMS and Azure Monitor, but I've encountered so many limitations and issues with it... Well, my point - is to develop your own bicycle monitoring system to match your environments and needs. 

Today I'll try to introduce my solution which is: highly avaliable, free and cloud-based.

High-available Zabbix architecture in Azure

Monitoring system should be highly available, because it will tell you that your environment is down.

Here is diagram. Notice that I'm using Internal Load Balancer as end-point for VMs. That's because of VNet implementation in Azure - IP address of the Load Balancer should be equal to the Zabbix cluster IP address.

Zabbix cluster will operate in active-passive mode and will based on following packages: pacemaker, corosync, pcsd.

High-available Zabbix architecture in Azure

 Also note the External Load Balancer - it will route requests from your office network to the Zabbix front end.

And - last, but not least, the database. I'm using Azure Database for MySQL servers as back end. It's quite flexible in terms of performance, and the price is affordable.

Configuring Zabbix cluster

Assuming you've set up your VM and other stuff, let's move on to configuration of our servers. Remember, we're deploying CentOS images.

Following script should be run on both servers. This article actually was prepared in the end of 2017, so versions of packages may be changed at the time you're reading this. I suggest you will run script line-by-line to eliminate any problems. Also, I assume there is a bunch of config files related to Apache and Zabbix placed into home directory of your user, so I'll place sample content of such configs in the comments.

PLEASE READ COMMENTS BELOW !!!

 

###
### Create users
###
# Users:
#   - zabbix        To run Zabbix Server
#   - zabbixagent   To run Zabbix Agent on CentOS server
groupadd zabbix
useradd -m -s /bin/bash -g zabbix zabbixagent
useradd -m -s /bin/bash -g zabbix zabbix

###
### Disable SELINUX
###
# SELINUX is a security subsystem, which is impossible to use with cluster, hence, disabling it.
sed -i 's/^SELINUX=enforcing.*/SELINUX=disabled/' /etc/sysconfig/selinux
sed -i 's/^SELINUX=enforcing.*/SELINUX=disabled/' /etc/selinux/config
setenforce 0

###
### Update all packages prior to initial installation.
###
# This must be done ONCE AND ONLY after server deployment. All other updates MUST be performed manually - and - granually.
yum update -y

###
###  Install mandatory packages
###
# Download and import repository to be able to install "python-ldap", "python-pyzabbix" and "python-docopt" packages.
# These packages are prerequsites for "zabbix-ldap-sync".
wget http://dl.fedoraproject.org/pub/epel/7/x86_64/Packages/e/epel-release-7-11.noarch.rpm
rpm -Uvh epel-release-7-11.noarch.rpm
yum install python-ldap python-docopt python-pyzabbix -y

# Install other prerequsites and tools.
yum install httpd mc policycoreutils-python telnet php php-cli php-common php-devel php-pear php-gd php-mbstring php-mysql php-xml mod_ssl openssl git swaks -y

# Enable and configure auto-start Apache web-server.
systemctl enable httpd
systemctl start httpd

# Install Powershell and Azure CLI 2.0
curl https://packages.microsoft.com/config/rhel/7/prod.repo | sudo tee /etc/yum.repos.d/microsoft.repo
yum install -y powershell
rpm --import https://packages.microsoft.com/keys/microsoft.asc
sh -c 'echo -e "[azure-cli]\nname=Azure CLI\nbaseurl=https://packages.microsoft.com/yumrepos/azure-cli\nenabled=1\ngpgcheck=1\ngpgkey=https://packages.microsoft.com/keys/microsoft.asc" > /etc/yum.repos.d/azure-cli.repo'
yum check-update
yum install azure-cli -y

###
### Install ODBC support and drivers
###
yum install -y unixODBC unixODBC-devel mssql-tools.x86_64

###
### Configure Apache
###
# Modify PHP settings
cp /etc/php.ini /etc/php.ini_initial
sed -i 's/^max_execution_time.*/max_execution_time=600/' /etc/php.ini
sed -i 's/^max_input_time.*/max_input_time=600/' /etc/php.ini
sed -i 's/^memory_limit.*/memory_limit=256M/' /etc/php.ini
sed -i 's/^post_max_size.*/post_max_size=32M/' /etc/php.ini
sed -i 's/^upload_max_filesize.*/upload_max_filesize=16M/' /etc/php.ini
sed -i "s/^\;date.timezone.*/date.timezone=\'Europe\/London\'/" /etc/php.ini

###
### Install Zabbix
###
# !!!Need to check if URL for Zabbix repository is correct!!!
# This is important because later we will need to import database script.
rpm --import http://repo.zabbix.com/RPM-GPG-KEY-ZABBIX
rpm -ivh http://repo.zabbix.com/zabbix/3.4/rhel/7/x86_64/zabbix-release-3.4-3.el7.centos.noarch.rpm
yum install zabbix-server-mysql zabbix-web-mysql zabbix-agent zabbix-get zabbix-sender zabbix-java-gateway -y

###
### Configure Zabbix Server
###
# Modify settings of Zabbix frontend to connect to database. Here is the example of it's contents.
#    <?php
#    // Zabbix GUI configuration file.
#    global $DB;

#    $DB['TYPE']     = 'MYSQL';
#    $DB['SERVER']   = 'mysqlservername.mysql.database.azure.com';
#    $DB['PORT']     = '0';
#    $DB['DATABASE'] = 'zabbixdb';
#    $DB['USER']     = 'zabbixuser@mysqlservername';
#    $DB['PASSWORD'] = '******';

#    // Schema name. Used for IBM DB2 and PostgreSQL.
#    $DB['SCHEMA'] = '';
#
#    $ZBX_SERVER      = '10.1.1.200';
#    $ZBX_SERVER_PORT = '10051';
#    $ZBX_SERVER_NAME = 'Zabbix Server Name';
#
#    $IMAGE_FORMAT_DEFAULT = IMAGE_FORMAT_PNG;
#
cp /etc/zabbix/web/zabbix.conf.php /etc/zabbix/web/zabbix.conf.php_initial
cp /home/zabbixadmin*/zabbix.conf.php /etc/zabbix/web/zabbix.conf.php

#Modify Zabbix config
#Zabbix Server config file is huge and will vary from one environment to other, so I'll provide just the most important things.
# ListenIP=0.0.0.0 - this is the default value and should remain as is. cp /etc/zabbix/zabbix_server.conf /etc/zabbix/zabbix_server.conf_initial yes | cp /home/zabbixadmin*/zabbix_server.conf /etc/zabbix/zabbix_server.conf ### ### Configure Zabbix Cluster ### # Modify hosts file for Zabbix Cluster. This will ensure proper name resolution. echo "10.180.10.18 zabbixserver1" >> /etc/hosts echo "10.180.10.19 zabbixserver2" >> /etc/hosts # Install cluster components yum install pacemaker pcs -y # Set password for cluster user echo "*****" | passwd "hacluster" --stdin # Enable and start cluster services systemctl start pcsd.service systemctl enable pcsd.service systemctl enable corosync.service systemctl enable pacemaker.service # All other cluster configuration must be done only on one cluster nodes. # Script provided in zabbix2.sh. ### ### Configure Zabbix Agent ### # Configure permissions for Zabbix Agent (zabbixagent user) # Ensure that the whole zabbix group has permission to put .pid and .log files. chmod g+rw /var/run/zabbix/ chmod g+rw /var/log/zabbix/ # Two strings of code above will configure folder permissions, but they will be overwritten after server's reboot. # Below we instruct CentOS to configure such permissions every time server starts. rm -f /usr/lib/tmpfiles.d/zabbix-agent.conf rm -f /usr/lib/tmpfiles.d/zabbix-server.conf touch /usr/lib/tmpfiles.d/zabbix-agent.conf touch /usr/lib/tmpfiles.d/zabbix-server.conf echo "d /run/zabbix 0775 zabbix zabbix - -" >> /usr/lib/tmpfiles.d/zabbix-agent.conf echo "d /run/zabbix 0775 zabbix zabbix - -" >> /usr/lib/tmpfiles.d/zabbix-server.conf # Modify Zabbix Agent config
# Note the IPs: 10.1.1.200 - cluster IP address, equal to IP of the Internal Load Balancer, also you will put this IP into your agents.
# 10.1.1.18 and 10.1.1.19 - real servers' IPs cp /etc/zabbix/zabbix_agentd.conf /etc/zabbix/zabbix_agentd.conf_initial sed -i "s/^Server=127\.0\.0\.1$/Server=10.1.1.200,10.1.1.18,10.1.1.19/" /etc/zabbix/zabbix_agentd.conf sed -i "s/^ServerActive=127\.0\.0\.1$/ServerActive=10.1.1.200/" /etc/zabbix/zabbix_agentd.conf sed -i "s/^Hostname=Zabbix server$/Hostname=$HOSTNAME/" /etc/zabbix/zabbix_agentd.conf sed -i "s/^# User=zabbix$/User=zabbixagent/" /etc/zabbix/zabbix_agentd.conf sed -i "s/^# RefreshActiveChecks=120$/RefreshActiveChecks=60/" /etc/zabbix/zabbix_agentd.conf # Enable Zabbix Agent service systemctl enable zabbix-agent

 After you finish here, dont forget to create database, user, assign permissions for him. And import default schema of Zabbix Server database.

 Ok, we've run this script on both servers. Now we're to configure cluster.

###
### This script must be invoked only on ONE server AND ONLY after installing pacemaker and corosync
###
# Set up authentication between nodes
pcs cluster auth zabbixserver1 zabbixserver2 -u hacluster -p '*****'

# Create cluster
pcs cluster setup --name zabbixserver zabbixserver1 zabbixserver2 --force

# Start cluster service on both nodes
pcs cluster start --all

# Disable stonith
pcs property set stonith-enabled=false

# Disable quorum
pcs property set no-quorum-policy=ignore

# Enable infinite restart of cluster's services (virtual IP and Zabbix Server) if they're down
pcs property set start-failure-is-fatal=false

# Create cluster virtual IP
pcs resource create cluster_vip ocf:heartbeat:IPaddr2 ip=10.1.1.200

# Create clustered Zabbix service
pcs resource create zabbix_server systemd:zabbix-server op monitor interval=10s

# Ensure that virtual IP and Zabbix service will run on the same node
pcs constraint colocation add zabbix_server cluster_vip

# Ensure that virtual IP will be allocated first and only after that - Zabbix service will be started
pcs constraint order cluster_vip then zabbix_server

# Ensure that Zabbix service will prefer "zabbixserver1"
pcs constraint location cluster_vip prefers zabbixserver1
pcs constraint location zabbix_server prefers zabbixserver1

 Now you can reboot both servers. Commands below are useful for diagnostics and operations.

These will start\stop all resources in the cluster.

pcs cluster start --all
pcs cluster stop --all

  This will display status of the cluster.

pcs status

 That's it for now.

 

Tags: linux (en), azure (en), zabbix (en)

Print