SRE cheatsheet: monitoring

Friday, September 14, 2012

Log aggregation with Logstash, Elasticsearch, Graylog 2 and more Part One

Setup, problem and solution design.
Purpose of log aggregation is to develop single point of access for servers data (in our case nginx web servers).
We have a lot of web servers writing off huge amount of log and no real way to understand what is going on there. Initial solution was to have each systems write a local log file with a Munin agent with custom Perl parser transferring data to Munin server there it was displayed as an RRDtool graph. It worked, however servers themselves generated a lot of logged data making it impossible to parse close to real time forcing us to drop out significant amount of data.

After making a small internet research and due to budget constrains we decided to go with open source tools only. Those applications however still had to be high volume, high load, scalable and big data supporting.
We have decided to setup a dedicated loghost and ship all the data to it parsing it on spot to a needed results. Another thing our proposed solution took into consideration was future log indexing for both technical and BI search ability /readability.

Proposed solution consisted of:

Logstash - Tool for managing events and logs. You can use it to collect logs, parse them, and store them for later use (like, for searching) .
Elasticsearch - Distributed, RESTful, Search Engine
Graylog 2 - Software to run analytics, alerting, monitoring and powerful searches over your whole log.
As a bonus, Logstash gave us possibility to export events to a monitoring system or support shift management.

next: implementation of log aggregation

Munin monitoring tool
Logstash
Elasticsearch
Graylog 2

Provided by: ForthScale systems, scalable infrastructure experts

Wednesday, June 06, 2012

Couple of simple steps to debug munin plugins
First check if munin recognize the plugin.
Execute munin plugin in regular mode:
munin-run plugin_name
you will get output in format of:
some.value XX

then execute munin plugin in configuration mode:
munin-run plugin_name config

you will get output in format of:

graph_title Great Plugin
graph_args --upper-limit 100 -l 0
graph_vlabel %
graph_category some_category
graph_info This is the best munin plugin ever.
something.label LABEL

If you have any problems they can be related to the plugin having a permission problem.

Next step is to test the plugin connection via port 4949
Run a telnet on munin node port 4949

telnet munin-node.example.com 4949

You will get output in format of:

Trying munin-node.example.com...
Connected to munin-node.example.com.
Escape character is '^]'.
# munin node at munin-node.example.com

then type in console:

fetch plugin_name
or
fetch plugin_name config

It will output something similar to munin-run.

Then plugin works with munin-run command but not through telnet execution, you most likely to have a local PATH problem. Tip: Set env.PATH for the plugin in the plugin's environment file.

Provided by: SiQ systems, Cloud experts

Friday, September 14, 2012

Log aggregation with Logstash, Elasticsearch, Graylog 2 and more Part One

Wednesday, June 06, 2012

solving error: Your current user or role does not have access to Kubernetes objects on this EKS cluster.