Centreon “log” table getting insanely huge

Posted on 2016/12/08 by Le_Vert

Hi there,

I’m currently migrating some old Centreons 2.5/2.6 with Nagios/NDO to Centreon 2.7 with Centreon-Engine/Centreon-broker but I’m experiencing some issues with insanely large MySQL tables to migrate:

root@server:~# ls -lah /var/lib/mysql/centreon_storage/log.*
-rw-rw---- 1 mysql mysql  13K Apr 15  2015 /var/lib/mysql/centreon_storage/log.frm
-rw-rw---- 1 mysql mysql  16G Dec  8 09:18 /var/lib/mysql/centreon_storage/log.MYD
-rw-rw---- 1 mysql mysql 6.0G Dec  8 09:18 /var/lib/mysql/centreon_storage/log.MYI

root@server:~# ls -lah /var/lib/mysql/centreon_storage/log.*

-rw-rw---- 1 mysql mysql 13K Apr 15 2015 /var/lib/mysql/centreon_storage/log.frm

-rw-rw---- 1 mysql mysql 16G Dec 8 09:18 /var/lib/mysql/centreon_storage/log.MYD

-rw-rw---- 1 mysql mysql 6.0G Dec 8 09:18 /var/lib/mysql/centreon_storage/log.MYI

This table contains old Nagios logs and according to a forum post it’s being use when clicking on Monitoring > Event logs and is used when doing reporting actions.
Fair enough, I don’t mind anyway of what happened last year, reporting is done on a monthly basis.

So let’s see what is the oldest entry there:

root@server:~# echo 'SELECT FROM_UNIXTIME(ctime) FROM log ORDER BY ctime ASC LIMIT 1' | mysql -N centreon_storage
2015-06-14 19:19:00

1 2	root@server:~# echo 'SELECT FROM_UNIXTIME(ctime) FROM log ORDER BY ctime ASC LIMIT 1' \| mysql -N centreon_storage 2015-06-14 19:19:00

Sadly, it’s using unix timestamp and not MySQL datetime format, so we’ll have to do some conversion to get it humanely-readable.
To be honest, when I started the cleanup the oldest entry was even older.

I’m not sure if Centreon is supposed to clean this out. I guess it does, probably using one of the various cron jobs installed by Centreon but according to my experience this is highly borked and can surely lead to uncleaned entries.

Let’s validate we’re not going to delete bad entries by running a select first

root@server:~# echo 'SELECT FROM_UNIXTIME(ctime), ctime, output FROM log WHERE ctime < UNIX_TIMESTAMP("2016-06-08 00:00:00") LIMIT 5' | mysql -N centreon_storage
2015-06-14 19:19:00	1434309540	Max concurrent service checks (200) has been reached.  Nudging server1:traffic_eth0 by 11 seconds...
2015-06-14 19:19:00	1434309540	Max concurrent service checks (200) has been reached.  Nudging server1:Ping by 7 seconds...
2015-06-14 19:19:00	1434309540	Max concurrent service checks (200) has been reached.  Nudging server2:Memory by 12 seconds...
2015-06-14 19:19:00	1434309540	Max concurrent service checks (200) has been reached.  Nudging server3:Processor by 6 seconds...
2015-06-14 19:19:01	1434309541	Max concurrent service checks (200) has been reached.  Nudging server3:Memory by 10 seconds...

root@server:~# echo 'SELECT FROM_UNIXTIME(ctime), ctime, output FROM log WHERE ctime < UNIX_TIMESTAMP("2016-06-08 00:00:00") LIMIT 5' | mysql -N centreon_storage

2015-06-14 19:19:00 1434309540 Max concurrent service checks (200) has been reached. Nudging server1:traffic_eth0 by 11 seconds...

2015-06-14 19:19:00 1434309540 Max concurrent service checks (200) has been reached. Nudging server1:Ping by 7 seconds...

2015-06-14 19:19:00 1434309540 Max concurrent service checks (200) has been reached. Nudging server2:Memory by 12 seconds...

2015-06-14 19:19:00 1434309540 Max concurrent service checks (200) has been reached. Nudging server3:Processor by 6 seconds...

2015-06-14 19:19:01 1434309541 Max concurrent service checks (200) has been reached. Nudging server3:Memory by 10 seconds...

Looks okay. Be sure to compare “ctime” and the converted date and play with the WHERE condition so you can be sure it’s really working properly.
For instance, if you swap “2016-06-08 00:00:00” with “2015-06-14 19:19:01” the last line should disappear.

Once you’ve confirmed your not deleting anything useful, go ahead with a DELETE statement:

root@server:~# time echo 'DELETE FROM log WHERE ctime < UNIX_TIMESTAMP("2016-06-08 00:00:00") LIMIT 1000000' | mysql -N centreon_storage

real	0m51.884s
user	0m0.000s
sys	0m0.008s

root@server:~# time echo 'DELETE FROM log WHERE ctime < UNIX_TIMESTAMP("2016-06-08 00:00:00") LIMIT 1000000' | mysql -N centreon_storage

real 0m51.884s

user 0m0.000s

sys 0m0.008s

I decided to use LIMIT here, to avoid loading too much the server for an unknown time. “time” command has been added here so you can have a measurement of the time required to delete 1 000 000 entries (52s here).

You can now recheck the oldest log you have now:

root@server:~# echo 'SELECT FROM_UNIXTIME(ctime) FROM log ORDER BY ctime ASC LIMIT 1' | mysql -N centreon_storage
2015-06-19 21:29:54

1 2	root@server:~# echo 'SELECT FROM_UNIXTIME(ctime) FROM log ORDER BY ctime ASC LIMIT 1' \| mysql -N centreon_storage 2015-06-19 21:29:54

It seems it’ll be a long way to go before getting to june, 2016 😉

Bonus:
All in one command, so you just have too check your term when coming back from the coffee machine to see its progress:

root@server:~# while true; do echo 'DELETE FROM log WHERE ctime < UNIX_TIMESTAMP("2016-06-08 00:00:00") LIMIT 100000' | mysql -N centreon_storage && echo 'SELECT FROM_UNIXTIME(ctime) FROM log ORDER BY ctime ASC LIMIT 1' | mysql -N centreon_storage && sleep 2; done
2015-06-21 01:47:32
2015-06-21 10:59:55
2015-06-21 19:57:21
2015-06-22 04:58:59
[...]

root@server:~# while true; do echo 'DELETE FROM log WHERE ctime < UNIX_TIMESTAMP("2016-06-08 00:00:00") LIMIT 100000' | mysql -N centreon_storage && echo 'SELECT FROM_UNIXTIME(ctime) FROM log ORDER BY ctime ASC LIMIT 1' | mysql -N centreon_storage && sleep 2; done

2015-06-21 01:47:32

2015-06-21 10:59:55

2015-06-21 19:57:21

2015-06-22 04:58:59

[...]

When the loop keeps outputing the same date, it means DELETE is not removing anything anymore, time to hit ctrl+c !

Let’s have a look to the table size now:

root@server:~# ls -lah /var/lib/mysql/centreon_storage/log.*
-rw-rw---- 1 mysql mysql  13K Apr 15  2015 /var/lib/mysql/centreon_storage/log.frm
-rw-rw---- 1 mysql mysql  16G Dec  8 10:25 /var/lib/mysql/centreon_storage/log.MYD
-rw-rw---- 1 mysql mysql 6.0G Dec  8 10:25 /var/lib/mysql/centreon_storage/log.MYI

root@server:~# ls -lah /var/lib/mysql/centreon_storage/log.*

-rw-rw---- 1 mysql mysql 13K Apr 15 2015 /var/lib/mysql/centreon_storage/log.frm

-rw-rw---- 1 mysql mysql 16G Dec 8 10:25 /var/lib/mysql/centreon_storage/log.MYD

-rw-rw---- 1 mysql mysql 6.0G Dec 8 10:25 /var/lib/mysql/centreon_storage/log.MYI

Uh ?

Thanks to Google, it seems I need to run “OPTIMIZE TABLE” to reclaim the freed disk space. But there’re two thing I know about optimize and huge tables like this one:
* It will write lock the table
* It will last for ages (I mean up to *days*)

Let’s try to make this process a bit quicker… Ever heard about eatmydata ?
It’ll will disable fsync() system call, giving you some kind of write cache on steroids; drawbacks: you’re not protected anymore from file corruption in case of a crash.

For now, we’ll take the risk and hack mysql init script to run with eatmydata:

root@server:~# sed -i 's!/usr/bin/mysqld_safe > /dev/null!/usr/bin/eatmydata /usr/bin/mysqld_safe > /dev/null!' /etc/init.d/mysql
root@server:~# systemctl --system daemon-reload
root@server:~# systemctl restart mysql

root@server:~# sed -i 's!/usr/bin/mysqld_safe > /dev/null!/usr/bin/eatmydata /usr/bin/mysqld_safe > /dev/null!' /etc/init.d/mysql

root@server:~# systemctl --system daemon-reload

root@server:~# systemctl restart mysql

It’s pretty hard to figure out if the trick worked or not. Actually, it’ll set a LD_PRELOAD env variable to override libc calls with the unprotected ones.
Thanks to /proc, we can check this by looking at the mysqld PID attributes

root@server:~# cat /proc/`ps aux | grep /usr/sbin/mysql | grep -v grep | awk '{ print $2 }'`/environ | tr '\0' '\n'

1	root@server:~# cat /proc/`ps aux \| grep /usr/sbin/mysql \| grep -v grep \| awk '{ print $2 }'`/environ \| tr '\0' '\n'

(basically, I get /usr/sbin/mysql pid which is the main MySQL server process and check /proc//environ)

If it worked, you should find a line like this:

LD_PRELOAD=/usr/lib/libeatmydata/libeatmydata.so /usr/lib/libeatmydata/libeatmydata.so

1	LD_PRELOAD=/usr/lib/libeatmydata/libeatmydata.so /usr/lib/libeatmydata/libeatmydata.so

We can now run optimize on this table:

root@server:~# echo "OPTIMIZE TABLE log" | mysql centreon_storage

1	root@server:~# echo "OPTIMIZE TABLE log" \| mysql centreon_storage

You can see it processing by running:

watch -n 2 ls -lah /var/lib/mysql/centreon_storage/log.*

1	watch -n 2 ls -lah /var/lib/mysql/centreon_storage/log.*

-rw-rw---- 1 mysql mysql  13K Dec  8 14:58 /var/lib/mysql/centreon_storage/log.frm
-rw-rw---- 1 mysql mysql  11G Dec  8 16:45 /var/lib/mysql/centreon_storage/log.MYD
-rw-rw---- 1 mysql mysql 3.1G Dec  8 16:45 /var/lib/mysql/centreon_storage/log.MYI
-rw-rw---- 1 mysql mysql 710M Dec  8 16:53 /var/lib/mysql/centreon_storage/log.TMM

-rw-rw---- 1 mysql mysql 13K Dec 8 14:58 /var/lib/mysql/centreon_storage/log.frm

-rw-rw---- 1 mysql mysql 11G Dec 8 16:45 /var/lib/mysql/centreon_storage/log.MYD

-rw-rw---- 1 mysql mysql 3.1G Dec 8 16:45 /var/lib/mysql/centreon_storage/log.MYI

-rw-rw---- 1 mysql mysql 710M Dec 8 16:53 /var/lib/mysql/centreon_storage/log.TMM

Now you will have to wait a couple of hours for the optimization to complete…

Nginx SSL vhosting using Server Name Indication

Posted on 2016/12/01 by Le_Vert

Here is the issue: I have a tcp/443 DNAT to a specific machine running some specific HTTPS app that does not work behind a reverse proxy.

Obviously, I want to run others application on 443 and I’m not allowed to get any other port.

Sounds pretty bad, right ?
Actually, there’s a way out and it’s called “nginx-is-so-fuckin-powerfull” 😉

As you may know, a long time ago a feature has been added to TLS which is called “Server Name Indication”. Before this it was impossible to serve multiple virtual hosts on a single address because SSL session was negociated before the client actually sends the requested vhost name.

With SNI, there’s a quick chat between your HTTPS server and the remote browser, something like:

- Client: hey I'm an HTTPS client
- Server: Ok, which server ?
- Client: blog.le-vert.net
- Server: Serving blog.le-vert.net certificate...
- Client: #*/-[}$$ (start talking SSL)

- Client: hey I'm an HTTPS client

- Server: Ok, which server ?

- Client: blog.le-vert.net

- Server: Serving blog.le-vert.net certificate...

- Client: #*/-[}$$ (start talking SSL)

Ok that’s probably not really accurate but who cares about what exactly happens. The thing is: there’s a routing capability before serving the SSL certificate and we know the requested domain name at this point; and guess what: NGINX offers routing possibility using SNI name !!

First thing… You need a really really new NGINX version (1.11.5), but if your distro doesn’t have it you can use NGINX repositories.
Second, you must understand that very old clients may not use SNI. If it doesn’t it will hit the NGINX default vhost. So make sure to keep the old behavior as default, just in case.
Here is the client compatibility list for SNI: https://en.wikipedia.org/wiki/Server_Name_Indication
I leave it to you to decide if you care about handling Internet Explorer < 7. So let's configure NGINX correctly: You need to define a stream {} section on nginx.conf top, just like the http one.

stream {
    include /etc/nginx/stream.conf.d/*.conf;
}

stream {

include /etc/nginx/stream.conf.d/*.conf;

}

Of course, you need to disable default http/server to listen on port 443 (comment lines like "listen 443 ssl" in all your existing configuration). Now we'll create a stream server, which is a plain TCP proxy: In /etc/nginx/stream.conf.d/443.conf:

map $ssl_preread_server_name $name {
    default original_dest;
    new.hostname.com local_https;
}

upstream original_dest {
    server 1.2.3.4:443;
}

upstream local_https {
    server 127.0.0.1:8443;
}

log_format stream_routing '$remote_addr [$time_local] '
                          'with SNI name "$ssl_preread_server_name" '
                          'proxying to "$name" '
                          '$protocol $status $bytes_sent $bytes_received '
                          '$session_time';

server {
    listen 443;
    ssl_preread on;
    proxy_pass $name;
    access_log /var/log/nginx/stream_443.log stream_routing;
}

map $ssl_preread_server_name $name {

default original_dest;

new.hostname.com local_https;

}

upstream original_dest {

server 1.2.3.4:443;

}

upstream local_https {

server 127.0.0.1:8443;

}

log_format stream_routing '$remote_addr [$time_local] '

'with SNI name "$ssl_preread_server_name" '

'proxying to "$name" '

'$protocol $status $bytes_sent $bytes_received '

'$session_time';

server {

listen 443;

ssl_preread on;

proxy_pass $name;

access_log /var/log/nginx/stream_443.log stream_routing;

}

And that's it 😀 You can now create a new http/server instance on port 8443 to serve your different new https vhosts but I suggest starting with the default virtual host (/etc/nginx/conf.d/default.conf) by adding "listen 8443 ssl default_server" and some ssl cert and key directives. Here is a example of the stream_443.log:

192.168.0.100 [01/Dec/2016:11:16:53 +0100] with SNI name "" proxying to "original_dest" TCP 200 3135 1161 10.256
192.168.0.100 [01/Dec/2016:11:17:56 +0100] with SNI name "new.hostname.com" proxying to "local_https" TCP 200 1467 747 0.070
192.168.0.100 [01/Dec/2016:11:18:12 +0100] with SNI name "new.hostname.com" proxying to "local_https" TCP 200 16505 1365 16.178
192.168.0.100 [01/Dec/2016:11:18:15 +0100] with SNI name "local.server.hostname" proxying to "original_dest" TCP 200 2461 557 25.59

192.168.0.100 [01/Dec/2016:11:16:53 +0100] with SNI name "" proxying to "original_dest" TCP 200 3135 1161 10.256

192.168.0.100 [01/Dec/2016:11:17:56 +0100] with SNI name "new.hostname.com" proxying to "local_https" TCP 200 1467 747 0.070

192.168.0.100 [01/Dec/2016:11:18:12 +0100] with SNI name "new.hostname.com" proxying to "local_https" TCP 200 16505 1365 16.178

192.168.0.100 [01/Dec/2016:11:18:15 +0100] with SNI name "local.server.hostname" proxying to "original_dest" TCP 200 2461 557 25.59

Nice work NGINX, as usual ! Going further: There's just a little issue here: The real HTTPS on port 8443 will always see incoming IP address as 127.0.0.1. Howerver, there's an overhead called "proxy_protocol" that can help passing proxying related things between NGINX servers but my equipment running behind doesn't like this. So the idea here is to use proxy_protocol between my stream/443 and http/8443 instances and strip it when proxying to original_dest using a dummy stream server that does nothing else that popping out the proxy_protocol data and forwarding to the real server. Then I will restore remote_addr in http/8443. The new config file is now:

map $ssl_preread_server_name $name {
    default original_dest;
    new.hostname.com local_https;
}

upstream original_dest {
    # Forward to a dummy server to strip out proxy_protocol
    # Otherwise original_dest won't work
    server 127.0.0.1:8080;
}

upstream local_https {
    server 127.0.0.1:8443;
}

log_format stream_routing '$remote_addr [$time_local] '
                          'with SNI name "$ssl_preread_server_name" '
                          'proxying to "$name" '
                          '$protocol $status $bytes_sent $bytes_received '
                          '$session_time';
server {
    listen 443;
    ssl_preread on;
    proxy_pass $name;
    proxy_protocol on;
    access_log /var/log/nginx/stream_443.log stream_routing;
}

# Dummy server to strip out proxy_protocol before sending to original_dest
server {
    listen 8080 proxy_protocol ;
    proxy_pass 1.2.3.4:443;
}

map $ssl_preread_server_name $name {

default original_dest;

new.hostname.com local_https;

}

upstream original_dest {

# Forward to a dummy server to strip out proxy_protocol

# Otherwise original_dest won't work

server 127.0.0.1:8080;

}

upstream local_https {

server 127.0.0.1:8443;

}

log_format stream_routing '$remote_addr [$time_local] '

'with SNI name "$ssl_preread_server_name" '

'proxying to "$name" '

'$protocol $status $bytes_sent $bytes_received '

'$session_time';

server {

listen 443;

ssl_preread on;

proxy_pass $name;

proxy_protocol on;

access_log /var/log/nginx/stream_443.log stream_routing;

}

# Dummy server to strip out proxy_protocol before sending to original_dest

server {

listen 8080 proxy_protocol ;

proxy_pass 1.2.3.4:443;

}

In the http/8443 vhost, we set the following to restore original client IP address:

listen 8443 default_server proxy_protocol ssl;
set_real_ip_from 127.0.0.1/32;
real_ip_header proxy_protocol;

listen 8443 default_server proxy_protocol ssl;

set_real_ip_from 127.0.0.1/32;

real_ip_header proxy_protocol;

Nginx -_- Bonus stuff: I case you're having issue with SELinux (and you will, for instance it will deny NGINX to start a connection from port 8080 to a remote host), you can use the following to extract failures from audit.log and turn them into a permanent SELinux exception

tail -n 2 /var/log/audit/audit.log (you may want to get more or less lines, depending of what you see happening)
tail -n 2 /var/log/audit/audit.log |audit2allow -m nginx_proxy_connect (create a plain text SELinux rule, so you can see what's going to be done)
tail -n 2 /var/log/audit/audit.log |audit2allow -M nginx_proxy_connect (create the real SELinux rule)
semodule -i nginx_proxy_connect.pp (install the rule)

tail -n 2 /var/log/audit/audit.log (you may want to get more or less lines, depending of what you see happening)

tail -n 2 /var/log/audit/audit.log |audit2allow -m nginx_proxy_connect (create a plain text SELinux rule, so you can see what's going to be done)

tail -n 2 /var/log/audit/audit.log |audit2allow -M nginx_proxy_connect (create the real SELinux rule)

semodule -i nginx_proxy_connect.pp (install the rule)

blog.le-vert.net

No bullshit, only Linux stuff

Monthly Archives: December 2016

Centreon “log” table getting insanely huge

Nginx SSL vhosting using Server Name Indication