Centreon “log” table getting insanely huge

Hi there,

I’m currently migrating some old Centreons 2.5/2.6 with Nagios/NDO to Centreon 2.7 with Centreon-Engine/Centreon-broker but I’m experiencing some issues with insanely large MySQL tables to migrate:

This table contains old Nagios logs and according to a forum post it’s being use when clicking on Monitoring > Event logs and is used when doing reporting actions.
Fair enough, I don’t mind anyway of what happened last year, reporting is done on a monthly basis.

So let’s see what is the oldest entry there:

Sadly, it’s using unix timestamp and not MySQL datetime format, so we’ll have to do some conversion to get it humanely-readable.
To be honest, when I started the cleanup the oldest entry was even older.

I’m not sure if Centreon is supposed to clean this out. I guess it does, probably using one of the various cron jobs installed by Centreon but according to my experience this is highly borked and can surely lead to uncleaned entries.

Let’s validate we’re not going to delete bad entries by running a select first

Looks okay. Be sure to compare “ctime” and the converted date and play with the WHERE condition so you can be sure it’s really working properly.
For instance, if you swap “2016-06-08 00:00:00” with “2015-06-14 19:19:01” the last line should disappear.

Once you’ve confirmed your not deleting anything useful, go ahead with a DELETE statement:

I decided to use LIMIT here, to avoid loading too much the server for an unknown time. “time” command has been added here so you can have a measurement of the time required to delete 1 000 000 entries (52s here).

You can now recheck the oldest log you have now:

It seems it’ll be a long way to go before getting to june, 2016 😉

All in one command, so you just have too check your term when coming back from the coffee machine to see its progress:

When the loop keeps outputing the same date, it means DELETE is not removing anything anymore, time to hit ctrl+c !

Let’s have a look to the table size now:

Uh ?

Thanks to Google, it seems I need to run “OPTIMIZE TABLE” to reclaim the freed disk space. But there’re two thing I know about optimize and huge tables like this one:
* It will write lock the table
* It will last for ages (I mean up to *days*)

Let’s try to make this process a bit quicker… Ever heard about eatmydata ?
It’ll will disable fsync() system call, giving you some kind of write cache on steroids; drawbacks: you’re not protected anymore from file corruption in case of a crash.

For now, we’ll take the risk and hack mysql init script to run with eatmydata:

It’s pretty hard to figure out if the trick worked or not. Actually, it’ll set a LD_PRELOAD env variable to override libc calls with the unprotected ones.
Thanks to /proc, we can check this by looking at the mysqld PID attributes

(basically, I get /usr/sbin/mysql pid which is the main MySQL server process and check /proc//environ)

If it worked, you should find a line like this:

We can now run optimize on this table:

You can see it processing by running:

Now you will have to wait a couple of hours for the optimization to complete…

Nginx SSL vhosting using Server Name Indication

Here is the issue: I have a tcp/443 DNAT to a specific machine running some specific HTTPS app that does not work behind a reverse proxy.

Obviously, I want to run others application on 443 and I’m not allowed to get any other port.

Sounds pretty bad, right ?
Actually, there’s a way out and it’s called “nginx-is-so-fuckin-powerfull” 😉

As you may know, a long time ago a feature has been added to TLS which is called “Server Name Indication”. Before this it was impossible to serve multiple virtual hosts on a single address because SSL session was negociated before the client actually sends the requested vhost name.

With SNI, there’s a quick chat between your HTTPS server and the remote browser, something like:

Ok that’s probably not really accurate but who cares about what exactly happens. The thing is: there’s a routing capability before serving the SSL certificate and we know the requested domain name at this point; and guess what: NGINX offers routing possibility using SNI name !!

First thing… You need a really really new NGINX version (1.11.5), but if your distro doesn’t have it you can use NGINX repositories.
Second, you must understand that very old clients may not use SNI. If it doesn’t it will hit the NGINX default vhost. So make sure to keep the old behavior as default, just in case.
Here is the client compatibility list for SNI: https://en.wikipedia.org/wiki/Server_Name_Indication
I leave it to you to decide if you care about handling Internet Explorer < 7. So let's configure NGINX correctly: You need to define a stream {} section on nginx.conf top, just like the http one.

Of course, you need to disable default http/server to listen on port 443 (comment lines like "listen 443 ssl" in all your existing configuration). Now we'll create a stream server, which is a plain TCP proxy: In /etc/nginx/stream.conf.d/443.conf:

And that's it 😀 You can now create a new http/server instance on port 8443 to serve your different new https vhosts but I suggest starting with the default virtual host (/etc/nginx/conf.d/default.conf) by adding "listen 8443 ssl default_server" and some ssl cert and key directives. Here is a example of the stream_443.log:

Nice work NGINX, as usual ! Going further: There's just a little issue here: The real HTTPS on port 8443 will always see incoming IP address as Howerver, there's an overhead called "proxy_protocol" that can help passing proxying related things between NGINX servers but my equipment running behind doesn't like this. So the idea here is to use proxy_protocol between my stream/443 and http/8443 instances and strip it when proxying to original_dest using a dummy stream server that does nothing else that popping out the proxy_protocol data and forwarding to the real server. Then I will restore remote_addr in http/8443. The new config file is now:

In the http/8443 vhost, we set the following to restore original client IP address:

Nginx -_- Bonus stuff: I case you're having issue with SELinux (and you will, for instance it will deny NGINX to start a connection from port 8080 to a remote host), you can use the following to extract failures from audit.log and turn them into a permanent SELinux exception