Hi there,
I’m currently migrating some old Centreons 2.5/2.6 with Nagios/NDO to Centreon 2.7 with Centreon-Engine/Centreon-broker but I’m experiencing some issues with insanely large MySQL tables to migrate:
1 2 3 4 |
root@server:~# ls -lah /var/lib/mysql/centreon_storage/log.* -rw-rw---- 1 mysql mysql 13K Apr 15 2015 /var/lib/mysql/centreon_storage/log.frm -rw-rw---- 1 mysql mysql 16G Dec 8 09:18 /var/lib/mysql/centreon_storage/log.MYD -rw-rw---- 1 mysql mysql 6.0G Dec 8 09:18 /var/lib/mysql/centreon_storage/log.MYI |
This table contains old Nagios logs and according to a forum post it’s being use when clicking on Monitoring > Event logs and is used when doing reporting actions.
Fair enough, I don’t mind anyway of what happened last year, reporting is done on a monthly basis.
So let’s see what is the oldest entry there:
1 2 |
root@server:~# echo 'SELECT FROM_UNIXTIME(ctime) FROM log ORDER BY ctime ASC LIMIT 1' | mysql -N centreon_storage 2015-06-14 19:19:00 |
Sadly, it’s using unix timestamp and not MySQL datetime format, so we’ll have to do some conversion to get it humanely-readable.
To be honest, when I started the cleanup the oldest entry was even older.
I’m not sure if Centreon is supposed to clean this out. I guess it does, probably using one of the various cron jobs installed by Centreon but according to my experience this is highly borked and can surely lead to uncleaned entries.
Let’s validate we’re not going to delete bad entries by running a select first
1 2 3 4 5 6 |
root@server:~# echo 'SELECT FROM_UNIXTIME(ctime), ctime, output FROM log WHERE ctime < UNIX_TIMESTAMP("2016-06-08 00:00:00") LIMIT 5' | mysql -N centreon_storage 2015-06-14 19:19:00 1434309540 Max concurrent service checks (200) has been reached. Nudging server1:traffic_eth0 by 11 seconds... 2015-06-14 19:19:00 1434309540 Max concurrent service checks (200) has been reached. Nudging server1:Ping by 7 seconds... 2015-06-14 19:19:00 1434309540 Max concurrent service checks (200) has been reached. Nudging server2:Memory by 12 seconds... 2015-06-14 19:19:00 1434309540 Max concurrent service checks (200) has been reached. Nudging server3:Processor by 6 seconds... 2015-06-14 19:19:01 1434309541 Max concurrent service checks (200) has been reached. Nudging server3:Memory by 10 seconds... |
Looks okay. Be sure to compare “ctime” and the converted date and play with the WHERE condition so you can be sure it’s really working properly.
For instance, if you swap “2016-06-08 00:00:00” with “2015-06-14 19:19:01” the last line should disappear.
Once you’ve confirmed your not deleting anything useful, go ahead with a DELETE statement:
1 2 3 4 5 |
root@server:~# time echo 'DELETE FROM log WHERE ctime < UNIX_TIMESTAMP("2016-06-08 00:00:00") LIMIT 1000000' | mysql -N centreon_storage real 0m51.884s user 0m0.000s sys 0m0.008s |
I decided to use LIMIT here, to avoid loading too much the server for an unknown time. “time” command has been added here so you can have a measurement of the time required to delete 1 000 000 entries (52s here).
You can now recheck the oldest log you have now:
1 2 |
root@server:~# echo 'SELECT FROM_UNIXTIME(ctime) FROM log ORDER BY ctime ASC LIMIT 1' | mysql -N centreon_storage 2015-06-19 21:29:54 |
It seems it’ll be a long way to go before getting to june, 2016 😉
Bonus:
All in one command, so you just have too check your term when coming back from the coffee machine to see its progress:
1 2 3 4 5 6 |
root@server:~# while true; do echo 'DELETE FROM log WHERE ctime < UNIX_TIMESTAMP("2016-06-08 00:00:00") LIMIT 100000' | mysql -N centreon_storage && echo 'SELECT FROM_UNIXTIME(ctime) FROM log ORDER BY ctime ASC LIMIT 1' | mysql -N centreon_storage && sleep 2; done 2015-06-21 01:47:32 2015-06-21 10:59:55 2015-06-21 19:57:21 2015-06-22 04:58:59 [...] |
When the loop keeps outputing the same date, it means DELETE is not removing anything anymore, time to hit ctrl+c !
Let’s have a look to the table size now:
1 2 3 4 |
root@server:~# ls -lah /var/lib/mysql/centreon_storage/log.* -rw-rw---- 1 mysql mysql 13K Apr 15 2015 /var/lib/mysql/centreon_storage/log.frm -rw-rw---- 1 mysql mysql 16G Dec 8 10:25 /var/lib/mysql/centreon_storage/log.MYD -rw-rw---- 1 mysql mysql 6.0G Dec 8 10:25 /var/lib/mysql/centreon_storage/log.MYI |
Uh ?
Thanks to Google, it seems I need to run “OPTIMIZE TABLE” to reclaim the freed disk space. But there’re two thing I know about optimize and huge tables like this one:
* It will write lock the table
* It will last for ages (I mean up to *days*)
Let’s try to make this process a bit quicker… Ever heard about eatmydata ?
It’ll will disable fsync() system call, giving you some kind of write cache on steroids; drawbacks: you’re not protected anymore from file corruption in case of a crash.
For now, we’ll take the risk and hack mysql init script to run with eatmydata:
1 2 3 |
root@server:~# sed -i 's!/usr/bin/mysqld_safe > /dev/null!/usr/bin/eatmydata /usr/bin/mysqld_safe > /dev/null!' /etc/init.d/mysql root@server:~# systemctl --system daemon-reload root@server:~# systemctl restart mysql |
It’s pretty hard to figure out if the trick worked or not. Actually, it’ll set a LD_PRELOAD env variable to override libc calls with the unprotected ones.
Thanks to /proc, we can check this by looking at the mysqld PID attributes
1 |
root@server:~# cat /proc/`ps aux | grep /usr/sbin/mysql | grep -v grep | awk '{ print $2 }'`/environ | tr '\0' '\n' |
(basically, I get /usr/sbin/mysql pid which is the main MySQL server process and check /proc/
If it worked, you should find a line like this:
1 |
LD_PRELOAD=/usr/lib/libeatmydata/libeatmydata.so /usr/lib/libeatmydata/libeatmydata.so |
We can now run optimize on this table:
1 |
root@server:~# echo "OPTIMIZE TABLE log" | mysql centreon_storage |
You can see it processing by running:
1 |
watch -n 2 ls -lah /var/lib/mysql/centreon_storage/log.* |
1 2 3 4 |
-rw-rw---- 1 mysql mysql 13K Dec 8 14:58 /var/lib/mysql/centreon_storage/log.frm -rw-rw---- 1 mysql mysql 11G Dec 8 16:45 /var/lib/mysql/centreon_storage/log.MYD -rw-rw---- 1 mysql mysql 3.1G Dec 8 16:45 /var/lib/mysql/centreon_storage/log.MYI -rw-rw---- 1 mysql mysql 710M Dec 8 16:53 /var/lib/mysql/centreon_storage/log.TMM |
Now you will have to wait a couple of hours for the optimization to complete…