Fixing VMware vCenter template customization for Debian Stretch (nic detected as “ether”)

Hello,

I’m a big fan of Foreman, I use it everywhere to spawn my virtual machines (mostly with VMWare vCenter or AWS) and then apply directly Puppet classes on it to get a fully configured new host in a few clicks. Maybe I’ll write about it one day, let’s see.

Anyway, this week theme was mostly “Let’s upgrade from Jessie to Stretch, I’m craving Python 3.5 and the new async/await syntax)”.
Sadly, it went wrong. I was unable to use my Foreman anymore against ESX 6.0 because when injecting the customization XML file (used to define IP settings within the VM through open-vm-tools) the resulting VM had no network set.
After looking at what happened, I figured out /etc/network/interfaces had been created wrong: instead of using eth0 (yes, I disabled predicitve interface name in my template) it was all set like the interface was named ether. Uh ?

Quick Google search with “debian stretch vmware ether” lead me to the following GitHub bug opened against open-vm-tools. Sadly the issue wouldn’t come from open-vm-tools: this issue comes from a VMWare script not parsing correctly current ifconfig output (yeah, I added net-tools in my template too).

Here is an extract of the net-tools package NEWS.Debian file:

Wow, that’s a pretty dangerous move you did here….

The script creating the network configuration is actually a piece of Perl crap copied directly from the vCenter server into the VM filesystem. Yeah, that sounds like black magic but the good news is that’s it’s Perl, so it’s fixable.

So I searched for this “Customization.pm” file on my vCenter Windows server and I found it here:
C:\Program Files\VMware\vCenter Server\vpxd\imgcust\linux\imgcust-scripts\Customization.pm

I managed quite easily to understand what was wrong, and I must say that original output parsing was pretty cheap.
Anyway, here’s a better one that just works:

Nothing to restart, this file in copied everytime you apply customization to a template. You’ll find attached a text version of the patch: vcenter_Customization_pm.diff

Good luck!

Policy Based Routing (PBR) with Shorewall to migrate a server

Hey,

Today I’m doing pure sysadmin work and I’ve been asked to migrate several servers from an obsolete IP range 192.168.x to 10.x. Things were quite easy until I reached the internal mail server that can be used by hundreds field hardware as a relay server. Everybody is supposed to use DNS entry but I won’t trust that.

So my idea is to switch eth0 to the new network and keep a new eth1 in the old one to keep the service working and be able to log what’s using the obsolete address.
There’s just a little problem: if my default gateway is on eth0, any packet entering eth1 from a routed network (would work for what’s connected in the legacy local network) will be answered using the default gateway on eth0. That’s asymmetric routing and that just doesn’t work.

Okay, so how do I solve that ? With Shorewall of course ! The idea if to tag any packet entering eth1 with a different mark than the ones coming throught eth0 and provide different routing table for each mark. I’ll do this on CentOS today but it should be basically the same for any Linux system. Shorewall is usually available everywhere but you can try doing this by hand with “ip” and “iptables”. Looks like a lot of pain to me, though.
Having both address routed and working is a nice step but it’s pretty useless if I have no way to find out who’s using the obsolete address so we’ll use Shorewall to log these access and create a specific rsyslog/logrotate configuration to get a dedicated log.

First, we’ll change the network configuration to have both interfaces up with a default gateway only on the first interface (connected to the new network). The gateway will be later overridden by Shorewall but it’s always saner to have a default configuration working, even with limited feature.

So make sure to create proper ifcfg-eth0 and ifcfg-eth1 in /etc/sysconfig/network-scripts and make sure to have only GATEWAY defined on the new network. You should also make sure that the server is reachable on its new address and reachable on the old address with a machine directly connected to the legacy network.

Let’s continue with a very basic Shorewall configuration. yum -y install shorewall and then make sure to have the three following files in /etc/shorewall:

  • interfaces – List of network adapter handled by Shorewall
  • policy – Default firewall policies between each zone
  • providers – This one is PBR specific, we’ll use this to mark packets
  • rules – Overrides default policies with port/host rules
  • shorewall.conf – Global settings
  • zones – Map interfaces to firewall zones
  • If you miss one, copy it from /usr/share/shorewall/configfiles/

    So let’s do a few adjustments in shorewall.conf first:

    IP_FORWARDING=No (No this machine SHOULD never be used a gateway between legacy and new network, we're not here to create security flows ;-))
    DISABLE_IPV6=Yes (Sadly, there's no IPv6 here so it's better to let Shorewall4 kill the whole stack)
    LOGTAGONLY=Yes (Change the way Shorewall generate log prefix, otherwise ours will be too long and get shortened)

    Now defines the interfaces in rules:

    And map them to IPv4 zones:

    fw is a default zone meaning “myself”.

    And we create a default policy allowing the machine itself to reach legacy and new network zones and blocking any incoming packets.

    Finally we’ll add a set of default rules to be at least, to SSH the server again

    Just like in policy file, you can use loc,old if you want to permit ping and SSH from the old network too.

    I’ll also add a few rules to permit mail related services from the new zone:

    Okay, now we can enable and start Shorewall.

    Now we’ll ask Shorewall to mark packets differently according to the incoming interface. This will be done in providers file.

    Last column is the gateway to use on each network.

    Let’s permit mail-related traffic from the legacy network but ask Shorewall to log these packets. Add the following in rules file:

    Reload Shorewall and try to telnet tcp/25 from a routed network, both IPs are now working !

    If you check /var/log/messages you will see log like:
    Jul 24 16:55:39 mailsrv kernel: Shorewall:MailMigration:ACCE IN=eth1 OUT= MAC=XXX SRC=192.168.55.4 DST=192.168.0.10 LEN=60 TOS=0x00 PREC=0x00 TTL=63 ID=22405 DF PROTO=TCP SPT=39474 DPT=25 WINDOW=29200 RES=0x00 SYN URGP=0 MARK=0x2

    You can also check your routing tables with ip route show table.
    ip route show table main shows no more default gateway
    ip route show table 1 shows local route for eth0 network and default gateway of the new network. It’ll be used for packet tagged as 1.
    ip route show table 2 shows local route for eth1 network and default gateway of the legacy network. It’ll be used for packet tagged as 2 (Note the log above with MARK=0x2).

    Your server is now completely accessible from both networks and you can easily monitor the log file to find clients still using the legacy address. But we can make it a lot easier by asking rsyslog to create a separate log file with these specific messages:

    Create /etc/rsyslog.d/mailmigration.conf with the following content:

    And the associated logrotate /etc/logrotate.d/mailmigration file to avoid having a single never ending file:

    If you want to go further for a more automated way of handling this, I’d definitely suggest having a look at Rsyslog AMQP module to publish event to a RabbitMQ and write a quick Python consumer to parse and notify “Someone” (may I suggest calling some API to create an internal support ticket ?) using Pika. The “worker.py” file should be enough for testing, just try/except/ch.basic_nack your handler so the message goes back in queue in case of failure.