Diagnose Network Problems In A Container Using The Filesystem

So you need to figure out why you application is not working, but your toolset is very limited because you are using either a minimal installation of Linux or you are in a container. This technique should work either way.

You can use the /dev filesystem to check network connections.

Example 1:
You would like to know if you can reach the host 172.16.0.10 on port 443.

ip=172.16.0.10
port=443
if $((echo > /dev/tcp/$ip/$port) &>/dev/null)
then echo "TCP port $port is open"
else echo "TCP port $port is closed"
fi

The above command will you the answer.

You can also write this as a one-liner

ip=172.16.0.10;port=443; (echo > /dev/tcp/$ip/$port) &>/dev/null && echo "TCP port $port is open" || echo "TCP port $port is closed"

Example 2:

You would like to know what connections are being made to and from the container or host, but you do not have netstat.

Continue reading Diagnose Network Problems In A Container Using The Filesystem

NSX-T Troubleshooting IDFW rules

So you have migrated to NSX-T 3.2 and you are using IDFW rules to allow users to dynamically gain access when they log in to any physical device in the domain.

Only trouble is that now it is not really working, and VMware did not yet implement a way in the gui to see the effective members of Groups that contains Active Directory members.

Well there is a way you can see who is in the group at least, but there are a couple of steps.

How to find the effective group members

Step one is to identify the rule you are troubleshooting. Make a note of the rule id.

Next find the host the destination VM is running on. You can do this manually in vCenter or use powershell. That’s up to you.

Continue reading NSX-T Troubleshooting IDFW rules

NSX-T Traffic Analysis Traceflow fails

So you are trying to do a traceflow to see where the traffic is going, but as it turns out it is not working. If this is the case, and if you are running NSX-T 3.2, and the segment that your VMs belong to is VLAN backed. Then I might have a solutions for you.

ERROR

Traceflow request failed. The request might be cancelled because it took more time than normal. Please retry.Error Message: Error: Traceflow intent /infra/traceflows/<guid> realized on enforcement point /infra/sites/default/enforcement-points/default with error Traceflow on VLAN logical port LogicalPort/<guid> requires INT (In-band Network Telemetry) to be enabled (Error code: 500060)
Continue reading NSX-T Traffic Analysis Traceflow fails

Powershell: Migrate Standard Portgroups

Hi, just wanted to share this piece of code with you. This short script creates a mirror of virtual portgroups from one vSphere ESXi host to another.

I only takes the name and vlan id into account, so all policies, nics and other settings are not migrated at this time, but it is easy to do. I however does not always want that.

You need VMware.PowerCLI module for it to work.

Use it at you own risk.

Import-Module VMware.PowerCLI

connect-viserver <vCenter Name>

$srcHost = Get-VMHost <Source Host Name>
$dstHost = Get-VMHost <Destination Host Name>

# Change the source and destination switch names if you need to
$srcSwitch = $srcHost | Get-VirtualSwitch -Name vSwitch0
$dstSwitch = $dstHost | Get-VirtualSwitch -Name vSwitch0

$srcPGs = $srcSwitch | Get-VirtualPortGroup
$dstSwitch = $dstHost | Get-VirtualSwitch -Name vSwitch1 -ErrorAction:SilentlyContinue
if ($dstSwitch -eq $null) {
  $dstSwitch = $dstHost | New-VirtualSwitch -Name $srcSwitch.Name
}

foreach ($srcPg in $srcPGs) {
  $pgName = $srcPg.name
  $pgVlan = $srcPg.VLanId

  $pgExists = $false
  foreach ($dstPg in $dstPGs) {
    if ($dstPg.Name -eq $pgName) {
      $pgExists = $true
    }
  }
  if (-not $pgExists) {
    #Write-Host "Creating portgroup $pgName with vlan $pgVlan"
    $dstSwitch | New-VirtualPortGroup -Name $pgName -VLanId $pgVlan
  }
}

ESXi 6.7 PSOD with qfle3 driver version above 1.0.69.1

Had a ESXi PSOD today. That does not happened that often, so I was quite surprised to find out that it was not a hardware related issue that was the root cause.

VMware did an analysis of the memory dump, and it turned out to be a faulty driver. That made sense since the PSOD often comes from drivers og agents when it is not a hardware issue.

The PSOD i got was the following:

#PF Exception 14 in World xxxxxxx:vmnicX-pollw IP xxxxxxxxxx addr xxxxxxxx
Continue reading ESXi 6.7 PSOD with qfle3 driver version above 1.0.69.1

Host xxx.xxx.xxx.xxx is not compatible with the VDS version

Just had an odd issue today.

A customer had created a Virtual Distributed Switch, but was unable to add his ESXi hosts to the vDS. It said that: “Host is not compatible with the VDS version.”

He was only able to join his version 6.5 ESXi host to a 5.5 vDS. If it was upgraded to version 6.0 or 6.5 it did not work.

There are multiple reports of this online related to upgraded hosts and vCenters. I suspect that it is an issue that you only run into if you do major upgrades without reinstalling ESXi, and since I never do that I have not had that problem before.

The quick solution to this problem is: Continue reading Host xxx.xxx.xxx.xxx is not compatible with the VDS version

VMware NSX Active Directory Groups Missing

Today I was doing some micro segmentation at a customer site, and I was having an issue with new active directory groups not showing up when I wanted to add them to a Security Group.

I turned out that there were a lot of groups missing. I checked the synchronization  but there was no errors, and no pattern in what groups was missing and what groups were there. If you do have errors this article might be relevant for you: https://kb.vmware.com/s/article/2150678

I checked the release notes for recent updates, but it did not look like this was a known bug.

Solution:
I did find a workaround. I deleted the Domain from NSX, and set if up again. Now all groups was available. This is not a very good solution since all your existing setup in regards to Identity based rules in the distributed firewall, and security groups with AD Group members, needs to be redone. So be careful to document everything before you delete the domain.

Relevant NSX version: 6.3.2.5672532

Should you enable Network health check for your Distributed Virtual Switches

Network health check is a very useful feature that was introduced with vSphere 5.1 vDS.

What does it do?

The purpose is to test if the VLANs, MTU and Load Balancing settings you defined are actually working. The old way of doing this would be to disconnect all port except one, by either doing a shutdown on the switch port, or pulling out the cable, and then testing, with a VM, if every VLAN still works. This can be a very lengthy process if you have many adapters, but also a necessary step if you want a stable environment.

If you want to know more about how it check the different settings, Joseph Griffiths did a good article on this you can read here: http://blog.jgriffiths.org/?p=877

So why would you ever disable this feature?

Well the health check feature generates a lot of mac table entries as explained in VMware KB 2034795.

An example given is that you have 35 Hosts with 2 Network Adapters each, and 60 VLANs. This will generate (35 * 2 * 60) 4200 mac table entries in your physical switches. And as you can see, this quickly increases. Some switches only has room for 32.000 records or less. Continue reading Should you enable Network health check for your Distributed Virtual Switches

Free IPAM solution – 2. Enabling LDAP/AD Authentication

Trying to remember the password for all of your different web interfaces?

Remember one less password, by enabling AD (Active Directory) authentication for phpIPAM.

Continue reading Free IPAM solution – 2. Enabling LDAP/AD Authentication

Enable SSL on Apache2 (Self-Signed)

Quick guide to getting a self signed certificate configured for Apache on Ubuntu 16.04.

All credits go to Justin Ellingwood. There is a link to his article at the bottom of this page. This is just a quick summery of what needs to be done to get SSL working, based on his article.

I do not recommend using self-signed certificates in production, as it does not provide any security what so ever! Deploy a signed certificate from your internal 2-Tier PKI infrastructure. If you do not have an internal PKI infrastructure, your need to get one!

Continue reading Enable SSL on Apache2 (Self-Signed)