Tuesday, 19 March 2013

VMs, snapshots and domain computer accounts.

Have you ever had the problem where you have reverted a VM's snapshot to find that it's computer password is out of sync with the domain? I have, loads of times. This is often seen with the following message when you attempt to log in: "Windows cannot connect to the domain, either because the domain
controller is down or otherwise unavailable, or because your computer
account was not found."

The problem is that Windows machines on a Domain change their computer account passwords with a Domain controller every 30 Days. If a machine changes it's computer password with a domain controller and you then revert to a snapshot that was taken before the password was changed the computer account will no longer be able to authenticate on the network and domain users won't be able to logon.

How do you get around this? One way is in Microsoft's KB article 154501.

This can reduce the security on your domain, or at least the security between the DC and the workstation you make the following registry changes on but if you have a testing setup like I do this is not much of a problem and the convenience easily outweighs any security issues (in my opinion :).
You can set the DisablePasswordChange registry entry to 1 in :

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Netlogon\Parameters

That has now stopped this machine from changing it's domain password every 30 days. Now we need to get the machine back onto the domain, to achieve this do one of the following:

A. Remove the computer from the domain and add it back in again. Easy enough you say. I have a lot of test machines floating about that don't always have the same local administrator password. So it is a good idea to make sure that you know a local administrators password before you remove the machine from the domain otherwise you will end up with a machine you cannot access via windows (all is not necessarily lost, trinity rescue kit can help here).

B. Remove and rejoin the machine to the domain using the netdom.exe command:
netdom remove <machine name> /Domain:<domain name> /userd:<domain name>\<domain administrator account name> /passwordd:<domain administrator password>
Wait for the response:
The command completed successfully.
Then run:
netdom join <machine name> /Domain:<domain name> /userd:<domain name>\<domain administrator account name> /passwordd:<domain administrator password>
Again we are looking for the response:
The command completed successfully.
It is probably best to reboot the machine at this point, it is a windows machine after all we were messing with the domain membership and I simply would trust a machine in this state to function as expected unless it is rebooted.

Wednesday, 14 November 2012

ESXi not accepting previously used volume

I recently had a problem on an ESXi 5.0 box I was trying to add a iscsi target to. I had set up a 6 TB  iSCSI LUN but everytime I tried adding it to ESXI I got the following error:

Call "HostDatastoreSystem.QueryVmfsDatastoreCreateOptions" for object "ha-datastoresystem" on ESXi "ESX HOSTNAME" failed. 

It turned out that the setup wizard on the device had happily created a volume and formatted it with ext4 before I realised what it was doing. Although I removed this volume the partition information wasn't removed from the array and this was upsetting ESXi.

How do you fix it? As there was no obvious way to wipe the partition info from within the Array's management interface I decided to do it from a Debian VM running on that host.

Firstly I removed the discovery information from the ESXI server so it isn't trying to interfere then on the Debian VM I installed open-iscsi with:

apt-get install open-iscsi

Start the open-iscsi daemon with:

/etc/init.d/open-iscsi start

Query the iSCSI target on the storage device:

iscsiadm -m discovery -t st -p 192.168.1.99

Which should return something like this:
192.168.1.99:3260,0 iqn.2010-12.com.manufacturer:nasdevice.name

If you have set up iSCSI authentication on your storage device you will need to run something like the following commands using the iqn in the above response.

iscsiadm   --mode node  --targetname "iqn.2010-12.com.manufacturer:nasdevice.name"  -p 192.168.0.99:3260 --op=update --name node.session.auth.authmethod --value=CHAP
iscsiadm   --mode node  --targetname "iqn.2010-12.com.manufacturer:nasdevice.name"  -p 192.168.0.99:3260 --op=update --name node.session.auth.username --value=username
iscsiadm   --mode node  --targetname "iqn.2010-12.com.manufacturer:nasdevice.name"  -p 192.168.0.99:3260 --op=update --name node.session.auth.password --value=password

Logon to the storage device:

iscsiadm -m node --targetname "iqn.2010-12.com.manufacturer:nasdevice.name" --portal "192.168.0.99:3260" --login

All being well this should now create a SCSI device as if you had attached a hard drive directly to the system. I looked at the bottom of the output from the dmesg command to find out which device (/dev/sdb).

I used the following command to write zeros to the first half a MB of the disk which will overwrite any partition table information (care should be taken that you have the correct device when using this command, it will eat your drive):

 dd if=/dev/zero of=/dev/sdb bs=512 count=1024

Now logout of the storage device with the following command:

iscsiadm -m node --targetname "iqn.2010-12.com.manufacturer:nasdevice.name" --portal "192.168.0.99:3260" --logout

ESXi should now happily accept the iSCSI LUN when you attempt to add it.