Basics

I won’t go into too much detail about the ESXi automated build process – VMware do a good job of that in the ESXi 5.5 installation guide, on top of this the default kickstart file /etc/vmware/weasel/ks.cfg is a good starting point.

Creating dynamic zero touch builds is possibly overkill for most SME environments, but can be useful in it large scale estates like service provider environments. Ideally these should tie into proper CMDB / inventory databases, source control, continuous integration environments, etc., but this all depends on how much time/manpower/money you have available.

GitHub

All files – including apache license – can be found on https://github.com/dagsonstebo/VMware-ESXi-5.5-zero-touch-build-scripts.

Host specific PXE boot process

The standard VMware build PXE menu configuration is roughly as follows:

LABEL Manual ESXi 5.5 Install
MENU LABEL ESXi5.5 manual install
KERNEL esxi55-install/mboot.c32
APPEND -c esxi55-install/boot.cfg

I.e. PXE loads the boot.cfg file, which in turn specifies the path to the kickstart file. The problem is that to get host specific builds we need to create separate boot.cfg and kickstart files for each host, i.e. not very pretty. An undocumented method is to specify the kickstart file from within the PXE menu:

LABEL Kickstarted ESXi 5.5 ESXI1
MENU LABEL ESXi5.5.0 kickstart ESXI1
KERNEL esxi55-install/mboot.c32
APPEND -c esxi55-install/boot.cfg pxebooting ks=http://192.168.0.100/esxi55ks.cfg

Since we don’t have to specify the kickstart file within boot.cfg using this method cuts down on the number of boot.cfg files but still require a kickstart file per host. As kickstart files can be of considerable length this is still not very pretty.

The ideal scenario is to download a single boot.cfg file and a single kickstart file which in turn loads host specific configuration data from file or an external URL. Turns out the ESXi build doesn’t check or reject any URL given in the APPEND statement above. In other words we can specify anything we want, as long as it name resolves and returns data. The solution (yes I’m aware this might be construed as a bodge or cheat – semantics…) is to specify the kickstart file in an argumented URL:

LABEL Kickstarted ESXi 5.5 ESXICN1
MENU LABEL ESXi5.5.0 kickstart ESXICN1
KERNEL esxi55-install/mboot.c32
APPEND -c esxi55-install/boot.cfg pxebooting ks=http://192.168.0.100/esxi55ks.cfg?hostname=esxi1.mylab.local

In other words the build is now cut back to a single boot.cfg and and single kickstart file, all that is required is separate PXE boot menu entries on a per host basis (or – if you’ve got full overview of your MAC addresses configure PXE to build hosts without any input). The URL can later be parsed in the kickstart build script from the /var/log/esxi_install.log file on the host:

2014-02-12T22:04:42.859Z DEBUG    Executing: /sbin/bootOption -roC
2014-02-12T22:04:42.874Z INFO       vmbTrustedBoot=false tboot=0x0x101b000 runweasel pxebooting ks=http://192.168.0.100/esxi55ks.cfg?hostname=esxi1.mylab.local BOOTIF=01-00-0d-21-84-61-61

All that is now required is to make sure the build script uses this URL to download host specific configuration details.

Busybox shortcomings

Just a quick note on the ASH shell used in the Busybox environment the build script is running in. To put it mildly this has very limited functionality compared to your standard BASH shell, and things like functions, associative arrays and pointers just don’t work, and finding workarounds can be a bit of a pain. The alternative is to use Python – but since the automagic is generally carried out by vim-cmd and esxcli commands this means a lot of subprocesses etc.

Basics

This part of the build script is closely related to the vanilla VMware kickstart file normally found in /etc/vmware/weasel/ks.cfg on a newly built host.

################################################################################################
# Accept EULA, set root password, reboot after install and set license.
#dryrun
vmaccepteula
keyboard 'United Kingdom'
rootpw password
reboot
#vmserialnum --esx=

# Install to local disk, ignoring SSD drives, overwriting existing VMFS datastores.
# Use with caution as does not work/apply to all storage controllers.
# No issue with NFS / software iSCSI environments.
# Omitted options:
# --overwritevsan (causes install to fail when no vsan partition found).
# --novmfsondisk (prevents local VMFS datastore from being created).
install --firstdisk=local --ignoressd --overwritevmfs

# Set IP addressing and NIC for the duration of the build, this is changed later once full config downloaded.
# --addvmportgroup set to 0 to prevent creation of portgroup needing changed later on.
network --bootproto=dhcp --device=vmnic0 --addvmportgroup=0

###############################################################################################
# Pre and post installation scripts, interpreter options are [python|busybox]
# All automagic happens on firstboot hence nothing added.
# %pre --interpreter=busybox
%post --interpreter=busybox --ignorefailure=true

Firstboot, build folder and basic host configuration

All of the above is fairly standard fodder, all the automagic happens during firstboot, starting off with:

  • Setting general build variables.
  • Create build folder.
  • Configure logging (/scratch/build/build.log)
  • Parse hostname (as described above).
  • Rename local VMFS datastore.
  • Catch hardware config – this can be used later to install model / hardware specific drivers.
  • Set hostname.
  • Download and parse the host specific configuration file (I’ll get back to this shortly).
  • Clear the temporary build network configuration (to be replaced by details in the host configuration file).
########################################################################################################################
# Firstboot
# All configuration of network is done during firstboot, and depends on the host specific configuration file.
# Host configuration file is downloaded and parsed based on argument passed in the KS download URL.
# Interpreter is [busybox|python].
%firstboot --interpreter=busybox

# Constants
BUILDVERSION="VMware vSphere 5.5 build v1.0";
DOWNLOADURL="http://192.168.0.100";
CONFIGURL="${DOWNLOADURL}/ks"
WAITFORHOSTD=120;
WAITFORREMOVAL=15;
DELIMITER1="---";

########################################################################################################################
# Firstboot set to sleep X mins to allow the host daemon to fully start. This is an issue on certain blade server models.
sleep ${WAITFORHOSTD};

# Create a build working folder and initial build logging.
# To prevent any persistency issues the scratch partition is used.
strScratchfolder=`cat /etc/vmware/locker.conf | cut -d" " -f1`;
mkdir ${strScratchfolder}/build;
strBuildfolder="${strScratchfolder}/build";
strLogfile="${strBuildfolder}/build.log";

# Change to the build working folder, all build actions now done from here
cd ${strBuildfolder};
echo "INFO  Build folder: ${strBuildfolder}." >> ${strLogfile};

# Parse kickstart URL and determine host name
strFQDN=`awk 'FNR==2 {print $7}' /var/log/esxi_install.log | cut -d"=" -f3`;
strHostname=`echo ${strFQDN} | cut -d"." -f1`;

# Parse all hardware information and log. This information can be used for customised configuration if required.
strManufacturer=`esxcli hardware platform get | grep Vendor\ Name | cut -d":" -f2 | sed -e 's/^\ //'`;
strModel=`esxcli hardware platform get | grep Product\ Name | cut -d":" -f2 | sed -e 's/^\ //'`;
strSerial=`esxcli hardware platform get | grep Serial\ Number | cut -d":" -f2 | sed -e 's/^\ //'`;
strUUID=`smbiosDump | grep -A 5 System\ Info | grep UUID | cut -d":" -f2 | sed -e 's/[ ]*//'`;
echo "INFO  System manufacturer/model: ${strManufacturer} ${strModel}" >> ${strLogfile};
echo "INFO  System serial: ${strSerial}" >> ${strLogfile};
echo "INFO  System UUID (from BIOS): ${strUUID}" >> ${strLogfile};
########################################################################################################################
# Basic config
# Set system hostname
echo "INFO  Configuring hostname: ${strFQDN}" >> ${strLogfile};
esxcli system hostname set --fqdn=${strFQDN};

# Rename local datastore to be more descriptive.
# Datastore name is persistent across ALL rebuilds hence find old name first then rename it.
strCurrentdsname=`esxcli --formatter=csv storage filesystem list | grep VMFS | cut -d"," -f7`;
vim-cmd hostsvc/datastore/rename ${strCurrentdsname} "${strHostname}_localds1";
echo "INFO  Local datastore renamed to ${strHostname}_localds1." >> ${strLogfile};

# Determine ESXi version
strESXiversion=`vmware -v`;
strESXinoversion=`vmware -v | cut -d" " -f3`;
echo "INFO  VMware ESXi version installed: ${strESXiversion}" >> ${strLogfile};

# Download all config files
strHostconfigfile="${CONFIGURL}/${strHostname}.cfg";
echo "INFO  Downloading host config file ${strHostconfigfile}." >> ${strLogfile};
wget ${strHostconfigfile};

cd /scratch/build;

########################################################################################################################
# Parse all host configuration data
# Since the firstboot part is using busybox the file can simply be included using the source statement thereby loading
# all configuration options as variables
echo "INFO  Applying host source config file ${strHostname}.cfg." >> ${strLogfile};
source ${strHostname}.cfg;

########################################################################################################################
# Clear temporary build network config
echo "INFO  Removing temporary build networking vmk0." >> ${strLogfile};
esxcli network ip interface remove --interface-name=vmk0 >> ${strLogfile} 2>&1;
echo "INFO  Removing temporary build networking vSwitch0." >> ${strLogfile};
esxcli network vswitch standard remove --vswitch-name=vSwitch0 >> ${strLogfile} 2>&1;
sleep ${WAITFORREMOVAL};

Host configuration file

The format of the configuration file doesn’t really matter as long as it can be parsed – JSON, XML, comma separated, etc. Ideally I’d like to tie this into a configuration management database backend, but this will take a bit more time and coding. The easiest way is to simply write this as a shell script and use source (or “.”) to load it (see above). These files can easily be created in large numbers with a few lines of code, e.g. an Excel macro does the trick.

So – here goes:

# General settings - self explanatory
CFG_DG="192.168.0.1";
CFG_DNS1="192.168.0.2";
CFG_DNS2="192.168.0.2";
CFG_SEARCHDOMAIN="mylab.local";
CFG_NTP1="ntp.cis.strath.ac.uk";
CFG_PASSWORD="Password123";

# Storage - switch on software iscsi, specify which vmk's to run this on
CFG_ENABLEISCSI="true";
CFG_ISCSIVMKS="vmk1,vmk2";

######################################################
# vSwitch configuration
CFG_VSWITCH0_NAME="vSwitch0";
CFG_VSWITCH0_UPLINK1="vmnic0";
CFG_VSWITCH0_UPLINK2="vmnic1";
CFG_VSWITCH0_ACTIVE="vmnic0,vmnic1";
CFG_VSWITCH0_STANDBY="";
# Failback and switch notify should be true or false
CFG_VSWITCH0_FAILBACK="false";
CFG_VSWITCH0_NOTIFY="true";
#Failuredetection: link / beacon
CFG_VSWITCH0_FAILUREDETECTION="link";
#Load balacing: iphash / mac /portid
CFG_VSWITCH0_LOADBAL="mac";
#Ports: 8,24,56,120,248,504,1016,2040,4088
CFG_VSWITCH0_PORTS="128";
CFG_VSWITCH0_MTU="1500";

#vSwitch1 - here used for iscsi
CFG_VSWITCH1_NAME="vSwitch1";
CFG_VSWITCH1_UPLINK1="vmnic2";
CFG_VSWITCH1_UPLINK2="vmnic3";
CFG_VSWITCH1_ACTIVE="vmnic2,vmnic3";
CFG_VSWITCH1_STANDBY="";
CFG_VSWITCH1_FAILBACK="true";
#Failuredetection: link / beacon
CFG_VSWITCH1_FAILUREDETECTION="link";
CFG_VSWITCH1_NOTIFY="true";
#Load balacing: iphash / mac / portid
CFG_VSWITCH1_LOADBAL="mac";
CFG_VSWITCH1_PORTS="128";
CFG_VSWITCH1_MTU="9000";

######################################################
# Port groups
CFG_PORTGROUP0_NAME="Management Network";
CFG_PORTGROUP0_VSWITCH="vSwitch0";
CFG_PORTGROUP0_VLAN="0";
# Configure active uplink for a port group - use this for iSCSI binding
CFG_PORTGROUP0_ACTIVE="";

CFG_PORTGROUP1_NAME="iSCSI_vmnic2";
CFG_PORTGROUP1_VSWITCH="vSwitch1";
CFG_PORTGROUP1_VLAN="100";
CFG_PORTGROUP1_ACTIVE="vmnic2";

CFG_PORTGROUP2_NAME="iSCSI_vmnic3";
CFG_PORTGROUP2_VSWITCH="vSwitch1";
CFG_PORTGROUP2_VLAN="100";
CFG_PORTGROUP2_ACTIVE="vmnic3";

CFG_PORTGROUP3_NAME="DATA_VSW0_VLAN130";
CFG_PORTGROUP3_VSWITCH="vSwitch0";
CFG_PORTGROUP3_VLAN="130";
CFG_PORTGROUP3_ACTIVE="";

######################################################
#Port vmkernel interfaces
CFG_VMK0_NAME="vmk0";
CFG_VMK0_PORTGROUP="Management Network";
CFG_VMK0_TYPE="static";
CFG_VMK0_IP="192.168.0.50";
CFG_VMK0_NETMASK="255.255.255.0";
CFG_VMK0_MTU="1500";
# Management, VMotion, faultToleranceLogging, vSphereReplication
CFG_VMK0_TAGS="Management,VMotion";

CFG_VMK1_NAME="vmk1";
CFG_VMK1_PORTGROUP="iSCSI_vmnic2";
CFG_VMK1_TYPE="static";
CFG_VMK1_IP="192.168.0.60";
CFG_VMK1_NETMASK="255.255.255.0";
CFG_VMK1_MTU="9000";
# Management, VMotion, faultToleranceLogging, vSphereReplication
CFG_VMK1_TAGS="";

CFG_VMK2_NAME="vmk2";
CFG_VMK2_PORTGROUP="iSCSI_vmnic3";
CFG_VMK2_TYPE="static";
CFG_VMK2_IP="192.168.0.61";
CFG_VMK2_NETMASK="255.255.255.0";
CFG_VMK2_MTU="9000";
# Management, VMotion, faultToleranceLogging, vSphereReplication
CFG_VMK2_TAGS="";

Next – network configuration…

Posted by Dag

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s