Basics
I won’t go into too much detail about the ESXi automated build process – VMware do a good job of that in the ESXi 5.5 installation guide, on top of this the default kickstart file /etc/vmware/weasel/ks.cfg is a good starting point.
Creating dynamic zero touch builds is possibly overkill for most SME environments, but can be useful in it large scale estates like service provider environments. Ideally these should tie into proper CMDB / inventory databases, source control, continuous integration environments, etc., but this all depends on how much time/manpower/money you have available.
GitHub
All files – including apache license – can be found on https://github.com/dagsonstebo/VMware-ESXi-5.5-zero-touch-build-scripts.
Host specific PXE boot process
The standard VMware build PXE menu configuration is roughly as follows:
LABEL Manual ESXi 5.5 Install MENU LABEL ESXi5.5 manual install KERNEL esxi55-install/mboot.c32 APPEND -c esxi55-install/boot.cfg
I.e. PXE loads the boot.cfg file, which in turn specifies the path to the kickstart file. The problem is that to get host specific builds we need to create separate boot.cfg and kickstart files for each host, i.e. not very pretty. An undocumented method is to specify the kickstart file from within the PXE menu:
LABEL Kickstarted ESXi 5.5 ESXI1 MENU LABEL ESXi5.5.0 kickstart ESXI1 KERNEL esxi55-install/mboot.c32 APPEND -c esxi55-install/boot.cfg pxebooting ks=http://192.168.0.100/esxi55ks.cfg
Since we don’t have to specify the kickstart file within boot.cfg using this method cuts down on the number of boot.cfg files but still require a kickstart file per host. As kickstart files can be of considerable length this is still not very pretty.
The ideal scenario is to download a single boot.cfg file and a single kickstart file which in turn loads host specific configuration data from file or an external URL. Turns out the ESXi build doesn’t check or reject any URL given in the APPEND statement above. In other words we can specify anything we want, as long as it name resolves and returns data. The solution (yes I’m aware this might be construed as a bodge or cheat – semantics…) is to specify the kickstart file in an argumented URL:
LABEL Kickstarted ESXi 5.5 ESXICN1 MENU LABEL ESXi5.5.0 kickstart ESXICN1 KERNEL esxi55-install/mboot.c32 APPEND -c esxi55-install/boot.cfg pxebooting ks=http://192.168.0.100/esxi55ks.cfg?hostname=esxi1.mylab.local
In other words the build is now cut back to a single boot.cfg and and single kickstart file, all that is required is separate PXE boot menu entries on a per host basis (or – if you’ve got full overview of your MAC addresses configure PXE to build hosts without any input). The URL can later be parsed in the kickstart build script from the /var/log/esxi_install.log file on the host:
2014-02-12T22:04:42.859Z DEBUG Executing: /sbin/bootOption -roC 2014-02-12T22:04:42.874Z INFO vmbTrustedBoot=false tboot=0x0x101b000 runweasel pxebooting ks=http://192.168.0.100/esxi55ks.cfg?hostname=esxi1.mylab.local BOOTIF=01-00-0d-21-84-61-61
All that is now required is to make sure the build script uses this URL to download host specific configuration details.
Busybox shortcomings
Just a quick note on the ASH shell used in the Busybox environment the build script is running in. To put it mildly this has very limited functionality compared to your standard BASH shell, and things like functions, associative arrays and pointers just don’t work, and finding workarounds can be a bit of a pain. The alternative is to use Python – but since the automagic is generally carried out by vim-cmd and esxcli commands this means a lot of subprocesses etc.
Basics
This part of the build script is closely related to the vanilla VMware kickstart file normally found in /etc/vmware/weasel/ks.cfg on a newly built host.
################################################################################################ # Accept EULA, set root password, reboot after install and set license. #dryrun vmaccepteula keyboard 'United Kingdom' rootpw password reboot #vmserialnum --esx= # Install to local disk, ignoring SSD drives, overwriting existing VMFS datastores. # Use with caution as does not work/apply to all storage controllers. # No issue with NFS / software iSCSI environments. # Omitted options: # --overwritevsan (causes install to fail when no vsan partition found). # --novmfsondisk (prevents local VMFS datastore from being created). install --firstdisk=local --ignoressd --overwritevmfs # Set IP addressing and NIC for the duration of the build, this is changed later once full config downloaded. # --addvmportgroup set to 0 to prevent creation of portgroup needing changed later on. network --bootproto=dhcp --device=vmnic0 --addvmportgroup=0 ############################################################################################### # Pre and post installation scripts, interpreter options are [python|busybox] # All automagic happens on firstboot hence nothing added. # %pre --interpreter=busybox %post --interpreter=busybox --ignorefailure=true
Firstboot, build folder and basic host configuration
All of the above is fairly standard fodder, all the automagic happens during firstboot, starting off with:
- Setting general build variables.
- Create build folder.
- Configure logging (/scratch/build/build.log)
- Parse hostname (as described above).
- Rename local VMFS datastore.
- Catch hardware config – this can be used later to install model / hardware specific drivers.
- Set hostname.
- Download and parse the host specific configuration file (I’ll get back to this shortly).
- Clear the temporary build network configuration (to be replaced by details in the host configuration file).
######################################################################################################################## # Firstboot # All configuration of network is done during firstboot, and depends on the host specific configuration file. # Host configuration file is downloaded and parsed based on argument passed in the KS download URL. # Interpreter is [busybox|python]. %firstboot --interpreter=busybox # Constants BUILDVERSION="VMware vSphere 5.5 build v1.0"; DOWNLOADURL="http://192.168.0.100"; CONFIGURL="${DOWNLOADURL}/ks" WAITFORHOSTD=120; WAITFORREMOVAL=15; DELIMITER1="---"; ######################################################################################################################## # Firstboot set to sleep X mins to allow the host daemon to fully start. This is an issue on certain blade server models. sleep ${WAITFORHOSTD}; # Create a build working folder and initial build logging. # To prevent any persistency issues the scratch partition is used. strScratchfolder=`cat /etc/vmware/locker.conf | cut -d" " -f1`; mkdir ${strScratchfolder}/build; strBuildfolder="${strScratchfolder}/build"; strLogfile="${strBuildfolder}/build.log"; # Change to the build working folder, all build actions now done from here cd ${strBuildfolder}; echo "INFO Build folder: ${strBuildfolder}." >> ${strLogfile}; # Parse kickstart URL and determine host name strFQDN=`awk 'FNR==2 {print $7}' /var/log/esxi_install.log | cut -d"=" -f3`; strHostname=`echo ${strFQDN} | cut -d"." -f1`; # Parse all hardware information and log. This information can be used for customised configuration if required. strManufacturer=`esxcli hardware platform get | grep Vendor\ Name | cut -d":" -f2 | sed -e 's/^\ //'`; strModel=`esxcli hardware platform get | grep Product\ Name | cut -d":" -f2 | sed -e 's/^\ //'`; strSerial=`esxcli hardware platform get | grep Serial\ Number | cut -d":" -f2 | sed -e 's/^\ //'`; strUUID=`smbiosDump | grep -A 5 System\ Info | grep UUID | cut -d":" -f2 | sed -e 's/[ ]*//'`; echo "INFO System manufacturer/model: ${strManufacturer} ${strModel}" >> ${strLogfile}; echo "INFO System serial: ${strSerial}" >> ${strLogfile}; echo "INFO System UUID (from BIOS): ${strUUID}" >> ${strLogfile}; ######################################################################################################################## # Basic config # Set system hostname echo "INFO Configuring hostname: ${strFQDN}" >> ${strLogfile}; esxcli system hostname set --fqdn=${strFQDN}; # Rename local datastore to be more descriptive. # Datastore name is persistent across ALL rebuilds hence find old name first then rename it. strCurrentdsname=`esxcli --formatter=csv storage filesystem list | grep VMFS | cut -d"," -f7`; vim-cmd hostsvc/datastore/rename ${strCurrentdsname} "${strHostname}_localds1"; echo "INFO Local datastore renamed to ${strHostname}_localds1." >> ${strLogfile}; # Determine ESXi version strESXiversion=`vmware -v`; strESXinoversion=`vmware -v | cut -d" " -f3`; echo "INFO VMware ESXi version installed: ${strESXiversion}" >> ${strLogfile}; # Download all config files strHostconfigfile="${CONFIGURL}/${strHostname}.cfg"; echo "INFO Downloading host config file ${strHostconfigfile}." >> ${strLogfile}; wget ${strHostconfigfile}; cd /scratch/build; ######################################################################################################################## # Parse all host configuration data # Since the firstboot part is using busybox the file can simply be included using the source statement thereby loading # all configuration options as variables echo "INFO Applying host source config file ${strHostname}.cfg." >> ${strLogfile}; source ${strHostname}.cfg; ######################################################################################################################## # Clear temporary build network config echo "INFO Removing temporary build networking vmk0." >> ${strLogfile}; esxcli network ip interface remove --interface-name=vmk0 >> ${strLogfile} 2>&1; echo "INFO Removing temporary build networking vSwitch0." >> ${strLogfile}; esxcli network vswitch standard remove --vswitch-name=vSwitch0 >> ${strLogfile} 2>&1; sleep ${WAITFORREMOVAL};
Host configuration file
The format of the configuration file doesn’t really matter as long as it can be parsed – JSON, XML, comma separated, etc. Ideally I’d like to tie this into a configuration management database backend, but this will take a bit more time and coding. The easiest way is to simply write this as a shell script and use source (or “.”) to load it (see above). These files can easily be created in large numbers with a few lines of code, e.g. an Excel macro does the trick.
So – here goes:
# General settings - self explanatory CFG_DG="192.168.0.1"; CFG_DNS1="192.168.0.2"; CFG_DNS2="192.168.0.2"; CFG_SEARCHDOMAIN="mylab.local"; CFG_NTP1="ntp.cis.strath.ac.uk"; CFG_PASSWORD="Password123"; # Storage - switch on software iscsi, specify which vmk's to run this on CFG_ENABLEISCSI="true"; CFG_ISCSIVMKS="vmk1,vmk2"; ###################################################### # vSwitch configuration CFG_VSWITCH0_NAME="vSwitch0"; CFG_VSWITCH0_UPLINK1="vmnic0"; CFG_VSWITCH0_UPLINK2="vmnic1"; CFG_VSWITCH0_ACTIVE="vmnic0,vmnic1"; CFG_VSWITCH0_STANDBY=""; # Failback and switch notify should be true or false CFG_VSWITCH0_FAILBACK="false"; CFG_VSWITCH0_NOTIFY="true"; #Failuredetection: link / beacon CFG_VSWITCH0_FAILUREDETECTION="link"; #Load balacing: iphash / mac /portid CFG_VSWITCH0_LOADBAL="mac"; #Ports: 8,24,56,120,248,504,1016,2040,4088 CFG_VSWITCH0_PORTS="128"; CFG_VSWITCH0_MTU="1500"; #vSwitch1 - here used for iscsi CFG_VSWITCH1_NAME="vSwitch1"; CFG_VSWITCH1_UPLINK1="vmnic2"; CFG_VSWITCH1_UPLINK2="vmnic3"; CFG_VSWITCH1_ACTIVE="vmnic2,vmnic3"; CFG_VSWITCH1_STANDBY=""; CFG_VSWITCH1_FAILBACK="true"; #Failuredetection: link / beacon CFG_VSWITCH1_FAILUREDETECTION="link"; CFG_VSWITCH1_NOTIFY="true"; #Load balacing: iphash / mac / portid CFG_VSWITCH1_LOADBAL="mac"; CFG_VSWITCH1_PORTS="128"; CFG_VSWITCH1_MTU="9000"; ###################################################### # Port groups CFG_PORTGROUP0_NAME="Management Network"; CFG_PORTGROUP0_VSWITCH="vSwitch0"; CFG_PORTGROUP0_VLAN="0"; # Configure active uplink for a port group - use this for iSCSI binding CFG_PORTGROUP0_ACTIVE=""; CFG_PORTGROUP1_NAME="iSCSI_vmnic2"; CFG_PORTGROUP1_VSWITCH="vSwitch1"; CFG_PORTGROUP1_VLAN="100"; CFG_PORTGROUP1_ACTIVE="vmnic2"; CFG_PORTGROUP2_NAME="iSCSI_vmnic3"; CFG_PORTGROUP2_VSWITCH="vSwitch1"; CFG_PORTGROUP2_VLAN="100"; CFG_PORTGROUP2_ACTIVE="vmnic3"; CFG_PORTGROUP3_NAME="DATA_VSW0_VLAN130"; CFG_PORTGROUP3_VSWITCH="vSwitch0"; CFG_PORTGROUP3_VLAN="130"; CFG_PORTGROUP3_ACTIVE=""; ###################################################### #Port vmkernel interfaces CFG_VMK0_NAME="vmk0"; CFG_VMK0_PORTGROUP="Management Network"; CFG_VMK0_TYPE="static"; CFG_VMK0_IP="192.168.0.50"; CFG_VMK0_NETMASK="255.255.255.0"; CFG_VMK0_MTU="1500"; # Management, VMotion, faultToleranceLogging, vSphereReplication CFG_VMK0_TAGS="Management,VMotion"; CFG_VMK1_NAME="vmk1"; CFG_VMK1_PORTGROUP="iSCSI_vmnic2"; CFG_VMK1_TYPE="static"; CFG_VMK1_IP="192.168.0.60"; CFG_VMK1_NETMASK="255.255.255.0"; CFG_VMK1_MTU="9000"; # Management, VMotion, faultToleranceLogging, vSphereReplication CFG_VMK1_TAGS=""; CFG_VMK2_NAME="vmk2"; CFG_VMK2_PORTGROUP="iSCSI_vmnic3"; CFG_VMK2_TYPE="static"; CFG_VMK2_IP="192.168.0.61"; CFG_VMK2_NETMASK="255.255.255.0"; CFG_VMK2_MTU="9000"; # Management, VMotion, faultToleranceLogging, vSphereReplication CFG_VMK2_TAGS="";
Next – network configuration…