VMware ESXi 6.0, Patch ESXi-6.0.0-20170604001-standard (2149958)

a lot of security patches/bugfixes was released today for 6.0, Patch ESXi-6.0.0-20170604001-standard 

VMware ESXi 6.0, Patch ESXi-6.0.0-20170604001-standard (2149958)

Details

  Release date: June 6, 2017

Profile Name
ESXi-6.0.0-20170604001-standard
Build
For build information, see KB 2149954.
Vendor
VMware, Inc.
Release Date
June 6, 2017
Acceptance Level
PartnerSupported
Affected Hardware
N/A
Affected Software
N/A
Affected VIBs
VMware_bootbank_esx-base_6.0.0-3.69.5572656
VMware_bootbank_vsan_6.0.0-3.69.5568629
VMware_bootbank_vsanhealth_6.0.0-3000000.3.0.3.69.5572665
VMware_bootbank_ehci-ehci-hcd_1.0-4vmw.600.3.69.5572656
VMware_bootbank_misc-drivers_6.0.0-3.69.5572656
VMware_bootbank_xhci-xhci_1.0-3vmw.600.3.69.5572656
VMware_bootbank_esx-ui_1.19.0-5387100
PRs Fixed
1394451, 1410026, 1458558, 1664328, 1665732, 1668042, 1693612, 1701193, 1719837, 1742050, 1759902, 1765482, 1768322, 1769030, 1776387, 1777480, 1789705, 1790760, 1793825, 1794229, 1795756, 1795898, 1796950, 1797808, 1798708, 1799234, 1799321, 1799532, 1800279, 1800710, 1802377, 1804891, 1805516, 1806868, 1807032, 1807049, 1808533, 1809707, 1811355, 1823083, 1825991, 1827677, 1829156, 1830129, 1832833, 1832953, 1834405, 1835945, 1838497, 1843748, 1848150, 1848283, 1851131, 1822118, 1838537, 1790630, 1434443, 1801457, 1804667, 1832410, 1841523, 1843319, 1843969, 1848854, 1850816, 1856312, 1847505
Related CVE numbers
N/A

 

Solution

Summaries and Symptoms

Issues fixed in this patch (and their relevant symptoms, if applicable) include:
  • When you create a linked clone from a digest VMDK file, vCenter Server marks the digest disk file as non-deletable. Thus, when you delete the respective VM, the digest VMDK file is not deleted from the VM folder because of the ddb.deletable = FALSE ddb entry in the descriptor file.
  • The vm-support command uses a script called smartinfo.sh to collect SMART data for all storage devices on the ESXi host. The vm-support command imposes a 20-second timeout for every command that collects support data. However, smartinfo.sh takes more than 20 seconds to complete, which causes the vm-support command to run with the following error:

    cmd /usr/lib/vmware/vm-support/bin/smartinfo.sh timed out after 20 seconds due to lack of progress in last 10 seconds (0 bytes read)

  • When you attempt to cancel a snapshot creation task, but the VASA provider is unable to cancel the related underlying operations on a VVoL-backed disk, a snapshotted VVoL is created and it exists until garbage collection cleans it up.
  • When you attempt to cancel a clone creation task, but the VASA provider is unable to cancel the related underlying operations, vCenter Server creates a new VVoL, copies all the data, and reports that the clone creation has  been successful.
  • When you use PCI passthru with devices that use MSI-X and newer Linux kernels, a purple diagnostic screen that shows VMKPCIPassthru_SetupIntrProxy appears. This issue is due to the code in PCIPassthruChangeIntrSettings.
  • When you try to re-add a host to vCenter Server, hostd might crash if the host has IOFilter enabled and if VMs with enabled Changed Block Tracking (CBT) reside on that host. The filter library uses the poll and worker libraries. When the filter library is initialized before the poll and worker libraries, it cannot work properly and crashes.
  • When you take a snapshot of a virtual machine, the virtual machine might become unresponsive.
  • If you are using an IPv6 address type on an ESXi host, the host might become unavailable during shutdown.
  • If the active memory of a virtual machine that runs on an ESXi host falls under 1% and drops to zero, the host might start reclaiming memory even if the host has enough free memory.
    Workaround: Resolved.

    1. Connect to vCenter Server by using the vSphere Web Client.
    2. Select an ESXi host in the inventory.
    3. Power off all virtual machines on the ESXi host.
    4. Click Settings.
    5. Under the System heading, click Advanced System Settings.
    6. Search for the Mem.SampleActivePctMin setting.
    7. Click Edit.
    8. Change the value to 1.
    9. Click OK to accept the changes.
    10. Power on the virtual machines.
  • If you disconnect your ESXi host from vCenter Server and some of the virtual machines on that host are using LAG, your ESXi host might become unresponsive when you reconnect it to vCenter Server after recreating the same LAG on the vCenter Server side. You might see an error, such as the following:

    0x439116e1aeb0:[0x418004878a9c]LACPScheduler@#+0x3c stack: 0x417fcfa000400x439116e1aed0:[0x418003df5a26]Net_TeamScheduler@vmkernel#nover+0x7a stack: 0x43070000003c0x439116e1af30:[0x4180044f5004]TeamES_Output@#+0x410 stack: 0x4302c435d9580x439116e1afb0:[0x4180044e27a7]EtherswitchPortDispatch@#+0x633 stack: 0x0

  • After you create a virtual machine snapshot of a SEsparse format, you might hit a rare race condition if there are significant but varying write IOPS to the snapshot. This race condition might make the ESXi host stop responding.
  • The network packet count calculation might be handled by multiple CPUs. This calculation might introduce a network statistics calculation error, and might display the wrong number in the network performance chart.
  • Your Guest OS might appear to slowdown or might experience a CPU spike that disappears after you disable ASLR in the Guest OS and perform an FSR. The following processes might cause such behavior:
    • Your translation cache fills with translations for numerous user-level CPUID/RDTSC instructions that are encountered at different virtual addresses in the Guest OS.
    • Your virtual machine monitor uses a hash function with poor dispersion when checking for existing translations.
  • You are prompted for a password twice when connecting to an ESXi host through SSH if the ESXi host is upgraded from vSphere version 5.5 to 6.0 while being part of a domain.
  • Simple Network Management Protocol (SNMP) agent is reporting the same value for both ifOutErrors and ifOutOctets counters, when they should be different.
  • If a VVol VASA Provider returns an error during a storage profile change operation, vSphere tries to undo the operation, but the profile ID gets corrupted in the process.
  • Because of a memory leak, the hostd process might crash with the following error:

    "Memory exceeds hard limit. Panic".

    The hostd logs report numerous errors such as:

    Unable to build Durable Name

    This causes the host to get disconnected from vCenter Server.

  • The Pixman library is updated to version 0.35.1.
  • Per host Read/Write latency displayed for VVol datastores in the vSphere Web Client is incorrect.
  • Virtual machine configured to use EFI firmware fails to obtain an IP address when trying to PXE boot if the DHCP environment responds by IP unicast. The EFI firmware was not capable of receiving a DHCP reply sent by IP unicast.
  • The host profile operations: check compliance, remediation and cloning of host profile fail in an AutoDeploy environment. The following scenarios are observed:
    • During fresh installation of ESXi hosts using autodeploy
      • Check compliance for host profiles fails with similar message:

        Host is unavailable for checking compliance

      • Host profile remediation (apply host profiles) fails with the following error:

        Call "HostProfileManager.GenerateConfigTaskList" for object "HostProfileManager" on vCenter Server<vcenter_hostname> failed.</vcenter_hostname>

    • Change the reference host of host profile fails with the following error:

      Call "HostProfileManager.CreateProfile" for object "HostProfileManager" on vCenter Server failed.

    • Clone host profile fails with the following error:

      Call "HostProfileManager.CreateProfile" for object "HostProfileManager" on vCenter Server failed. The profile does not have an associated reference host.

    In /var/log/syslog.log file failed operations appear with the following error:

    Error: profileData from only a single profile instance supported in VerifyMyProfilesPolicies.

  • An ESXi host with VMware vFlash Read Cache (VFRC) configured Virtual Machine might fail with a purple screen when backend storage becomes slow or inaccessible. This failure is due to a locking defect in the VFRC code.
  • Using SESparse for both creating snapshots and cloning of virtual machines, might cause a corrupted Guest OS file system.
  • Because of incorrect accounting of overhead memory a PSOD or a warning message occurs at unload time when destroying a physically contiguous vmkernel heap with initial allocated size of 64MB or more. The following warning message is observed:

    Heap: 2781: Non-empty heap () being destroyed (avail is , should be ).

  • Intelligent Platform Management Interface (IPMI) stack is unresponsive after a hard Baseboard Management Controller (BMC) reset.
  • In order to update the last seen time stamp for each LUN on an ESXi host, a process has to acquire a lock on /etc/vmware/lunTimestamps.log file. The lock is being held for a long time than necessary in each process. If there are too many such processes trying to update the /etc/vmware/lunTimestamps.log file, they might result in lock contention on this file. If hostd is one of these processes that is trying to acquire the lock, the ESXi host might get disconnected from the vCenter Server or become unresponsive with lock contention error messages (on lunTimestamps.log file) in the hostd logs. You might get a similar error message:

    Error interacting with configuration file /etc/vmware/lunTimestamps.log: Timeout while waiting for lock, /etc/vmware/lunTimestamps.log.LOCK, to be released. Another process has kept this file locked for more than 30 seconds. The process currently holding the lock is (). This is likely a temporary condition. Please try your operation again.

    Note:

    • process_name is the process or service that is currently holding the lock on the /etc/vmware/lunTimestamps.log. For example, smartd, esxcfg-scsidevs, localcli, etc.
    • PID is the process ID for any of these services. 
  • DDR4 memory modules of Dell 13G Servers are displayed as Unknown on the Hardware status Page in the vSphere Web Client.
  • During snapshot consolidation a precise calculation might be performed to determine the storage space required to perform the consolidation. This precise calculation can cause the virtual machine to stop responding, because it takes a long time to complete.
  • Virtual Machines with SEsparse based snapshots might stop responding, during I/O operations with a specific type of I/O workload in multiple threads.
  • There is a memory leak in the Еtherswitch heap with size from 32 bytes to 63 bytes. When the heap is running out of memory, the VMs lose connection.
  • UTF-8 characters are not handled properly before being passed on to a VVol Vasa Provider. As a result, the VM storage profiles which are using international characters are either not recognised by the Vasa Provider or are treated, or displayed incorrectly by the Vasa Provider.
  • The ESXi600-201611001 patch includes a change that allows ESXi to disable Intel® IOMMU (also known as VT-d) interrupt remapper functionality. In HPE ProLiant Gen8 servers, disabling this functionality causes PCI errors. As a result of these errors, the platform generates an NMI that causes the ESXi host to fail with a purple diagnostic screen.

    With this ESXi600-201706001 patch release, ESXi re-enables the Intel IOMMU’s interrupt remapper functionality by default to avoid the failures in HPE ProLiant Gen8 servers.

  • When the Storage I/O Control (SIOC) changes the LUN’s maximum queue depth parameter, an event notification is sent from the Pluggable Storage Architecture (PSA) to the hostd. In setups where the queue depth parameter is changed dynamically, a load of event notifications is sent to the hostd causing performance issues such as slow vSphere tasks or disconnecting the hostd from the vCenter Server. With this fix, PSA does not send any event notification to hostd.
  • The CBRC filter uses 32-bit computation to perform calculations and returns a completion percentage for every digest recompute request. For large disks, the number of hashes are great enough to overflow the 32 bit calculation, resulting in an incorrect completion percentage.
  • When you reboot the ESXi host under the following conditions, the host might fail with a purple diagnostic screen and a PCPU xxx: no heartbeat error.
    • You use the vSphere Network Appliance (DVFilter) in an NSX environment
    • You migrate a virtual machine with vMotion under DVFilter control

    the host may fail with purple diagnostic screen and error: PCPU xxx: no heartbeat

  • During normal VM operation, VMware Tools services (version 9.10.0 and later) create vSocket connections to exchange data with the hypervisor. When a large number of such connections are made, the hypervisor may run out of lock serial numbers and the virtual machine powers off with an error.
  • Under certain conditions, a mandatory field in the VMODL object of the profile path is left unset. This condition triggers a serialization issue during answer file validation for the network configuration, causing a vpxd service crash.
  • Existing VMs using Instant Clone and new ones, created with or without Instant Clone, lose connection with the Guest Introspection host module. As a result, the VMs are not protected and no new Guest Introspection configurations can be forward to the ESXi host. You are also present with a “Guest introspection not ready” warning in the vCenter Server UI.
  • Multiple Couldn't enable keep alive warnings occur during VMware NSX and partner solutions communication through a VMCI socket (vsock). The VMkernel log now omits these repeated warnings because they can be safely ignored.
  • The current calculation of the resync bytes overestimates the full resync traffic for RAID5/6 configurations. This may happen when either of the following is present:
    • Placing the node in maintenance mode with either “Full Data Migration” or “Ensure Accessibility” evacuation mode.
    • Creating a complete mirror of a component after losing it due to a failure in the cluster.
  • Windows 2012 domain controller supports SMBv2, whereas Likewise stack on ESXi supports only SMBv1.
    With this release, the likewise stack on ESXi is enabled to support SMBv2.
  • For a VM with e1000/e1000e vNIC, when the e1000/e1000e driver tells the e1000/e1000e VMkernel emulation to skip a descriptor (the transmit descriptor address and length are 0), a loss of connectivity can occur and the VM can enter kernel panic state.
  • If a VM has a driver (espessaly graphics driver) or an application that pins too much memory, it creates a sticky page in the VM. If such a VM is about to be vMotion migrated to another host, the migration process is suspended and later on fails due to incorrect pending of Input/Output computations.
  • Every time the kernel API vmk_ScsiCmdGetVMUuid fails to obtain a valid UUID, it prints an error message in the system logs similar to the following:

    2016-06-30T16:46:08.749Z cpu6:33528)WARNING: World: vm 0: 11020: vm not found

    The issue is resolved by conditionally invoking the function World_GetVcUuid which caused the log spew from the kernel API vmk_ScsiCmdGetVMUuid.

  • For a Pure Storage FlashArray device you have to add manually the SATP rule to set the SATP, PSP and IOPs. The issue is resolved in this release and a new SATP rule is added to ESXi to set SATP to VMW_SATP_ALUA, PSP to VMW_PSP_RR, and IOPs to 1 for all Pure Storage FlashArray models.

    Note: In case of a stateless ESXi installation, if an old host profile is applied, it overwrites the new rules after upgrade.

  • Sometimes when you try to unmount an NFS datastore from vCenter Server the operation might fail with the error:

    NFS datastore unmount failure - Datsatore has open files, cannot be unmounted.

  • When you use Storage vMotion on vSphere Virtual Volumes storage, the UUID of a virtual disk might change. The UUID identifies the virtual disk and a changed UUID makes the virtual disk appear as a new and different disk. The UUID is also visible to the guest OS and might cause might cause drive misidentification.
  • When you try to use host profiles to join an ESXi 6.x host to an Active Directory domain, the application hangs or fails with an error.
  • You see Failed to open file error messages in the vmkernel.log file when a vSAN-enabled ESXi host boots up or during a manual disk group mount in vSAN.
  • When the I/O operations hang or drop at HBA driver layer because of driver, firmware, connectivity, or storage issues, the stuck I/O is not aborted which causes the VM to hang.
  • After you use vSphere Auto Deploy to add hosts to an Active Direcotry domain using vSphere Authentication Proxy, you cannot log in to the host with AD credentials.
  • When you try to use vSphere Auto Deploy to add hosts to an Active Direcotry domain using vSphere Authentication Proxy, the hosts cannot be joined to the domain.
  • If an ESXi host is connected to a vSphere Distributed Switch configured with LACP and you try to use vSphere vMotion, you see a warning similar to:

    Currently connected network interface 'Network Adapter 1" uses network 'DSwitchName', which is not accessible.

  • When the unmap commands fail, the ESXi host might stop responding due to a memory leak in the failure path. You might receive the following error message in the vmkernel.log file:

    FSDisk: 300: Issue of delete blocks failed [sync:0] and the host gets unresponsive.

  • If you use SEsparse and enable unmapping operation, the file system of the guest OS might be corrupt. This issue occurs only in case you use SEsparse to create snapshots and clones of virtual machines. As a result, after the wipe operation (the storage unmapping) is completed, the file system of the guest OS might be corrupt. The full clone of the virtual machine performs well.
  • When you hot-add an existing or new virtual disk to a CBT enabled VM residing on VVOL datastore, the guest operation system might stop responding until the hot-add process completes. The VM unresponsiveness depends on the size of the virtual disk being added. The VM automatically recovers once hot-add completes.
  • To define the storage I/O scheduling policy for a virtual machine (VM), you can configure the I/O throughput for each VM disk by modifying the IOPS limit. When you edit the IOPS limit and CBT is enabled for the VM, the operation fails with an error The scheduling parameter change failed. Due to this problem, the scheduling policies of the VM cannot be altered. The error message appears in the vSphere Recent Tasks pane.You can see the following errors in the /var/log/vmkernel.log file:

    2016-11-30T21:01:56.788Z cpu0:136101)VSCSI: 273: handle 8194(vscsi0:0):Input values: res=0 limit=-2 bw=-1 Shares=1000
    2016-11-30T21:01:56.788Z cpu0:136101)ScsiSched: 2760: Invalid Bandwidth Cap Configuration
    2016-11-30T21:01:56.788Z cpu0:136101)WARNING: VSCSI: 337: handle 8194(vscsi0:0):Failed to invert policy
    .

  • A few races happen between multiple LSOM internal code paths. Freeing up a region in the caching tier twice leads to a stack trace and panic of the type:

    PanicvPanicInt@vmkernel#nover+0x36b stack: 0x417ff6af0980, 0x4180368
    2015-04-20T16:27:38.399Z cpu7:1000015002)0x439124d1a780:[0x4180368ad6b7]Panic_vPanic@vmkernel#nover+0x23 stack: 0x46a, 0x4180368d7bc1, 0x43a
    2015-04-20T16:27:38.411Z cpu7:1000015002)0x439124d1a7a0:[0x4180368d7bc1]vmk_PanicWithModuleID@vmkernel#nover+0x41 stack: 0x439124d1a800, 0x4
    2015-04-20T16:27:38.423Z cpu7:1000015002)0x439124d1a800:[0x418037cc6d46]SSDLOG_FreeLogEntry@LSOMCommon#1+0xb6e stack: 0x6, 0x4180368dd0f4, 0
    2015-04-20T16:27:38.435Z cpu7:1000015002)0x439124d1a880:[0x418037d3c351]PLOGCommitDispatch@com.vmware.plog#0.0.0.1+0x849 stack: 0x46a7500, 0

    The races happen between PLOG Relog, PLOG Probe, and PLOG Decommission workflows.

  • In some cases, the system might display a generic error message rather than a specific message for identifying out of space issues. For example, when a failure is caused by insufficient disk space, you can see an error message such as:

    Storage policy change failure: 12 (Cannot allocate memory).

  • Under heavy I/O workload, a vSAN process might occupy the CPU cycles for a long time, which results in a brief PCPU lockup. This leads to a non-maskable interrupt and a log spew in the vmkernel log files.
  • An ESXi vSAN enabled host might fail with a PSOD with the following backtrace on the PSOD screen:

    2017-02-19T09:58:26.778Z cpu17:33637)0x43911b29bd20:[0x418032a77f83]Panic_vPanic@vmkernel#nover+0x23 stack: 0x4313df6720ba, 0x418032a944
    2017-02-19T09:58:26.778Z cpu17:33637)0x43911b29bd40:[0x418032a944a9]vmk_PanicWithModuleID@vmkernel#nover+0x41 stack: 0x43911b29bda0, 0x4
    2017-02-19T09:58:26.778Z cpu17:33637)0x43911b29bda0:[0x41803387b46c]vs_space_mgmt_svc_start@com.vmware.virsto#0.0.0.1+0x414 stack: 0x100
    2017-02-19T09:58:26.778Z cpu17:33637)0x43911b29be00:[0x41803384266d]Virsto_StartInstance@com.vmware.virsto#0.0.0.1+0x68d stack: 0x4312df
    2017-02-19T09:58:26.778Z cpu17:33637)0x43911b29bf00:[0x4180338f138f]LSOMMountHelper@com.vmware.lsom#0.0.0.1+0x19b stack: 0x43060d72b980,
    2017-02-19T09:58:26.778Z cpu17:33637)0x43911b29bf30:[0x418032a502c2]helpFunc@vmkernel#nover+0x4e6 stack: 0x0, 0x43060d6a60a0, 0x35, 0x0,
    2017-02-19T09:58:26.778Z cpu17:33637)0x43911b29bfd0:[0x418032c14c1e]CpuSched_StartWorld@vmkernel#nover+0xa2 stack: 0x0, 0x0, 0x0, 0x0, 0

  • If you use objtool, it performs an ioctl call which leads to a NULL pointer dereference in the vSAN witness host. As a result, the ESXi host fails with a purple screen.
  • During the decommissioning of a vSAN diskgroup with dedupe and compression enabled, the diskgroup should have disks with access commands failures. The failures can be verified by vmkernel log messages such as:

    Partition: 914: Read of GPT header (hdrlba = 1) failed on "naa.55cd2e404c185332" : I/O error

    This results in a failure of the vSAN host when decommissioning.

  • Sometimes, the vSAN health user interface might report an incorrect status for the network health check of the type All hosts have a Virtual SAN vmknic configured and then trigger the false vCenter Server alarm.
  • Depending on the workload and the number of virtual machines, diskgroups on the host might go into permanent device loss (PDL) state. This causes the diskgroups to not admit further IOs, rendering them unusable until manual intervention is performed.
  • Removing a vSAN component that is in an invalid state from a vSAN cluster might cause a virtual machine to stop responding or a host to become disconnected from vCenter Server.
  • An ESXi host might fail with a PSOD because of a race between the distributed object manager client initialization and distributed object manager VMkernel sysinfo interface codepaths.
  • When you try to log in to the VMware Host Client by using Chrome 57, the VMware Host Client immediately reports an error. The reported error is an Angular Digest in progress error.

source:

http://kb.vmware.com