A new class of hardware vulnerabilities known as Spectre and Meltdown, which involve speculative execution side channels, received advisory and security updates from Microsoft in January 2018. In this blog post, we’ll give a technical analysis of the new L1 Terminal Fault ( L1TF ) speculative execution side channel vulnerability, which has been given the CVEs 2018-3615 ( for SGX ), 18- 3620 for operating systems and SMM, and 3646 for virtualization. Intel Xeon and Core processors are both impacted by this vulnerability.
Security researchers and engineers who are interested in a technical analysis of L1TF and the pertinent mitigations are the target audience for this post. Please see Microsoft’s security advisory for L1TF if you’re interested in more general advice.
Please take note that the information in this post is accurate at the time it was written.
Overview of L1 Terminal Fault ( L1TF )
To create the conditions for speculative execution side channels, we previously defined four categories of speculation primitives. The following categories, specifically: exception delivery or deferral, misprediction of the conditional branch, indirect branch of speculative execution, and memory access, offer a fundamental method for doing so. L1TF deals with speculative ( or out-of-order ) execution related to logic that results in an architectural exception, making it a member of the exception delivery or deferral category of speculation primitives ( along with Meltdown and Lazy FP State Restore ). We’ll give a broad overview of L1TF in this post. Please refer to the whitepaper and advisory that Intel has published for this vulnerability for a more thorough analysis.
L1TF develops as a result of CPU optimization for handling address translations during page-table walks. A terminal page fault, which happens when a virtual address’s paging structure entry is absent ( Present bit is 0) or otherwise invalid, may be experienced by the CPU when translating linear addresses. A page error or TSX transaction abort along the architectural path will be the exception as a result. However, a CPU that is L1TF-vulnerable may start reading the linear address being translated before either of these happen. Even for guest page table entries, the page frame bits of this hypothetical-only read are treated as a system physical address. If the physical address’s cache line is present in the L1 data cache, the data for that line may be transferred to dependent operations, which may run speculatively before terminating the instruction that caused the terminal page fault. Page table walks involving both traditional and extended page tables ( the latter of which is used for virtualization ) can exhibit L1TF behavior.
Consider the following condensed example to help you understand how this might happen. In this instance, an attacker-controlled virtual machine (VM ) has created a page table hierarchy with the intention of reading the physical address of the desired system ( host ). For the virtual address 0x12345000, the following diagram shows a hierarchy where the terminal page table entry is absent but the page frame is ox9a0 instead:
The VM could then attempt to read from system physical addresses within [0x93a0000, 0c9a1000] using the following instruction sequence after establishing this hierarchy:
01: 4C0FB600 movzx r8,byte [rax] ; rax = 0x1234504002: 49C1E00C shl r8,byte 0xc03: 428B0402 mov eax,[rdx+r8] ; rdx = address of signal array
The VM could try to infer a speculative load from the L1 data cache line associated with the system physical address 0x9a0040 ( if present in L1 ) and have the first byte of that cache lines forwarded to an out-of-order load that uses this BIT as an offset into symlinks by carrying out these instructions within TSX transactions or by handling the architectural page fault. In the event that this system’s physical address has not been given to the VM, this would set up the requirements for using a disclosure primitive like FLUSH+RELOAD to observe byte values, which would result in the disclosure of information across security boundaries.
While the aforementioned scenario shows how L1TF can be used to infer physical memory across the boundaries of virtual machines ( where the virtual machine has complete control over the guest page tables ), it is also possible to use it in other circumstances. A user mode application, for instance, might try to read from physical addresses that are n’t present terminal page table entries in their own address space usingL1TF. Operating systems frequently use the software bits in the not currently available page table entry format to store metadata that may correspond to legitimate physical page frames in practice. This might make it possible for a process to read physical memory that is n’t assigned to it ( or VM, in the case of virtualization ), or that’s not supposed to be accessible ( like Windows ‘ PAGE_NOACCESS memory ).
L1 Terminal Fault ( L1TF ) mitigations
Depending on the attack category being mitigated, there are various L1TF mitigations. We’ll discuss the software security models that L1TF poses a risk to and the specific strategies that can be used to mitigate it in order to demonstrate this. For this, we’ll use the mitigation taxonomy from our earlier post on reducing side channels for speculative execution. The mitigations mentioned in this section frequently need to be combined to give L1TF a comprehensive defense.
relevant to models of software security
The potential relevance of L1TF to the various intra-device attack scenarios that software security models typically deal with is outlined in the table below. L1TF is applicable to all intra-device attack scenarios, as indicated by the orange cells ( gray cells would have indicated that it is not applicable ), in contrast to Meltdown ( CVE- 2017-5754), which only affected the kernel-to-user scenario. This is due to the possibility that L1TF can read any system’s physical memory.
Category of Attack | Scenario attack | L1TF |
---|---|---|
Inter-VM | Hypervisor-to-guest | CVE-2018-3646 |
Host-to-guest | CVE-2018-3646 | |
Guest-to-guest | CVE-2018-3646 | |
Intra-OS | Kernel-to-user | CVE-2018-3620 |
Process-to-process | CVE-2018-3620 | |
Intra-process | CVE-2018-3620 | |
Enclave | Enclave-to-any | CVE-2018-3615 |
VSM-to-any | CVE-2018-3646 |
avoiding L1TF speculation techniques
Addressing the problem as closely to the root cause as possible is one of the best ways to reduce a vulnerability, as we’ve previously noted. There are numerous mitigations that can be applied to L1TF speculation techniques.
Safe page frame bits in entries that are n’t currently page tables
A valid physical page with sensitive data from another security domain must be referenced in the page frame bits of a terminal page table entry in order for an L1TF attack to occur. This means that a compliant hypervisor and operating system kernel can prevent certain L1TF attack scenarios by making sure that, in either case, the page frame bits of non-present page table entries always contain benign data or, alternatively, that they are set with high order bits that do not correspond to physical memory that is accessible. To prevent physical address truncation ( e .g., dropping the high order bit ) in the case of #2, the Windows kernel will use a bit that is smaller than the physical addresses that are implemented and supported by that processor.
All supported Windows kernel and Hyper-V hypervisor versions make sure that# 1 and# 2 are automatically enforced on L1TF-vulnerable hardware starting with the August 2018 Windows security updates. This is enforced for both existing conventional page table entries and absence of extended page tables. According to our published Windows Server guidance, this mitigation is by default turned off and needs to be turned on.
Consider the following example of a user mode virtual address with no PTE because it is not accessible to demonstrate how this works. The page frame bits in this instance still refer to L1TF and what might be considered a valid physical address:
26: kd> !pte 0x00000281`d84c0000… PTE at FFFFB30140EC2600… contains 0000000356CDEB00… not valid… Transition: 356cde… Protect: 18 - No Access26: kd> dt nt!HARDWARE_PTE FFFFB30140EC2600+0x000 Valid : 0y0+0x000 Write : 0y0+0x000 Owner : 0y0+0x000 WriteThrough : 0y0+0x000 CacheDisable : 0y0+0x000 Accessed : 0y0+0x000 Dirty : 0y0+0x000 LargePage : 0y0+0x000 Global : 0y1+0x000 CopyOnWrite : 0y1+0x000 Prototype : 0y0+0x000 reserved0 : 0y1+0x000 PageFrameNumber : 0y000000000000001101010110110011011110 (0x356cde)+0x000 reserved1 : 0y0000+0x000 SoftwareWsIndex : 0y00000000000 (0)+0x000 NoExecute : 0y0
The behavior of setting a high order bit in the not present page table entry, which refers to physical memory that is either inaccessible or guaranteed to be benign ( in this case, bit 45 ), can be seen with the August 2018 Windows security updates in effect. Any attempt to read from it using L1TF will fail because it does n’t correspond to a physical address that is accessible.
17: kd> !pte 0x00000196`04840000… PTE at FFFF8000CB024200… contains 0000200129CB2B00… not valid… Transition: 200129cb2… Protect: 18 - No Access17: kd> dt nt!HARDWARE_PTE FFFF8000CB024200+0x000 Valid : 0y0+0x000 Write : 0y0+0x000 Owner : 0y0+0x000 WriteThrough : 0y0+0x000 CacheDisable : 0y0+0x000 Accessed : 0y0+0x000 Dirty : 0y0+0x000 LargePage : 0y0+0x000 Global : 0y1+0x000 CopyOnWrite : 0y1+0x000 Prototype : 0y0+0x000 reserved0 : 0y1+0x000 PageFrameNumber : 0y001000000000000100101001110010110010 (0x200129cb2)+0x000 reserved1 : 0y0000+0x000 SoftwareWsIndex : 0y00000000000 (0)+0x000 NoExecute : 0y0
The Hyper-V hypervisor Top-Level Functional Specification ( TLFS ) has been updated to include a defined interface that VMs can use to query this information in order to give virtual machines access to portable methods for determining the implemented physical address bits supported on systems. This makes it possible for virtual machines in a migration pool to migrate safely.
On the transition of the security domain, flush L1 data cache
Sensitive data from a victim security domain must be present in the L1 data cache in order to be disclosed using L1/TF ( note that all LPs share the same physical core ). This means that when switching between security domains, flushing the L1 data cache can stop disclosure. Through a microcode update that supports an architectural interface for flushing the L1 data cache, Intel has made this possible.
The Hyper-V hypervisor now makes use of the new L1 data cache flush feature when it is present to make sure that VM data is removed at crucial locations from the L1/data cache as of August 2018 Windows security updates. The flush happens when switching between virtual processor contexts on 2016+ Windows Server and 1607+ Windows 10. By reducing the frequency of this need, this lessens the flush’s performance impact. Prior to running a virtual machine ( e .g., before VMENTRY ), the flush takes place in Windows.
The flush is carried out in conjunction with the safe use or disablement of HyperThreading and per-virtual-processor hypervisor address spaces to ensure the robustness of L1 data cache flushing in the Hyper-V supervisor.
Intel’s microcode update makes sure that the L1 data cache is flushed whenever the logical processor exits enclave execution mode for SGX scenarios. Additionally, the microcode update allows for confirmation that the BIOS has enabledHT. Before enclave secrets in the L1 data cache are flushed or cleared, HT users run the risk of being attacked by a sibling logical processor. If the entity verifying the attestation deems the possibility of L1TF attacks from the sibling logic processor to be unacceptable, it may reject attestations from HT-enabled systems.
Sibling logical processor scheduling is secure
Multiple logical processors ( LPs ) can run concurrently on a physical core thanks to Intel’s HyperThreading ( HT) technology, also referred to as simultaneous multithreading. Code can be executed simultaneously by each sibling LP in a variety of security domains and privilege modes. One LP, for instance, might be running in the hypervisor while another runs code inside a virtual machine. Given that sensitive data may be able to reenter the L1 data cache through a sibling LP after it has been flushed, this has implications for the flush.
This can be avoided by safely scheduling code execution on sibling LPs or turning offHT. After a flush, both of these methods make sure that the core’s L1 data cache is not contaminated by data from another security domain.
The core scheduler, which ensures that virtual processors running on physical cores always belong to the same VM and are referred to as sibling hyperthreads, is supported by the Hyper-V hypervisor on Windows Server 2016 and higher. Starting with Windows Server 2019, this feature is enabled by default and requires an administrator’s opt-in. This, along with per-virtual-processor hypervisor address spaces, allow for the deferral of the L1 data cache flush to the point where a core starts executing virtual processors from different VMs rather than having to do it on each VMENTRY. Please refer to the in-depth Hyper-V technical blog on this subject for more information on how this is implemented.
In a scenario involving two distinct virtual processor scheduling policies (VM 1 and VM 2 ), the following diagram shows how those policies differ. The diagram demonstrates that while core scheduling is enabled, it is not possible for code from two different VMs to run concurrently on a core ( in this case, core 2 ).
It may be necessary to disable HT in order to guarantee the L1 data cache flush’s robustness for inter-VM isolation on Windows versions prior to Windows Server 2016 and for all versions of Windows Client with virtualization enabled. For situations where Virtual Secure Mode (VSM) is used to isolate secrets, this is also currently required on 2016+ Windows Server. Sibling logical processors cannot run concurrently on the same physical core when HT is turned off. Please consult our advisory for instructions on how to turn off HT on Windows.
erasing sensitive information from memory
Removing sensitive content from the address space so that it cannot be disclosed through speculative execution is another strategy for reducing side channels for the practice.
spaces for per-virtual-processor address
Hypervisors did not have a compelling need to partition their virtual address space per-VM prior to the development of speculative execution side channels. In order to make memory accesses easier, hypervisors frequently maintain a virtual mapping of all physical memory. Cross-VM secrets should not be stored in the hypervisor’s virtual address space when it is acting on behalf of a VM due to the existence of L1TF and other speculative execution side channels.
With the release of the August 2018 security update, the Hyper-V hypervisor in 2016+ Windows Server and 1607+ Windows 10 no longer maps all of physical memory into its virtual address space. Instead, it uses per-virtual-processor ( and thus per -VM ) address spaces. This guarantees that a given virtual processor can only access memory that has been allotted to it and the hypervisor on its behalf. This mitigation ensures that no sensitive cross-VM information is made available in the L1TF by working in tandem with the M1 data cache flush, safe use, or disablement of HT.
Application of mitigation
Together, the mitigations mentioned in the earlier sections offer L1TF broad protection. The attack scenarios, pertinent mitigations, and default settings for various Windows Server and Version of Windows Clients are summarized in the tables that follow:
Category of Attack | Version of Windows Server | Version of Windows Client | ||
---|---|---|---|---|
2016+ Windows Server | Pre-Windows Server 2016 | 1607+ Windows 10 | Prior to Windows 10 1607 | |
Inter-VM | Safe page frame bits_Opt-in_: L1 data cache flush, enable core scheduler, or disable HT are all enabled. | Enabled: L1 data cache flush, disable HT, safe page frame bits_Opt-in_ | Safe page frame bits_Opt-in_: L1 data cache flush, enable HT, per-virtual-processor address spaces | Enabled: L1 data cache flush, disable HT, safe page frame bits_Opt-in_ |
Intra-OS | Safe page frame bits are available. | Safe page frame bits are enabled. | ||
Enclave | L1 data cache flush_Opt-in (SGX/VSM) _: disable HT ( enabled ) |
Below is a more succinct summary of the connection between L1TF attack scenarios and mitigations:
Tactic for Mitigation | Name of the mitigation | Inter-VM | Intra-OS | Enclave |
---|---|---|---|---|
Avoid speculative methods | On the transition of the security domain, flush L1 data cache | |||
Sibling logical processor scheduling is secure | ||||
Safe page frame bits in entries that are n’t currently page tables | ||||
Delete sensitive information from memory. | spaces for per-virtual-processor address |
Putting an end to
In this article, we examined the L1 Terminal Fault ( L1TF), a brand-new speculative execution side channel vulnerability. A combination of software and firmware ( microcode ) updates are needed for systems with affected Intel processors in order to mitigate this vulnerability, which affects a wide variety of attack scenarios. We will continue to improve our response and mitigation strategy as a result of the discovery of L1TF, which shows that research into speculative execution side channels is ongoing. Through our Speculative Execution Side Channel Bounty Program, we continue to support researchers in reporting new discoveries.
Microsoft Security Response Center ( MSRC ) Matt Miller