254 lines
13 KiB
ReStructuredText
254 lines
13 KiB
ReStructuredText
======================================================
|
||
Confidential Computing in Linux for x86 virtualization
|
||
======================================================
|
||
|
||
.. contents:: :local:
|
||
|
||
By: Elena Reshetova <elena.reshetova@intel.com> and Carlos Bilbao <carlos.bilbao.osdev@gmail.com>
|
||
|
||
Motivation
|
||
==========
|
||
|
||
Kernel developers working on confidential computing for virtualized
|
||
environments in x86 operate under a set of assumptions regarding the Linux
|
||
kernel threat model that differ from the traditional view. Historically,
|
||
the Linux threat model acknowledges attackers residing in userspace, as
|
||
well as a limited set of external attackers that are able to interact with
|
||
the kernel through various networking or limited HW-specific exposed
|
||
interfaces (USB, thunderbolt). The goal of this document is to explain
|
||
additional attack vectors that arise in the confidential computing space
|
||
and discuss the proposed protection mechanisms for the Linux kernel.
|
||
|
||
Overview and terminology
|
||
========================
|
||
|
||
Confidential Computing (CoCo) is a broad term covering a wide range of
|
||
security technologies that aim to protect the confidentiality and integrity
|
||
of data in use (vs. data at rest or data in transit). At its core, CoCo
|
||
solutions provide a Trusted Execution Environment (TEE), where secure data
|
||
processing can be performed and, as a result, they are typically further
|
||
classified into different subtypes depending on the SW that is intended
|
||
to be run in TEE. This document focuses on a subclass of CoCo technologies
|
||
that are targeting virtualized environments and allow running Virtual
|
||
Machines (VM) inside TEE. From now on in this document will be referring
|
||
to this subclass of CoCo as 'Confidential Computing (CoCo) for the
|
||
virtualized environments (VE)'.
|
||
|
||
CoCo, in the virtualization context, refers to a set of HW and/or SW
|
||
technologies that allow for stronger security guarantees for the SW running
|
||
inside a CoCo VM. Namely, confidential computing allows its users to
|
||
confirm the trustworthiness of all SW pieces to include in its reduced
|
||
Trusted Computing Base (TCB) given its ability to attest the state of these
|
||
trusted components.
|
||
|
||
While the concrete implementation details differ between technologies, all
|
||
available mechanisms aim to provide increased confidentiality and
|
||
integrity for the VM's guest memory and execution state (vCPU registers),
|
||
more tightly controlled guest interrupt injection, as well as some
|
||
additional mechanisms to control guest-host page mapping. More details on
|
||
the x86-specific solutions can be found in
|
||
:doc:`Intel Trust Domain Extensions (TDX) </arch/x86/tdx>` and
|
||
`AMD Memory Encryption <https://www.amd.com/system/files/techdocs/sev-snp-strengthening-vm-isolation-with-integrity-protection-and-more.pdf>`_.
|
||
|
||
The basic CoCo guest layout includes the host, guest, the interfaces that
|
||
communicate guest and host, a platform capable of supporting CoCo VMs, and
|
||
a trusted intermediary between the guest VM and the underlying platform
|
||
that acts as a security manager. The host-side virtual machine monitor
|
||
(VMM) typically consists of a subset of traditional VMM features and
|
||
is still in charge of the guest lifecycle, i.e. create or destroy a CoCo
|
||
VM, manage its access to system resources, etc. However, since it
|
||
typically stays out of CoCo VM TCB, its access is limited to preserve the
|
||
security objectives.
|
||
|
||
In the following diagram, the "<--->" lines represent bi-directional
|
||
communication channels or interfaces between the CoCo security manager and
|
||
the rest of the components (data flow for guest, host, hardware) ::
|
||
|
||
+-------------------+ +-----------------------+
|
||
| CoCo guest VM |<---->| |
|
||
+-------------------+ | |
|
||
| Interfaces | | CoCo security manager |
|
||
+-------------------+ | |
|
||
| Host VMM |<---->| |
|
||
+-------------------+ | |
|
||
| |
|
||
+--------------------+ | |
|
||
| CoCo platform |<--->| |
|
||
+--------------------+ +-----------------------+
|
||
|
||
The specific details of the CoCo security manager vastly diverge between
|
||
technologies. For example, in some cases, it will be implemented in HW
|
||
while in others it may be pure SW.
|
||
|
||
Existing Linux kernel threat model
|
||
==================================
|
||
|
||
The overall components of the current Linux kernel threat model are::
|
||
|
||
+-----------------------+ +-------------------+
|
||
| |<---->| Userspace |
|
||
| | +-------------------+
|
||
| External attack | | Interfaces |
|
||
| vectors | +-------------------+
|
||
| |<---->| Linux Kernel |
|
||
| | +-------------------+
|
||
+-----------------------+ +-------------------+
|
||
| Bootloader/BIOS |
|
||
+-------------------+
|
||
+-------------------+
|
||
| HW platform |
|
||
+-------------------+
|
||
|
||
There is also communication between the bootloader and the kernel during
|
||
the boot process, but this diagram does not represent it explicitly. The
|
||
"Interfaces" box represents the various interfaces that allow
|
||
communication between kernel and userspace. This includes system calls,
|
||
kernel APIs, device drivers, etc.
|
||
|
||
The existing Linux kernel threat model typically assumes execution on a
|
||
trusted HW platform with all of the firmware and bootloaders included on
|
||
its TCB. The primary attacker resides in the userspace, and all of the data
|
||
coming from there is generally considered untrusted, unless userspace is
|
||
privileged enough to perform trusted actions. In addition, external
|
||
attackers are typically considered, including those with access to enabled
|
||
external networks (e.g. Ethernet, Wireless, Bluetooth), exposed hardware
|
||
interfaces (e.g. USB, Thunderbolt), and the ability to modify the contents
|
||
of disks offline.
|
||
|
||
Regarding external attack vectors, it is interesting to note that in most
|
||
cases external attackers will try to exploit vulnerabilities in userspace
|
||
first, but that it is possible for an attacker to directly target the
|
||
kernel; particularly if the host has physical access. Examples of direct
|
||
kernel attacks include the vulnerabilities CVE-2019-19524, CVE-2022-0435
|
||
and CVE-2020-24490.
|
||
|
||
Confidential Computing threat model and its security objectives
|
||
===============================================================
|
||
|
||
Confidential Computing adds a new type of attacker to the above list: a
|
||
potentially misbehaving host (which can also include some part of a
|
||
traditional VMM or all of it), which is typically placed outside of the
|
||
CoCo VM TCB due to its large SW attack surface. It is important to note
|
||
that this doesn’t imply that the host or VMM are intentionally
|
||
malicious, but that there exists a security value in having a small CoCo
|
||
VM TCB. This new type of adversary may be viewed as a more powerful type
|
||
of external attacker, as it resides locally on the same physical machine
|
||
(in contrast to a remote network attacker) and has control over the guest
|
||
kernel communication with most of the HW::
|
||
|
||
+------------------------+
|
||
| CoCo guest VM |
|
||
+-----------------------+ | +-------------------+ |
|
||
| |<--->| | Userspace | |
|
||
| | | +-------------------+ |
|
||
| External attack | | | Interfaces | |
|
||
| vectors | | +-------------------+ |
|
||
| |<--->| | Linux Kernel | |
|
||
| | | +-------------------+ |
|
||
+-----------------------+ | +-------------------+ |
|
||
| | Bootloader/BIOS | |
|
||
+-----------------------+ | +-------------------+ |
|
||
| |<--->+------------------------+
|
||
| | | Interfaces |
|
||
| | +------------------------+
|
||
| CoCo security |<--->| Host/Host-side VMM |
|
||
| manager | +------------------------+
|
||
| | +------------------------+
|
||
| |<--->| CoCo platform |
|
||
+-----------------------+ +------------------------+
|
||
|
||
While traditionally the host has unlimited access to guest data and can
|
||
leverage this access to attack the guest, the CoCo systems mitigate such
|
||
attacks by adding security features like guest data confidentiality and
|
||
integrity protection. This threat model assumes that those features are
|
||
available and intact.
|
||
|
||
The **Linux kernel CoCo VM security objectives** can be summarized as follows:
|
||
|
||
1. Preserve the confidentiality and integrity of CoCo guest's private
|
||
memory and registers.
|
||
|
||
2. Prevent privileged escalation from a host into a CoCo guest Linux kernel.
|
||
While it is true that the host (and host-side VMM) requires some level of
|
||
privilege to create, destroy, or pause the guest, part of the goal of
|
||
preventing privileged escalation is to ensure that these operations do not
|
||
provide a pathway for attackers to gain access to the guest's kernel.
|
||
|
||
The above security objectives result in two primary **Linux kernel CoCo
|
||
VM assets**:
|
||
|
||
1. Guest kernel execution context.
|
||
2. Guest kernel private memory.
|
||
|
||
The host retains full control over the CoCo guest resources, and can deny
|
||
access to them at any time. Examples of resources include CPU time, memory
|
||
that the guest can consume, network bandwidth, etc. Because of this, the
|
||
host Denial of Service (DoS) attacks against CoCo guests are beyond the
|
||
scope of this threat model.
|
||
|
||
The **Linux CoCo VM attack surface** is any interface exposed from a CoCo
|
||
guest Linux kernel towards an untrusted host that is not covered by the
|
||
CoCo technology SW/HW protection. This includes any possible
|
||
side-channels, as well as transient execution side channels. Examples of
|
||
explicit (not side-channel) interfaces include accesses to port I/O, MMIO
|
||
and DMA interfaces, access to PCI configuration space, VMM-specific
|
||
hypercalls (towards Host-side VMM), access to shared memory pages,
|
||
interrupts allowed to be injected into the guest kernel by the host, as
|
||
well as CoCo technology-specific hypercalls, if present. Additionally, the
|
||
host in a CoCo system typically controls the process of creating a CoCo
|
||
guest: it has a method to load into a guest the firmware and bootloader
|
||
images, the kernel image together with the kernel command line. All of this
|
||
data should also be considered untrusted until its integrity and
|
||
authenticity is established via attestation.
|
||
|
||
The table below shows a threat matrix for the CoCo guest Linux kernel but
|
||
does not discuss potential mitigation strategies. The matrix refers to
|
||
CoCo-specific versions of the guest, host and platform.
|
||
|
||
.. list-table:: CoCo Linux guest kernel threat matrix
|
||
:widths: auto
|
||
:align: center
|
||
:header-rows: 1
|
||
|
||
* - Threat name
|
||
- Threat description
|
||
|
||
* - Guest malicious configuration
|
||
- A misbehaving host modifies one of the following guest's
|
||
configuration:
|
||
|
||
1. Guest firmware or bootloader
|
||
|
||
2. Guest kernel or module binaries
|
||
|
||
3. Guest command line parameters
|
||
|
||
This allows the host to break the integrity of the code running
|
||
inside a CoCo guest, and violates the CoCo security objectives.
|
||
|
||
* - CoCo guest data attacks
|
||
- A misbehaving host retains full control of the CoCo guest's data
|
||
in-transit between the guest and the host-managed physical or
|
||
virtual devices. This allows any attack against confidentiality,
|
||
integrity or freshness of such data.
|
||
|
||
* - Malformed runtime input
|
||
- A misbehaving host injects malformed input via any communication
|
||
interface used by the guest's kernel code. If the code is not
|
||
prepared to handle this input correctly, this can result in a host
|
||
--> guest kernel privilege escalation. This includes traditional
|
||
side-channel and/or transient execution attack vectors.
|
||
|
||
* - Malicious runtime input
|
||
- A misbehaving host injects a specific input value via any
|
||
communication interface used by the guest's kernel code. The
|
||
difference with the previous attack vector (malformed runtime input)
|
||
is that this input is not malformed, but its value is crafted to
|
||
impact the guest's kernel security. Examples of such inputs include
|
||
providing a malicious time to the guest or the entropy to the guest
|
||
random number generator. Additionally, the timing of such events can
|
||
be an attack vector on its own, if it results in a particular guest
|
||
kernel action (i.e. processing of a host-injected interrupt).
|
||
resistant to supplied host input.
|
||
|