Skip to content

Commit c7e48b1

Browse files
committed
Merge: Backport TDX suport to RHEL 10.1 + assorted minor KVM bugfixes
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-10/-/merge_requests/1212 This patch series backports TDX suport to RHEL 10.1 JIRA: https://issues.redhat.com/browse/RHEL-47242 Sanity tested on AMD and Intel machine. Omitted-fix: c126b46 ("Avoid calling kvm_is_mmio_pfn() when kvm_x86_ops.get_mt_mask is NULL") Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Approved-by: David Arcari <darcari@redhat.com> Approved-by: Paolo Bonzini <bonzini@gnu.org> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Julio Faracco <jfaracco@redhat.com>
2 parents 26e49a1 + 4cd012e commit c7e48b1

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

59 files changed

+7301
-746
lines changed

Documentation/virt/kvm/api.rst

Lines changed: 109 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1411,6 +1411,9 @@ the memory region are automatically reflected into the guest. For example, an
14111411
mmap() that affects the region will be made visible immediately. Another
14121412
example is madvise(MADV_DROP).
14131413

1414+
For TDX guest, deleting/moving memory region loses guest memory contents.
1415+
Read only region isn't supported. Only as-id 0 is supported.
1416+
14141417
Note: On arm64, a write generated by the page-table walker (to update
14151418
the Access and Dirty flags, for example) never results in a
14161419
KVM_EXIT_MMIO exit when the slot has the KVM_MEM_READONLY flag. This
@@ -2005,6 +2008,13 @@ If the KVM_CAP_VM_TSC_CONTROL capability is advertised, this can also
20052008
be used as a vm ioctl to set the initial tsc frequency of subsequently
20062009
created vCPUs.
20072010

2011+
For TSC protected Confidential Computing (CoCo) VMs where TSC frequency
2012+
is configured once at VM scope and remains unchanged during VM's
2013+
lifetime, the vm ioctl should be used to configure the TSC frequency
2014+
and the vcpu ioctl is not supported.
2015+
2016+
Example of such CoCo VMs: TDX guests.
2017+
20082018
4.56 KVM_GET_TSC_KHZ
20092019
--------------------
20102020

@@ -4780,17 +4790,19 @@ H_GET_CPU_CHARACTERISTICS hypercall.
47804790

47814791
:Capability: basic
47824792
:Architectures: x86
4783-
:Type: vm
4793+
:Type: vm ioctl, vcpu ioctl
47844794
:Parameters: an opaque platform specific structure (in/out)
47854795
:Returns: 0 on success; -1 on error
47864796

47874797
If the platform supports creating encrypted VMs then this ioctl can be used
47884798
for issuing platform-specific memory encryption commands to manage those
47894799
encrypted VMs.
47904800

4791-
Currently, this ioctl is used for issuing Secure Encrypted Virtualization
4792-
(SEV) commands on AMD Processors. The SEV commands are defined in
4793-
Documentation/virt/kvm/x86/amd-memory-encryption.rst.
4801+
Currently, this ioctl is used for issuing both Secure Encrypted Virtualization
4802+
(SEV) commands on AMD Processors and Trusted Domain Extensions (TDX) commands
4803+
on Intel Processors. The detailed commands are defined in
4804+
Documentation/virt/kvm/x86/amd-memory-encryption.rst and
4805+
Documentation/virt/kvm/x86/intel-tdx.rst.
47944806

47954807
4.111 KVM_MEMORY_ENCRYPT_REG_REGION
47964808
-----------------------------------
@@ -6640,7 +6652,8 @@ to the byte array.
66406652
.. note::
66416653

66426654
For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_PAPR, KVM_EXIT_XEN,
6643-
KVM_EXIT_EPR, KVM_EXIT_X86_RDMSR and KVM_EXIT_X86_WRMSR the corresponding
6655+
KVM_EXIT_EPR, KVM_EXIT_HYPERCALL, KVM_EXIT_TDX,
6656+
KVM_EXIT_X86_RDMSR and KVM_EXIT_X86_WRMSR the corresponding
66446657
operations are complete (and guest state is consistent) only after userspace
66456658
has re-entered the kernel with KVM_RUN. The kernel side will first finish
66466659
incomplete operations and then check for pending signals.
@@ -6839,6 +6852,7 @@ should put the acknowledged interrupt vector into the 'epr' field.
68396852
#define KVM_SYSTEM_EVENT_WAKEUP 4
68406853
#define KVM_SYSTEM_EVENT_SUSPEND 5
68416854
#define KVM_SYSTEM_EVENT_SEV_TERM 6
6855+
#define KVM_SYSTEM_EVENT_TDX_FATAL 7
68426856
__u32 type;
68436857
__u32 ndata;
68446858
__u64 data[16];
@@ -6865,6 +6879,11 @@ Valid values for 'type' are:
68656879
reset/shutdown of the VM.
68666880
- KVM_SYSTEM_EVENT_SEV_TERM -- an AMD SEV guest requested termination.
68676881
The guest physical address of the guest's GHCB is stored in `data[0]`.
6882+
- KVM_SYSTEM_EVENT_TDX_FATAL -- a TDX guest reported a fatal error state.
6883+
KVM doesn't do any parsing or conversion, it just dumps 16 general-purpose
6884+
registers to userspace, in ascending order of the 4-bit indices for x86-64
6885+
general-purpose registers in instruction encoding, as defined in the Intel
6886+
SDM.
68686887
- KVM_SYSTEM_EVENT_WAKEUP -- the exiting vCPU is in a suspended state and
68696888
KVM has recognized a wakeup event. Userspace may honor this event by
68706889
marking the exiting vCPU as runnable, or deny it and call KVM_RUN again.
@@ -7163,6 +7182,69 @@ The valid value for 'flags' is:
71637182
- KVM_NOTIFY_CONTEXT_INVALID -- the VM context is corrupted and not valid
71647183
in VMCS. It would run into unknown result if resume the target VM.
71657184

7185+
::
7186+
7187+
/* KVM_EXIT_TDX */
7188+
struct {
7189+
__u64 flags;
7190+
__u64 nr;
7191+
union {
7192+
struct {
7193+
u64 ret;
7194+
u64 data[5];
7195+
} unknown;
7196+
struct {
7197+
u64 ret;
7198+
u64 gpa;
7199+
u64 size;
7200+
} get_quote;
7201+
struct {
7202+
u64 ret;
7203+
u64 leaf;
7204+
u64 r11, r12, r13, r14;
7205+
} get_tdvmcall_info;
7206+
struct {
7207+
u64 ret;
7208+
u64 vector;
7209+
} setup_event_notify;
7210+
};
7211+
} tdx;
7212+
7213+
Process a TDVMCALL from the guest. KVM forwards select TDVMCALL based
7214+
on the Guest-Hypervisor Communication Interface (GHCI) specification;
7215+
KVM bridges these requests to the userspace VMM with minimal changes,
7216+
placing the inputs in the union and copying them back to the guest
7217+
on re-entry.
7218+
7219+
Flags are currently always zero, whereas ``nr`` contains the TDVMCALL
7220+
number from register R11. The remaining field of the union provide the
7221+
inputs and outputs of the TDVMCALL. Currently the following values of
7222+
``nr`` are defined:
7223+
7224+
* ``TDVMCALL_GET_QUOTE``: the guest has requested to generate a TD-Quote
7225+
signed by a service hosting TD-Quoting Enclave operating on the host.
7226+
Parameters and return value are in the ``get_quote`` field of the union.
7227+
The ``gpa`` field and ``size`` specify the guest physical address
7228+
(without the shared bit set) and the size of a shared-memory buffer, in
7229+
which the TDX guest passes a TD Report. The ``ret`` field represents
7230+
the return value of the GetQuote request. When the request has been
7231+
queued successfully, the TDX guest can poll the status field in the
7232+
shared-memory area to check whether the Quote generation is completed or
7233+
not. When completed, the generated Quote is returned via the same buffer.
7234+
7235+
* ``TDVMCALL_GET_TD_VM_CALL_INFO``: the guest has requested the support
7236+
status of TDVMCALLs. The output values for the given leaf should be
7237+
placed in fields from ``r11`` to ``r14`` of the ``get_tdvmcall_info``
7238+
field of the union.
7239+
7240+
* ``TDVMCALL_SETUP_EVENT_NOTIFY_INTERRUPT``: the guest has requested to
7241+
set up a notification interrupt for vector ``vector``.
7242+
7243+
KVM may add support for more values in the future that may cause a userspace
7244+
exit, even without calls to ``KVM_ENABLE_CAP`` or similar. In this case,
7245+
it will enter with output fields already valid; in the common case, the
7246+
``unknown.ret`` field of the union will be ``TDVMCALL_STATUS_SUBFUNC_UNSUPPORTED``.
7247+
Userspace need not do anything if it does not wish to support a TDVMCALL.
71667248
::
71677249

71687250
/* Fix the size of the union. */
@@ -8206,6 +8288,28 @@ KVM_X86_QUIRK_STUFF_FEATURE_MSRS By default, at vCPU creation, KVM sets the
82068288
and 0x489), as KVM does now allow them to
82078289
be set by userspace (KVM sets them based on
82088290
guest CPUID, for safety purposes).
8291+
8292+
KVM_X86_QUIRK_IGNORE_GUEST_PAT By default, on Intel platforms, KVM ignores
8293+
guest PAT and forces the effective memory
8294+
type to WB in EPT. The quirk is not available
8295+
on Intel platforms which are incapable of
8296+
safely honoring guest PAT (i.e., without CPU
8297+
self-snoop, KVM always ignores guest PAT and
8298+
forces effective memory type to WB). It is
8299+
also ignored on AMD platforms or, on Intel,
8300+
when a VM has non-coherent DMA devices
8301+
assigned; KVM always honors guest PAT in
8302+
such case. The quirk is needed to avoid
8303+
slowdowns on certain Intel Xeon platforms
8304+
(e.g. ICX, SPR) where self-snoop feature is
8305+
supported but UC is slow enough to cause
8306+
issues with some older guests that use
8307+
UC instead of WC to map the video RAM.
8308+
Userspace can disable the quirk to honor
8309+
guest PAT if it knows that there is no such
8310+
guest software, for example if it does not
8311+
expose a bochs graphics device (which is
8312+
known to have had a buggy driver).
82098313
=================================== ============================================
82108314

82118315
7.32 KVM_CAP_MAX_VCPU_ID

Documentation/virt/kvm/x86/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ KVM for x86 systems
1111
cpuid
1212
errata
1313
hypercalls
14+
intel-tdx
1415
mmu
1516
msr
1617
nested-vmx

0 commit comments

Comments
 (0)