Skip to content

Commit 5aacef9

Browse files
author
Herton R. Krzesinski
committed
Merge: TDX core kernel enabling (support running Linux as guest)
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/875 Bugzilla: http://bugzilla.redhat.com/1955275 Depends: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/1359 Signed-off-by: Wander Lairson Costa <wander@redhat.com> Omitted-fix: 4b3f764 ("tools headers cpufeatures: Sync with the kernel sources") Unnecessary to this MR. Omitted-fix: 5ced812 ("tools headers cpufeatures: Sync with the kernel sources") Unnecessary to this MR. d86d28c25344 (Wander Lairson Costa) config: Enable TDX Guest 3f18abeb0dff (Wander Lairson Costa) x86/hyperv: Initialize shared memory boundary in the Isolation VM. 2d58bdec9d0c (Wander Lairson Costa) Documentation/x86: Document TDX kernel architecture 7b2226734233 (Wander Lairson Costa) ACPICA: Avoid cache flush inside virtual machines 04420296b84a (Wander Lairson Costa) x86/tdx/ioapic: Add shared bit for IOAPIC base address 8ef8f998f932 (Wander Lairson Costa) x86/mm: Make DMA memory shared for TD guest 9768974a2782 (Wander Lairson Costa) x86/mm/cpa: Add support for TDX shared memory 1fcb980c6af2 (Wander Lairson Costa) x86/tdx: Make pages shared in ioremap() 60e4f2d0615c (Wander Lairson Costa) x86/topology: Disable CPU online/offline control for TDX guests 0ef54dd634a9 (Wander Lairson Costa) x86/acpi/x86/boot: Add multiprocessor wake-up support dcc0b0bb9386 (Wander Lairson Costa) x86/boot: Avoid #VE during boot for TDX platforms d9f6dbc65b99 (Wander Lairson Costa) x86/boot: Set CR0.NE early and keep it set during the boot 9918045fab36 (Wander Lairson Costa) x86/acpi/x86/boot: Add multiprocessor wake-up support 0089010a4c0d (Wander Lairson Costa) x86/boot: Add a trampoline for booting APs via firmware handoff ace0733b9895 (Wander Lairson Costa) x86/tdx: Wire up KVM hypercalls 2be45c89174c (Wander Lairson Costa) x86/tdx: Port I/O: Add early boot support e4d019d0ab47 (Wander Lairson Costa) x86/tdx: Port I/O: Add runtime hypercalls 4defa6a6e997 (Wander Lairson Costa) x86/boot: Port I/O: Add decompression-time support for TDX 4c739b45c9d8 (Wander Lairson Costa) x86/boot: Port I/O: Allow to hook up alternative helpers 965b5f7d6581 (Wander Lairson Costa) x86: Consolidate port I/O helpers 74ed20bd60d2 (Wander Lairson Costa) x86: Adjust types used in port I/O helpers 1ae2ce7ee424 (Wander Lairson Costa) x86/tdx: Detect TDX at early kernel decompression time 0089726a21e8 (Wander Lairson Costa) x86/tdx: Handle in-kernel MMIO 1c50354f716a (Wander Lairson Costa) x86/tdx: Handle CPUID via #VE 407152b69b48 (Wander Lairson Costa) x86/tdx: Add MSR support for TDX guests 17e87935337c (Wander Lairson Costa) x86/tdx: Add HLT support for TDX guests 1bf9e16304ff (Wander Lairson Costa) x86/traps: Add #VE support for TDX guest 25aeaa15a0cb (Wander Lairson Costa) x86/traps: Refactor exc_general_protection() 69558857494c (Wander Lairson Costa) x86/tdx: Exclude shared bit from __PHYSICAL_MASK cd447f3886c1 (Wander Lairson Costa) x86/tdx: Extend the confidential computing API to support TDX guests b684ce611a45 (Wander Lairson Costa) x86/tdx: Add __tdx_module_call() and __tdx_hypercall() helper functions 161598be64cd (Wander Lairson Costa) x86/tdx: Provide common base for SEAMCALL and TDCALL C wrappers b140c1045eef (Wander Lairson Costa) x86/tdx: Detect running as a TDX guest in early boot 995afcdd46f7 (Wander Lairson Costa) x86/ibt: Disable IBT around firmware 045dfa4ec296 (Wander Lairson Costa) x86/ibt,kexec: Disable CET on kexec 496a5f1f0623 (Wander Lairson Costa) x86/ibt: Add IBT feature, MSR and #CP handling 5d67f5402e4f (Wander Lairson Costa) x86/ibt: Base IBT bits 1521fcb8488c (Wander Lairson Costa) Documentation: Add x86/amd_hsmp driver c844f1ba98c8 (Wander Lairson Costa) x86/mm/cpa: Generalize __set_memory_enc_pgtable() 44c6c035ecbb (Wander Lairson Costa) x86/coco: Add API to handle encryption mask 953a82ec52c4 (Wander Lairson Costa) x86/coco: Explicitly declare type of confidential computing platform f9ac78c1574b (Wander Lairson Costa) x86/cc: Move arch/x86/{kernel/cc_platform.c => coco/core.c} 01e18facbab7 (Wander Lairson Costa) hyper-v: Enable swiotlb bounce buffer for Isolation VM 90e8d9c572bc (Wander Lairson Costa) x86/hyper-v: Add hyperv Isolation VM check in the cc_platform_has() 3de5845b13de (Wander Lairson Costa) swiotlb: Add swiotlb bounce buffer remap function for HV IVM aa622fccff1b (Wander Lairson Costa) x86/sev: Move common memory encryption code to mem_encrypt.c a97b7c77b835 (Wander Lairson Costa) x86/sev: Rename mem_encrypt.c to mem_encrypt_amd.c 61db1fc94f8f (Wander Lairson Costa) x86/sev: Use CC_ATTR attribute to generalize string I/O unroll 8aaa3ebc1c97 (Wander Lairson Costa) x86/insn-eval: Introduce insn_decode_mmio() 81d528215bf4 (Wander Lairson Costa) x86/insn-eval: Introduce insn_get_modrm_reg_ptr() 880e1d2cd664 (Wander Lairson Costa) x86/sev: Remove do_early_exception() forward declarations 7b6d4f843752 (Wander Lairson Costa) x86/head64: Carve out the guest encryption postprocessing into a helper 4a1dcbfcb64d (Wander Lairson Costa) x86/sev: Get rid of excessive use of defines 5611b4e29bdb (Wander Lairson Costa) x86/sev: Shorten GHCB terminate macro names 63ca1c66f961 (Wander Lairson Costa) x86/kvm: Add guest support for detecting and enabling SEV Live Migration feature. 547ffb035c0e (Wander Lairson Costa) EFI: Introduce the new AMD Memory Encryption GUID. 2f1f679e8f94 (Wander Lairson Costa) x86/hyperv: Initialize GHCB page in Isolation VM d983844169c3 (Wander Lairson Costa) x86/iopl: Fake iopl(3) CLI/STI usage 67a8e0127390 (Wander Lairson Costa) mm: x86: Invoke hypercall when page encryption status is changed 2ff153eba629 (Wander Lairson Costa) x86/kvm: Add AMD SEV specific Hypercall3 Documentation/x86/amd_hsmp.rst | 86 +++ Documentation/x86/index.rst | 2 + Documentation/x86/tdx.rst | 218 +++++++ arch/x86/Kbuild | 2 + arch/x86/Kconfig | 45 +- arch/x86/Makefile | 16 +- arch/x86/boot/boot.h | 37 +- arch/x86/boot/compressed/Makefile | 1 + arch/x86/boot/compressed/head_64.S | 27 +- arch/x86/boot/compressed/misc.c | 12 + arch/x86/boot/compressed/misc.h | 4 +- arch/x86/boot/compressed/pgtable.h | 2 +- arch/x86/boot/compressed/sev.c | 6 +- arch/x86/boot/compressed/tdcall.S | 3 + arch/x86/boot/compressed/tdx.c | 77 +++ arch/x86/boot/compressed/tdx.h | 13 + arch/x86/boot/cpuflags.c | 3 +- arch/x86/boot/cpuflags.h | 1 + arch/x86/boot/io.h | 41 ++ arch/x86/boot/main.c | 4 + arch/x86/coco/Makefile | 8 + arch/x86/coco/core.c | 137 ++++ arch/x86/coco/tdx/Makefile | 3 + arch/x86/coco/tdx/tdcall.S | 204 ++++++ arch/x86/coco/tdx/tdx.c | 692 +++++++++++++++++++++ arch/x86/hyperv/hv_init.c | 80 ++- arch/x86/include/asm/acenv.h | 14 +- arch/x86/include/asm/apic.h | 7 + arch/x86/include/asm/coco.h | 32 + arch/x86/include/asm/cpu.h | 4 + arch/x86/include/asm/cpufeatures.h | 2 + arch/x86/include/asm/disabled-features.h | 8 +- arch/x86/include/asm/efi.h | 9 +- arch/x86/include/asm/ibt.h | 93 +++ arch/x86/include/asm/idtentry.h | 9 + arch/x86/include/asm/insn-eval.h | 14 + arch/x86/include/asm/io.h | 62 +- arch/x86/include/asm/kvm_para.h | 34 + arch/x86/include/asm/mem_encrypt.h | 10 +- arch/x86/include/asm/mshyperv.h | 4 + arch/x86/include/asm/msr-index.h | 20 +- arch/x86/include/asm/paravirt.h | 6 + arch/x86/include/asm/paravirt_types.h | 1 + arch/x86/include/asm/pgtable.h | 13 +- arch/x86/include/asm/processor.h | 1 + arch/x86/include/asm/realmode.h | 1 + arch/x86/include/asm/sev-common.h | 55 +- arch/x86/include/asm/shared/io.h | 34 + arch/x86/include/asm/shared/tdx.h | 40 ++ arch/x86/include/asm/tdx.h | 91 +++ arch/x86/include/asm/traps.h | 2 + arch/x86/include/asm/x86_init.h | 16 + arch/x86/include/uapi/asm/processor-flags.h | 2 + arch/x86/kernel/Makefile | 5 - arch/x86/kernel/acpi/boot.c | 100 ++- arch/x86/kernel/apic/apic.c | 10 + arch/x86/kernel/apic/io_apic.c | 18 +- arch/x86/kernel/apm_32.c | 7 + arch/x86/kernel/asm-offsets.c | 17 + arch/x86/kernel/cc_platform.c | 69 -- arch/x86/kernel/cpu/common.c | 59 +- arch/x86/kernel/cpu/mshyperv.c | 24 + arch/x86/kernel/head64.c | 67 +- arch/x86/kernel/head_64.S | 28 +- arch/x86/kernel/idt.c | 7 + arch/x86/kernel/kvm.c | 82 +++ arch/x86/kernel/machine_kexec_64.c | 4 +- arch/x86/kernel/paravirt.c | 1 + arch/x86/kernel/process.c | 5 + arch/x86/kernel/relocate_kernel_64.S | 8 + arch/x86/kernel/sev-shared.c | 2 +- arch/x86/kernel/sev.c | 11 +- arch/x86/kernel/smpboot.c | 12 +- arch/x86/kernel/traps.c | 249 +++++++- arch/x86/kernel/x86_init.c | 16 +- arch/x86/lib/insn-eval.c | 106 +++- arch/x86/mm/Makefile | 7 +- arch/x86/mm/ioremap.c | 5 + arch/x86/mm/mem_encrypt.c | 392 +----------- arch/x86/mm/mem_encrypt_amd.c | 466 ++++++++++++++ arch/x86/mm/mem_encrypt_identity.c | 12 +- arch/x86/mm/pat/set_memory.c | 21 +- arch/x86/realmode/rm/header.S | 1 + arch/x86/realmode/rm/trampoline_64.S | 57 +- arch/x86/realmode/rm/trampoline_common.S | 12 +- arch/x86/realmode/rm/wakemain.c | 4 + arch/x86/virt/vmx/tdx/tdxcall.S | 96 +++ include/asm-generic/mshyperv.h | 18 +- include/linux/cc_platform.h | 21 + include/linux/efi.h | 1 + include/linux/swiotlb.h | 6 + kernel/cpu.c | 7 + kernel/dma/swiotlb.c | 43 +- .../configs/common/generic/CONFIG_INTEL_TDX_GUEST | 1 + .../configs/common/generic/CONFIG_X86_KERNEL_IBT | 1 + 95 files changed, 3691 insertions(+), 695 deletions(-) Approved-by: Rafael Aquini <aquini@redhat.com> Approved-by: David Arcari <darcari@redhat.com> Approved-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
2 parents 5e8cf7d + 7776679 commit 5aacef9

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

63 files changed

+2549
-294
lines changed

Documentation/x86/amd_hsmp.rst

Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
3+
============================================
4+
AMD HSMP interface
5+
============================================
6+
7+
Newer Fam19h EPYC server line of processors from AMD support system
8+
management functionality via HSMP (Host System Management Port).
9+
10+
The Host System Management Port (HSMP) is an interface to provide
11+
OS-level software with access to system management functions via a
12+
set of mailbox registers.
13+
14+
More details on the interface can be found in chapter
15+
"7 Host System Management Port (HSMP)" of the family/model PPR
16+
Eg: https://www.amd.com/system/files/TechDocs/55898_B1_pub_0.50.zip
17+
18+
HSMP interface is supported on EPYC server CPU models only.
19+
20+
21+
HSMP device
22+
============================================
23+
24+
amd_hsmp driver under the drivers/platforms/x86/ creates miscdevice
25+
/dev/hsmp to let user space programs run hsmp mailbox commands.
26+
27+
$ ls -al /dev/hsmp
28+
crw-r--r-- 1 root root 10, 123 Jan 21 21:41 /dev/hsmp
29+
30+
Characteristics of the dev node:
31+
* Write mode is used for running set/configure commands
32+
* Read mode is used for running get/status monitor commands
33+
34+
Access restrictions:
35+
* Only root user is allowed to open the file in write mode.
36+
* The file can be opened in read mode by all the users.
37+
38+
In-kernel integration:
39+
* Other subsystems in the kernel can use the exported transport
40+
function hsmp_send_message().
41+
* Locking across callers is taken care by the driver.
42+
43+
44+
An example
45+
==========
46+
47+
To access hsmp device from a C program.
48+
First, you need to include the headers::
49+
50+
#include <linux/amd_hsmp.h>
51+
52+
Which defines the supported messages/message IDs.
53+
54+
Next thing, open the device file, as follows::
55+
56+
int file;
57+
58+
file = open("/dev/hsmp", O_RDWR);
59+
if (file < 0) {
60+
/* ERROR HANDLING; you can check errno to see what went wrong */
61+
exit(1);
62+
}
63+
64+
The following IOCTL is defined:
65+
66+
``ioctl(file, HSMP_IOCTL_CMD, struct hsmp_message *msg)``
67+
The argument is a pointer to a::
68+
69+
struct hsmp_message {
70+
__u32 msg_id; /* Message ID */
71+
__u16 num_args; /* Number of input argument words in message */
72+
__u16 response_sz; /* Number of expected output/response words */
73+
__u32 args[HSMP_MAX_MSG_LEN]; /* argument/response buffer */
74+
__u16 sock_ind; /* socket number */
75+
};
76+
77+
The ioctl would return a non-zero on failure; you can read errno to see
78+
what happened. The transaction returns 0 on success.
79+
80+
More details on the interface and message definitions can be found in chapter
81+
"7 Host System Management Port (HSMP)" of the respective family/model PPR
82+
eg: https://www.amd.com/system/files/TechDocs/55898_B1_pub_0.50.zip
83+
84+
User space C-APIs are made available by linking against the esmi library,
85+
which is provided by the E-SMS project https://developer.amd.com/e-sms/.
86+
See: https://github.com/amd/esmi_ib_library

Documentation/x86/index.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,8 @@ x86-specific Documentation
2525
intel-iommu
2626
intel_txt
2727
amd-memory-encryption
28+
amd_hsmp
29+
tdx
2830
pti
2931
mds
3032
microcode

Documentation/x86/tdx.rst

Lines changed: 218 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,218 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
3+
=====================================
4+
Intel Trust Domain Extensions (TDX)
5+
=====================================
6+
7+
Intel's Trust Domain Extensions (TDX) protect confidential guest VMs from
8+
the host and physical attacks by isolating the guest register state and by
9+
encrypting the guest memory. In TDX, a special module running in a special
10+
mode sits between the host and the guest and manages the guest/host
11+
separation.
12+
13+
Since the host cannot directly access guest registers or memory, much
14+
normal functionality of a hypervisor must be moved into the guest. This is
15+
implemented using a Virtualization Exception (#VE) that is handled by the
16+
guest kernel. A #VE is handled entirely inside the guest kernel, but some
17+
require the hypervisor to be consulted.
18+
19+
TDX includes new hypercall-like mechanisms for communicating from the
20+
guest to the hypervisor or the TDX module.
21+
22+
New TDX Exceptions
23+
==================
24+
25+
TDX guests behave differently from bare-metal and traditional VMX guests.
26+
In TDX guests, otherwise normal instructions or memory accesses can cause
27+
#VE or #GP exceptions.
28+
29+
Instructions marked with an '*' conditionally cause exceptions. The
30+
details for these instructions are discussed below.
31+
32+
Instruction-based #VE
33+
---------------------
34+
35+
- Port I/O (INS, OUTS, IN, OUT)
36+
- HLT
37+
- MONITOR, MWAIT
38+
- WBINVD, INVD
39+
- VMCALL
40+
- RDMSR*,WRMSR*
41+
- CPUID*
42+
43+
Instruction-based #GP
44+
---------------------
45+
46+
- All VMX instructions: INVEPT, INVVPID, VMCLEAR, VMFUNC, VMLAUNCH,
47+
VMPTRLD, VMPTRST, VMREAD, VMRESUME, VMWRITE, VMXOFF, VMXON
48+
- ENCLS, ENCLU
49+
- GETSEC
50+
- RSM
51+
- ENQCMD
52+
- RDMSR*,WRMSR*
53+
54+
RDMSR/WRMSR Behavior
55+
--------------------
56+
57+
MSR access behavior falls into three categories:
58+
59+
- #GP generated
60+
- #VE generated
61+
- "Just works"
62+
63+
In general, the #GP MSRs should not be used in guests. Their use likely
64+
indicates a bug in the guest. The guest may try to handle the #GP with a
65+
hypercall but it is unlikely to succeed.
66+
67+
The #VE MSRs are typically able to be handled by the hypervisor. Guests
68+
can make a hypercall to the hypervisor to handle the #VE.
69+
70+
The "just works" MSRs do not need any special guest handling. They might
71+
be implemented by directly passing through the MSR to the hardware or by
72+
trapping and handling in the TDX module. Other than possibly being slow,
73+
these MSRs appear to function just as they would on bare metal.
74+
75+
CPUID Behavior
76+
--------------
77+
78+
For some CPUID leaves and sub-leaves, the virtualized bit fields of CPUID
79+
return values (in guest EAX/EBX/ECX/EDX) are configurable by the
80+
hypervisor. For such cases, the Intel TDX module architecture defines two
81+
virtualization types:
82+
83+
- Bit fields for which the hypervisor controls the value seen by the guest
84+
TD.
85+
86+
- Bit fields for which the hypervisor configures the value such that the
87+
guest TD either sees their native value or a value of 0. For these bit
88+
fields, the hypervisor can mask off the native values, but it can not
89+
turn *on* values.
90+
91+
A #VE is generated for CPUID leaves and sub-leaves that the TDX module does
92+
not know how to handle. The guest kernel may ask the hypervisor for the
93+
value with a hypercall.
94+
95+
#VE on Memory Accesses
96+
======================
97+
98+
There are essentially two classes of TDX memory: private and shared.
99+
Private memory receives full TDX protections. Its content is protected
100+
against access from the hypervisor. Shared memory is expected to be
101+
shared between guest and hypervisor and does not receive full TDX
102+
protections.
103+
104+
A TD guest is in control of whether its memory accesses are treated as
105+
private or shared. It selects the behavior with a bit in its page table
106+
entries. This helps ensure that a guest does not place sensitive
107+
information in shared memory, exposing it to the untrusted hypervisor.
108+
109+
#VE on Shared Memory
110+
--------------------
111+
112+
Access to shared mappings can cause a #VE. The hypervisor ultimately
113+
controls whether a shared memory access causes a #VE, so the guest must be
114+
careful to only reference shared pages it can safely handle a #VE. For
115+
instance, the guest should be careful not to access shared memory in the
116+
#VE handler before it reads the #VE info structure (TDG.VP.VEINFO.GET).
117+
118+
Shared mapping content is entirely controlled by the hypervisor. The guest
119+
should only use shared mappings for communicating with the hypervisor.
120+
Shared mappings must never be used for sensitive memory content like kernel
121+
stacks. A good rule of thumb is that hypervisor-shared memory should be
122+
treated the same as memory mapped to userspace. Both the hypervisor and
123+
userspace are completely untrusted.
124+
125+
MMIO for virtual devices is implemented as shared memory. The guest must
126+
be careful not to access device MMIO regions unless it is also prepared to
127+
handle a #VE.
128+
129+
#VE on Private Pages
130+
--------------------
131+
132+
An access to private mappings can also cause a #VE. Since all kernel
133+
memory is also private memory, the kernel might theoretically need to
134+
handle a #VE on arbitrary kernel memory accesses. This is not feasible, so
135+
TDX guests ensure that all guest memory has been "accepted" before memory
136+
is used by the kernel.
137+
138+
A modest amount of memory (typically 512M) is pre-accepted by the firmware
139+
before the kernel runs to ensure that the kernel can start up without
140+
being subjected to a #VE.
141+
142+
The hypervisor is permitted to unilaterally move accepted pages to a
143+
"blocked" state. However, if it does this, page access will not generate a
144+
#VE. It will, instead, cause a "TD Exit" where the hypervisor is required
145+
to handle the exception.
146+
147+
Linux #VE handler
148+
=================
149+
150+
Just like page faults or #GP's, #VE exceptions can be either handled or be
151+
fatal. Typically, an unhandled userspace #VE results in a SIGSEGV.
152+
An unhandled kernel #VE results in an oops.
153+
154+
Handling nested exceptions on x86 is typically nasty business. A #VE
155+
could be interrupted by an NMI which triggers another #VE and hilarity
156+
ensues. The TDX #VE architecture anticipated this scenario and includes a
157+
feature to make it slightly less nasty.
158+
159+
During #VE handling, the TDX module ensures that all interrupts (including
160+
NMIs) are blocked. The block remains in place until the guest makes a
161+
TDG.VP.VEINFO.GET TDCALL. This allows the guest to control when interrupts
162+
or a new #VE can be delivered.
163+
164+
However, the guest kernel must still be careful to avoid potential
165+
#VE-triggering actions (discussed above) while this block is in place.
166+
While the block is in place, any #VE is elevated to a double fault (#DF)
167+
which is not recoverable.
168+
169+
MMIO handling
170+
=============
171+
172+
In non-TDX VMs, MMIO is usually implemented by giving a guest access to a
173+
mapping which will cause a VMEXIT on access, and then the hypervisor
174+
emulates the access. That is not possible in TDX guests because VMEXIT
175+
will expose the register state to the host. TDX guests don't trust the host
176+
and can't have their state exposed to the host.
177+
178+
In TDX, MMIO regions typically trigger a #VE exception in the guest. The
179+
guest #VE handler then emulates the MMIO instruction inside the guest and
180+
converts it into a controlled TDCALL to the host, rather than exposing
181+
guest state to the host.
182+
183+
MMIO addresses on x86 are just special physical addresses. They can
184+
theoretically be accessed with any instruction that accesses memory.
185+
However, the kernel instruction decoding method is limited. It is only
186+
designed to decode instructions like those generated by io.h macros.
187+
188+
MMIO access via other means (like structure overlays) may result in an
189+
oops.
190+
191+
Shared Memory Conversions
192+
=========================
193+
194+
All TDX guest memory starts out as private at boot. This memory can not
195+
be accessed by the hypervisor. However, some kernel users like device
196+
drivers might have a need to share data with the hypervisor. To do this,
197+
memory must be converted between shared and private. This can be
198+
accomplished using some existing memory encryption helpers:
199+
200+
* set_memory_decrypted() converts a range of pages to shared.
201+
* set_memory_encrypted() converts memory back to private.
202+
203+
Device drivers are the primary user of shared memory, but there's no need
204+
to touch every driver. DMA buffers and ioremap() do the conversions
205+
automatically.
206+
207+
TDX uses SWIOTLB for most DMA allocations. The SWIOTLB buffer is
208+
converted to shared on boot.
209+
210+
For coherent DMA allocation, the DMA buffer gets converted on the
211+
allocation. Check force_dma_unencrypted() for details.
212+
213+
References
214+
==========
215+
216+
TDX reference material is collected here:
217+
218+
https://www.intel.com/content/www/us/en/developer/articles/technical/intel-trust-domain-extensions.html

arch/arm/xen/mm.c

Lines changed: 12 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -23,22 +23,20 @@
2323
#include <asm/xen/hypercall.h>
2424
#include <asm/xen/interface.h>
2525

26-
unsigned long xen_get_swiotlb_free_pages(unsigned int order)
26+
static gfp_t xen_swiotlb_gfp(void)
2727
{
2828
phys_addr_t base;
29-
gfp_t flags = __GFP_NOWARN|__GFP_KSWAPD_RECLAIM;
3029
u64 i;
3130

3231
for_each_mem_range(i, &base, NULL) {
3332
if (base < (phys_addr_t)0xffffffff) {
3433
if (IS_ENABLED(CONFIG_ZONE_DMA32))
35-
flags |= __GFP_DMA32;
36-
else
37-
flags |= __GFP_DMA;
38-
break;
34+
return __GFP_DMA32;
35+
return __GFP_DMA;
3936
}
4037
}
41-
return __get_free_pages(flags, order);
38+
39+
return GFP_KERNEL;
4240
}
4341

4442
static bool hypercall_cflush = false;
@@ -122,10 +120,7 @@ int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
122120
unsigned int address_bits,
123121
dma_addr_t *dma_handle)
124122
{
125-
if (!xen_initial_domain())
126-
return -EINVAL;
127-
128-
/* we assume that dom0 is mapped 1:1 for now */
123+
/* the domain is 1:1 mapped to use swiotlb-xen */
129124
*dma_handle = pstart;
130125
return 0;
131126
}
@@ -143,10 +138,13 @@ static int __init xen_mm_init(void)
143138
if (!xen_swiotlb_detect())
144139
return 0;
145140

146-
rc = xen_swiotlb_init();
147141
/* we can work with the default swiotlb */
148-
if (rc < 0 && rc != -EEXIST)
149-
return rc;
142+
if (!io_tlb_default_mem.nslabs) {
143+
rc = swiotlb_init_late(swiotlb_size_or_default(),
144+
xen_swiotlb_gfp(), NULL);
145+
if (rc < 0)
146+
return rc;
147+
}
150148

151149
cflush.op = 0;
152150
cflush.a.dev_bus_addr = 0;

0 commit comments

Comments
 (0)