Age | Commit message (Collapse) | Author |
|
In order to fix a VT-x bug, and support MSR_SPEC_CTRL on AMD, move
MSR_SPEC_CTRL handling into the new {pv,hvm}_{get,set}_reg() infrastructure.
Duplicate the msrs->spec_ctrl.raw accesses in the PV and VT-x paths for now.
The SVM path is currently unreachable because of the CPUID policy.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit: 6536688439dbca1d08fd6db5be29c39e3917fb2f
master date: 2022-01-20 16:32:11 +0000
|
|
In the original change I neglected to consider the case of us running as
L1 under another Xen. In this case we're not Dom0, so the underlying Xen
wouldn't permit us access to these MSRs. As an immediate workaround use
rdmsr_safe(); I don't view this as the final solution though, as the
original problem the earlier change tried to address also applies when
running nested. Yet it is then unclear to me how to properly address the
issue: We shouldn't generally expose the MSR values, but handing back
zero (or effectively any other static value) doesn't look appropriate
either.
Fixes: bfcdaae9c210 ("x86/AMD: expose SYSCFG, TOM, TOM2, and IORRs to Dom0")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
|
|
Sufficiently old Linux (3.12-ish) accesses these MSRs (with the
exception of IORRs) in an unguarded manner. Furthermore these same MSRs,
at least on Fam11 and older CPUs, are also consulted by modern Linux,
and their (bogus) built-in zapping of #GP faults from MSR accesses leads
to it effectively reading zero instead of the intended values, which are
relevant for PCI BAR placement (which ought to all live in MMIO-type
space, not in DRAM-type one).
For SYSCFG, only certain bits get exposed. Since MtrrVarDramEn also
covers the IORRs, expose them as well. Introduce (consistently named)
constants for the bits we're interested in and use them in pre-existing
code as well. While there also drop the unused and somewhat questionable
K8_MTRR_RDMEM_WRMEM_MASK. To complete the set of memory type and DRAM vs
MMIO controlling MSRs, also expose TSEG_{BASE,MASK} (the former also
gets read by Linux, dealing with which was already the subject of
6eef0a99262c ["x86/PV: conditionally avoid raising #GP for early guest
MSR reads"]).
As a welcome side effect, verbosity on/of debug builds gets (perhaps
significantly) reduced.
Note that at least as far as those MSR accesses by Linux are concerned,
there's no similar issue for DomU-s, as the accesses sit behind PCI
device matching logic. The checked for devices would never be exposed to
DomU-s in the first place. Nevertheless I think that at least for HVM we
should return sensible values, not 0 (as svm_msr_read_intercept() does
right now). The intended values may, however, need to be determined by
hvmloader, and then get made known to Xen.
Fixes: 322ec7c89f66 ("x86/pv: disallow access to unknown MSRs")
Reported-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
|
|
MSR_ARCH_CAPS is still not supported for guests yet (other than the hardware
domain), until the toolstack learns how to construct an MSR policy.
However, we want access to the host ARCH_CAPS_TSX_CTRL value in particular for
testing purposes.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
|
|
* Introduce cpu_has_arch_caps and replace boot_cpu_has(X86_FEATURE_ARCH_CAPS)
* Read CPUID data into the appropriate boot_cpu_data.x86_capability[]
element, as subsequent changes are going to need more cpu_has_* logic.
* Use the hi/lo MSR helpers, which substantially improves code generation.
No practical change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
|
|
In hindsight, this was a poor move. Some of these MSRs require probing for,
cause unhelpful spew into xl dmesg, or cause spew from unit tests explicitly
checking behaviour.
This restores behaviour close to that of Xen 4.14, meaning in particular
that for all of the MSRs getting re-added explicitly a #GP fault will get
raised irrespective of the new "msr_relaxed" setting.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Acked-by: Ian Jackson <iwj@xenproject.org>
|
|
Linux has been warning ("firmware bug") about this bit being clear for a
long time. While writable in older hardware it has been readonly on more
than just most recent hardware. For simplicitly report it always set (if
anything we may want to log the issue ourselves if it turns out to be
clear on older hardware) on CPU families 10h and up (in family 0fh the
bit is part of a larger field of different purpose).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Ian Jackson <iwj@xenproject.org>
|
|
Windows 10 will triple fault if #GP is injected when attempting to
read the FEATURE_CONTROL MSR on Intel or compatible hardware. Fix this
by injecting a #GP only when the vendor doesn't support the MSR, even
if there are no features to expose.
Fixes: 39ab598c50a2 ('x86/pv: allow reading FEATURE_CONTROL MSR')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
[Extended comment]
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
|
|
Currently a PV hardware domain can also be given control over the CPU
frequency, and such guest is allowed to write to MSR_IA32_PERF_CTL.
However since commit 322ec7c89f6 the default behavior has been changed
to reject accesses to not explicitly handled MSRs, preventing PV
guests that manage CPU frequency from reading
MSR_IA32_PERF_{STATUS/CTL}.
Additionally some HVM guests (Windows at least) will attempt to read
MSR_IA32_PERF_CTL and will panic if given back a #GP fault:
vmx.c:3035:d8v0 RDMSR 0x00000199 unimplemented
d8v0 VIRIDIAN CRASH: 3b c0000096 fffff806871c1651 ffffda0253683720 0
Move the handling of MSR_IA32_PERF_{STATUS/CTL} to the common MSR
handling shared between HVM and PV guests, and add an explicit case
for reads to MSR_IA32_PERF_{STATUS/CTL}.
Restore previous behavior and allow PV guests with the required
permissions to read the contents of the mentioned MSRs. Non privileged
guests will get 0 when trying to read those registers, as writes to
MSR_IA32_PERF_CTL by such guest will already be silently dropped.
Fixes: 322ec7c89f6 ('x86/pv: disallow access to unknown MSRs')
Fixes: 84e848fd7a1 ('x86/hvm: disallow access to unknown MSRs')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
|
|
Windows 8 will attempt to read MSR_IA32_THERM_STATUS and panic if a
#GP fault is injected as a result:
vmx.c:3035:d8v0 RDMSR 0x0000019c unimplemented
d8v0 VIRIDIAN CRASH: 3b c0000096 fffff8061de31651 fffff4088a613720 0
So handle the MSR and return 0 instead.
Note that this is done on the generic MSR handler, and PV guest will
also get 0 back when trying to read the MSR. There doesn't seem to be
much value in handling the MSR for HVM guests only.
Fixes: 84e848fd7a1 ('x86/hvm: disallow access to unknown MSRs')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
|
|
Now that the main PV/HVM MSR handlers raise #GP for all unknown MSRs, there is
no need to special case these MSRs any more.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
|
|
Linux PV guests will attempt to read the FEATURE_CONTROL MSR, so move
the handling done in VMX code into guest_rdmsr as it can be shared
between PV and HVM guests that way.
Note that there's a slight behavior change and attempting to read the
MSR when no features are available will result in a fault.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
|
|
Report LFENCE_SERIALISE unconditionally for DE_CFG on AMD hardware and
silently drop writes.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
|
|
The overhead of (the lack of) MDS_NO alone has been measured at 30% on some
workloads. While we're not in a position yet to offer MSR_ARCH_CAPS generally
to guests, dom0 doesn't migrate, so we can pass a subset of hardware values
straight through.
This will cause PVH dom0's not to use KPTI by default, and all dom0's not to
use VERW flushing by default, and to use eIBRS in preference to retpoline on
recent Intel CPUs.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
|
|
... including serialisation/deserialisation logic and unit tests.
There is no current way to configure this MSR correctly for guests.
The toolstack side this logic needs building, which is far easier to
do with it in place.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
|
|
... rather than from the default clauses of the PV and HVM MSR handlers.
This means that we no longer take the vmce lock for any unknown MSR, and
accesses to architectural MCE banks outside of the subset implemented for the
guest no longer fall further through the unknown MSR path.
The bank limit of 32 isn't stated anywhere I can locate, but is a consequence
of the MSR layout described in SDM Volume 4.
With the vmce calls removed, the hvm alternative_call()'s expression can be
simplified substantially.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
|
|
We do not expose the feature to guests, so should disallow access to the
respective MSRs. For simplicity, drop the entire block of MSRs, not just the
subset which have been specified thus far.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wl@xen.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-acked-by: Paul Durrant <paul@xen.org>
|
|
This is part of XSA-320 / CVE-2020-0543
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wl@xen.org>
|
|
The CET spec has been published and guest kernels are starting to get support.
Introduce the CPUID and MSRs, and fully block the MSRs from guest use.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wl@xen.org>
|
|
Drop #include-s not needed by the header itself. Put the ones needed
into whichever other files actually need them.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
|
|
Drop #include-s not needed by the header itself. Put the ones needed
into whichever other files actually need them.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
|
|
For now, the default and max policies remain identical, but this will change
in the future.
Update XEN_SYSCTL_get_cpu_policy and init_domain_msr_policy() to use the
default policies.
Take the opportunity sort PV ahead of HVM, as is the prevailing style
elsewhere.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
|
|
Arrange to compile out the PV or HVM logic and objects as applicable. This
involves a bit of complexity in init_domain_msr_policy() as is_pv_domain()
can't be evaulated at compile time.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
|
|
It turns out that these are unused, and we dup a type-dependent block of
zeros. Use xzalloc() instead.
Read/write MSRs typically default 0, and non-zero defaults would need dealing
with at suitable INIT/RESET points (e.g. arch_vcpu_regs_init).
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
|
|
A splitlock is an atomic operation which crosses a cache line boundary. It
serialises operations in the cache coherency fabric and comes with a
multi-thousand cycle stall.
Intel Tremont CPUs introduce MSR_CORE_CAPS to enumerate various core-specific
features, and MSR_TEST_CTRL to adjust the behaviour in the case of a
splitlock.
Virtualising this for guests is distinctly tricky owing to the fact that
MSR_TEST_CTRL has core rather than thread scope. In the meantime however,
prevent the MSR values leaking into guests.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wl@xen.org>
|
|
This is an Intel-only, read-only MSR related to microcode loading. Expose it
in similar circumstances as the PATCHLEVEL MSR.
This should have been alongside c/s 013896cb8b2 "x86/msr: Fix handling of
MSR_AMD_PATCHLEVEL/MSR_IA32_UCODE_REV"
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
|
|
To fulfill the "protected" in its name, don't let the real hardware
values leak. While we could report a control register value expressing
this (which I would have preferred), unconditionally raise #GP for all
accesses (in the interest of getting this done).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
|
|
To protect against the TSX Async Abort speculative vulnerability, Intel have
released new microcode for affected parts which introduce the MSR_TSX_CTRL
control, which allows TSX to be turned off. This will be architectural on
future parts.
Introduce tsx= to provide a global on/off for TSX, including its enumeration
via CPUID. Provide stub virtualisation of this MSR, as it is not exposed to
guests at the moment.
VMs may have booted before microcode is loaded, or before hosts have rebooted,
and they still want to migrate freely. A VM which booted seeing TSX can
migrate safely to hosts with TSX disabled - TSX will start unconditionally
aborting, but still behave in a manner compatible with the ABI.
The guest-visible behaviour is equivalent to late loading the microcode and
setting the RTM_DISABLE bit in the course of live patching.
This is part of XSA-305 / CVE-2019-11135
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
|
|
The domain builder no longer uses local CPUID instructions for policy
decisions. This resolves a key issue for PVH dom0's. However, as PV dom0's
have never had faulting enforced, leave a command line option to restore the
old behaviour.
Advertise virtualised faulting support to control domains unless the opt-out
has been used.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-acked-by: Juergen Gross <jgross@suse.com>
|
|
The control domain exclusion for CPUID Faulting predates dom0 PVH, but the
reason for the exclusion (to allow the domain builder to see host CPUID
values) isn't applicable.
The domain builder *is* broken in PVH control domains, and restricting the use
of CPUID Faulting doesn't make it any less broken. Tweak the logic to only
exclude PV control domains.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-acked-by: Juergen Gross <jgross@suse.com>
|
|
Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
|
|
Lightweight Profiling was introduced in Bulldozer (Fam15h), but was dropped
from Zen (Fam17h) processors. Furthermore, LWP was dropped from Fam15/16 CPUs
when IBPB for Spectre v2 was introduced in microcode, owing to LWP not being
used in practice.
As a result, CPUs which are operating within specification (i.e. with up to
date microcode) no longer have this feature, and therefore are not using it.
Drop support from Xen. The main motivation here is to remove unnecessary
complexity from CPUID handling, but it also tidies up the SVM code nicely.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Brian Woods <brian.woods@amd.com>
|
|
This is a model specific register which details the current configuration
cores and threads in the package. Because of how Hyperthread and Core
configuration works works in firmware, the MSR it is de-facto constant and
will remain unchanged until the next system reset.
It is a read only MSR (so unilaterally reject writes), but for now retain its
leaky-on-read properties. Further CPUID/MSR work is required before we can
start virtualising a consistent topology to the guest, and retaining the old
behaviour is the safest course of action.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
|
|
* Fix the shim build by providing a !CONFIG_HVM declaration for
hvm_get_guest_bndcfgs(), and removing the introduced
ASSERT(is_hvm_domain(d))'s. They are needed for DCE to keep the build
working. Furthermore, in this way, the risk of runtime type confusion is
removed.
* Revert the de-const'ing of the vcpu pointer in vmx_get_guest_bndcfgs().
vmx_vmcs_enter() really does mutate the vcpu, and may cause it to undergo a
full de/reschedule, which is contrary to the programmers expectation of
hvm_get_guest_bndcfgs(). guest_rdmsr() was always going to need to lose
its const parameter, and this was the correct time for it to happen.
* The MSRs in vcpu_msrs are in numeric order. Re-position XSS to match.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
|
|
Saving and restoring the value of this MSR is currently handled by
implementation-specific code despite it being architectural. This patch
moves handling of accesses to this MSR from hvm.c into the msr.c, thus
allowing the common MSR save/restore code to handle it.
This patch also adds proper checks of CPUID policy in the new get/set code.
NOTE: MSR_IA32_XSS is the last MSR to be saved and restored by
implementation-specific code. This patch therefore removes the
(VMX) definitions and of the init_msr(), save_msr() and
load_msr() hvm_funcs, as they are no longer necessary. The
declarations of and calls to those hvm_funcs will be cleaned up
by a subsequent patch.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
|
|
Saving and restoring the value of this MSR is currently handled by
implementation-specific code despite it being architectural. This patch
moves handling of accesses to this MSR from hvm.c into the msr.c, thus
allowing the common MSR save/restore code to handle it.
NOTE: Because vmx_get/set_guest_bndcfgs() call vmx_vmcs_enter(), the
struct vcpu pointer passed in, and hence the vcpu pointer passed to
guest_rdmsr() cannot be const.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
|
|
There are a number of bugs. There are no read/write hooks on the HVM side, so
guest accesses fall into the "read/write-discard" defaults, which bypass the
correct faulting behaviour and the Intel special case.
For the PV side, writes are discarded (again, bypassing proper faulting),
except for a pinned dom0, which is permitted to actually write the values
other than 0. This is pointless with read hook implementing the Intel special
case.
However, implementing the Intel special case is itself pointless. First of
all, OS software can't guarentee to read back 0 in the first place, because a)
this behaviour isn't guarenteed in the SDM, and b) there are SMM handlers
which use the CPUID instruction. Secondly, when a guest executes CPUID, this
doesn't typically result in Xen executing a CPUID instruction in practice.
With the dom0 special case removed, there are now no writes to this MSR other
than Xen's microcode loading facilities, which means that the value held in
the MSR will be properly up-to-date. Forward it directly, without jumping
through any hoops.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
|
|
The CPUID bit and MSR are deliberately not exposed to guests, because they
won't exist on newer processors. As vPMU isn't security supported, the
misbehaviour of PCR3 isn't expected to impact production deployments.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
|
|
With PVRDTSCP mode removed, handling of MSR_TSC_AUX can move into the common
code. Move its storage into struct vcpu_msrs (dropping the HVM-specific
msr_tsc_aux), and add an RDPID feature check as this bit also enumerates the
presence of the MSR.
Introduce cpu_has_rdpid along with the synthesized cpu_has_msr_tsc_aux to
correct the context switch paths, as MSR_TSC_AUX is enumerated by either
RDTSCP or RDPID.
Drop hvm_msr_tsc_aux() entirely, and use v->arch.msrs->tsc_aux directly.
Update hvm_load_cpu_ctxt() to check that the incoming ctxt.msr_tsc_aux isn't
out of range. In practice, no previous version of Xen ever wrote an
out-of-range value. Add MSR_TSC_AUX to the list of MSRs migrated for PV
guests, but leave the HVM path using the existing space in hvm_hw_cpu.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Brian Woods <brian.woods@amd.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
|
|
Dispatch from the guest_{rd,wr}msr() functions. The read side should be safe
outside of current context, but the write side is definitely not. As the
toolstack has no legitimate reason to access the APIC registers via this
interface (not least because whether they are accessible at all depends on
guest settings), unilaterally reject access attempts outside of current
context.
Rename to guest_{rd,wr}msr_x2apic() for consistency, and alter the functions
to use X86EMUL_EXCEPTION rather than X86EMUL_UNHANDLEABLE. The previous
callers turned UNHANDLEABLE into EXCEPTION, but using UNHANDLEABLE will now
interfere with the fallback to legacy MSR handling.
While altering guest_rdmsr_x2apic() make a couple of minor improvements.
Reformat the initialiser for readable[] so it indents in a more natural way,
and alter high to be a 64bit integer to avoid shifting 0 by 32 in the common
path.
Observant people might notice that we now don't let PV guests read the x2apic
MSRs. They should never have been able to in the first place.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
|
|
This is a followup to c/s 96f235c26 which fulfils the remaining TODO item.
First of all, the pre-existing SVM code has a bug. The value in
msrs->dr_mask[] may be stale, as we allow direct access to these MSRs.
Resolve this in guest_rdmsr() by reading directly from hardware in the
affected case.
With the reading/writing logic moved to the common guest_{rd,wr}msr()
infrastructure, the migration logic can be simplified. The PV migration logic
drops all of its special casing, and SVM's entire {init,save,load}_msr()
infrastructure becomes unnecessary.
The resulting diffstat shows quite how expensive the PV special cases where in
arch_do_domctl().
add/remove: 0/3 grow/shrink: 4/6 up/down: 465/-1494 (-1029)
Function old new delta
guest_rdmsr 252 484 +232
guest_wrmsr 653 822 +169
msrs_to_send 8 48 +40
hvm_load_cpu_msrs 489 513 +24
svm_init_msr 21 - -21
hvm_save_cpu_msrs 365 343 -22
read_msr 1089 1001 -88
write_msr 1829 1689 -140
svm_msr_read_intercept 1124 970 -154
svm_load_msr 195 - -195
svm_save_msr 196 - -196
svm_msr_write_intercept 1461 1265 -196
arch_do_domctl 9581 9099 -482
Total: Before=3314610, After=3313581, chg -0.03%
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
|
|
Rename them to guest_{rd,wr}msr_xen() for consistency, and because the _regs
suffix isn't very appropriate.
Update them to take a vcpu pointer rather than presuming that they act on
current, and switch to using X86EMUL_* return values.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
|
|
Rename the functions to guest_{rd,wr}msr_viridian() for consistency, and
because the _regs() suffix isn't very appropriate.
Update them to take a vcpu pointer rather than presuming that they act on
current, which is safe for all implemented operations, and switch their return
ABI to use X86EMUL_*.
The default cases no longer need to deal with MSRs out of the Viridian range,
but drop the printks to debug builds only and identify the value attempting to
be written.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
|
|
Despite the complicated diff in {svm,vmx}_msr_write_intercept(), it is just
the 0 case losing one level of indentation, as part of removing the call to
wrmsr_hypervisor_regs().
The case blocks in guest_{wr,rd}msr() use raw numbers, partly for consistency
with the CPUID side of things, but mainly because this is clearer code to
follow. In particular, the Xen block may overlap with the Viridian block if
Viridian is not enabled for the domain, and trying to express this with named
literals caused more confusion that it solved.
Future changes with clean up the individual APIs, including allowing these
MSRs to be usable for vcpus other than current (no callers exist with v !=
current).
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
|
|
Guests (outside of the nested virt case, which isn't supported yet) don't need
L1D_FLUSH for their L1TF mitigations, but offering/emulating MSR_FLUSH_CMD is
easy and doesn't pose an issue for Xen.
The MSR is offered to HVM guests only. PV guests attempting to use it would
trap for emulation, and the L1D cache would fill long before the return to
guest context. As such, PV guests can't make any use of the L1D_FLUSH
functionality.
This is part of XSA-273 / CVE-2018-3646.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
|
|
After attempting to develop the infrastructure, it turns out that the choice
of naming is suboptimal.
Rename msr_domain_policy to just msr_policy to mirror the CPUID side of
things, and alter the 'dp' variable name convention to 'mp'. While altering
all the names, export all of the system msr_policy objects (which are already
global symbols).
Rename msr_vcpu_policy to vcpu_msrs and switch 'vp' to 'msrs' in code. Update
the arch_vcpu field name to match.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
|
|
... and use it in place of the opencoded instances.
For consistency, restructure init_domain_cpuid_policy() to be like
init_{domain,vcpu}_msr_policy() by operating on the local pointer where
possible.
No change in behaviour.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
|
|
This simplifies future interactions with the toolstack, by removing the need
for per-MSR custom accessors when shuffling data in/out of a policy.
Use a 32bit raw backing integer (for simplicity), and use a bitfield to move
the cpuid_faulting field to its appropriate position.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
|
|
These MSRs are non-architectural and the available booleans were used in lieu
of an architectural signal of availability.
However, in hindsight, the additional booleans make toolstack MSR interactions
more complicated. The MSRs are unconditionally available to HVM guests, but
currently for PV guests, are hidden when CPUID faulting is unavailable.
Instead, switch them to being unconditionally readable, even for PV guests.
The new behaviour is:
* PLATFORM_INFO is unconditionally readable even for PV guests and will
indicate the presence or absence of CPUID Faulting in bit 31.
* MISC_FEATURES_ENABLES is unconditionally readable, and bit 0 may be set
iff PLATFORM_INFO reports that CPUID Faulting is available.
As a minor bugfix, CPUID Faulting for HVM guests is not restricted to
Intel/AMD hardware. In particular, VIA have a VT-x implementaion conforming
to the Intel specification.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
|
|
Almost all infrastructure is already in place. Update the reserved bits
calculation in guest_wrmsr(), and offer SSBD to guests by default.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
|