1# x86/HVM direct boot ABI #
2
3Since the Xen entry point into the kernel can be different from the
4native entry point, a `ELFNOTE` is used in order to tell the domain
5builder how to load and jump into the kernel entry point:
6
7    ELFNOTE(Xen, XEN_ELFNOTE_PHYS32_ENTRY,          .long,  xen_start32)
8
9The presence of the `XEN_ELFNOTE_PHYS32_ENTRY` note indicates that the
10kernel supports the boot ABI described in this document.
11
12The domain builder must load the kernel into the guest memory space and
13jump into the entry point defined at `XEN_ELFNOTE_PHYS32_ENTRY` with the
14following machine state:
15
16 * `ebx`: contains the physical memory address where the loader has placed
17   the boot start info structure.
18
19 * `cr0`: bit 0 (PE) must be set. All the other writeable bits are cleared.
20
21 * `cr4`: all bits are cleared.
22
23 * `cs`: must be a 32-bit read/execute code segment with a base of ‘0’
24   and a limit of ‘0xFFFFFFFF’. The selector value is unspecified.
25
26 * `ds`, `es`, `ss`: must be a 32-bit read/write data segment with a base of
27   ‘0’ and a limit of ‘0xFFFFFFFF’. The selector values are all unspecified.
28
29 * `tr`: must be a 32-bit TSS (active) with a base of '0' and a limit of '0x67'.
30
31 * `eflags`: bit 17 (VM) must be cleared. Bit 9 (IF) must be cleared.
32   Bit 8 (TF) must be cleared. Other bits are all unspecified.
33
34All other processor registers and flag bits are unspecified. The OS is in
35charge of setting up it's own stack, GDT and IDT.
36
37The format of the boot start info structure (pointed to by %ebx) can be found
38in `xen/include/public/arch-x86/hvm/start_info.h`
39
40Other relevant information needed in order to boot a guest kernel
41(console page address, xenstore event channel...) can be obtained
42using HVMPARAMS, just like it's done on HVM guests.
43
44The setup of the hypercall page is also performed in the same way
45as HVM guests, using the hypervisor cpuid leaves and msr ranges.
46
47## AP startup ##
48
49AP startup can be performed using hypercalls or the local APIC if present.
50The following VCPU hypercalls can be used in order to bring up secondary vCPUs:
51
52 * `VCPUOP_initialise` is used to set the initial state of the vCPU. The
53   argument passed to the hypercall must be of the type vcpu_hvm_context.
54   See `public/hvm/hvm_vcpu.h` for the layout of the structure. Note that
55   this hypercall allows starting the vCPU in several modes (16/32/64bits),
56   regardless of the mode the BSP is currently running on.
57
58 * `VCPUOP_up` is used to launch the vCPU once the initial state has been
59   set using `VCPUOP_initialise`.
60
61 * `VCPUOP_down` is used to bring down a vCPU.
62
63 * `VCPUOP_is_up` is used to scan the number of available vCPUs.
64
65## Hardware description ##
66
67PVH guests that have access to hardware (either emulated or real) will also
68have ACPI tables with the description of the hardware that's available to the
69guest. This applies to both privileged and unprivileged guests. A pointer to
70the position of the RSDP in memory (if present) can be fetched from the start
71info structure that's passed at boot time (field `rsdp_paddr`).
72
73Description of paravirtualized devices will come from XenStore, just as it's
74done for HVM guests.
75
76## Interrupts ##
77
78### Interrupts from physical devices ###
79
80Interrupts from physical devices are delivered using native methods, this is
81done in order to take advantage of new hardware assisted virtualization
82functions, like posted interrupts. This implies that PVH guests with physical
83devices will also have the necessary interrupt controllers in order to manage
84the delivery of interrupts from those devices, using the same interfaces that
85are available on native hardware.
86
87### Interrupts from paravirtualized devices ###
88
89Interrupts from paravirtualized devices are delivered using event channels, see
90[Event Channel Internals][event_channels] for more detailed information about
91event channels. Delivery of those interrupts can be configured in the same way
92as HVM guests, check `xen/include/public/hvm/params.h` and
93`xen/include/public/hvm/hvm_op.h` for more information about available delivery
94methods.
95
96## MTRR ##
97
98### Unprivileged guests ###
99
100PVH guests are currently booted with the default MTRR type set to write-back
101and MTRR enabled. This allows DomUs to start with a sane MTRR state. Note that
102this will have to be revisited when pci-passthrough is added to PVH in order to
103set MMIO regions as UC.
104
105Xen guarantees that RAM regions will always have the WB cache type set in the
106initial MTRR state, either set by the default MTRR type or by other means.
107
108### Hardware domain ###
109
110A PVH hardware domain is booted with the same MTRR state as the one found on
111the host. This is done because the hardware domain memory map is already a
112modified copy of the host memory map, so the same MTRR setup should work.
113