1# Xen Hypervisor Command Line Options
2
3This document covers the command line options which the Xen
4Hypervisor.
5
6## Types of parameter
7
8Most parameters take the form `option=value`.  Different options on
9the command line should be space delimited.  All options are case
10sensitive, as are all values unless explicitly noted.
11
12### Boolean (`<boolean>`)
13
14All boolean option may be explicitly enabled using a `value` of
15> `yes`, `on`, `true`, `enable` or `1`
16
17They may be explicitly disabled using a `value` of
18> `no`, `off`, `false`, `disable` or `0`
19
20In addition, a boolean option may be enabled by simply stating its
21name, and may be disabled by prefixing its name with `no-`.
22
23####Examples
24
25Enable noreboot mode
26> `noreboot=true`
27
28Disable x2apic support (if present)
29> `x2apic=off`
30
31Enable synchronous console mode
32> `sync_console`
33
34Explicitly specifying any value other than those listed above is
35undefined, as is stacking a `no-` prefix with an explicit value.
36
37### Integer (`<integer>`)
38
39An integer parameter will default to decimal and may be prefixed with
40a `-` for negative numbers.  Alternatively, a hexadecimal number may be
41used by prefixing the number with `0x`, or an octal number may be used
42if a leading `0` is present.
43
44Providing a string which does not validly convert to an integer is
45undefined.
46
47### Size (`<size>`)
48
49A size parameter may be any integer, with a single size suffix
50
51* `T` or `t`: TiB (2^40)
52* `G` or `g`: GiB (2^30)
53* `M` or `m`: MiB (2^20)
54* `K` or `k`: KiB (2^10)
55* `B` or `b`: Bytes
56
57Without a size suffix, the default will be kilo.  Providing a suffix
58other than those listed above is undefined.
59
60### String
61
62Many parameters are more complicated and require more intricate
63configuration.  The detailed description of each individual parameter
64specify which values are valid.
65
66### List
67
68Some options take a comma separated list of values.
69
70### Combination
71
72Some parameters act as combinations of the above, most commonly a mix
73of Boolean and String.  These are noted in the relevant sections.
74
75## Parameter details
76
77### acpi
78> `= force | ht | noirq | <boolean>`
79
80**String**, or **Boolean** to disable.
81
82The **acpi** option is used to control a set of four related boolean
83flags; `acpi_force`, `acpi_ht`, `acpi_noirq` and `acpi_disabled`.
84
85By default, Xen will scan the DMI data and blacklist certain systems
86which are known to have broken ACPI setups.  Providing `acpi=force`
87will cause Xen to ignore the blacklist and attempt to use all ACPI
88features.
89
90Using `acpi=ht` causes Xen to parse the ACPI tables enough to
91enumerate all CPUs, but will not use other ACPI features.  This is not
92common, and only has an effect if your system is blacklisted.
93
94The `acpi=noirq` option causes Xen to not parse the ACPI MADT table
95looking for IO-APIC entries.  This is also not common, and any system
96which requires this option to function should be blacklisted.
97Additionally, this will not prevent Xen from finding IO-APIC entries
98from the MP tables.
99
100Finally, any of the boolean false options can be used to disable ACPI
101usage entirely.
102
103Because responsibility for ACPI processing is shared between Xen and
104the domain 0 kernel this option is automatically propagated to the
105domain 0 command line
106
107### acpi_apic_instance
108> `= <integer>`
109
110Specify which ACPI MADT table to parse for APIC information, if more
111than one is present.
112
113### acpi_pstate_strict (x86)
114> `= <boolean>`
115
116> Default: `false`
117
118Enforce checking that P-state transitions by the ACPI cpufreq driver
119actually result in the nominated frequency to be established. A warning
120message will be logged if that isn't the case.
121
122### acpi_skip_timer_override (x86)
123> `= <boolean>`
124
125Instruct Xen to ignore timer-interrupt override.
126
127### acpi_sleep (x86)
128> `= s3_bios | s3_mode`
129
130`s3_bios` instructs Xen to invoke video BIOS initialization during S3
131resume.
132
133`s3_mode` instructs Xen to set up the boot time (option `vga=`) video
134mode during S3 resume.
135
136### allow_unsafe (x86)
137> `= <boolean>`
138
139> Default: `false`
140
141Force boot on potentially unsafe systems. By default Xen will refuse
142to boot on systems with the following errata:
143
144* AMD Erratum 121. Processors with this erratum are subject to a guest
145  triggerable Denial of Service. Override only if you trust all of
146  your PV guests.
147
148### altp2m (Intel)
149> `= <boolean>`
150
151> Default: `false`
152
153Permit multiple copies of host p2m.
154
155### apic (x86)
156> `= bigsmp | default`
157
158Override Xen's logic for choosing the APIC driver.  By default, if
159there are more than 8 CPUs, Xen will switch to `bigsmp` over
160`default`.
161
162### apicv (Intel)
163> `= <boolean>`
164
165> Default: `true`
166
167Permit Xen to use APIC Virtualisation Extensions.  This is an optimisation
168available as part of VT-x, and allows hardware to take care of the guests APIC
169handling, rather than requiring emulation in Xen.
170
171### apic_verbosity (x86)
172> `= verbose | debug`
173
174Increase the verbosity of the APIC code from the default value.
175
176### arat (x86)
177> `= <boolean>`
178
179> Default: `true`
180
181Permit Xen to use "Always Running APIC Timer" support on compatible hardware
182in combination with cpuidle.  This option is only expected to be useful for
183developers wishing Xen to fall back to older timing methods on newer hardware.
184
185### argo
186    = List of [ <bool>, mac-permissive=<bool> ]
187
188Controls for the Argo hypervisor-mediated interdomain communication service.
189
190The functionality that this option controls is only available when Xen has been
191compiled with the build setting for Argo enabled in the build configuration.
192
193Argo is a interdomain communication mechanism, where Xen acts as the central
194point of authority.  Guests may register memory rings to recieve messages,
195query the status of other domains, and send messages by hypercall, all subject
196to appropriate auditing by Xen.  Argo is disabled by default.
197
198*   The `mac-permissive` boolean controls whether wildcard receive rings may be
199    registered (`mac-permissive=1`) or may not be registered
200    (`mac-permissive=0`).
201
202    This option is disabled by default, to protect domains from a DoS by a
203    buggy or malicious other domain spamming the ring.
204
205### asid (x86)
206> `= <boolean>`
207
208> Default: `true`
209
210Permit Xen to use Address Space Identifiers.  This is an optimisation which
211tags the TLB entries with an ID per vcpu.  This allows for guest TLB flushes
212to be performed without the overhead of a complete TLB flush.
213
214### async-show-all (x86)
215> `= <boolean>`
216
217> Default: `false`
218
219Forces all CPUs' full state to be logged upon certain fatal asynchronous
220exceptions (watchdog NMIs and unexpected MCEs).
221
222### ats (x86)
223> `= <boolean>`
224
225> Default: `false`
226
227Permits Xen to set up and use PCI Address Translation Services.  This is a
228performance optimisation for PCI Passthrough.
229
230**WARNING: Xen cannot currently safely use ATS because of its synchronous wait
231loops for Queued Invalidation completions.**
232
233### availmem
234> `= <size>`
235
236> Default: `0` (no limit)
237
238Specify a maximum amount of available memory, to which Xen will clamp
239the e820 table.
240
241### badpage
242> `= List of [ <integer> | <integer>-<integer> ]`
243
244Specify that certain pages, or certain ranges of pages contain bad
245bytes and should not be used.  For example, if your memory tester says
246that byte `0x12345678` is bad, you would place `badpage=0x12345` on
247Xen's command line.
248
249### bootscrub
250> `= idle | <boolean>`
251
252> Default: `idle`
253
254Scrub free RAM during boot.  This is a safety feature to prevent
255accidentally leaking sensitive VM data into other VMs if Xen crashes
256and reboots.
257
258In `idle` mode, RAM is scrubbed in background on all CPUs during idle-loop
259with a guarantee that memory allocations always provide scrubbed pages.
260This option reduces boot time on machines with a large amount of RAM while
261still providing security benefits.
262
263### bootscrub_chunk
264> `= <size>`
265
266> Default: `128M`
267
268Maximum RAM block size chunks to be scrubbed whilst holding the page heap lock
269and not running softirqs. Reduce this if softirqs are not being run frequently
270enough. Setting this to a high value may cause boot failure, particularly if
271the NMI watchdog is also enabled.
272
273### cet
274    = List of [ shstk=<bool> ]
275
276    Applicability: x86
277
278Controls for the use of Control-flow Enforcement Technology.  CET is group a
279of hardware features designed to combat Return-oriented Programming (ROP, also
280call/jmp COP/JOP) attacks.
281
282*   The `shstk=` boolean controls whether Xen uses Shadow Stacks for its own
283    protection.
284
285    The option is available when `CONFIG_XEN_SHSTK` is compiled in, and
286    defaults to `true` on hardware supporting CET-SS.  Specifying
287    `cet=no-shstk` will cause Xen not to use Shadow Stacks even when support
288    is available in hardware.
289
290    Shadow Stacks are incompatible with 32bit PV guests.  This option will
291    override the `pv=32` boolean to false.  Backwards compatibility can be
292    maintained with the `pv-shim` mechanism.
293
294### clocksource (x86)
295> `= pit | hpet | acpi | tsc`
296
297If set, override Xen's default choice for the platform timer.
298Having TSC as platform timer requires being explicitly set. This is because
299TSC can only be safely used if CPU hotplug isn't performed on the system. On
300some platforms, the "maxcpus" option may need to be used to further adjust
301the number of allowed CPUs.  When running on platforms that can guarantee a
302monotonic TSC across sockets you may want to adjust the "tsc" command line
303parameter to "stable:socket".
304
305### cmci-threshold (Intel)
306> `= <integer>`
307
308> Default: `2`
309
310Specify the event count threshold for raising Corrected Machine Check
311Interrupts.  Specifying zero disables CMCI handling.
312
313### cmos-rtc-probe (x86)
314> `= <boolean>`
315
316> Default: `false`
317
318Flag to indicate whether to probe for a CMOS Real Time Clock irrespective of
319ACPI indicating none to be there.
320
321### com1
322### com2
323> `= <baud>[/<base-baud>][,[DPS][,[<io-base>|pci|amt][,[<irq>|msi][,[<port-bdf>][,[<bridge-bdf>]]]]]]`
324
325Both option `com1` and `com2` follow the same format.
326
327* `<baud>` may be either an integer baud rate, or the string `auto` if
328  the bootloader or other earlier firmware has already set it up.
329* Optionally, the base baud rate (usually the highest baud rate the
330  device can communicate at) can be specified.
331* `DPS` represents the number of data bits, the parity, and the number
332  of stop bits.
333  * `D` is an integer between 5 and 8 for the number of data bits.
334  * `P` is a single character representing the type of parity:
335      * `n` No
336      * `o` Odd
337      * `e` Even
338      * `m` Mark
339      * `s` Space
340  * `S` is an integer 1 or 2 for the number of stop bits.
341* `<io-base>` is an integer which specifies the IO base port for UART
342  registers.
343* `<irq>` is the IRQ number to use, or `0` to use the UART in poll
344  mode only, or `msi` to set up a Message Signaled Interrupt.
345* `<port-bdf>` is the PCI location of the UART, in
346  `<bus>:<device>.<function>` notation.
347* `<bridge-bdf>` is the PCI bridge behind which is the UART, in
348  `<bus>:<device>.<function>` notation.
349* `pci` indicates that Xen should scan the PCI bus for the UART,
350  avoiding Intel AMT devices.
351* `amt` indicated that Xen should scan the PCI bus for the UART,
352  including Intel AMT devices if present.
353
354A typical setup for most situations might be `com1=115200,8n1`
355
356In addition to the above positional specification for UART parameters,
357name=value pair specfications are also supported. This is used to add
358flexibility for UART devices which require additional UART parameter
359configurations.
360
361The comma separation still delineates positional parameters. Hence,
362unless the parameter is explicitly specified with name=value option, it
363will be considered a positional parameter.
364
365The syntax consists of
366com1=(comma-separated positional parameters),(comma separated name-value pairs)
367
368The accepted name keywords for name=value pairs are:
369
370* `baud` - accepts integer baud rate (eg. 115200) or `auto`
371* `bridge`- Similar to bridge-bdf in positional parameters.
372            Used to determine the PCI bridge to access the UART device.
373            Notation is xx:xx.x `<bus>:<device>.<function>`
374* `clock-hz`- accepts large integers to setup UART clock frequencies.
375              Do note - these values are multiplied by 16.
376* `data-bits` - integer between 5 and 8
377* `dev` - accepted values are `pci` OR `amt`. If this option
378          is used to specify if the serial device is pci-based. The io_base
379          cannot be specified when `dev=pci` or `dev=amt` is used.
380* `io-base` - accepts integer which specified IO base port for UART registers
381* `irq` - IRQ number to use
382* `parity` - accepted values are same as positional parameters
383* `port` - Used to specify which port the PCI serial device is located on
384           Notation is xx:xx.x `<bus>:<device>.<function>`
385* `reg-shift` - register shifts required to set UART registers
386* `reg-width` - register width required to set UART registers
387                (only accepts 1 and 4)
388* `stop-bits` - only accepts 1 or 2 for the number of stop bits
389
390The following are examples of correct specifications:
391
392    com1=115200,8n1,0x3f8,4
393    com1=115200,8n1,0x3f8,4,reg-width=4,reg-shift=2
394    com1=baud=115200,parity=n,stop-bits=1,io-base=0x3f8,reg-width=4
395
396### conring_size
397> `= <size>`
398
399> Default: `conring_size=16k`
400
401Specify the size of the console ring buffer.
402
403### console
404> `= List of [ vga | com1[H,L] | com2[H,L] | pv | dbgp | none ]`
405
406> Default: `console=com1,vga`
407
408Specify which console(s) Xen should use.
409
410`vga` indicates that Xen should try and use the vga graphics adapter.
411
412`com1` and `com2` indicates that Xen should use serial ports 1 and 2
413respectively.  Optionally, these arguments may be followed by an `H` or
414`L`.  `H` indicates that transmitted characters will have their MSB
415set, while received characters must have their MSB set.  `L` indicates
416the converse; transmitted and received characters will have their MSB
417cleared.  This allows a single port to be shared by two subsystems
418(e.g. console and debugger).
419
420`pv` indicates that Xen should use Xen's PV console. This option is
421only available when used together with `pv-in-pvh`.
422
423`dbgp` indicates that Xen should use a USB debug port.
424
425`none` indicates that Xen should not use a console.  This option only
426makes sense on its own.
427
428### console_timestamps
429> `= none | date | datems | boot | raw`
430
431> Default: `none`
432
433> Can be modified at runtime
434
435Specify which timestamp format Xen should use for each console line.
436
437* `none`: No timestamps
438* `date`: Date and time information
439    * `[YYYY-MM-DD HH:MM:SS]`
440* `datems`: Date and time, with milliseconds
441    * `[YYYY-MM-DD HH:MM:SS.mmm]`
442* `boot`: Seconds and microseconds since boot
443    * `[SSSSSS.uuuuuu]`
444+ `raw`: Raw platform ticks, architecture and implementation dependent
445    * `[XXXXXXXXXXXXXXXX]`
446
447For compatibility with the older boolean parameter, specifying
448`console_timestamps` alone will enable the `date` option.
449
450### console_to_ring
451> `= <boolean>`
452
453> Default: `false`
454
455Flag to indicate whether all guest console output should be copied
456into the console ring buffer.
457
458### conswitch
459> `= <switch char>[x]`
460
461> Default: `conswitch=a`
462
463> Can be modified at runtime
464
465Specify which character should be used to switch serial input between
466Xen and dom0.  The required sequence is CTRL-&lt;switch char&gt; three
467times.
468
469The optional trailing `x` indicates that Xen should not automatically
470switch the console input to dom0 during boot.  Any other value,
471including omission, causes Xen to automatically switch to the dom0
472console during dom0 boot.  Use `conswitch=ax` to keep the default switch
473character, but for xen to keep the console.
474
475### core_parking
476> `= power | performance`
477
478> Default: `power`
479
480### cpu_type (x86)
481> `= arch_perfmon`
482
483If set, force use of the performance counters for oprofile, rather than detecting
484available support.
485
486### cpufreq
487> `= none | {{ <boolean> | xen } [:[powersave|performance|ondemand|userspace][,<maxfreq>][,[<minfreq>][,[verbose]]]]} | dom0-kernel`
488
489> Default: `xen`
490
491Indicate where the responsibility for driving power states lies.  Note that the
492choice of `dom0-kernel` is deprecated and not supported by all Dom0 kernels.
493
494* Default governor policy is ondemand.
495* `<maxfreq>` and `<minfreq>` are integers which represent max and min processor frequencies
496  respectively.
497* `verbose` option can be included as a string or also as `verbose=<integer>`
498
499### cpuid (x86)
500> `= List of comma separated booleans`
501
502This option allows for fine tuning of the facilities Xen will use, after
503accounting for hardware capabilities as enumerated via CPUID.
504
505Unless otherwise noted, options only have any effect in their negative form,
506to hide the named feature(s).  Ignoring a feature using this mechanism will
507cause Xen not to use the feature, nor offer them as usable to guests.
508
509Currently accepted:
510
511The Speculation Control hardware features `srbds-ctrl`, `md-clear`, `ibrsb`,
512`stibp`, `ibpb`, `l1d-flush` and `ssbd` are used by default if available and
513applicable.  They can all be ignored.
514
515`rdrand` and `rdseed` have multiple interactions.
516
517*   For Special Register Buffer Data Sampling (SRBDS, XSA-320, CVE-2020-0543),
518    RDRAND and RDSEED can be ignored.
519
520    Due to the absence of microcode to address SRBDS on IvyBridge client
521    hardware, the RDRAND feature is hidden by default for guests, unless
522    `rdrand` is used in its positive form.  Irrespective of the setting here,
523    VMs can use RDRAND if explicitly enabled in guest config file, and VMs
524    already using RDRAND can migrate in.
525
526*   The RDRAND feature is disabled by default on AMD Fam15/16 systems, due to
527    possible malfunctions after ACPI S3 suspend/resume.  `rdrand` may be used
528    in its positive form to override Xen's default behaviour on these systems,
529    and make the feature fully usable.
530
531### cpuid_mask_cpu
532> `= fam_0f_rev_[cdefg] | fam_10_rev_[bc] | fam_11_rev_b`
533
534> Applicability: AMD
535
536If none of the other **cpuid_mask_\*** options are given, Xen has a set of
537pre-configured masks to make the current processor appear to be
538family/revision specified.
539
540See below for general information on masking.
541
542**Warning: This option is not fully effective on Family 15h processors or
543later.**
544
545### cpuid_mask_ecx
546### cpuid_mask_edx
547### cpuid_mask_ext_ecx
548### cpuid_mask_ext_edx
549### cpuid_mask_l7s0_eax
550### cpuid_mask_l7s0_ebx
551### cpuid_mask_thermal_ecx
552### cpuid_mask_xsave_eax
553> `= <integer>`
554
555> Applicability: x86.  Default: `~0` (all bits set)
556
557The availability of these options are model specific.  Some processors don't
558support any of them, and no processor supports all of them.  Xen will ignore
559options on processors which are lacking support.
560
561These options can be used to alter the features visible via the `CPUID`
562instruction.  Settings applied here take effect globally, including for Xen
563and all guests.
564
565Note: Since Xen 4.7, it is no longer necessary to mask a host to create
566migration safety in heterogeneous scenarios.  All necessary CPUID settings
567should be provided in the VM configuration file.  Furthermore, it is
568recommended not to use this option, as doing so causes an unnecessary
569reduction of features at Xen's disposal to manage guests.
570
571### cpuidle (x86)
572> `= <boolean>`
573
574### cpuinfo (x86)
575> `= <boolean>`
576
577### crashinfo_maxaddr
578> `= <size>`
579
580> Default: `4G`
581
582Specify the maximum address to allocate certain structures, if used in
583combination with the **low_crashinfo** command line option.
584
585### crashkernel
586> `= <ramsize-range>:<size>[,...][{@,<}<offset>]`
587> `= <size>[{@,<}<offset>]`
588> `= <size>,below=offset`
589
590Specify sizes and optionally placement of the crash kernel reservation
591area.  The `<ramsize-range>:<size>` pairs indicate how much memory to
592set aside for a crash kernel (`<size>`) for a given range of installed
593RAM (`<ramsize-range>`).  Each `<ramsize-range>` is of the form
594`<start>-[<end>]`.
595
596A trailing `@<offset>` specifies the exact address this area should be
597placed at, whereas `<` in place of `@` just specifies an upper bound of
598the address range the area should fall into.
599
600< and below are synonyomous, the latter being useful for grub2 systems
601which would otherwise require escaping of the < option
602
603
604### credit2_balance_over
605> `= <integer>`
606
607### credit2_balance_under
608> `= <integer>`
609
610### credit2_cap_period_ms
611> `= <integer>`
612
613> Default: `10`
614
615Domains subject to a cap receive a replenishment of their runtime budget
616once every cap period interval. Default is 10 ms. The amount of budget
617they receive depends on their cap. For instance, a domain with a 50% cap
618will receive 50% of 10 ms, so 5 ms.
619
620### credit2_load_precision_shift
621> `= <integer>`
622
623> Default: `18`
624
625Specify the number of bits to use for the fractional part of the
626values involved in Credit2 load tracking and load balancing math.
627
628### credit2_load_window_shift
629> `= <integer>`
630
631> Default: `30`
632
633Specify the number of bits to use to represent the length of the
634window (in nanoseconds) we use for load tracking inside Credit2.
635This means that, with the default value (30), we use
6362^30 nsec ~= 1 sec long window.
637
638Load tracking is done by means of a variation of exponentially
639weighted moving average (EWMA). The window length defined here
640is what tells for how long we give value to previous history
641of the load itself. In fact, after a full window has passed,
642what happens is that we discard all previous history entirely.
643
644A short window will make the load balancer quick at reacting
645to load changes, but also short-sighted about previous history
646(and hence, e.g., long term load trends). A long window will
647make the load balancer thoughtful of previous history (and
648hence capable of capturing, e.g., long term load trends), but
649also slow in responding to load changes.
650
651The default value of `1 sec` is rather long.
652
653### credit2_runqueue
654> `= cpu | core | socket | node | all`
655
656> Default: `socket`
657
658Specify how host CPUs are arranged in runqueues. Runqueues are kept
659balanced with respect to the load generated by the vCPUs running on
660them. Smaller runqueues (as in with `core`) means more accurate load
661balancing (for instance, it will deal better with hyperthreading),
662but also more overhead.
663
664Available alternatives, with their meaning, are:
665* `cpu`: one runqueue per each logical pCPUs of the host;
666* `core`: one runqueue per each physical core of the host;
667* `socket`: one runqueue per each physical socket (which often,
668            but not always, matches a NUMA node) of the host;
669* `node`: one runqueue per each NUMA node of the host;
670* `all`: just one runqueue shared by all the logical pCPUs of
671         the host
672
673### dbgp
674> `= ehci[ <integer> | @pci<bus>:<slot>.<func> ]`
675
676Specify the USB controller to use, either by instance number (when going
677over the PCI busses sequentially) or by PCI device (must be on segment 0).
678
679### debug_stack_lines
680> `= <integer>`
681
682> Default: `20`
683
684Limits the number lines printed in Xen stack traces.
685
686### debugtrace
687> `= [cpu:]<size>`
688
689> Default: `128`
690
691Specify the size of the console debug trace buffer. By specifying `cpu:`
692additionally a trace buffer of the specified size is allocated per cpu.
693The debug trace feature is only enabled in debugging builds of Xen.
694
695### dma_bits
696> `= <integer>`
697
698Specify the bit width of the DMA heap.
699
700### dom0
701    = List of [ pv | pvh, shadow=<bool>, verbose=<bool>,
702                cpuid-faulting=<bool> ]
703
704    Applicability: x86
705
706Controls for how dom0 is constructed on x86 systems.
707
708*   The `pv` and `pvh` options select the virtualisation mode of dom0.
709
710    The `pv` option is only available when `CONFIG_PV` is compiled in.  The
711    `pvh` option is only available when `CONFIG_HVM` is compiled in.  When
712    both options are compiled in, the default is PV.
713
714    In addition, the following requirements must be met:
715
716    *   The dom0 kernel selected by the boot loader must be capable of the
717        selected mode.
718    *   For a PVH dom0, the hardware must have VT-x/SVM extensions available.
719
720*   The `shadow` boolean allows dom0 to be explicitly constructed using shadow
721    paging.  This option is unavailable when `CONFIG_SHADOW_PAGING` is
722    disabled.
723
724    For PVH, dom0 defaults to using HAP on capable hardware, and falls back to
725    shadow paging otherwise.  A PVH dom0 cannot be used if Xen is compiled
726    without shadow paging support, and the hardware lacks HAP support.
727
728    For PV, the use of dom0 shadow mode is only for development purposes.  PV
729    guests do no require any paging support by default.
730
731*   The `verbose` boolean is intended for diagnostics, and prints out extra
732    information during the dom0 build.  It defaults to the compile time choice
733    of `CONFIG_VERBOSE_DEBUG`.
734
735*   The `cpuid-faulting` boolean is an interim option, is only applicable to
736    PV dom0, and defaults to true.
737
738    Before Xen 4.13, the domain builder logic for guest construction depended
739    on seeing host CPUID values to function correctly.  As a result, CPUID
740    Faulting was never activated for PV dom0's, even on capable hardware.
741
742    In Xen 4.13, the domain builder logic has been fixed, and no longer has
743    this dependency.  As a consequence, CPUID Faulting is activated by default
744    even for PV dom0's.
745
746    However, as PV dom0's have always seen host CPUID data in the past, there
747    is a chance that further dependencies exist.  This boolean can be used to
748    restore the pre-4.13 behaviour.  If specifying `no-cpuid-faulting` fixes
749    an issue in dom0, please report a bug.
750
751### dom0-iommu
752    = List of [ passthrough=<bool>, strict=<bool>, map-inclusive=<bool>,
753                map-reserved=<bool>, none ]
754
755Controls for the dom0 IOMMU setup.
756
757*   The `passthrough` boolean controls whether IOMMU translation functionality
758    is disabled for devices in dom0 (`passthrough=1`) or whether the IOMMU is
759    used to ensure that dom0 can only DMA to its permitted areas of RAM
760    (`passthrough=0`).
761
762    This option is only applicable to x86 PV dom0's, and defaults to false.
763
764    Some older Intel VT-d hardware isn't capable of disabling translation
765    functionality on a per-device basis, and will cause this option to be
766    ignored and assumed to be 0.  Similar behaviour on such systems is only
767    available by fully disabling all IOMMUs.
768
769    This option is hardwired to false for x86 PVH dom0's (where a non-identity
770    transform is required for dom0 to function), and is ignored for ARM.
771
772*   The `strict` boolean is applicable to x86 PV dom0's only and defaults to
773    false.  It controls whether dom0 can have IOMMU mappings for all domain
774    RAM in the system, or only for its allocated RAM (and grant mappings etc.)
775
776    This option is hardwired to true for x86 PVH dom0's (as RAM belonging to
777    other domains in the system don't live in a compatible address space), and
778    is ignored for ARM.
779
780*   The `map-inclusive` boolean is applicable to x86 PV dom0's, and sets up
781    identity IOMMU mappings for all non-RAM regions below 4GB except for
782    unusable ranges, and ranges belonging to Xen.
783
784    Typically, some devices in a system use bits of RAM for communication, and
785    these areas should be listed as reserved in the E820 table and identified
786    via RMRR or IVMD entries in the APCI tables, so Xen can ensure that they
787    are identity-mapped in the IOMMU.  However, some firmware makes mistakes,
788    and this option is a coarse-grain workaround for those errors.
789
790    Where possible, finer grain corrections should be made with the `rmrr=`,
791    `ivrs_hpet=` or `ivrs_ioapic=` command line options.
792
793    This option is disabled by default, and deprecated and intended for
794    removal in future versions of Xen.  If specifying `map-inclusive` is the
795    only way to make your system boot, please report a bug.
796
797*   The `map-reserved` functionality is very similar to `map-inclusive`.
798
799    The differences from `map-inclusive` are that `map-reserved` is applicable
800    to both x86 PV and PVH dom0's, is enabled by default, and represents a
801    subset of the correction by only mapping reserved memory regions rather
802    than all non-RAM regions.
803
804*   The `none` option is intended for development purposes only, and skips
805    certain safety checks pertaining to the correct IOMMU configuration for
806    dom0 to boot.
807
808    Incorrect use of this option may result in a malfunctioning system.
809
810### dom0_ioports_disable (x86)
811> `= List of <hex>-<hex>`
812
813Specify a list of IO ports to be excluded from dom0 access.
814
815### dom0_max_vcpus
816
817Either:
818
819> `= <integer>`.
820
821The number of VCPUs to give to dom0.  This number of VCPUs can be more
822than the number of PCPUs on the host.  The default is the number of
823PCPUs.
824
825Or:
826
827> `= <min>-<max>` where `<min>` and `<max>` are integers.
828
829Gives dom0 a number of VCPUs equal to the number of PCPUs, but always
830at least `<min>` and no more than `<max>`.  Using `<min>` may give
831more VCPUs than PCPUs.  `<min>` or `<max>` may be omitted and the
832defaults of 1 and unlimited respectively are used instead.
833
834For example, with `dom0_max_vcpus=4-8`:
835
836>        Number of
837>     PCPUs | Dom0 VCPUs
838>      2    |  4
839>      4    |  4
840>      6    |  6
841>      8    |  8
842>     10    |  8
843
844### dom0_mem (ARM)
845> `= <size>`
846
847Set the amount of memory for the initial domain (dom0). It must be
848greater than zero. This parameter is required.
849
850### dom0_mem (x86)
851> `= List of ( min:<sz> | max:<sz> | <sz> )`
852
853Set the amount of memory for the initial domain (dom0). If a size is
854positive, it represents an absolute value.  If a size is negative, it
855is subtracted from the total available memory.
856
857* `<sz>` specifies the exact amount of memory.
858* `min:<sz>` specifies the minimum amount of memory.
859* `max:<sz>` specifies the maximum amount of memory.
860
861If `<sz>` is not specified, the default is all the available memory
862minus some reserve.  The reserve is 1/16 of the available memory or
863128 MB (whichever is smaller).
864
865The amount of memory will be at least the minimum but never more than
866the maximum (i.e., `max` overrides the `min` option).  If there isn't
867enough memory then as much as possible is allocated.
868
869`max:<sz>` also sets the maximum reservation (the maximum amount of
870memory dom0 can balloon up to).  If this is omitted then the maximum
871reservation is unlimited.
872
873For example, to set dom0's initial memory allocation to 512MB but
874allow it to balloon up as far as 1GB use `dom0_mem=512M,max:1G`
875
876> `<sz>` is: `<size> | [<size>+]<frac>%`
877> `<frac>` is an integer < 100
878
879* `<frac>` specifies a fraction of host memory size in percent.
880
881So `<sz>` being `1G+25%` on a 256 GB host would result in 65 GB.
882
883If you use this option then it is highly recommended that you disable
884any dom0 autoballooning feature present in your toolstack. See the
885_xl.conf(5)_ man page or [Xen Best
886Practices](https://wiki.xen.org/wiki/Xen_Best_Practices#Xen_dom0_dedicated_memory_and_preventing_dom0_memory_ballooning).
887
888This option doesn't have effect if pv-shim mode is enabled.
889
890### dom0_nodes (x86)
891
892> `= List of [ <integer> | relaxed | strict ]`
893
894> Default: `strict`
895
896Specify the NUMA nodes to place Dom0 on. Defaults for vCPU-s created
897and memory assigned to Dom0 will be adjusted to match the node
898restrictions set up here. Note that the values to be specified here are
899ACPI PXM ones, not Xen internal node numbers. `relaxed` sets up vCPU
900affinities to prefer but be not limited to the specified node(s).
901
902### dom0_vcpus_pin
903> `= <boolean>`
904
905> Default: `false`
906
907Pin dom0 vcpus to their respective pcpus
908
909### dtuart (ARM)
910> `= path [:options]`
911
912> Default: `""`
913
914Specify the full path in the device tree for the UART.  If the path doesn't
915start with `/`, it is assumed to be an alias.  The options are device specific.
916
917### e820-mtrr-clip (x86)
918> `= <boolean>`
919
920Flag that specifies if RAM should be clipped to the highest cacheable
921MTRR.
922
923> Default: `true` on Intel CPUs, otherwise `false`
924
925### e820-verbose (x86)
926> `= <boolean>`
927
928> Default: `false`
929
930Flag that enables verbose output when processing e820 information and
931applying clipping.
932
933### edd (x86)
934> `= off | on | skipmbr`
935
936Control retrieval of Extended Disc Data (EDD) from the BIOS during
937boot.
938
939### edid (x86)
940> `= no | force`
941
942Either force retrieval of monitor EDID information via VESA DDC, or
943disable it (edid=no). This option should not normally be required
944except for debugging purposes.
945
946### efi
947    = List of [ rs=<bool>, attr=no|uc ]
948
949Controls for interacting with the system Extended Firmware Interface.
950
951*   The `rs` boolean controls whether Runtime Services are used.  By default,
952    Xen uses Runtime Services itself, and proxies certain calls on behalf of
953    dom0.  Selecting `rs=0` prohibits all use of Runtime Services.
954
955*   The `attr=` string exists to specify what to do with memory regions of
956    unknown/unrecognised cacheability.  `attr=no` is the default and will
957    leave the memory regions unmapped, while `attr=uc` will map them as fully
958    uncacheable.
959
960### ept
961> `= List of [ ad=<bool>, pml=<bool>, exec-sp=<bool> ]`
962
963> Applicability: Intel
964
965Extended Page Tables are a feature of Intel's VT-x technology, whereby
966hardware manages the virtualisation of HVM guest pagetables.  EPT was
967introduced with the Nehalem architecture.
968
969*   The `ad` boolean controls hardware tracking of Access and Dirty bits in the
970    EPT pagetables, and was first introduced in Broadwell Server.
971
972    By default, Xen will use A/D tracking when available in hardware, except
973    on Avoton processors affected by erratum AVR41.  Explicitly choosing
974    `ad=0` will disable the use of A/D tracking on capable hardware, whereas
975    choosing `ad=1` will cause tracking to be used even on AVR41-affected
976    hardware.
977
978*   The `pml` boolean controls the use of Page Modification Logging, which is
979    also introduced in Broadwell Server.
980
981    PML is a feature whereby the processor generates a list of pages which
982    have been dirtied.  This is necessary information for operations such as
983    live migration, and having the processor maintain the list of dirtied
984    pages is more efficient than traditional software implementations where
985    all guest writes trap into Xen so the dirty bitmap can be maintained.
986
987    By default, Xen will use PML when it is available in hardware.  PML
988    functionally depends on A/D tracking, so choosing `ad=0` will implicitly
989    disable PML.  `pml=0` can be used to prevent the use of PML on otherwise
990    capable hardware.
991
992*   The `exec-sp` boolean controls whether EPT superpages with execute
993    permissions are permitted.  In general this is good for performance.
994
995    However, on processors vulnerable CVE-2018-12207, HVM guest kernels can
996    use executable superpages to crash the host.  By default, executable
997    superpages are disabled on affected hardware.
998
999    If HVM guest kernels are trusted not to mount a DoS against the system,
1000    this option can enabled to regain performance.
1001
1002    This boolean may be modified at runtime using `xl set-parameters
1003    ept=[no-]exec-sp` to switch between fast and secure.
1004
1005    *   When switching from secure to fast, preexisting HVM domains will run
1006        at their current performance until they are rebooted; new domains will
1007        run without any overhead.
1008
1009    *   When switching from fast to secure, all HVM domains will immediately
1010        suffer a performance penalty.
1011
1012    **Warning: No guarantee is made that this runtime option will be retained
1013      indefinitely, or that it will retain this exact behaviour.  It is
1014      intended as an emergency option for people who first chose fast, then
1015      change their minds to secure, and wish not to reboot.**
1016
1017### extra_guest_irqs
1018> `= [<domU number>][,<dom0 number>]`
1019
1020> Default: `32,<variable>`
1021
1022Change the number of PIRQs available for guests.  The optional first number is
1023common for all domUs, while the optional second number (preceded by a comma)
1024is for dom0.  Changing the setting for domU has no impact on dom0 and vice
1025versa.  For example to change dom0 without changing domU, use
1026`extra_guest_irqs=,512`.  The default value for Dom0 and an eventual separate
1027hardware domain is architecture dependent.
1028Note that specifying zero as domU value means zero, while for dom0 it means
1029to use the default.
1030
1031### flask
1032> `= permissive | enforcing | late | disabled`
1033
1034> Default: `enforcing`
1035
1036Specify how the FLASK security server should be configured.  This option is only
1037available if the hypervisor was compiled with FLASK support.  This can be
1038enabled by running either:
1039- make -C xen config and enabling XSM and FLASK.
1040- make -C xen menuconfig and enabling 'FLux Advanced Security Kernel support' and 'Xen Security Modules support'
1041
1042* `permissive`: This is intended for development and is not suitable for use
1043  with untrusted guests.  If a policy is provided by the bootloader, it will be
1044  loaded; errors will be reported to the ring buffer but will not prevent
1045  booting.  The policy can be changed to enforcing mode using "xl setenforce".
1046* `enforcing`: This will cause the security server to enter enforcing mode prior
1047  to the creation of domain 0.  If an valid policy is not provided by the
1048  bootloader and no built-in policy is present, the hypervisor will not continue
1049  booting.
1050* `late`: This disables loading of the built-in security policy or the policy
1051  provided by the bootloader.  FLASK will be enabled but will not enforce access
1052  controls until a policy is loaded by a domain using "xl loadpolicy".  Once a
1053  policy is loaded, FLASK will run in enforcing mode unless "xl setenforce" has
1054  changed that setting.
1055* `disabled`: This causes the XSM framework to revert to the dummy module.  The
1056  dummy module provides the same security policy as is used when compiling the
1057  hypervisor without support for XSM.  The xsm_op hypercall can also be used to
1058  switch to this mode after boot, but there is no way to re-enable FLASK once
1059  the dummy module is loaded.
1060
1061### font
1062> `= <height>` where height is `8x8 | 8x14 | 8x16`
1063
1064Specify the font size when using the VESA console driver.
1065
1066### force-ept (Intel)
1067> `= <boolean>`
1068
1069> Default: `false`
1070
1071Allow EPT to be enabled when VMX feature `VM_ENTRY_LOAD_GUEST_PAT` is not
1072present.
1073
1074*Warning:*
1075Due to CVE-2013-2212, VMX feature `VM_ENTRY_LOAD_GUEST_PAT` is by default
1076required as a prerequisite for using EPT.  If you are not using PCI Passthrough,
1077or trust the guest administrator who would be using passthrough, then the
1078requirement can be relaxed.  This option is particularly useful for nested
1079virtualization, to allow the L1 hypervisor to use EPT even if the L0 hypervisor
1080does not provide `VM_ENTRY_LOAD_GUEST_PAT`.
1081
1082### gdb
1083> `= com1[H,L] | com2[H,L] | dbgp`
1084
1085> Default: ``
1086
1087Specify which console gdbstub should use. See **console**.
1088
1089### gnttab
1090> `= List of [ max-ver:<integer>, transitive=<bool> ]`
1091
1092> Default: `gnttab=max-ver:2,transitive`
1093
1094Control various aspects of the grant table behaviour available to guests.
1095
1096* `max-ver` Select the maximum grant table version to offer to guests.  Valid
1097version are 1 and 2.
1098* `transitive` Permit or disallow the use of transitive grants.  Note that the
1099use of grant table v2 without transitive grants is an ABI breakage from the
1100guests point of view.
1101
1102The usage of gnttab v2 is not security supported on ARM platforms.
1103
1104### gnttab_max_frames
1105> `= <integer>`
1106
1107> Default: `64`
1108
1109> Can be modified at runtime
1110
1111Specify the maximum number of frames which any domain may use as part
1112of its grant table. This value is an upper boundary of the per-domain
1113value settable via Xen tools.
1114
1115Dom0 is using this value for sizing its grant table.
1116
1117### gnttab_max_maptrack_frames
1118> `= <integer>`
1119
1120> Default: `1024`
1121
1122> Can be modified at runtime
1123
1124Specify the maximum number of frames to use as part of a domains
1125maptrack array. This value is an upper boundary of the per-domain
1126value settable via Xen tools.
1127
1128Dom0 is using this value for sizing its maptrack table.
1129
1130### global-pages
1131    = <boolean>
1132
1133    Applicability: x86
1134    Default: true unless running virtualized on AMD or Hygon hardware
1135
1136Control whether to use global pages for PV guests, and thus the need to
1137perform TLB flushes by writing to CR4.  This is a performance trade-off.
1138
1139AMD SVM does not support selective trapping of CR4 writes, which means that a
1140global TLB flush (two CR4 writes) takes two VMExits, and massively outweigh
1141the benefit of using global pages to begin with.  This case is easy for Xen to
1142spot, and is accounted for in the default setting.
1143
1144Other cases where this option might be a benefit is on VT-x hardware when
1145selective CR4 writes are not supported/enabled by the hypervisor, or in any
1146virtualised case using shadow paging.  These are not easy for Xen to spot, so
1147are not accounted for in the default setting.
1148
1149### guest_loglvl
1150> `= <level>[/<rate-limited level>]` where level is `none | error | warning | info | debug | all`
1151
1152> Default: `guest_loglvl=none/warning`
1153
1154> Can be modified at runtime
1155
1156Set the logging level for Xen guests.  Any log message with equal more
1157more importance will be printed.
1158
1159The optional `<rate-limited level>` option instructs which severities
1160should be rate limited.
1161
1162### hap (x86)
1163> `= <boolean>`
1164
1165> Default: `true`
1166
1167Flag to globally enable or disable support for Hardware Assisted
1168Paging (HAP)
1169
1170### hap_1gb (x86)
1171> `= <boolean>`
1172
1173> Default: `true`
1174
1175Flag to enable 1 GB host page table support for Hardware Assisted
1176Paging (HAP).
1177
1178### hap_2mb (x86)
1179> `= <boolean>`
1180
1181> Default: `true`
1182
1183Flag to enable 2 MB host page table support for Hardware Assisted
1184Paging (HAP).
1185
1186### hardware_dom
1187> `= <domid>`
1188
1189> Default: `0`
1190
1191Enable late hardware domain creation using the specified domain ID.  This is
1192intended to be used when domain 0 is a stub domain which builds a disaggregated
1193system including a hardware domain with the specified domain ID.  This option is
1194supported only when compiled with XSM on x86.
1195
1196### hest_disable
1197> ` = <boolean>`
1198
1199> Default: `false`
1200
1201Control Xens use of the APEI Hardware Error Source Table, should one be found.
1202
1203### highmem-start (x86)
1204> `= <size>`
1205
1206Specify the memory boundary past which memory will be treated as highmem (x86
1207debug hypervisor only).
1208
1209### hmp-unsafe (arm)
1210> `= <boolean>`
1211
1212> Default : `false`
1213
1214Say yes at your own risk if you want to enable heterogenous computing
1215(such as big.LITTLE). This may result to an unstable and insecure
1216platform, unless you manually specify the cpu affinity of all domains so
1217that all vcpus are scheduled on the same class of pcpus (big or LITTLE
1218but not both). vcpu migration between big cores and LITTLE cores is not
1219supported. See docs/misc/arm/big.LITTLE.txt for more information.
1220
1221When the hmp-unsafe option is disabled (default), CPUs that are not
1222identical to the boot CPU will be parked and not used by Xen.
1223
1224### hpetbroadcast (x86)
1225> `= <boolean>`
1226
1227### hvm_debug (x86)
1228> `= <integer>`
1229
1230The specified value is a bit mask with the individual bits having the
1231following meaning:
1232
1233>     Bit  0 - debug level 0 (unused at present)
1234>     Bit  1 - debug level 1 (Control Register logging)
1235>     Bit  2 - debug level 2 (VMX logging of MSR restores when context switching)
1236>     Bit  3 - debug level 3 (unused at present)
1237>     Bit  4 - I/O operation logging
1238>     Bit  5 - vMMU logging
1239>     Bit  6 - vLAPIC general logging
1240>     Bit  7 - vLAPIC timer logging
1241>     Bit  8 - vLAPIC interrupt logging
1242>     Bit  9 - vIOAPIC logging
1243>     Bit 10 - hypercall logging
1244>     Bit 11 - MSR operation logging
1245
1246Recognized in debug builds of the hypervisor only.
1247
1248### hvm_fep (x86)
1249> `= <boolean>`
1250
1251> Default: `false`
1252
1253Allow use of the Forced Emulation Prefix in HVM guests, to allow emulation of
1254arbitrary instructions.
1255
1256This option is intended for development and testing purposes.
1257
1258*Warning*
1259As this feature opens up the instruction emulator to arbitrary
1260instruction from an HVM guest, don't use this in production system. No
1261security support is provided when this flag is set.
1262
1263### hvm_port80 (x86)
1264> `= <boolean>`
1265
1266> Default: `true`
1267
1268Specify whether guests are to be given access to physical port 80
1269(often used for debugging purposes), to override the DMI based
1270detection of systems known to misbehave upon accesses to that port.
1271
1272### idle_latency_factor (x86)
1273> `= <integer>`
1274
1275### ioapic_ack (x86)
1276> `= old | new`
1277
1278> Default: `new` unless directed-EOI is supported
1279
1280### iommu
1281    = List of [ <bool>, verbose, debug, force, required, quarantine,
1282                sharept, intremap, intpost, crash-disable,
1283                snoop, qinval, igfx, amd-iommu-perdev-intremap,
1284                dom0-{passthrough,strict} ]
1285
1286    All sub-options are boolean in nature.
1287
1288I/O Memory Memory Units perform a function similar to the CPU MMU (hence the
1289name), but typically exist as a discrete device, integrated as part of a PCI
1290Root Complex.  The most common configuration is to have one IOMMU per package
1291(for on-die PCIe devices and directly attached PCIe lanes), and one IOMMU
1292covering the remaining I/O in the system.
1293
1294The functionality in an IOMMU commonly falls into two orthogonal categories:
1295
12961.  DMA remapping which uses a pagetable-like hierarchical structure and maps
1297    I/O Virtual Addresses (DFNs - Device Frame Numbers in Xen's terminology)
1298    to System Physical Addresses (MFNs - Machine Frame Numbers in Xen's
1299    terminology).
1300
13012.  Interrupt Remapping, which controls incoming Message Signalled Interrupt
1302    requests, including their routing to specific CPUs.
1303
1304IOMMU functionality can be used to provide a translation which the hardware
1305device driver isn't aware of (e.g. PCI Passthrough and a native driver inside
1306the guest) and/or to enforce fine-grained control over the memory and
1307interrupts which a device is attempting to access.
1308
1309By default, IOMMUs are configured for use if they are available.  An overall
1310boolean (e.g. `iommu=no`) can override this and leave the IOMMUs disabled.
1311
1312*   The `verbose` and `debug` booleans can be used to print additional
1313    diagnostic information.  Neither are active by default.
1314
1315*   The `force` and `required` booleans are synonymous and, when requested,
1316    will prevent Xen from booting if IOMMUs aren't discovered and enabled
1317    successfully.
1318
1319*   The `quarantine` boolean can be used to control Xen's behavior when
1320    de-assigning devices from guests.  If enabled (the default), Xen always
1321    quarantines such devices; they must be explicitly assigned back to Dom0
1322    before they can be used there again.  If disabled, Xen will only
1323    quarantine devices the toolstack hass arranged for getting quarantined.
1324
1325*   The `sharept` boolean controls whether the IOMMU pagetables are shared
1326    with the CPU-side HAP pagetables, or allocated separately.  Sharing
1327    reduces the memory overhead, but doesn't work in combination with CPU-side
1328    pagefault-based features, e.g. dirty VRAM tracking when a PCI device is
1329    assigned.
1330
1331    Due to implementation choices, sharing pagetables doesn't work on AMD
1332    hardware, and this option is ignored.  It is enabled by default on Intel
1333    systems.
1334
1335    This option is ignored on ARM, and the pagetables are always shared.
1336
1337*   The `intremap` boolean controls the Interrupt Remapping sub-feature, and
1338    is active by default on compatible hardware.  On x86 systems, the first
1339    generation of IOMMUs only supported DMA remapping, and Interrupt Remapping
1340    appeared in the second generation.
1341
1342    This option is only valid on x86.
1343
1344*   The `intpost` boolean controls the Posted Interrupt sub-feature.  In
1345    combination with APIC acceleration (VT-x APICV, SVM AVIC), the IOMMU can
1346    be configured to deliver interrupts from assigned PCI devices directly
1347    into the guest, without trapping out into hypervisor context.
1348
1349    This option depends on `intremap`, and is disabled by default due to some
1350    corner cases in the implementation which have yet to be resolved.
1351
1352    This option is only valid on x86, and only builds of Xen with HVM support.
1353
1354*   The `crash-disable` boolean controls disabling IOMMU functionality (DMAR/IR/QI)
1355    before switching to a crash kernel. This option is inactive by default and
1356    is for compatibility with older kdump kernels only. Modern kernels copy
1357    all the necessary tables from the previous one following kexec which makes
1358    the transition transparent for them with IOMMU functions still on.
1359
1360The following options are specific to Intel VT-d hardware:
1361
1362*   The `snoop` boolean controls the Snoop Control sub-feature, and is active
1363    by default on compatible hardware.
1364
1365    An incoming DMA request may specify _Snooped_ (query the CPU caches for
1366    the appropriate lines) or _Non-Snooped_ (don't query the CPU caches).
1367    _Non-Snooped_ accesses incur less latency, but behind-the-scenes
1368    hypervisor activity can invalidate the expectations of the device driver,
1369    and Snoop Control allows the hypervisor to force DMA requests to be
1370    _Snooped_ when they would otherwise not be.
1371
1372*   The `qinval` boolean controls the Queued Invalidation sub-feature, and is
1373    active by default on compatible hardware.  Queued Invalidation is a
1374    feature in second-generation IOMMUs and is a functional prerequisite for
1375    Interrupt Remapping.
1376
1377*   The `igfx` boolean is active by default, and controls whether the IOMMU in
1378    front of an Intel Graphics Device is enabled or not.
1379
1380    It is intended as a debugging mechanism for graphics issues, and to be
1381    similar to Linux's `intel_iommu=igfx_off` option.  If specifying `no-igfx`
1382    fixes anything, please report the problem.
1383
1384The following options are specific to AMD-Vi hardware:
1385
1386*   The `amd-iommu-perdev-intremap` boolean controls whether the interrupt
1387    remapping table is per device (the default), or a single global table for
1388    the entire system.
1389
1390    Using a global table is not security supported as it allows all devices to
1391    impersonate each other as far as interrupts as concerned (see XSA-36), but
1392    it is a workaround for SP5100 Erratum 28.
1393
1394**WARNING: The `dom0-passthrough` and `dom0-strict` booleans are both
1395deprecated, and superseded by _dom0-iommu={passthrough,strict}_ respectively -
1396using both the old and new command line options in combination is undefined.**
1397
1398### iommu_dev_iotlb_timeout
1399> `= <integer>`
1400
1401> Default: `1000`
1402
1403Specify the timeout of the device IOTLB invalidation in milliseconds.
1404By default, the timeout is 1000 ms. When you see error 'Queue invalidate
1405wait descriptor timed out', try increasing this value.
1406
1407### iommu_inclusive_mapping
1408> `= <boolean>`
1409
1410**WARNING: This command line option is deprecated, and superseded by
1411_dom0-iommu=map-inclusive_ - using both options in combination is undefined.**
1412
1413### irq_ratelimit (x86)
1414> `= <integer>`
1415
1416### irq_vector_map (x86)
1417### ivrs_hpet[`<hpet>`] (AMD)
1418> `=[<seg>:]<bus>:<device>.<func>`
1419
1420Force the use of `[<seg>:]<bus>:<device>.<func>` as device ID of HPET
1421`<hpet>` instead of the one specified by the IVHD sub-tables of the IVRS
1422ACPI table.
1423
1424### ivrs_ioapic[`<ioapic>`] (AMD)
1425> `=[<seg>:]<bus>:<device>.<func>`
1426
1427Force the use of `[<seg>:]<bus>:<device>.<func>` as device ID of IO-APIC
1428`<ioapic>` instead of the one specified by the IVHD sub-tables of the IVRS
1429ACPI table.
1430
1431### lapic (x86)
1432> `= <boolean>`
1433
1434Force the use of use of the local APIC on a uniprocessor system, even
1435if left disabled by the BIOS.
1436
1437### lapic_timer_c2_ok (x86)
1438> `= <boolean>`
1439
1440### ler (x86)
1441> `= <boolean>`
1442
1443> Default: false
1444
1445This option is intended for debugging purposes only.  Enable MSR_DEBUGCTL.LBR
1446in hypervisor context to be able to dump the Last Interrupt/Exception To/From
1447record with other registers.
1448
1449### loglvl
1450> `= <level>[/<rate-limited level>]` where level is `none | error | warning | info | debug | all`
1451
1452> Default: `loglvl=warning`
1453
1454> Can be modified at runtime
1455
1456Set the logging level for Xen.  Any log message with equal more more
1457importance will be printed.
1458
1459The optional `<rate-limited level>` option instructs which severities
1460should be rate limited.
1461
1462### low_crashinfo
1463> `= none | min | all`
1464
1465> Default: `none` if not specified at all, or to `min` if **low_crashinfo** is present without qualification.
1466
1467This option is only useful for hosts with a 32bit dom0 kernel, wishing
1468to use kexec functionality in the case of a crash.  It represents
1469which data structures should be deliberately allocated in low memory,
1470so the crash kernel may find find them.  Should be used in combination
1471with **crashinfo_maxaddr**.
1472
1473### low_mem_virq_limit
1474> `= <size>`
1475
1476> Default: `64M`
1477
1478Specify the threshold below which Xen will inform dom0 that the quantity of
1479free memory is getting low.  Specifying `0` will disable this notification.
1480
1481### maxcpus (x86)
1482> `= <integer>`
1483
1484Specify the maximum number of CPUs that should be brought up.
1485
1486This option is ignored in **pv-shim** mode.
1487
1488### max_cstate (x86)
1489> `= <integer>[,<integer>]`
1490
1491Specify the deepest C-state CPUs are permitted to be placed in, and
1492optionally the maximum sub C-state to be used used.  The latter only applies
1493to the highest permitted C-state.
1494
1495### max_gsi_irqs (x86)
1496> `= <integer>`
1497
1498Specifies the number of interrupts to be use for pin (IO-APIC or legacy PIC)
1499based interrupts. Any higher IRQs will be available for use via PCI MSI.
1500
1501### max_lpi_bits (arm)
1502> `= <integer>`
1503
1504Specifies the number of ARM GICv3 LPI interrupts to allocate on the host,
1505presented as the number of bits needed to encode it. This must be at least
150614 and not exceed 32, and each LPI requires one byte (configuration) and
1507one pending bit to be allocated.
1508Defaults to 20 bits (to cover at most 1048576 interrupts).
1509
1510### mce (x86)
1511> `= <integer>`
1512
1513### mce_fb (Intel)
1514> `= <integer>`
1515
1516### mce_verbosity (x86)
1517> `= verbose`
1518
1519Specify verbose machine check output.
1520
1521### mem (x86)
1522> `= <size>`
1523
1524Specify the maximum address of physical RAM.  Any RAM beyond this
1525limit is ignored by Xen.
1526
1527### memop-max-order
1528> `= [<domU>][,[<ctldom>][,[<hwdom>][,<ptdom>]]]`
1529
1530> x86 default: `9,18,12,12`
1531> ARM default: `9,18,10,10`
1532
1533Change the maximum order permitted for allocation (or allocation-like)
1534requests issued by the various kinds of domains (in this order:
1535ordinary DomU, control domain, hardware domain, and - when supported
1536by the platform - DomU with pass-through device assigned).
1537
1538### mmcfg (x86)
1539> `= <boolean>[,amd-fam10]`
1540
1541> Default: `1`
1542
1543Specify if the MMConfig space should be enabled.
1544
1545### mmio-relax (x86)
1546> `= <boolean> | all`
1547
1548> Default: `false`
1549
1550By default, domains may not create cached mappings to MMIO regions.
1551This option relaxes the check for Domain 0 (or when using `all`, all PV
1552domains), to permit the use of cacheable MMIO mappings.
1553
1554### msi (x86)
1555> `= <boolean>`
1556
1557> Default: `true`
1558
1559Force Xen to (not) use PCI-MSI, even if ACPI FADT says otherwise.
1560
1561### mtrr.show (x86)
1562> `= <boolean>`
1563
1564> Default: `false`
1565
1566Print boot time MTRR state.
1567
1568### mwait-idle (x86)
1569> `= <boolean>`
1570
1571> Default: `true`
1572
1573Use the MWAIT idle driver (with model specific C-state knowledge) instead
1574of the ACPI based one.
1575
1576### nmi (x86)
1577> `= ignore | dom0 | fatal`
1578
1579> Default: `fatal` for a debug build, or `dom0` for a non-debug build
1580
1581Specify what Xen should do in the event of an NMI parity or I/O error.
1582`ignore` discards the error; `dom0` causes Xen to report the error to
1583dom0, while 'fatal' causes Xen to print diagnostics and then hang.
1584
1585### noapic (x86)
1586
1587Instruct Xen to ignore any IOAPICs that are present in the system, and
1588instead continue to use the legacy PIC. This is _not_ recommended with
1589pvops type kernels.
1590
1591Because responsibility for APIC setup is shared between Xen and the
1592domain 0 kernel this option is automatically propagated to the domain
15930 command line.
1594
1595### invpcid (x86)
1596> `= <boolean>`
1597
1598> Default: `true`
1599
1600By default, Xen will use the INVPCID instruction for TLB management if
1601it is available.  This option can be used to cause Xen to fall back to
1602older mechanisms, which are generally slower.
1603
1604### noirqbalance (x86)
1605> `= <boolean>`
1606
1607Disable software IRQ balancing and affinity. This can be used on
1608systems such as Dell 1850/2850 that have workarounds in hardware for
1609IRQ routing issues.
1610
1611### nolapic (x86)
1612> `= <boolean>`
1613
1614> Default: `false`
1615
1616Ignore the local APIC on a uniprocessor system, even if enabled by the
1617BIOS.
1618
1619### no-real-mode (x86)
1620> `= <boolean>`
1621
1622Do not execute real-mode bootstrap code when booting Xen. This option
1623should not be used except for debugging. It will effectively disable
1624the **vga** option, which relies on real mode to set the video mode.
1625
1626### noreboot
1627> `= <boolean>`
1628
1629Do not automatically reboot after an error.  This is useful for
1630catching debug output.  Defaults to automatically reboot after 5
1631seconds.
1632
1633### nosmp (x86)
1634> `= <boolean>`
1635
1636Disable SMP support.  No secondary processors will be booted.
1637Defaults to booting secondary processors.
1638
1639This option is ignored in **pv-shim** mode.
1640
1641### nr_irqs (x86)
1642> `= <integer>`
1643
1644### numa (x86)
1645> `= on | off | fake=<integer> | noacpi`
1646
1647> Default: `on`
1648
1649### pci
1650    = List of [ serr=<bool>, perr=<bool> ]
1651
1652    Default: Signaling left as set by firmware.
1653
1654Override the firmware settings, and explicitly enable or disable the
1655signalling of PCI System and Parity errors.
1656
1657### pci-phantom
1658> `=[<seg>:]<bus>:<device>,<stride>`
1659
1660Mark a group of PCI devices as using phantom functions without actually
1661advertising so, so the IOMMU can create translation contexts for them.
1662
1663All numbers specified must be hexadecimal ones.
1664
1665This option can be specified more than once (up to 8 times at present).
1666
1667### pcid (x86)
1668> `= <boolean> | xpti=<bool>`
1669
1670> Default: `xpti`
1671
1672> Can be modified at runtime (change takes effect only for domains created
1673  afterwards)
1674
1675If available, control usage of the PCID feature of the processor for
167664-bit pv-domains. PCID can be used either for no domain at all (`false`),
1677for all of them (`true`), only for those subject to XPTI (`xpti`) or for
1678those not subject to XPTI (`no-xpti`). The feature is used only in case
1679INVPCID is supported and not disabled via `invpcid=false`.
1680
1681### pku (x86)
1682> `= <boolean>`
1683
1684> Default: `true`
1685
1686Flag to enable Memory Protection Keys.
1687
1688The protection-key feature provides an additional mechanism by which IA-32e
1689paging controls access to usermode addresses.
1690
1691### ple_gap
1692> `= <integer>`
1693
1694### ple_window (Intel)
1695> `= <integer>`
1696
1697### psr (Intel)
1698> `= List of ( cmt:<boolean> | rmid_max:<integer> | cat:<boolean> | cos_max:<integer> | cdp:<boolean> )`
1699
1700> Default: `psr=cmt:0,rmid_max:255,cat:0,cos_max:255,cdp:0`
1701
1702Platform Shared Resource(PSR) Services.  Intel Haswell and later server
1703platforms offer information about the sharing of resources.
1704
1705To use the PSR monitoring service for a certain domain, a Resource
1706Monitoring ID(RMID) is used to bind the domain to corresponding shared
1707resource.  RMID is a hardware-provided layer of abstraction between software
1708and logical processors.
1709
1710To use the PSR cache allocation service for a certain domain, a capacity
1711bitmasks(CBM) is used to bind the domain to corresponding shared resource.
1712CBM represents cache capacity and indicates the degree of overlap and isolation
1713between domains. In hypervisor a Class of Service(COS) ID is allocated for each
1714unique CBM.
1715
1716The following resources are available:
1717
1718* Cache Monitoring Technology (Haswell and later).  Information regarding the
1719  L3 cache occupancy.
1720  * `cmt` instructs Xen to enable/disable Cache Monitoring Technology.
1721  * `rmid_max` indicates the max value for rmid.
1722* Memory Bandwidth Monitoring (Broadwell and later). Information regarding the
1723  total/local memory bandwidth. Follow the same options with Cache Monitoring
1724  Technology.
1725
1726* Cache Allocation Technology (Broadwell and later).  Information regarding
1727  the cache allocation.
1728  * `cat` instructs Xen to enable/disable Cache Allocation Technology.
1729  * `cos_max` indicates the max value for COS ID.
1730* Code and Data Prioritization Technology (Broadwell and later). Information
1731  regarding the code cache and the data cache allocation. CDP is based on CAT.
1732  * `cdp` instructs Xen to enable/disable Code and Data Prioritization. Note
1733    that `cos_max` of CDP is a little different from `cos_max` of CAT. With
1734    CDP, one COS will corespond two CBMs other than one with CAT, due to the
1735    sum of CBMs is fixed, that means actual `cos_max` in use will automatically
1736    reduce to half when CDP is enabled.
1737
1738### pv
1739    = List of [ 32=<bool> ]
1740
1741    Applicability: x86
1742
1743Controls for aspects of PV guest support.
1744
1745*   The `32` boolean controls whether 32bit PV guests can be created.  It
1746    defaults to `true`, and is ignored when `CONFIG_PV32` is compiled out.
1747
1748    32bit PV guests are incompatible with CET Shadow Stacks.  If Xen is using
1749    shadow stacks, this option will be overridden to `false`.  Backwards
1750    compatibility can be maintained with the `pv-shim` mechanism.
1751
1752### pv-linear-pt (x86)
1753> `= <boolean>`
1754
1755> Default: `true`
1756
1757Only available if Xen is compiled with `CONFIG_PV_LINEAR_PT` support
1758enabled.
1759
1760Allow PV guests to have pagetable entries pointing to other pagetables
1761of the same level (i.e., allowing L2 PTEs to point to other L2 pages).
1762This technique is often called "linear pagetables", and is sometimes
1763used to allow operating systems a simple way to consistently map the
1764current process's pagetables into its own virtual address space.
1765
1766Linux and MiniOS don't use this technique.  NetBSD and Novell Netware
1767do; there may be other custom operating systems which do.  If you're
1768certain you don't plan on having PV guests which use this feature,
1769turning it off can reduce the attack surface.
1770
1771### pv-l1tf (x86)
1772> `= List of [ <bool>, dom0=<bool>, domu=<bool> ]`
1773
1774> Default: `false` on believed-unaffected hardware, or in pv-shim mode.
1775>          `domu`  on believed-affected hardware.
1776
1777Mitigations for L1TF / XSA-273 / CVE-2018-3620 for PV guests.
1778
1779For backwards compatibility, we may not alter an architecturally-legitimate
1780pagetable entry a PV guest chooses to write.  We can however force such a
1781guest into shadow mode so that Xen controls the PTEs which are reachable by
1782the CPU pagewalk.
1783
1784Shadowing is performed at the point where a PV guest first tries to write an
1785L1TF-vulnerable PTE.  Therefore, a PV guest kernel which has been updated with
1786its own L1TF mitigations will not trigger shadow mode if it is well behaved.
1787
1788If `CONFIG_SHADOW_PAGING` is not compiled in, this mitigation instead crashes
1789the guest when an L1TF-vulnerable PTE is written, which still allows updated,
1790well-behaved PV guests to run, despite Shadow being compiled out.
1791
1792In the pv-shim case, Shadow is expected to be compiled out, and a malicious
1793guest kernel can only leak data from the shim Xen, rather than the host Xen.
1794
1795### pv-shim (x86)
1796> `= <boolean>`
1797
1798> Default: `false`
1799
1800This option is intended for use by a toolstack, when choosing to run a PV
1801guest compatibly inside an HVM container.
1802
1803In this mode, the kernel and initrd passed as modules to the hypervisor are
1804constructed into a plain unprivileged PV domain.
1805
1806### rcu-idle-timer-period-ms
1807> `= <integer>`
1808
1809> Default: `10`
1810
1811How frequently a CPU which has gone idle, but with pending RCU callbacks,
1812should be woken up to check if the grace period has completed, and the
1813callbacks are safe to be executed. Expressed in milliseconds; maximum is
1814100, and it can't be 0.
1815
1816### reboot (x86)
1817> `= t[riple] | k[bd] | a[cpi] | p[ci] | P[ower] | e[fi] | n[o] [, [w]arm | [c]old]`
1818
1819> Default: `0`
1820
1821Specify the host reboot method.
1822
1823`warm` instructs Xen to not set the cold reboot flag.
1824
1825`cold` instructs Xen to set the cold reboot flag.
1826
1827`no` instructs Xen to not automatically reboot after panics or crashes.
1828
1829`triple` instructs Xen to reboot the host by causing a triple fault.
1830
1831`kbd` instructs Xen to reboot the host via the keyboard controller.
1832
1833`acpi` instructs Xen to reboot the host using RESET_REG in the ACPI FADT.
1834
1835`pci` instructs Xen to reboot the host using PCI reset register (port CF9).
1836
1837`Power` instructs Xen to power-cycle the host using PCI reset register (port CF9).
1838
1839'efi' instructs Xen to reboot using the EFI reboot call (in EFI mode by
1840 default it will use that method first).
1841
1842`xen` instructs Xen to reboot using Xen's SCHEDOP hypercall (this is the default
1843when running nested Xen)
1844
1845### rmrr
1846> `= start<-end>=[s1]bdf1[,[s1]bdf2[,...]];start<-end>=[s2]bdf1[,[s2]bdf2[,...]]`
1847
1848Define RMRR units that are missing from ACPI table along with device they
1849belong to and use them for 1:1 mapping. End addresses can be omitted and one
1850page will be mapped. The ranges are inclusive when start and end are specified.
1851If segment of the first device is not specified, segment zero will be used.
1852If other segments are not specified, first device segment will be used.
1853If a segment is specified for other than the first device and it does not match
1854the one specified for the first one, an error will be reported.
1855
1856'start' and 'end' values are page numbers (not full physical addresses),
1857in hexadecimal format (can optionally be preceded by "0x").
1858
1859Usage example: If device 0:0:1d.0 requires one page (0xd5d45) to be
1860reserved, and device 0:0:1a.0 requires three pages (0xd5d46 thru 0xd5d48)
1861to be reserved, one usage would be:
1862
1863rmrr=d5d45=0:0:1d.0;0xd5d46-0xd5d48=0:0:1a.0
1864
1865Note: grub2 requires to escape or use quotations if special characters are used,
1866namely ';', refer to the grub2 documentation if multiple ranges are specified.
1867
1868### ro-hpet (x86)
1869> `= <boolean>`
1870
1871> Default: `true`
1872
1873Map the HPET page as read only in Dom0. If disabled the page will be mapped
1874with read and write permissions.
1875
1876### sched
1877> `= credit | credit2 | arinc653 | rtds | null`
1878
1879> Default: `sched=credit`
1880
1881Choose the default scheduler.
1882
1883### sched_credit2_max_cpus_runqueue
1884> `= <integer>`
1885
1886> Default: `16`
1887
1888Defines how many CPUs will be put, at most, in each Credit2 runqueue.
1889
1890Runqueues are still arranged according to the host topology (and following
1891what indicated by the 'credit2_runqueue' parameter). But we also have a cap
1892to the number of CPUs that share each runqueues.
1893
1894A value that is a submultiple of the number of online CPUs is recommended,
1895as that would likely produce a perfectly balanced runqueue configuration.
1896
1897### sched_credit2_migrate_resist
1898> `= <integer>`
1899
1900### sched_credit_tslice_ms
1901> `= <integer>`
1902
1903Set the timeslice of the credit1 scheduler, in milliseconds.  The
1904default is 30ms.  Reasonable values may include 10, 5, or even 1 for
1905very latency-sensitive workloads.
1906
1907### sched-gran (x86)
1908> `= cpu | core | socket`
1909
1910> Default: `sched-gran=cpu`
1911
1912Set the scheduling granularity. In case the granularity is larger than 1 (e.g.
1913`core`on a SMT-enabled system, or `socket`) multiple vcpus are assigned
1914statically to a "scheduling unit" which will then be subject to scheduling.
1915This assignment of vcpus to scheduling units is fixed.
1916
1917`cpu`: Vcpus will be scheduled individually on single cpus (e.g. a
1918hyperthread using x86/Intel terminology)
1919
1920`core`: As many vcpus as there are cpus on a physical core are scheduled
1921together on a physical core.
1922
1923`socket`: As many vcpus as there are cpus on a physical sockets are scheduled
1924together on a physical socket.
1925
1926Note: a value other than `cpu` will result in rejecting a runtime modification
1927attempt of the "smt" setting.
1928
1929Note: for AMD x86 processors before Fam17 the terminology in the official data
1930sheets is different: a cpu is named "core" and multiple "cores" are running
1931in the same "compute unit". As from Fam17 on AMD is using the same names as
1932Intel ("thread" and "core") the topology levels are named "cpu", "core" and
1933"socket" even on older AMD processors.
1934
1935### sched_ratelimit_us
1936> `= <integer>`
1937
1938In order to limit the rate of context switching, set the minimum
1939amount of time that a vcpu can be scheduled for before preempting it,
1940in microseconds.  The default is 1000us (1ms).  Setting this to 0
1941disables it altogether.
1942
1943### sched_smt_power_savings
1944> `= <boolean>`
1945
1946Normally Xen will try to maximize performance and cache utilization by
1947spreading out vcpus across as many different divisions as possible
1948(i.e, numa nodes, sockets, cores threads, &c).  This often maximizes
1949throughput, but also maximizes energy usage, since it reduces the
1950depth to which a processor can sleep.
1951
1952This option inverts the logic, so that the scheduler in effect tries
1953to keep the vcpus on the smallest amount of silicon possible; i.e.,
1954first fill up sibling threads, then sibling cores, then sibling
1955sockets, &c.  This will reduce performance somewhat, particularly on
1956systems with hyperthreading enabled, but should reduce power by
1957enabling more sockets and cores to go into deeper sleep states.
1958
1959### scrub-domheap
1960> `= <boolean>`
1961
1962> Default: `false`
1963
1964Scrub domains' freed pages. This is a safety net against a (buggy) domain
1965accidentally leaking secrets by releasing pages without proper sanitization.
1966
1967### serial_tx_buffer
1968> `= <size>`
1969
1970> Default: `16kB`
1971
1972Set the serial transmit buffer size.
1973
1974### serrors (ARM)
1975> `= diverse | panic`
1976
1977> Default: `diverse`
1978
1979This parameter is provided to administrators to determine how the hypervisor
1980handles SErrors.
1981
1982* `diverse`:
1983  The hypervisor will distinguish guest SErrors from hypervisor SErrors:
1984    - The guest generated SErrors will be forwarded to the currently running
1985      guest.
1986    - The hypervisor generated SErrors will cause the whole system to crash
1987
1988* `panic`:
1989  All SErrors will cause the whole system to crash. This option should only
1990  be used if you trust all your guests and/or they don't have a gadget (e.g.
1991  device) to generate SErrors in normal run.
1992
1993### shim_mem (x86)
1994> `= List of ( min:<size> | max:<size> | <size> )`
1995
1996Set the amount of memory that xen-shim uses. Only has effect if pv-shim mode is
1997enabled. Note that this value accounts for the memory used by the shim itself
1998plus the free memory slack given to the shim for runtime allocations.
1999
2000* `min:<size>` specifies the minimum amount of memory. Ignored if greater
2001   than max.
2002* `max:<size>` specifies the maximum amount of memory.
2003* `<size>` specifies the exact amount of memory. Overrides both min and max.
2004
2005By default, the amount of free memory slack given to the shim for runtime usage
2006is 1MB.
2007
2008### smap (x86)
2009> `= <boolean> | hvm`
2010
2011> Default: `true` unless running in pv-shim mode on AMD or Hygon hardware
2012
2013Flag to enable Supervisor Mode Access Prevention
2014Use `smap=hvm` to allow SMAP use by HVM guests only.
2015
2016In PV shim mode on AMD or Hygon hardware due to significant performance impact
2017in some cases and generally lower security risk the option defaults to false.
2018
2019### smep (x86)
2020> `= <boolean> | hvm`
2021
2022> Default: `true` unless running in pv-shim mode on AMD or Hygon hardware
2023
2024Flag to enable Supervisor Mode Execution Protection
2025Use `smep=hvm` to allow SMEP use by HVM guests only.
2026
2027In PV shim mode on AMD or Hygon hardware due to significant performance impact
2028in some cases and generally lower security risk the option defaults to false.
2029
2030### smt (x86)
2031> `= <boolean>`
2032
2033Default: `true`
2034
2035Control bring up of multiple hyper-threads per CPU core.
2036
2037### snb_igd_quirk
2038> `= <boolean> | cap | <integer>`
2039
2040A true boolean value enables legacy behavior (1s timeout), while `cap`
2041enforces the maximum theoretically necessary timeout of 670ms. Any number
2042is being interpreted as a custom timeout in milliseconds. Zero or boolean
2043false disable the quirk workaround, which is also the default.
2044
2045### spec-ctrl (Arm)
2046> `= List of [ ssbd=force-disable|runtime|force-enable ]`
2047
2048Controls for speculative execution sidechannel mitigations.
2049
2050The option `ssbd=` is used to control the state of Speculative Store
2051Bypass Disable (SSBD) mitigation.
2052
2053* `ssbd=force-disable` will keep the mitigation permanently off. The guest
2054will not be able to control the state of the mitigation.
2055* `ssbd=runtime` will always turn on the mitigation when running in the
2056hypervisor context. The guest will be to turn on/off the mitigation for
2057itself by using the firmware interface `ARCH_WORKAROUND_2`.
2058* `ssbd=force-enable` will keep the mitigation permanently on. The guest will
2059not be able to control the state of the mitigation.
2060
2061By default SSBD will be mitigated at runtime (i.e `ssbd=runtime`).
2062
2063### spec-ctrl (x86)
2064> `= List of [ <bool>, xen=<bool>, {pv,hvm,msr-sc,rsb,md-clear}=<bool>,
2065>              bti-thunk=retpoline|lfence|jmp, {ibrs,ibpb,ssbd,eager-fpu,
2066>              l1d-flush,branch-harden,srb-lock}=<bool> ]`
2067
2068Controls for speculative execution sidechannel mitigations.  By default, Xen
2069will pick the most appropriate mitigations based on compiled in support,
2070loaded microcode, and hardware details, and will virtualise appropriate
2071mitigations for guests to use.
2072
2073**WARNING: Any use of this option may interfere with heuristics.  Use with
2074extreme care.**
2075
2076An overall boolean value, `spec-ctrl=no`, can be specified to turn off all
2077mitigations, including pieces of infrastructure used to virtualise certain
2078mitigation features for guests.  This also includes settings which `xpti`,
2079`smt`, `pv-l1tf`, `tsx` control, unless the respective option(s) have been
2080specified earlier on the command line.
2081
2082Alternatively, a slightly more restricted `spec-ctrl=no-xen` can be used to
2083turn off all of Xen's mitigations, while leaving the virtualisation support
2084in place for guests to use.
2085
2086Use of a positive boolean value for either of these options is invalid.
2087
2088The booleans `pv=`, `hvm=`, `msr-sc=`, `rsb=` and `md-clear=` offer fine
2089grained control over the alternative blocks used by Xen.  These impact Xen's
2090ability to protect itself, and Xen's ability to virtualise support for guests
2091to use.
2092
2093* `pv=` and `hvm=` offer control over all suboptions for PV and HVM guests
2094  respectively.
2095* `msr-sc=` offers control over Xen's support for manipulating `MSR_SPEC_CTRL`
2096  on entry and exit.  These blocks are necessary to virtualise support for
2097  guests and if disabled, guests will be unable to use IBRS/STIBP/SSBD/etc.
2098* `rsb=` offers control over whether to overwrite the Return Stack Buffer /
2099  Return Address Stack on entry to Xen.
2100* `md-clear=` offers control over whether to use VERW to flush
2101  microarchitectural buffers on idle and exit from Xen.  *Note: For
2102  compatibility with development versions of this fix, `mds=` is also accepted
2103  on Xen 4.12 and earlier as an alias.  Consult vendor documentation in
2104  preference to here.*
2105
2106If Xen was compiled with INDIRECT_THUNK support, `bti-thunk=` can be used to
2107select which of the thunks gets patched into the `__x86_indirect_thunk_%reg`
2108locations.  The default thunk is `retpoline` (generally preferred for Intel
2109hardware), with the alternatives being `jmp` (a `jmp *%reg` gadget, minimal
2110overhead), and `lfence` (an `lfence; jmp *%reg` gadget, preferred for AMD).
2111
2112On hardware supporting IBRS (Indirect Branch Restricted Speculation), the
2113`ibrs=` option can be used to force or prevent Xen using the feature itself.
2114If Xen is not using IBRS itself, functionality is still set up so IBRS can be
2115virtualised for guests.
2116
2117On hardware supporting IBPB (Indirect Branch Prediction Barrier), the `ibpb=`
2118option can be used to force (the default) or prevent Xen from issuing branch
2119prediction barriers on vcpu context switches.
2120
2121On hardware supporting SSBD (Speculative Store Bypass Disable), the `ssbd=`
2122option can be used to force or prevent Xen using the feature itself.  On AMD
2123hardware, this is a global option applied at boot, and not virtualised for
2124guest use.  On Intel hardware, the feature is virtualised for guests,
2125independently of Xen's choice of setting.
2126
2127On all hardware, the `eager-fpu=` option can be used to force or prevent Xen
2128from using fully eager FPU context switches.  This is currently implemented as
2129a global control.  By default, Xen will choose to use fully eager context
2130switches on hardware believed to speculate past #NM exceptions.
2131
2132On hardware supporting L1D_FLUSH, the `l1d-flush=` option can be used to force
2133or prevent Xen from issuing an L1 data cache flush on each VMEntry.
2134Irrespective of Xen's setting, the feature is virtualised for HVM guests to
2135use.  By default, Xen will enable this mitigation on hardware believed to be
2136vulnerable to L1TF.
2137
2138If Xen is compiled with `CONFIG_SPECULATIVE_HARDEN_BRANCH`, the
2139`branch-harden=` boolean can be used to force or prevent Xen from using
2140speculation barriers to protect selected conditional branches.  By default,
2141Xen will enable this mitigation.
2142
2143On hardware supporting SRBDS_CTRL, the `srb-lock=` option can be used to force
2144or prevent Xen from protect the Special Register Buffer from leaking stale
2145data. By default, Xen will enable this mitigation, except on parts where MDS
2146is fixed and TAA is fixed/mitigated (in which case, there is believed to be no
2147way for an attacker to obtain the stale data).
2148
2149### sync_console
2150> `= <boolean>`
2151
2152> Default: `false`
2153
2154Flag to force synchronous console output.  Useful for debugging, but
2155not suitable for production environments due to incurred overhead.
2156
2157### tboot (x86)
2158> `= 0x<phys_addr>`
2159
2160Specify the physical address of the trusted boot shared page.
2161
2162### tbuf_size
2163> `= <integer>`
2164
2165Specify the per-cpu trace buffer size in pages.
2166
2167### tdt (x86)
2168> `= <boolean>`
2169
2170> Default: `true`
2171
2172Flag to enable TSC deadline as the APIC timer mode.
2173
2174### tevt_mask
2175> `= <integer>`
2176
2177Specify a mask for Xen event tracing. This allows Xen tracing to be
2178enabled at boot. Refer to the xentrace(8) documentation for a list of
2179valid event mask values. In order to enable tracing, a buffer size (in
2180pages) must also be specified via the tbuf_size parameter.
2181
2182### tickle_one_idle_cpu
2183> `= <boolean>`
2184
2185### timer_slop
2186> `= <integer>`
2187
2188### tsc (x86)
2189> `= unstable | skewed | stable:socket`
2190
2191### tsx
2192    = <bool>
2193
2194    Applicability: x86
2195    Default: false on parts vulnerable to TAA, true otherwise
2196
2197Controls for the use of Transactional Synchronization eXtensions.
2198
2199On Intel parts released in Q3 2019 (with updated microcode), and future parts,
2200a control has been introduced which allows TSX to be turned off.
2201
2202On systems with the ability to turn TSX off, this boolean offers system wide
2203control of whether TSX is enabled or disabled.
2204
2205On parts vulnerable to CVE-2019-11135 / TSX Asynchronous Abort, the following
2206logic applies:
2207
2208 * An explicit `tsx=` choice is honoured, even if it is `true` and would
2209   result in a vulnerable system.
2210
2211 * When no explicit `tsx=` choice is given, parts vulnerable to TAA will be
2212   mitigated by disabling TSX, as this is the lowest overhead option.
2213
2214 * If the use of TSX is important, the more expensive TAA mitigations can be
2215   opted in to with `smt=0 spec-ctrl=md-clear`, at which point TSX will remain
2216   active by default.
2217
2218### ucode
2219> `= List of [ <integer> | scan=<bool>, nmi=<bool> ]`
2220
2221    Applicability: x86
2222    Default: `nmi`
2223
2224Controls for CPU microcode loading. For early loading, this parameter can
2225specify how and where to find the microcode update blob. For late loading,
2226this parameter specifies if the update happens within a NMI handler.
2227
2228'integer' specifies the CPU microcode update blob module index. When positive,
2229this specifies the n-th module (in the GrUB entry, zero based) to be used
2230for updating CPU micrcode. When negative, counting starts at the end of
2231the modules in the GrUB entry (so with the blob commonly being last,
2232one could specify `ucode=-1`). Note that the value of zero is not valid
2233here (entry zero, i.e. the first module, is always the Dom0 kernel
2234image). Note further that use of this option has an unspecified effect
2235when used with xen.efi (there the concept of modules doesn't exist, and
2236the blob gets specified via the `ucode=<filename>` config file/section
2237entry; see [EFI configuration file description](efi.html)).
2238
2239'scan' instructs the hypervisor to scan the multiboot images for an cpio
2240image that contains microcode. Depending on the platform the blob with the
2241microcode in the cpio name space must be:
2242  - on Intel: kernel/x86/microcode/GenuineIntel.bin
2243  - on AMD  : kernel/x86/microcode/AuthenticAMD.bin
2244When using xen.efi, the `ucode=<filename>` config file setting takes
2245precedence over `scan`.
2246
2247'nmi' determines late loading is performed in NMI handler or just in
2248stop_machine context. In NMI handler, even NMIs are blocked, which is
2249considered safer. The default value is `true`.
2250
2251### unrestricted_guest (Intel)
2252> `= <boolean>`
2253
2254### vcpu_migration_delay
2255> `= <integer>`
2256
2257> Default: `0`
2258
2259Specify a delay, in microseconds, between migrations of a VCPU between
2260PCPUs when using the credit1 scheduler. This prevents rapid fluttering
2261of a VCPU between CPUs, and reduces the implicit overheads such as
2262cache-warming. 1ms (1000) has been measured as a good value.
2263
2264### vesa-map
2265> `= <integer>`
2266
2267### vesa-mtrr
2268> `= <integer>`
2269
2270### vesa-ram
2271> `= <integer>`
2272
2273### vga
2274> `= ( ask | current | text-80x<rows> | gfx-<width>x<height>x<depth> | mode-<mode> )[,keep]`
2275
2276`ask` causes Xen to display a menu of available modes and request the
2277user to choose one of them.
2278
2279`current` causes Xen to use the graphics adapter in its current state,
2280without further setup.
2281
2282`text-80x<rows>` instructs Xen to set up text mode.  Valid values for
2283`<rows>` are `25, 28, 30, 34, 43, 50, 80`
2284
2285`gfx-<width>x<height>x<depth>` instructs Xen to set up graphics mode
2286with the specified width, height and depth.
2287
2288`mode-<mode>` instructs Xen to use a specific mode, as shown with the
2289`ask` option.  (N.B menu modes are displayed in hex, so `<mode>`
2290should be a hexadecimal number)
2291
2292The optional `keep` parameter causes Xen to continue using the vga
2293console even after dom0 has been started.  The default behaviour is to
2294relinquish control to dom0.
2295
2296### viridian-spinlock-retry-count (x86)
2297> `= <integer>`
2298
2299> Default: `2047`
2300
2301Specify the maximum number of retries before an enlightened Windows
2302guest will notify Xen that it has failed to acquire a spinlock.
2303
2304### viridian-version (x86)
2305> `= [<major>],[<minor>],[<build>]`
2306
2307> Default: `6,0,0x1772`
2308
2309<major>, <minor> and <build> must be integers. The values will be
2310encoded in guest CPUID 0x40000002 if viridian enlightenments are enabled.
2311
2312### vpid (Intel)
2313> `= <boolean>`
2314
2315> Default: `true`
2316
2317Use Virtual Processor ID support if available.  This prevents the need for TLB
2318flushes on VM entry and exit, increasing performance.
2319
2320### vpmu (x86)
2321    = List of [ <bool>, bts, ipc, arch, rtm-abort=<bool> ]
2322
2323    Applicability: x86.  Default: false
2324
2325Controls for Performance Monitoring Unit virtualisation.
2326
2327Performance monitoring facilities tend to be very hardware specific, and
2328provide access to a wealth of low level processor information.
2329
2330*   An overall boolean can be used to enable or disable vPMU support.  vPMU is
2331    disabled by default.
2332
2333    When enabled, guests have full access to all performance counter settings,
2334    including model specific functionality.  This is a superset of the
2335    functionality offered by `ipc` and/or `arch`, but a subset of the
2336    functionality offered by `bts`.
2337
2338    Xen's watchdog functionality is implemented using performance counters.
2339    As a result, use of the **watchdog** option will override and disable
2340    vPMU.
2341
2342*   The `bts` option enables performance monitoring, and permits additional
2343    access to the Branch Trace Store controls.  BTS is an Intel feature where
2344    the processor can write data into a buffer whenever a branch occurs.
2345    However, as this feature isn't virtualised, a misconfiguration by the
2346    guest can lock the entire system up.
2347
2348*   The `ipc` option allows access to the most minimal set of counters
2349    possible: instructions, cycles, and reference cycles.  These can be used
2350    to calculate instructions per cycle (IPC).
2351
2352*   The `arch` option allows access to the pre-defined architectural events.
2353
2354*   The `rtm-abort` boolean controls a trade-off between working Restricted
2355    Transactional Memory, and working performance counters.
2356
2357    All processors released to date (Q1 2019) supporting Transactional Memory
2358    Extensions suffer an erratum which has been addressed in microcode.
2359
2360    Processors based on the Skylake microarchitecture with up-to-date
2361    microcode internally use performance counter 3 to work around the erratum.
2362    A consequence is that the counter gets reprogrammed whenever an `XBEGIN`
2363    instruction is executed.
2364
2365    An alternative mode exists where PCR3 behaves as before, at the cost of
2366    `XBEGIN` unconditionally aborting.  Enabling `rtm-abort` mode will
2367    activate this alternative mode.
2368
2369*Warning:*
2370As the virtualisation is not 100% safe, don't use the vpmu flag on
2371production systems (see https://xenbits.xen.org/xsa/advisory-163.html)!
2372
2373### vwfi (arm)
2374> `= trap | native`
2375
2376> Default: `trap`
2377
2378WFI is the ARM instruction to "wait for interrupt". WFE is similar and
2379means "wait for event". This option, which is ARM specific, changes the
2380way guest WFI and WFE are implemented in Xen. By default, Xen traps both
2381instructions. In the case of WFI, Xen blocks the guest vcpu; in the case
2382of WFE, Xen yield the guest vcpu. When setting vwfi to `native`, Xen
2383doesn't trap either instruction, running them in guest context. Setting
2384vwfi to `native` reduces irq latency significantly. It can also lead to
2385suboptimal scheduling decisions, but only when the system is
2386oversubscribed (i.e., in total there are more vCPUs than pCPUs).
2387
2388### watchdog (x86)
2389> `= force | <boolean>`
2390
2391> Default: `false`
2392
2393Run an NMI watchdog on each processor.  If a processor is stuck for
2394longer than the **watchdog_timeout**, a panic occurs.  When `force` is
2395specified, in addition to running an NMI watchdog on each processor,
2396unknown NMIs will still be processed.
2397
2398### watchdog_timeout (x86)
2399> `= <integer>`
2400
2401> Default: `5`
2402
2403Set the NMI watchdog timeout in seconds.  Specifying `0` will turn off
2404the watchdog.
2405
2406### x2apic (x86)
2407> `= <boolean>`
2408
2409> Default: `true`
2410
2411Permit use of x2apic setup for SMP environments.
2412
2413### x2apic_phys (x86)
2414> `= <boolean>`
2415
2416> Default: `true` if **FADT** mandates physical mode or if interrupt remapping
2417>          is not available, `false` otherwise.
2418
2419In the case that x2apic is in use, this option switches between physical and
2420clustered mode.  The default, given no hint from the **FADT**, is cluster
2421mode.
2422
2423### xenheap_megabytes (arm32)
2424> `= <size>`
2425
2426> Default: `0` (1/32 of RAM)
2427
2428Amount of RAM to set aside for the Xenheap. Must be an integer multiple of 32.
2429
2430By default will use 1/32 of the RAM up to a maximum of 1GB and with a
2431minimum of 32M, subject to a suitably aligned and sized contiguous
2432region of memory being available.
2433
2434### xpti (x86)
2435> `= List of [ default | <boolean> | dom0=<bool> | domu=<bool> ]`
2436
2437> Default: `false` on hardware known not to be vulnerable to Meltdown (e.g. AMD)
2438> Default: `true` everywhere else
2439
2440Override default selection of whether to isolate 64-bit PV guest page
2441tables.
2442
2443`true` activates page table isolation even on hardware not vulnerable by
2444Meltdown for all domains.
2445
2446`false` deactivates page table isolation on all systems for all domains.
2447
2448`default` sets the default behaviour.
2449
2450With `dom0` and `domu` it is possible to control page table isolation
2451for dom0 or guest domains only.
2452
2453### xsave (x86)
2454> `= <boolean>`
2455
2456> Default: `true`
2457
2458Permit use of the `xsave/xrstor` instructions.
2459
2460### xsm
2461> `= dummy | flask | silo`
2462
2463> Default: selectable via Kconfig.  Depends on enabled XSM modules.
2464
2465Specify which XSM module should be enabled.  This option is only available if
2466the hypervisor was compiled with `CONFIG_XSM` enabled.
2467
2468* `dummy`: this is the default choice.  Basic restriction for common deployment
2469  (the dummy module) will be applied.  It's also used when XSM is compiled out.
2470* `flask`: this is the policy based access control.  To choose this, the
2471  separated option in kconfig must also be enabled.
2472* `silo`: this will deny any unmediated communication channels between
2473  unprivileged VMs.  To choose this, the separated option in kconfig must also
2474  be enabled.
2475