1# Xen Hypervisor Command Line Options 2 3This document covers the command line options which the Xen 4Hypervisor. 5 6## Types of parameter 7 8Most parameters take the form `option=value`. Different options on 9the command line should be space delimited. All options are case 10sensitive, as are all values unless explicitly noted. 11 12### Boolean (`<boolean>`) 13 14All boolean option may be explicitly enabled using a `value` of 15> `yes`, `on`, `true`, `enable` or `1` 16 17They may be explicitly disabled using a `value` of 18> `no`, `off`, `false`, `disable` or `0` 19 20In addition, a boolean option may be enabled by simply stating its 21name, and may be disabled by prefixing its name with `no-`. 22 23####Examples 24 25Enable noreboot mode 26> `noreboot=true` 27 28Disable x2apic support (if present) 29> `x2apic=off` 30 31Enable synchronous console mode 32> `sync_console` 33 34Explicitly specifying any value other than those listed above is 35undefined, as is stacking a `no-` prefix with an explicit value. 36 37### Integer (`<integer>`) 38 39An integer parameter will default to decimal and may be prefixed with 40a `-` for negative numbers. Alternatively, a hexadecimal number may be 41used by prefixing the number with `0x`, or an octal number may be used 42if a leading `0` is present. 43 44Providing a string which does not validly convert to an integer is 45undefined. 46 47### Size (`<size>`) 48 49A size parameter may be any integer, with a single size suffix 50 51* `T` or `t`: TiB (2^40) 52* `G` or `g`: GiB (2^30) 53* `M` or `m`: MiB (2^20) 54* `K` or `k`: KiB (2^10) 55* `B` or `b`: Bytes 56 57Without a size suffix, the default will be kilo. Providing a suffix 58other than those listed above is undefined. 59 60### String 61 62Many parameters are more complicated and require more intricate 63configuration. The detailed description of each individual parameter 64specify which values are valid. 65 66### List 67 68Some options take a comma separated list of values. 69 70### Combination 71 72Some parameters act as combinations of the above, most commonly a mix 73of Boolean and String. These are noted in the relevant sections. 74 75## Parameter details 76 77### acpi 78> `= force | ht | noirq | <boolean>` 79 80**String**, or **Boolean** to disable. 81 82The **acpi** option is used to control a set of four related boolean 83flags; `acpi_force`, `acpi_ht`, `acpi_noirq` and `acpi_disabled`. 84 85By default, Xen will scan the DMI data and blacklist certain systems 86which are known to have broken ACPI setups. Providing `acpi=force` 87will cause Xen to ignore the blacklist and attempt to use all ACPI 88features. 89 90Using `acpi=ht` causes Xen to parse the ACPI tables enough to 91enumerate all CPUs, but will not use other ACPI features. This is not 92common, and only has an effect if your system is blacklisted. 93 94The `acpi=noirq` option causes Xen to not parse the ACPI MADT table 95looking for IO-APIC entries. This is also not common, and any system 96which requires this option to function should be blacklisted. 97Additionally, this will not prevent Xen from finding IO-APIC entries 98from the MP tables. 99 100Finally, any of the boolean false options can be used to disable ACPI 101usage entirely. 102 103Because responsibility for ACPI processing is shared between Xen and 104the domain 0 kernel this option is automatically propagated to the 105domain 0 command line 106 107### acpi_apic_instance 108> `= <integer>` 109 110Specify which ACPI MADT table to parse for APIC information, if more 111than one is present. 112 113### acpi_pstate_strict (x86) 114> `= <boolean>` 115 116> Default: `false` 117 118Enforce checking that P-state transitions by the ACPI cpufreq driver 119actually result in the nominated frequency to be established. A warning 120message will be logged if that isn't the case. 121 122### acpi_skip_timer_override (x86) 123> `= <boolean>` 124 125Instruct Xen to ignore timer-interrupt override. 126 127### acpi_sleep (x86) 128> `= s3_bios | s3_mode` 129 130`s3_bios` instructs Xen to invoke video BIOS initialization during S3 131resume. 132 133`s3_mode` instructs Xen to set up the boot time (option `vga=`) video 134mode during S3 resume. 135 136### allow_unsafe (x86) 137> `= <boolean>` 138 139> Default: `false` 140 141Force boot on potentially unsafe systems. By default Xen will refuse 142to boot on systems with the following errata: 143 144* AMD Erratum 121. Processors with this erratum are subject to a guest 145 triggerable Denial of Service. Override only if you trust all of 146 your PV guests. 147 148### altp2m (Intel) 149> `= <boolean>` 150 151> Default: `false` 152 153Permit multiple copies of host p2m. 154 155### apic (x86) 156> `= bigsmp | default` 157 158Override Xen's logic for choosing the APIC driver. By default, if 159there are more than 8 CPUs, Xen will switch to `bigsmp` over 160`default`. 161 162### apicv (Intel) 163> `= <boolean>` 164 165> Default: `true` 166 167Permit Xen to use APIC Virtualisation Extensions. This is an optimisation 168available as part of VT-x, and allows hardware to take care of the guests APIC 169handling, rather than requiring emulation in Xen. 170 171### apic_verbosity (x86) 172> `= verbose | debug` 173 174Increase the verbosity of the APIC code from the default value. 175 176### arat (x86) 177> `= <boolean>` 178 179> Default: `true` 180 181Permit Xen to use "Always Running APIC Timer" support on compatible hardware 182in combination with cpuidle. This option is only expected to be useful for 183developers wishing Xen to fall back to older timing methods on newer hardware. 184 185### argo 186 = List of [ <bool>, mac-permissive=<bool> ] 187 188Controls for the Argo hypervisor-mediated interdomain communication service. 189 190The functionality that this option controls is only available when Xen has been 191compiled with the build setting for Argo enabled in the build configuration. 192 193Argo is a interdomain communication mechanism, where Xen acts as the central 194point of authority. Guests may register memory rings to recieve messages, 195query the status of other domains, and send messages by hypercall, all subject 196to appropriate auditing by Xen. Argo is disabled by default. 197 198* The `mac-permissive` boolean controls whether wildcard receive rings may be 199 registered (`mac-permissive=1`) or may not be registered 200 (`mac-permissive=0`). 201 202 This option is disabled by default, to protect domains from a DoS by a 203 buggy or malicious other domain spamming the ring. 204 205### asid (x86) 206> `= <boolean>` 207 208> Default: `true` 209 210Permit Xen to use Address Space Identifiers. This is an optimisation which 211tags the TLB entries with an ID per vcpu. This allows for guest TLB flushes 212to be performed without the overhead of a complete TLB flush. 213 214### async-show-all (x86) 215> `= <boolean>` 216 217> Default: `false` 218 219Forces all CPUs' full state to be logged upon certain fatal asynchronous 220exceptions (watchdog NMIs and unexpected MCEs). 221 222### ats (x86) 223> `= <boolean>` 224 225> Default: `false` 226 227Permits Xen to set up and use PCI Address Translation Services. This is a 228performance optimisation for PCI Passthrough. 229 230**WARNING: Xen cannot currently safely use ATS because of its synchronous wait 231loops for Queued Invalidation completions.** 232 233### availmem 234> `= <size>` 235 236> Default: `0` (no limit) 237 238Specify a maximum amount of available memory, to which Xen will clamp 239the e820 table. 240 241### badpage 242> `= List of [ <integer> | <integer>-<integer> ]` 243 244Specify that certain pages, or certain ranges of pages contain bad 245bytes and should not be used. For example, if your memory tester says 246that byte `0x12345678` is bad, you would place `badpage=0x12345` on 247Xen's command line. 248 249### bootscrub 250> `= idle | <boolean>` 251 252> Default: `idle` 253 254Scrub free RAM during boot. This is a safety feature to prevent 255accidentally leaking sensitive VM data into other VMs if Xen crashes 256and reboots. 257 258In `idle` mode, RAM is scrubbed in background on all CPUs during idle-loop 259with a guarantee that memory allocations always provide scrubbed pages. 260This option reduces boot time on machines with a large amount of RAM while 261still providing security benefits. 262 263### bootscrub_chunk 264> `= <size>` 265 266> Default: `128M` 267 268Maximum RAM block size chunks to be scrubbed whilst holding the page heap lock 269and not running softirqs. Reduce this if softirqs are not being run frequently 270enough. Setting this to a high value may cause boot failure, particularly if 271the NMI watchdog is also enabled. 272 273### cet 274 = List of [ shstk=<bool> ] 275 276 Applicability: x86 277 278Controls for the use of Control-flow Enforcement Technology. CET is group a 279of hardware features designed to combat Return-oriented Programming (ROP, also 280call/jmp COP/JOP) attacks. 281 282* The `shstk=` boolean controls whether Xen uses Shadow Stacks for its own 283 protection. 284 285 The option is available when `CONFIG_XEN_SHSTK` is compiled in, and 286 defaults to `true` on hardware supporting CET-SS. Specifying 287 `cet=no-shstk` will cause Xen not to use Shadow Stacks even when support 288 is available in hardware. 289 290 Shadow Stacks are incompatible with 32bit PV guests. This option will 291 override the `pv=32` boolean to false. Backwards compatibility can be 292 maintained with the `pv-shim` mechanism. 293 294### clocksource (x86) 295> `= pit | hpet | acpi | tsc` 296 297If set, override Xen's default choice for the platform timer. 298Having TSC as platform timer requires being explicitly set. This is because 299TSC can only be safely used if CPU hotplug isn't performed on the system. On 300some platforms, the "maxcpus" option may need to be used to further adjust 301the number of allowed CPUs. When running on platforms that can guarantee a 302monotonic TSC across sockets you may want to adjust the "tsc" command line 303parameter to "stable:socket". 304 305### cmci-threshold (Intel) 306> `= <integer>` 307 308> Default: `2` 309 310Specify the event count threshold for raising Corrected Machine Check 311Interrupts. Specifying zero disables CMCI handling. 312 313### cmos-rtc-probe (x86) 314> `= <boolean>` 315 316> Default: `false` 317 318Flag to indicate whether to probe for a CMOS Real Time Clock irrespective of 319ACPI indicating none to be there. 320 321### com1 322### com2 323> `= <baud>[/<base-baud>][,[DPS][,[<io-base>|pci|amt][,[<irq>|msi][,[<port-bdf>][,[<bridge-bdf>]]]]]]` 324 325Both option `com1` and `com2` follow the same format. 326 327* `<baud>` may be either an integer baud rate, or the string `auto` if 328 the bootloader or other earlier firmware has already set it up. 329* Optionally, the base baud rate (usually the highest baud rate the 330 device can communicate at) can be specified. 331* `DPS` represents the number of data bits, the parity, and the number 332 of stop bits. 333 * `D` is an integer between 5 and 8 for the number of data bits. 334 * `P` is a single character representing the type of parity: 335 * `n` No 336 * `o` Odd 337 * `e` Even 338 * `m` Mark 339 * `s` Space 340 * `S` is an integer 1 or 2 for the number of stop bits. 341* `<io-base>` is an integer which specifies the IO base port for UART 342 registers. 343* `<irq>` is the IRQ number to use, or `0` to use the UART in poll 344 mode only, or `msi` to set up a Message Signaled Interrupt. 345* `<port-bdf>` is the PCI location of the UART, in 346 `<bus>:<device>.<function>` notation. 347* `<bridge-bdf>` is the PCI bridge behind which is the UART, in 348 `<bus>:<device>.<function>` notation. 349* `pci` indicates that Xen should scan the PCI bus for the UART, 350 avoiding Intel AMT devices. 351* `amt` indicated that Xen should scan the PCI bus for the UART, 352 including Intel AMT devices if present. 353 354A typical setup for most situations might be `com1=115200,8n1` 355 356In addition to the above positional specification for UART parameters, 357name=value pair specfications are also supported. This is used to add 358flexibility for UART devices which require additional UART parameter 359configurations. 360 361The comma separation still delineates positional parameters. Hence, 362unless the parameter is explicitly specified with name=value option, it 363will be considered a positional parameter. 364 365The syntax consists of 366com1=(comma-separated positional parameters),(comma separated name-value pairs) 367 368The accepted name keywords for name=value pairs are: 369 370* `baud` - accepts integer baud rate (eg. 115200) or `auto` 371* `bridge`- Similar to bridge-bdf in positional parameters. 372 Used to determine the PCI bridge to access the UART device. 373 Notation is xx:xx.x `<bus>:<device>.<function>` 374* `clock-hz`- accepts large integers to setup UART clock frequencies. 375 Do note - these values are multiplied by 16. 376* `data-bits` - integer between 5 and 8 377* `dev` - accepted values are `pci` OR `amt`. If this option 378 is used to specify if the serial device is pci-based. The io_base 379 cannot be specified when `dev=pci` or `dev=amt` is used. 380* `io-base` - accepts integer which specified IO base port for UART registers 381* `irq` - IRQ number to use 382* `parity` - accepted values are same as positional parameters 383* `port` - Used to specify which port the PCI serial device is located on 384 Notation is xx:xx.x `<bus>:<device>.<function>` 385* `reg-shift` - register shifts required to set UART registers 386* `reg-width` - register width required to set UART registers 387 (only accepts 1 and 4) 388* `stop-bits` - only accepts 1 or 2 for the number of stop bits 389 390The following are examples of correct specifications: 391 392 com1=115200,8n1,0x3f8,4 393 com1=115200,8n1,0x3f8,4,reg-width=4,reg-shift=2 394 com1=baud=115200,parity=n,stop-bits=1,io-base=0x3f8,reg-width=4 395 396### conring_size 397> `= <size>` 398 399> Default: `conring_size=16k` 400 401Specify the size of the console ring buffer. 402 403### console 404> `= List of [ vga | com1[H,L] | com2[H,L] | pv | dbgp | none ]` 405 406> Default: `console=com1,vga` 407 408Specify which console(s) Xen should use. 409 410`vga` indicates that Xen should try and use the vga graphics adapter. 411 412`com1` and `com2` indicates that Xen should use serial ports 1 and 2 413respectively. Optionally, these arguments may be followed by an `H` or 414`L`. `H` indicates that transmitted characters will have their MSB 415set, while received characters must have their MSB set. `L` indicates 416the converse; transmitted and received characters will have their MSB 417cleared. This allows a single port to be shared by two subsystems 418(e.g. console and debugger). 419 420`pv` indicates that Xen should use Xen's PV console. This option is 421only available when used together with `pv-in-pvh`. 422 423`dbgp` indicates that Xen should use a USB debug port. 424 425`none` indicates that Xen should not use a console. This option only 426makes sense on its own. 427 428### console_timestamps 429> `= none | date | datems | boot | raw` 430 431> Default: `none` 432 433> Can be modified at runtime 434 435Specify which timestamp format Xen should use for each console line. 436 437* `none`: No timestamps 438* `date`: Date and time information 439 * `[YYYY-MM-DD HH:MM:SS]` 440* `datems`: Date and time, with milliseconds 441 * `[YYYY-MM-DD HH:MM:SS.mmm]` 442* `boot`: Seconds and microseconds since boot 443 * `[SSSSSS.uuuuuu]` 444+ `raw`: Raw platform ticks, architecture and implementation dependent 445 * `[XXXXXXXXXXXXXXXX]` 446 447For compatibility with the older boolean parameter, specifying 448`console_timestamps` alone will enable the `date` option. 449 450### console_to_ring 451> `= <boolean>` 452 453> Default: `false` 454 455Flag to indicate whether all guest console output should be copied 456into the console ring buffer. 457 458### conswitch 459> `= <switch char>[x]` 460 461> Default: `conswitch=a` 462 463> Can be modified at runtime 464 465Specify which character should be used to switch serial input between 466Xen and dom0. The required sequence is CTRL-<switch char> three 467times. 468 469The optional trailing `x` indicates that Xen should not automatically 470switch the console input to dom0 during boot. Any other value, 471including omission, causes Xen to automatically switch to the dom0 472console during dom0 boot. Use `conswitch=ax` to keep the default switch 473character, but for xen to keep the console. 474 475### core_parking 476> `= power | performance` 477 478> Default: `power` 479 480### cpu_type (x86) 481> `= arch_perfmon` 482 483If set, force use of the performance counters for oprofile, rather than detecting 484available support. 485 486### cpufreq 487> `= none | {{ <boolean> | xen } [:[powersave|performance|ondemand|userspace][,<maxfreq>][,[<minfreq>][,[verbose]]]]} | dom0-kernel` 488 489> Default: `xen` 490 491Indicate where the responsibility for driving power states lies. Note that the 492choice of `dom0-kernel` is deprecated and not supported by all Dom0 kernels. 493 494* Default governor policy is ondemand. 495* `<maxfreq>` and `<minfreq>` are integers which represent max and min processor frequencies 496 respectively. 497* `verbose` option can be included as a string or also as `verbose=<integer>` 498 499### cpuid (x86) 500> `= List of comma separated booleans` 501 502This option allows for fine tuning of the facilities Xen will use, after 503accounting for hardware capabilities as enumerated via CPUID. 504 505Unless otherwise noted, options only have any effect in their negative form, 506to hide the named feature(s). Ignoring a feature using this mechanism will 507cause Xen not to use the feature, nor offer them as usable to guests. 508 509Currently accepted: 510 511The Speculation Control hardware features `srbds-ctrl`, `md-clear`, `ibrsb`, 512`stibp`, `ibpb`, `l1d-flush` and `ssbd` are used by default if available and 513applicable. They can all be ignored. 514 515`rdrand` and `rdseed` have multiple interactions. 516 517* For Special Register Buffer Data Sampling (SRBDS, XSA-320, CVE-2020-0543), 518 RDRAND and RDSEED can be ignored. 519 520 Due to the absence of microcode to address SRBDS on IvyBridge client 521 hardware, the RDRAND feature is hidden by default for guests, unless 522 `rdrand` is used in its positive form. Irrespective of the setting here, 523 VMs can use RDRAND if explicitly enabled in guest config file, and VMs 524 already using RDRAND can migrate in. 525 526* The RDRAND feature is disabled by default on AMD Fam15/16 systems, due to 527 possible malfunctions after ACPI S3 suspend/resume. `rdrand` may be used 528 in its positive form to override Xen's default behaviour on these systems, 529 and make the feature fully usable. 530 531### cpuid_mask_cpu 532> `= fam_0f_rev_[cdefg] | fam_10_rev_[bc] | fam_11_rev_b` 533 534> Applicability: AMD 535 536If none of the other **cpuid_mask_\*** options are given, Xen has a set of 537pre-configured masks to make the current processor appear to be 538family/revision specified. 539 540See below for general information on masking. 541 542**Warning: This option is not fully effective on Family 15h processors or 543later.** 544 545### cpuid_mask_ecx 546### cpuid_mask_edx 547### cpuid_mask_ext_ecx 548### cpuid_mask_ext_edx 549### cpuid_mask_l7s0_eax 550### cpuid_mask_l7s0_ebx 551### cpuid_mask_thermal_ecx 552### cpuid_mask_xsave_eax 553> `= <integer>` 554 555> Applicability: x86. Default: `~0` (all bits set) 556 557The availability of these options are model specific. Some processors don't 558support any of them, and no processor supports all of them. Xen will ignore 559options on processors which are lacking support. 560 561These options can be used to alter the features visible via the `CPUID` 562instruction. Settings applied here take effect globally, including for Xen 563and all guests. 564 565Note: Since Xen 4.7, it is no longer necessary to mask a host to create 566migration safety in heterogeneous scenarios. All necessary CPUID settings 567should be provided in the VM configuration file. Furthermore, it is 568recommended not to use this option, as doing so causes an unnecessary 569reduction of features at Xen's disposal to manage guests. 570 571### cpuidle (x86) 572> `= <boolean>` 573 574### cpuinfo (x86) 575> `= <boolean>` 576 577### crashinfo_maxaddr 578> `= <size>` 579 580> Default: `4G` 581 582Specify the maximum address to allocate certain structures, if used in 583combination with the **low_crashinfo** command line option. 584 585### crashkernel 586> `= <ramsize-range>:<size>[,...][{@,<}<offset>]` 587> `= <size>[{@,<}<offset>]` 588> `= <size>,below=offset` 589 590Specify sizes and optionally placement of the crash kernel reservation 591area. The `<ramsize-range>:<size>` pairs indicate how much memory to 592set aside for a crash kernel (`<size>`) for a given range of installed 593RAM (`<ramsize-range>`). Each `<ramsize-range>` is of the form 594`<start>-[<end>]`. 595 596A trailing `@<offset>` specifies the exact address this area should be 597placed at, whereas `<` in place of `@` just specifies an upper bound of 598the address range the area should fall into. 599 600< and below are synonyomous, the latter being useful for grub2 systems 601which would otherwise require escaping of the < option 602 603 604### credit2_balance_over 605> `= <integer>` 606 607### credit2_balance_under 608> `= <integer>` 609 610### credit2_cap_period_ms 611> `= <integer>` 612 613> Default: `10` 614 615Domains subject to a cap receive a replenishment of their runtime budget 616once every cap period interval. Default is 10 ms. The amount of budget 617they receive depends on their cap. For instance, a domain with a 50% cap 618will receive 50% of 10 ms, so 5 ms. 619 620### credit2_load_precision_shift 621> `= <integer>` 622 623> Default: `18` 624 625Specify the number of bits to use for the fractional part of the 626values involved in Credit2 load tracking and load balancing math. 627 628### credit2_load_window_shift 629> `= <integer>` 630 631> Default: `30` 632 633Specify the number of bits to use to represent the length of the 634window (in nanoseconds) we use for load tracking inside Credit2. 635This means that, with the default value (30), we use 6362^30 nsec ~= 1 sec long window. 637 638Load tracking is done by means of a variation of exponentially 639weighted moving average (EWMA). The window length defined here 640is what tells for how long we give value to previous history 641of the load itself. In fact, after a full window has passed, 642what happens is that we discard all previous history entirely. 643 644A short window will make the load balancer quick at reacting 645to load changes, but also short-sighted about previous history 646(and hence, e.g., long term load trends). A long window will 647make the load balancer thoughtful of previous history (and 648hence capable of capturing, e.g., long term load trends), but 649also slow in responding to load changes. 650 651The default value of `1 sec` is rather long. 652 653### credit2_runqueue 654> `= cpu | core | socket | node | all` 655 656> Default: `socket` 657 658Specify how host CPUs are arranged in runqueues. Runqueues are kept 659balanced with respect to the load generated by the vCPUs running on 660them. Smaller runqueues (as in with `core`) means more accurate load 661balancing (for instance, it will deal better with hyperthreading), 662but also more overhead. 663 664Available alternatives, with their meaning, are: 665* `cpu`: one runqueue per each logical pCPUs of the host; 666* `core`: one runqueue per each physical core of the host; 667* `socket`: one runqueue per each physical socket (which often, 668 but not always, matches a NUMA node) of the host; 669* `node`: one runqueue per each NUMA node of the host; 670* `all`: just one runqueue shared by all the logical pCPUs of 671 the host 672 673### dbgp 674> `= ehci[ <integer> | @pci<bus>:<slot>.<func> ]` 675 676Specify the USB controller to use, either by instance number (when going 677over the PCI busses sequentially) or by PCI device (must be on segment 0). 678 679### debug_stack_lines 680> `= <integer>` 681 682> Default: `20` 683 684Limits the number lines printed in Xen stack traces. 685 686### debugtrace 687> `= [cpu:]<size>` 688 689> Default: `128` 690 691Specify the size of the console debug trace buffer. By specifying `cpu:` 692additionally a trace buffer of the specified size is allocated per cpu. 693The debug trace feature is only enabled in debugging builds of Xen. 694 695### dma_bits 696> `= <integer>` 697 698Specify the bit width of the DMA heap. 699 700### dom0 701 = List of [ pv | pvh, shadow=<bool>, verbose=<bool>, 702 cpuid-faulting=<bool> ] 703 704 Applicability: x86 705 706Controls for how dom0 is constructed on x86 systems. 707 708* The `pv` and `pvh` options select the virtualisation mode of dom0. 709 710 The `pv` option is only available when `CONFIG_PV` is compiled in. The 711 `pvh` option is only available when `CONFIG_HVM` is compiled in. When 712 both options are compiled in, the default is PV. 713 714 In addition, the following requirements must be met: 715 716 * The dom0 kernel selected by the boot loader must be capable of the 717 selected mode. 718 * For a PVH dom0, the hardware must have VT-x/SVM extensions available. 719 720* The `shadow` boolean allows dom0 to be explicitly constructed using shadow 721 paging. This option is unavailable when `CONFIG_SHADOW_PAGING` is 722 disabled. 723 724 For PVH, dom0 defaults to using HAP on capable hardware, and falls back to 725 shadow paging otherwise. A PVH dom0 cannot be used if Xen is compiled 726 without shadow paging support, and the hardware lacks HAP support. 727 728 For PV, the use of dom0 shadow mode is only for development purposes. PV 729 guests do no require any paging support by default. 730 731* The `verbose` boolean is intended for diagnostics, and prints out extra 732 information during the dom0 build. It defaults to the compile time choice 733 of `CONFIG_VERBOSE_DEBUG`. 734 735* The `cpuid-faulting` boolean is an interim option, is only applicable to 736 PV dom0, and defaults to true. 737 738 Before Xen 4.13, the domain builder logic for guest construction depended 739 on seeing host CPUID values to function correctly. As a result, CPUID 740 Faulting was never activated for PV dom0's, even on capable hardware. 741 742 In Xen 4.13, the domain builder logic has been fixed, and no longer has 743 this dependency. As a consequence, CPUID Faulting is activated by default 744 even for PV dom0's. 745 746 However, as PV dom0's have always seen host CPUID data in the past, there 747 is a chance that further dependencies exist. This boolean can be used to 748 restore the pre-4.13 behaviour. If specifying `no-cpuid-faulting` fixes 749 an issue in dom0, please report a bug. 750 751### dom0-iommu 752 = List of [ passthrough=<bool>, strict=<bool>, map-inclusive=<bool>, 753 map-reserved=<bool>, none ] 754 755Controls for the dom0 IOMMU setup. 756 757* The `passthrough` boolean controls whether IOMMU translation functionality 758 is disabled for devices in dom0 (`passthrough=1`) or whether the IOMMU is 759 used to ensure that dom0 can only DMA to its permitted areas of RAM 760 (`passthrough=0`). 761 762 This option is only applicable to x86 PV dom0's, and defaults to false. 763 764 Some older Intel VT-d hardware isn't capable of disabling translation 765 functionality on a per-device basis, and will cause this option to be 766 ignored and assumed to be 0. Similar behaviour on such systems is only 767 available by fully disabling all IOMMUs. 768 769 This option is hardwired to false for x86 PVH dom0's (where a non-identity 770 transform is required for dom0 to function), and is ignored for ARM. 771 772* The `strict` boolean is applicable to x86 PV dom0's only and defaults to 773 false. It controls whether dom0 can have IOMMU mappings for all domain 774 RAM in the system, or only for its allocated RAM (and grant mappings etc.) 775 776 This option is hardwired to true for x86 PVH dom0's (as RAM belonging to 777 other domains in the system don't live in a compatible address space), and 778 is ignored for ARM. 779 780* The `map-inclusive` boolean is applicable to x86 PV dom0's, and sets up 781 identity IOMMU mappings for all non-RAM regions below 4GB except for 782 unusable ranges, and ranges belonging to Xen. 783 784 Typically, some devices in a system use bits of RAM for communication, and 785 these areas should be listed as reserved in the E820 table and identified 786 via RMRR or IVMD entries in the APCI tables, so Xen can ensure that they 787 are identity-mapped in the IOMMU. However, some firmware makes mistakes, 788 and this option is a coarse-grain workaround for those errors. 789 790 Where possible, finer grain corrections should be made with the `rmrr=`, 791 `ivrs_hpet=` or `ivrs_ioapic=` command line options. 792 793 This option is disabled by default, and deprecated and intended for 794 removal in future versions of Xen. If specifying `map-inclusive` is the 795 only way to make your system boot, please report a bug. 796 797* The `map-reserved` functionality is very similar to `map-inclusive`. 798 799 The differences from `map-inclusive` are that `map-reserved` is applicable 800 to both x86 PV and PVH dom0's, is enabled by default, and represents a 801 subset of the correction by only mapping reserved memory regions rather 802 than all non-RAM regions. 803 804* The `none` option is intended for development purposes only, and skips 805 certain safety checks pertaining to the correct IOMMU configuration for 806 dom0 to boot. 807 808 Incorrect use of this option may result in a malfunctioning system. 809 810### dom0_ioports_disable (x86) 811> `= List of <hex>-<hex>` 812 813Specify a list of IO ports to be excluded from dom0 access. 814 815### dom0_max_vcpus 816 817Either: 818 819> `= <integer>`. 820 821The number of VCPUs to give to dom0. This number of VCPUs can be more 822than the number of PCPUs on the host. The default is the number of 823PCPUs. 824 825Or: 826 827> `= <min>-<max>` where `<min>` and `<max>` are integers. 828 829Gives dom0 a number of VCPUs equal to the number of PCPUs, but always 830at least `<min>` and no more than `<max>`. Using `<min>` may give 831more VCPUs than PCPUs. `<min>` or `<max>` may be omitted and the 832defaults of 1 and unlimited respectively are used instead. 833 834For example, with `dom0_max_vcpus=4-8`: 835 836> Number of 837> PCPUs | Dom0 VCPUs 838> 2 | 4 839> 4 | 4 840> 6 | 6 841> 8 | 8 842> 10 | 8 843 844### dom0_mem (ARM) 845> `= <size>` 846 847Set the amount of memory for the initial domain (dom0). It must be 848greater than zero. This parameter is required. 849 850### dom0_mem (x86) 851> `= List of ( min:<sz> | max:<sz> | <sz> )` 852 853Set the amount of memory for the initial domain (dom0). If a size is 854positive, it represents an absolute value. If a size is negative, it 855is subtracted from the total available memory. 856 857* `<sz>` specifies the exact amount of memory. 858* `min:<sz>` specifies the minimum amount of memory. 859* `max:<sz>` specifies the maximum amount of memory. 860 861If `<sz>` is not specified, the default is all the available memory 862minus some reserve. The reserve is 1/16 of the available memory or 863128 MB (whichever is smaller). 864 865The amount of memory will be at least the minimum but never more than 866the maximum (i.e., `max` overrides the `min` option). If there isn't 867enough memory then as much as possible is allocated. 868 869`max:<sz>` also sets the maximum reservation (the maximum amount of 870memory dom0 can balloon up to). If this is omitted then the maximum 871reservation is unlimited. 872 873For example, to set dom0's initial memory allocation to 512MB but 874allow it to balloon up as far as 1GB use `dom0_mem=512M,max:1G` 875 876> `<sz>` is: `<size> | [<size>+]<frac>%` 877> `<frac>` is an integer < 100 878 879* `<frac>` specifies a fraction of host memory size in percent. 880 881So `<sz>` being `1G+25%` on a 256 GB host would result in 65 GB. 882 883If you use this option then it is highly recommended that you disable 884any dom0 autoballooning feature present in your toolstack. See the 885_xl.conf(5)_ man page or [Xen Best 886Practices](https://wiki.xen.org/wiki/Xen_Best_Practices#Xen_dom0_dedicated_memory_and_preventing_dom0_memory_ballooning). 887 888This option doesn't have effect if pv-shim mode is enabled. 889 890### dom0_nodes (x86) 891 892> `= List of [ <integer> | relaxed | strict ]` 893 894> Default: `strict` 895 896Specify the NUMA nodes to place Dom0 on. Defaults for vCPU-s created 897and memory assigned to Dom0 will be adjusted to match the node 898restrictions set up here. Note that the values to be specified here are 899ACPI PXM ones, not Xen internal node numbers. `relaxed` sets up vCPU 900affinities to prefer but be not limited to the specified node(s). 901 902### dom0_vcpus_pin 903> `= <boolean>` 904 905> Default: `false` 906 907Pin dom0 vcpus to their respective pcpus 908 909### dtuart (ARM) 910> `= path [:options]` 911 912> Default: `""` 913 914Specify the full path in the device tree for the UART. If the path doesn't 915start with `/`, it is assumed to be an alias. The options are device specific. 916 917### e820-mtrr-clip (x86) 918> `= <boolean>` 919 920Flag that specifies if RAM should be clipped to the highest cacheable 921MTRR. 922 923> Default: `true` on Intel CPUs, otherwise `false` 924 925### e820-verbose (x86) 926> `= <boolean>` 927 928> Default: `false` 929 930Flag that enables verbose output when processing e820 information and 931applying clipping. 932 933### edd (x86) 934> `= off | on | skipmbr` 935 936Control retrieval of Extended Disc Data (EDD) from the BIOS during 937boot. 938 939### edid (x86) 940> `= no | force` 941 942Either force retrieval of monitor EDID information via VESA DDC, or 943disable it (edid=no). This option should not normally be required 944except for debugging purposes. 945 946### efi 947 = List of [ rs=<bool>, attr=no|uc ] 948 949Controls for interacting with the system Extended Firmware Interface. 950 951* The `rs` boolean controls whether Runtime Services are used. By default, 952 Xen uses Runtime Services itself, and proxies certain calls on behalf of 953 dom0. Selecting `rs=0` prohibits all use of Runtime Services. 954 955* The `attr=` string exists to specify what to do with memory regions of 956 unknown/unrecognised cacheability. `attr=no` is the default and will 957 leave the memory regions unmapped, while `attr=uc` will map them as fully 958 uncacheable. 959 960### ept 961> `= List of [ ad=<bool>, pml=<bool>, exec-sp=<bool> ]` 962 963> Applicability: Intel 964 965Extended Page Tables are a feature of Intel's VT-x technology, whereby 966hardware manages the virtualisation of HVM guest pagetables. EPT was 967introduced with the Nehalem architecture. 968 969* The `ad` boolean controls hardware tracking of Access and Dirty bits in the 970 EPT pagetables, and was first introduced in Broadwell Server. 971 972 By default, Xen will use A/D tracking when available in hardware, except 973 on Avoton processors affected by erratum AVR41. Explicitly choosing 974 `ad=0` will disable the use of A/D tracking on capable hardware, whereas 975 choosing `ad=1` will cause tracking to be used even on AVR41-affected 976 hardware. 977 978* The `pml` boolean controls the use of Page Modification Logging, which is 979 also introduced in Broadwell Server. 980 981 PML is a feature whereby the processor generates a list of pages which 982 have been dirtied. This is necessary information for operations such as 983 live migration, and having the processor maintain the list of dirtied 984 pages is more efficient than traditional software implementations where 985 all guest writes trap into Xen so the dirty bitmap can be maintained. 986 987 By default, Xen will use PML when it is available in hardware. PML 988 functionally depends on A/D tracking, so choosing `ad=0` will implicitly 989 disable PML. `pml=0` can be used to prevent the use of PML on otherwise 990 capable hardware. 991 992* The `exec-sp` boolean controls whether EPT superpages with execute 993 permissions are permitted. In general this is good for performance. 994 995 However, on processors vulnerable CVE-2018-12207, HVM guest kernels can 996 use executable superpages to crash the host. By default, executable 997 superpages are disabled on affected hardware. 998 999 If HVM guest kernels are trusted not to mount a DoS against the system, 1000 this option can enabled to regain performance. 1001 1002 This boolean may be modified at runtime using `xl set-parameters 1003 ept=[no-]exec-sp` to switch between fast and secure. 1004 1005 * When switching from secure to fast, preexisting HVM domains will run 1006 at their current performance until they are rebooted; new domains will 1007 run without any overhead. 1008 1009 * When switching from fast to secure, all HVM domains will immediately 1010 suffer a performance penalty. 1011 1012 **Warning: No guarantee is made that this runtime option will be retained 1013 indefinitely, or that it will retain this exact behaviour. It is 1014 intended as an emergency option for people who first chose fast, then 1015 change their minds to secure, and wish not to reboot.** 1016 1017### extra_guest_irqs 1018> `= [<domU number>][,<dom0 number>]` 1019 1020> Default: `32,<variable>` 1021 1022Change the number of PIRQs available for guests. The optional first number is 1023common for all domUs, while the optional second number (preceded by a comma) 1024is for dom0. Changing the setting for domU has no impact on dom0 and vice 1025versa. For example to change dom0 without changing domU, use 1026`extra_guest_irqs=,512`. The default value for Dom0 and an eventual separate 1027hardware domain is architecture dependent. 1028Note that specifying zero as domU value means zero, while for dom0 it means 1029to use the default. 1030 1031### flask 1032> `= permissive | enforcing | late | disabled` 1033 1034> Default: `enforcing` 1035 1036Specify how the FLASK security server should be configured. This option is only 1037available if the hypervisor was compiled with FLASK support. This can be 1038enabled by running either: 1039- make -C xen config and enabling XSM and FLASK. 1040- make -C xen menuconfig and enabling 'FLux Advanced Security Kernel support' and 'Xen Security Modules support' 1041 1042* `permissive`: This is intended for development and is not suitable for use 1043 with untrusted guests. If a policy is provided by the bootloader, it will be 1044 loaded; errors will be reported to the ring buffer but will not prevent 1045 booting. The policy can be changed to enforcing mode using "xl setenforce". 1046* `enforcing`: This will cause the security server to enter enforcing mode prior 1047 to the creation of domain 0. If an valid policy is not provided by the 1048 bootloader and no built-in policy is present, the hypervisor will not continue 1049 booting. 1050* `late`: This disables loading of the built-in security policy or the policy 1051 provided by the bootloader. FLASK will be enabled but will not enforce access 1052 controls until a policy is loaded by a domain using "xl loadpolicy". Once a 1053 policy is loaded, FLASK will run in enforcing mode unless "xl setenforce" has 1054 changed that setting. 1055* `disabled`: This causes the XSM framework to revert to the dummy module. The 1056 dummy module provides the same security policy as is used when compiling the 1057 hypervisor without support for XSM. The xsm_op hypercall can also be used to 1058 switch to this mode after boot, but there is no way to re-enable FLASK once 1059 the dummy module is loaded. 1060 1061### font 1062> `= <height>` where height is `8x8 | 8x14 | 8x16` 1063 1064Specify the font size when using the VESA console driver. 1065 1066### force-ept (Intel) 1067> `= <boolean>` 1068 1069> Default: `false` 1070 1071Allow EPT to be enabled when VMX feature `VM_ENTRY_LOAD_GUEST_PAT` is not 1072present. 1073 1074*Warning:* 1075Due to CVE-2013-2212, VMX feature `VM_ENTRY_LOAD_GUEST_PAT` is by default 1076required as a prerequisite for using EPT. If you are not using PCI Passthrough, 1077or trust the guest administrator who would be using passthrough, then the 1078requirement can be relaxed. This option is particularly useful for nested 1079virtualization, to allow the L1 hypervisor to use EPT even if the L0 hypervisor 1080does not provide `VM_ENTRY_LOAD_GUEST_PAT`. 1081 1082### gdb 1083> `= com1[H,L] | com2[H,L] | dbgp` 1084 1085> Default: `` 1086 1087Specify which console gdbstub should use. See **console**. 1088 1089### gnttab 1090> `= List of [ max-ver:<integer>, transitive=<bool> ]` 1091 1092> Default: `gnttab=max-ver:2,transitive` 1093 1094Control various aspects of the grant table behaviour available to guests. 1095 1096* `max-ver` Select the maximum grant table version to offer to guests. Valid 1097version are 1 and 2. 1098* `transitive` Permit or disallow the use of transitive grants. Note that the 1099use of grant table v2 without transitive grants is an ABI breakage from the 1100guests point of view. 1101 1102The usage of gnttab v2 is not security supported on ARM platforms. 1103 1104### gnttab_max_frames 1105> `= <integer>` 1106 1107> Default: `64` 1108 1109> Can be modified at runtime 1110 1111Specify the maximum number of frames which any domain may use as part 1112of its grant table. This value is an upper boundary of the per-domain 1113value settable via Xen tools. 1114 1115Dom0 is using this value for sizing its grant table. 1116 1117### gnttab_max_maptrack_frames 1118> `= <integer>` 1119 1120> Default: `1024` 1121 1122> Can be modified at runtime 1123 1124Specify the maximum number of frames to use as part of a domains 1125maptrack array. This value is an upper boundary of the per-domain 1126value settable via Xen tools. 1127 1128Dom0 is using this value for sizing its maptrack table. 1129 1130### global-pages 1131 = <boolean> 1132 1133 Applicability: x86 1134 Default: true unless running virtualized on AMD or Hygon hardware 1135 1136Control whether to use global pages for PV guests, and thus the need to 1137perform TLB flushes by writing to CR4. This is a performance trade-off. 1138 1139AMD SVM does not support selective trapping of CR4 writes, which means that a 1140global TLB flush (two CR4 writes) takes two VMExits, and massively outweigh 1141the benefit of using global pages to begin with. This case is easy for Xen to 1142spot, and is accounted for in the default setting. 1143 1144Other cases where this option might be a benefit is on VT-x hardware when 1145selective CR4 writes are not supported/enabled by the hypervisor, or in any 1146virtualised case using shadow paging. These are not easy for Xen to spot, so 1147are not accounted for in the default setting. 1148 1149### guest_loglvl 1150> `= <level>[/<rate-limited level>]` where level is `none | error | warning | info | debug | all` 1151 1152> Default: `guest_loglvl=none/warning` 1153 1154> Can be modified at runtime 1155 1156Set the logging level for Xen guests. Any log message with equal more 1157more importance will be printed. 1158 1159The optional `<rate-limited level>` option instructs which severities 1160should be rate limited. 1161 1162### hap (x86) 1163> `= <boolean>` 1164 1165> Default: `true` 1166 1167Flag to globally enable or disable support for Hardware Assisted 1168Paging (HAP) 1169 1170### hap_1gb (x86) 1171> `= <boolean>` 1172 1173> Default: `true` 1174 1175Flag to enable 1 GB host page table support for Hardware Assisted 1176Paging (HAP). 1177 1178### hap_2mb (x86) 1179> `= <boolean>` 1180 1181> Default: `true` 1182 1183Flag to enable 2 MB host page table support for Hardware Assisted 1184Paging (HAP). 1185 1186### hardware_dom 1187> `= <domid>` 1188 1189> Default: `0` 1190 1191Enable late hardware domain creation using the specified domain ID. This is 1192intended to be used when domain 0 is a stub domain which builds a disaggregated 1193system including a hardware domain with the specified domain ID. This option is 1194supported only when compiled with XSM on x86. 1195 1196### hest_disable 1197> ` = <boolean>` 1198 1199> Default: `false` 1200 1201Control Xens use of the APEI Hardware Error Source Table, should one be found. 1202 1203### highmem-start (x86) 1204> `= <size>` 1205 1206Specify the memory boundary past which memory will be treated as highmem (x86 1207debug hypervisor only). 1208 1209### hmp-unsafe (arm) 1210> `= <boolean>` 1211 1212> Default : `false` 1213 1214Say yes at your own risk if you want to enable heterogenous computing 1215(such as big.LITTLE). This may result to an unstable and insecure 1216platform, unless you manually specify the cpu affinity of all domains so 1217that all vcpus are scheduled on the same class of pcpus (big or LITTLE 1218but not both). vcpu migration between big cores and LITTLE cores is not 1219supported. See docs/misc/arm/big.LITTLE.txt for more information. 1220 1221When the hmp-unsafe option is disabled (default), CPUs that are not 1222identical to the boot CPU will be parked and not used by Xen. 1223 1224### hpetbroadcast (x86) 1225> `= <boolean>` 1226 1227### hvm_debug (x86) 1228> `= <integer>` 1229 1230The specified value is a bit mask with the individual bits having the 1231following meaning: 1232 1233> Bit 0 - debug level 0 (unused at present) 1234> Bit 1 - debug level 1 (Control Register logging) 1235> Bit 2 - debug level 2 (VMX logging of MSR restores when context switching) 1236> Bit 3 - debug level 3 (unused at present) 1237> Bit 4 - I/O operation logging 1238> Bit 5 - vMMU logging 1239> Bit 6 - vLAPIC general logging 1240> Bit 7 - vLAPIC timer logging 1241> Bit 8 - vLAPIC interrupt logging 1242> Bit 9 - vIOAPIC logging 1243> Bit 10 - hypercall logging 1244> Bit 11 - MSR operation logging 1245 1246Recognized in debug builds of the hypervisor only. 1247 1248### hvm_fep (x86) 1249> `= <boolean>` 1250 1251> Default: `false` 1252 1253Allow use of the Forced Emulation Prefix in HVM guests, to allow emulation of 1254arbitrary instructions. 1255 1256This option is intended for development and testing purposes. 1257 1258*Warning* 1259As this feature opens up the instruction emulator to arbitrary 1260instruction from an HVM guest, don't use this in production system. No 1261security support is provided when this flag is set. 1262 1263### hvm_port80 (x86) 1264> `= <boolean>` 1265 1266> Default: `true` 1267 1268Specify whether guests are to be given access to physical port 80 1269(often used for debugging purposes), to override the DMI based 1270detection of systems known to misbehave upon accesses to that port. 1271 1272### idle_latency_factor (x86) 1273> `= <integer>` 1274 1275### ioapic_ack (x86) 1276> `= old | new` 1277 1278> Default: `new` unless directed-EOI is supported 1279 1280### iommu 1281 = List of [ <bool>, verbose, debug, force, required, quarantine, 1282 sharept, intremap, intpost, crash-disable, 1283 snoop, qinval, igfx, amd-iommu-perdev-intremap, 1284 dom0-{passthrough,strict} ] 1285 1286 All sub-options are boolean in nature. 1287 1288I/O Memory Memory Units perform a function similar to the CPU MMU (hence the 1289name), but typically exist as a discrete device, integrated as part of a PCI 1290Root Complex. The most common configuration is to have one IOMMU per package 1291(for on-die PCIe devices and directly attached PCIe lanes), and one IOMMU 1292covering the remaining I/O in the system. 1293 1294The functionality in an IOMMU commonly falls into two orthogonal categories: 1295 12961. DMA remapping which uses a pagetable-like hierarchical structure and maps 1297 I/O Virtual Addresses (DFNs - Device Frame Numbers in Xen's terminology) 1298 to System Physical Addresses (MFNs - Machine Frame Numbers in Xen's 1299 terminology). 1300 13012. Interrupt Remapping, which controls incoming Message Signalled Interrupt 1302 requests, including their routing to specific CPUs. 1303 1304IOMMU functionality can be used to provide a translation which the hardware 1305device driver isn't aware of (e.g. PCI Passthrough and a native driver inside 1306the guest) and/or to enforce fine-grained control over the memory and 1307interrupts which a device is attempting to access. 1308 1309By default, IOMMUs are configured for use if they are available. An overall 1310boolean (e.g. `iommu=no`) can override this and leave the IOMMUs disabled. 1311 1312* The `verbose` and `debug` booleans can be used to print additional 1313 diagnostic information. Neither are active by default. 1314 1315* The `force` and `required` booleans are synonymous and, when requested, 1316 will prevent Xen from booting if IOMMUs aren't discovered and enabled 1317 successfully. 1318 1319* The `quarantine` boolean can be used to control Xen's behavior when 1320 de-assigning devices from guests. If enabled (the default), Xen always 1321 quarantines such devices; they must be explicitly assigned back to Dom0 1322 before they can be used there again. If disabled, Xen will only 1323 quarantine devices the toolstack hass arranged for getting quarantined. 1324 1325* The `sharept` boolean controls whether the IOMMU pagetables are shared 1326 with the CPU-side HAP pagetables, or allocated separately. Sharing 1327 reduces the memory overhead, but doesn't work in combination with CPU-side 1328 pagefault-based features, e.g. dirty VRAM tracking when a PCI device is 1329 assigned. 1330 1331 Due to implementation choices, sharing pagetables doesn't work on AMD 1332 hardware, and this option is ignored. It is enabled by default on Intel 1333 systems. 1334 1335 This option is ignored on ARM, and the pagetables are always shared. 1336 1337* The `intremap` boolean controls the Interrupt Remapping sub-feature, and 1338 is active by default on compatible hardware. On x86 systems, the first 1339 generation of IOMMUs only supported DMA remapping, and Interrupt Remapping 1340 appeared in the second generation. 1341 1342 This option is only valid on x86. 1343 1344* The `intpost` boolean controls the Posted Interrupt sub-feature. In 1345 combination with APIC acceleration (VT-x APICV, SVM AVIC), the IOMMU can 1346 be configured to deliver interrupts from assigned PCI devices directly 1347 into the guest, without trapping out into hypervisor context. 1348 1349 This option depends on `intremap`, and is disabled by default due to some 1350 corner cases in the implementation which have yet to be resolved. 1351 1352 This option is only valid on x86, and only builds of Xen with HVM support. 1353 1354* The `crash-disable` boolean controls disabling IOMMU functionality (DMAR/IR/QI) 1355 before switching to a crash kernel. This option is inactive by default and 1356 is for compatibility with older kdump kernels only. Modern kernels copy 1357 all the necessary tables from the previous one following kexec which makes 1358 the transition transparent for them with IOMMU functions still on. 1359 1360The following options are specific to Intel VT-d hardware: 1361 1362* The `snoop` boolean controls the Snoop Control sub-feature, and is active 1363 by default on compatible hardware. 1364 1365 An incoming DMA request may specify _Snooped_ (query the CPU caches for 1366 the appropriate lines) or _Non-Snooped_ (don't query the CPU caches). 1367 _Non-Snooped_ accesses incur less latency, but behind-the-scenes 1368 hypervisor activity can invalidate the expectations of the device driver, 1369 and Snoop Control allows the hypervisor to force DMA requests to be 1370 _Snooped_ when they would otherwise not be. 1371 1372* The `qinval` boolean controls the Queued Invalidation sub-feature, and is 1373 active by default on compatible hardware. Queued Invalidation is a 1374 feature in second-generation IOMMUs and is a functional prerequisite for 1375 Interrupt Remapping. 1376 1377* The `igfx` boolean is active by default, and controls whether the IOMMU in 1378 front of an Intel Graphics Device is enabled or not. 1379 1380 It is intended as a debugging mechanism for graphics issues, and to be 1381 similar to Linux's `intel_iommu=igfx_off` option. If specifying `no-igfx` 1382 fixes anything, please report the problem. 1383 1384The following options are specific to AMD-Vi hardware: 1385 1386* The `amd-iommu-perdev-intremap` boolean controls whether the interrupt 1387 remapping table is per device (the default), or a single global table for 1388 the entire system. 1389 1390 Using a global table is not security supported as it allows all devices to 1391 impersonate each other as far as interrupts as concerned (see XSA-36), but 1392 it is a workaround for SP5100 Erratum 28. 1393 1394**WARNING: The `dom0-passthrough` and `dom0-strict` booleans are both 1395deprecated, and superseded by _dom0-iommu={passthrough,strict}_ respectively - 1396using both the old and new command line options in combination is undefined.** 1397 1398### iommu_dev_iotlb_timeout 1399> `= <integer>` 1400 1401> Default: `1000` 1402 1403Specify the timeout of the device IOTLB invalidation in milliseconds. 1404By default, the timeout is 1000 ms. When you see error 'Queue invalidate 1405wait descriptor timed out', try increasing this value. 1406 1407### iommu_inclusive_mapping 1408> `= <boolean>` 1409 1410**WARNING: This command line option is deprecated, and superseded by 1411_dom0-iommu=map-inclusive_ - using both options in combination is undefined.** 1412 1413### irq_ratelimit (x86) 1414> `= <integer>` 1415 1416### irq_vector_map (x86) 1417### ivrs_hpet[`<hpet>`] (AMD) 1418> `=[<seg>:]<bus>:<device>.<func>` 1419 1420Force the use of `[<seg>:]<bus>:<device>.<func>` as device ID of HPET 1421`<hpet>` instead of the one specified by the IVHD sub-tables of the IVRS 1422ACPI table. 1423 1424### ivrs_ioapic[`<ioapic>`] (AMD) 1425> `=[<seg>:]<bus>:<device>.<func>` 1426 1427Force the use of `[<seg>:]<bus>:<device>.<func>` as device ID of IO-APIC 1428`<ioapic>` instead of the one specified by the IVHD sub-tables of the IVRS 1429ACPI table. 1430 1431### lapic (x86) 1432> `= <boolean>` 1433 1434Force the use of use of the local APIC on a uniprocessor system, even 1435if left disabled by the BIOS. 1436 1437### lapic_timer_c2_ok (x86) 1438> `= <boolean>` 1439 1440### ler (x86) 1441> `= <boolean>` 1442 1443> Default: false 1444 1445This option is intended for debugging purposes only. Enable MSR_DEBUGCTL.LBR 1446in hypervisor context to be able to dump the Last Interrupt/Exception To/From 1447record with other registers. 1448 1449### loglvl 1450> `= <level>[/<rate-limited level>]` where level is `none | error | warning | info | debug | all` 1451 1452> Default: `loglvl=warning` 1453 1454> Can be modified at runtime 1455 1456Set the logging level for Xen. Any log message with equal more more 1457importance will be printed. 1458 1459The optional `<rate-limited level>` option instructs which severities 1460should be rate limited. 1461 1462### low_crashinfo 1463> `= none | min | all` 1464 1465> Default: `none` if not specified at all, or to `min` if **low_crashinfo** is present without qualification. 1466 1467This option is only useful for hosts with a 32bit dom0 kernel, wishing 1468to use kexec functionality in the case of a crash. It represents 1469which data structures should be deliberately allocated in low memory, 1470so the crash kernel may find find them. Should be used in combination 1471with **crashinfo_maxaddr**. 1472 1473### low_mem_virq_limit 1474> `= <size>` 1475 1476> Default: `64M` 1477 1478Specify the threshold below which Xen will inform dom0 that the quantity of 1479free memory is getting low. Specifying `0` will disable this notification. 1480 1481### maxcpus (x86) 1482> `= <integer>` 1483 1484Specify the maximum number of CPUs that should be brought up. 1485 1486This option is ignored in **pv-shim** mode. 1487 1488### max_cstate (x86) 1489> `= <integer>[,<integer>]` 1490 1491Specify the deepest C-state CPUs are permitted to be placed in, and 1492optionally the maximum sub C-state to be used used. The latter only applies 1493to the highest permitted C-state. 1494 1495### max_gsi_irqs (x86) 1496> `= <integer>` 1497 1498Specifies the number of interrupts to be use for pin (IO-APIC or legacy PIC) 1499based interrupts. Any higher IRQs will be available for use via PCI MSI. 1500 1501### max_lpi_bits (arm) 1502> `= <integer>` 1503 1504Specifies the number of ARM GICv3 LPI interrupts to allocate on the host, 1505presented as the number of bits needed to encode it. This must be at least 150614 and not exceed 32, and each LPI requires one byte (configuration) and 1507one pending bit to be allocated. 1508Defaults to 20 bits (to cover at most 1048576 interrupts). 1509 1510### mce (x86) 1511> `= <integer>` 1512 1513### mce_fb (Intel) 1514> `= <integer>` 1515 1516### mce_verbosity (x86) 1517> `= verbose` 1518 1519Specify verbose machine check output. 1520 1521### mem (x86) 1522> `= <size>` 1523 1524Specify the maximum address of physical RAM. Any RAM beyond this 1525limit is ignored by Xen. 1526 1527### memop-max-order 1528> `= [<domU>][,[<ctldom>][,[<hwdom>][,<ptdom>]]]` 1529 1530> x86 default: `9,18,12,12` 1531> ARM default: `9,18,10,10` 1532 1533Change the maximum order permitted for allocation (or allocation-like) 1534requests issued by the various kinds of domains (in this order: 1535ordinary DomU, control domain, hardware domain, and - when supported 1536by the platform - DomU with pass-through device assigned). 1537 1538### mmcfg (x86) 1539> `= <boolean>[,amd-fam10]` 1540 1541> Default: `1` 1542 1543Specify if the MMConfig space should be enabled. 1544 1545### mmio-relax (x86) 1546> `= <boolean> | all` 1547 1548> Default: `false` 1549 1550By default, domains may not create cached mappings to MMIO regions. 1551This option relaxes the check for Domain 0 (or when using `all`, all PV 1552domains), to permit the use of cacheable MMIO mappings. 1553 1554### msi (x86) 1555> `= <boolean>` 1556 1557> Default: `true` 1558 1559Force Xen to (not) use PCI-MSI, even if ACPI FADT says otherwise. 1560 1561### mtrr.show (x86) 1562> `= <boolean>` 1563 1564> Default: `false` 1565 1566Print boot time MTRR state. 1567 1568### mwait-idle (x86) 1569> `= <boolean>` 1570 1571> Default: `true` 1572 1573Use the MWAIT idle driver (with model specific C-state knowledge) instead 1574of the ACPI based one. 1575 1576### nmi (x86) 1577> `= ignore | dom0 | fatal` 1578 1579> Default: `fatal` for a debug build, or `dom0` for a non-debug build 1580 1581Specify what Xen should do in the event of an NMI parity or I/O error. 1582`ignore` discards the error; `dom0` causes Xen to report the error to 1583dom0, while 'fatal' causes Xen to print diagnostics and then hang. 1584 1585### noapic (x86) 1586 1587Instruct Xen to ignore any IOAPICs that are present in the system, and 1588instead continue to use the legacy PIC. This is _not_ recommended with 1589pvops type kernels. 1590 1591Because responsibility for APIC setup is shared between Xen and the 1592domain 0 kernel this option is automatically propagated to the domain 15930 command line. 1594 1595### invpcid (x86) 1596> `= <boolean>` 1597 1598> Default: `true` 1599 1600By default, Xen will use the INVPCID instruction for TLB management if 1601it is available. This option can be used to cause Xen to fall back to 1602older mechanisms, which are generally slower. 1603 1604### noirqbalance (x86) 1605> `= <boolean>` 1606 1607Disable software IRQ balancing and affinity. This can be used on 1608systems such as Dell 1850/2850 that have workarounds in hardware for 1609IRQ routing issues. 1610 1611### nolapic (x86) 1612> `= <boolean>` 1613 1614> Default: `false` 1615 1616Ignore the local APIC on a uniprocessor system, even if enabled by the 1617BIOS. 1618 1619### no-real-mode (x86) 1620> `= <boolean>` 1621 1622Do not execute real-mode bootstrap code when booting Xen. This option 1623should not be used except for debugging. It will effectively disable 1624the **vga** option, which relies on real mode to set the video mode. 1625 1626### noreboot 1627> `= <boolean>` 1628 1629Do not automatically reboot after an error. This is useful for 1630catching debug output. Defaults to automatically reboot after 5 1631seconds. 1632 1633### nosmp (x86) 1634> `= <boolean>` 1635 1636Disable SMP support. No secondary processors will be booted. 1637Defaults to booting secondary processors. 1638 1639This option is ignored in **pv-shim** mode. 1640 1641### nr_irqs (x86) 1642> `= <integer>` 1643 1644### numa (x86) 1645> `= on | off | fake=<integer> | noacpi` 1646 1647> Default: `on` 1648 1649### pci 1650 = List of [ serr=<bool>, perr=<bool> ] 1651 1652 Default: Signaling left as set by firmware. 1653 1654Override the firmware settings, and explicitly enable or disable the 1655signalling of PCI System and Parity errors. 1656 1657### pci-phantom 1658> `=[<seg>:]<bus>:<device>,<stride>` 1659 1660Mark a group of PCI devices as using phantom functions without actually 1661advertising so, so the IOMMU can create translation contexts for them. 1662 1663All numbers specified must be hexadecimal ones. 1664 1665This option can be specified more than once (up to 8 times at present). 1666 1667### pcid (x86) 1668> `= <boolean> | xpti=<bool>` 1669 1670> Default: `xpti` 1671 1672> Can be modified at runtime (change takes effect only for domains created 1673 afterwards) 1674 1675If available, control usage of the PCID feature of the processor for 167664-bit pv-domains. PCID can be used either for no domain at all (`false`), 1677for all of them (`true`), only for those subject to XPTI (`xpti`) or for 1678those not subject to XPTI (`no-xpti`). The feature is used only in case 1679INVPCID is supported and not disabled via `invpcid=false`. 1680 1681### pku (x86) 1682> `= <boolean>` 1683 1684> Default: `true` 1685 1686Flag to enable Memory Protection Keys. 1687 1688The protection-key feature provides an additional mechanism by which IA-32e 1689paging controls access to usermode addresses. 1690 1691### ple_gap 1692> `= <integer>` 1693 1694### ple_window (Intel) 1695> `= <integer>` 1696 1697### psr (Intel) 1698> `= List of ( cmt:<boolean> | rmid_max:<integer> | cat:<boolean> | cos_max:<integer> | cdp:<boolean> )` 1699 1700> Default: `psr=cmt:0,rmid_max:255,cat:0,cos_max:255,cdp:0` 1701 1702Platform Shared Resource(PSR) Services. Intel Haswell and later server 1703platforms offer information about the sharing of resources. 1704 1705To use the PSR monitoring service for a certain domain, a Resource 1706Monitoring ID(RMID) is used to bind the domain to corresponding shared 1707resource. RMID is a hardware-provided layer of abstraction between software 1708and logical processors. 1709 1710To use the PSR cache allocation service for a certain domain, a capacity 1711bitmasks(CBM) is used to bind the domain to corresponding shared resource. 1712CBM represents cache capacity and indicates the degree of overlap and isolation 1713between domains. In hypervisor a Class of Service(COS) ID is allocated for each 1714unique CBM. 1715 1716The following resources are available: 1717 1718* Cache Monitoring Technology (Haswell and later). Information regarding the 1719 L3 cache occupancy. 1720 * `cmt` instructs Xen to enable/disable Cache Monitoring Technology. 1721 * `rmid_max` indicates the max value for rmid. 1722* Memory Bandwidth Monitoring (Broadwell and later). Information regarding the 1723 total/local memory bandwidth. Follow the same options with Cache Monitoring 1724 Technology. 1725 1726* Cache Allocation Technology (Broadwell and later). Information regarding 1727 the cache allocation. 1728 * `cat` instructs Xen to enable/disable Cache Allocation Technology. 1729 * `cos_max` indicates the max value for COS ID. 1730* Code and Data Prioritization Technology (Broadwell and later). Information 1731 regarding the code cache and the data cache allocation. CDP is based on CAT. 1732 * `cdp` instructs Xen to enable/disable Code and Data Prioritization. Note 1733 that `cos_max` of CDP is a little different from `cos_max` of CAT. With 1734 CDP, one COS will corespond two CBMs other than one with CAT, due to the 1735 sum of CBMs is fixed, that means actual `cos_max` in use will automatically 1736 reduce to half when CDP is enabled. 1737 1738### pv 1739 = List of [ 32=<bool> ] 1740 1741 Applicability: x86 1742 1743Controls for aspects of PV guest support. 1744 1745* The `32` boolean controls whether 32bit PV guests can be created. It 1746 defaults to `true`, and is ignored when `CONFIG_PV32` is compiled out. 1747 1748 32bit PV guests are incompatible with CET Shadow Stacks. If Xen is using 1749 shadow stacks, this option will be overridden to `false`. Backwards 1750 compatibility can be maintained with the `pv-shim` mechanism. 1751 1752### pv-linear-pt (x86) 1753> `= <boolean>` 1754 1755> Default: `true` 1756 1757Only available if Xen is compiled with `CONFIG_PV_LINEAR_PT` support 1758enabled. 1759 1760Allow PV guests to have pagetable entries pointing to other pagetables 1761of the same level (i.e., allowing L2 PTEs to point to other L2 pages). 1762This technique is often called "linear pagetables", and is sometimes 1763used to allow operating systems a simple way to consistently map the 1764current process's pagetables into its own virtual address space. 1765 1766Linux and MiniOS don't use this technique. NetBSD and Novell Netware 1767do; there may be other custom operating systems which do. If you're 1768certain you don't plan on having PV guests which use this feature, 1769turning it off can reduce the attack surface. 1770 1771### pv-l1tf (x86) 1772> `= List of [ <bool>, dom0=<bool>, domu=<bool> ]` 1773 1774> Default: `false` on believed-unaffected hardware, or in pv-shim mode. 1775> `domu` on believed-affected hardware. 1776 1777Mitigations for L1TF / XSA-273 / CVE-2018-3620 for PV guests. 1778 1779For backwards compatibility, we may not alter an architecturally-legitimate 1780pagetable entry a PV guest chooses to write. We can however force such a 1781guest into shadow mode so that Xen controls the PTEs which are reachable by 1782the CPU pagewalk. 1783 1784Shadowing is performed at the point where a PV guest first tries to write an 1785L1TF-vulnerable PTE. Therefore, a PV guest kernel which has been updated with 1786its own L1TF mitigations will not trigger shadow mode if it is well behaved. 1787 1788If `CONFIG_SHADOW_PAGING` is not compiled in, this mitigation instead crashes 1789the guest when an L1TF-vulnerable PTE is written, which still allows updated, 1790well-behaved PV guests to run, despite Shadow being compiled out. 1791 1792In the pv-shim case, Shadow is expected to be compiled out, and a malicious 1793guest kernel can only leak data from the shim Xen, rather than the host Xen. 1794 1795### pv-shim (x86) 1796> `= <boolean>` 1797 1798> Default: `false` 1799 1800This option is intended for use by a toolstack, when choosing to run a PV 1801guest compatibly inside an HVM container. 1802 1803In this mode, the kernel and initrd passed as modules to the hypervisor are 1804constructed into a plain unprivileged PV domain. 1805 1806### rcu-idle-timer-period-ms 1807> `= <integer>` 1808 1809> Default: `10` 1810 1811How frequently a CPU which has gone idle, but with pending RCU callbacks, 1812should be woken up to check if the grace period has completed, and the 1813callbacks are safe to be executed. Expressed in milliseconds; maximum is 1814100, and it can't be 0. 1815 1816### reboot (x86) 1817> `= t[riple] | k[bd] | a[cpi] | p[ci] | P[ower] | e[fi] | n[o] [, [w]arm | [c]old]` 1818 1819> Default: `0` 1820 1821Specify the host reboot method. 1822 1823`warm` instructs Xen to not set the cold reboot flag. 1824 1825`cold` instructs Xen to set the cold reboot flag. 1826 1827`no` instructs Xen to not automatically reboot after panics or crashes. 1828 1829`triple` instructs Xen to reboot the host by causing a triple fault. 1830 1831`kbd` instructs Xen to reboot the host via the keyboard controller. 1832 1833`acpi` instructs Xen to reboot the host using RESET_REG in the ACPI FADT. 1834 1835`pci` instructs Xen to reboot the host using PCI reset register (port CF9). 1836 1837`Power` instructs Xen to power-cycle the host using PCI reset register (port CF9). 1838 1839'efi' instructs Xen to reboot using the EFI reboot call (in EFI mode by 1840 default it will use that method first). 1841 1842`xen` instructs Xen to reboot using Xen's SCHEDOP hypercall (this is the default 1843when running nested Xen) 1844 1845### rmrr 1846> `= start<-end>=[s1]bdf1[,[s1]bdf2[,...]];start<-end>=[s2]bdf1[,[s2]bdf2[,...]]` 1847 1848Define RMRR units that are missing from ACPI table along with device they 1849belong to and use them for 1:1 mapping. End addresses can be omitted and one 1850page will be mapped. The ranges are inclusive when start and end are specified. 1851If segment of the first device is not specified, segment zero will be used. 1852If other segments are not specified, first device segment will be used. 1853If a segment is specified for other than the first device and it does not match 1854the one specified for the first one, an error will be reported. 1855 1856'start' and 'end' values are page numbers (not full physical addresses), 1857in hexadecimal format (can optionally be preceded by "0x"). 1858 1859Usage example: If device 0:0:1d.0 requires one page (0xd5d45) to be 1860reserved, and device 0:0:1a.0 requires three pages (0xd5d46 thru 0xd5d48) 1861to be reserved, one usage would be: 1862 1863rmrr=d5d45=0:0:1d.0;0xd5d46-0xd5d48=0:0:1a.0 1864 1865Note: grub2 requires to escape or use quotations if special characters are used, 1866namely ';', refer to the grub2 documentation if multiple ranges are specified. 1867 1868### ro-hpet (x86) 1869> `= <boolean>` 1870 1871> Default: `true` 1872 1873Map the HPET page as read only in Dom0. If disabled the page will be mapped 1874with read and write permissions. 1875 1876### sched 1877> `= credit | credit2 | arinc653 | rtds | null` 1878 1879> Default: `sched=credit` 1880 1881Choose the default scheduler. 1882 1883### sched_credit2_max_cpus_runqueue 1884> `= <integer>` 1885 1886> Default: `16` 1887 1888Defines how many CPUs will be put, at most, in each Credit2 runqueue. 1889 1890Runqueues are still arranged according to the host topology (and following 1891what indicated by the 'credit2_runqueue' parameter). But we also have a cap 1892to the number of CPUs that share each runqueues. 1893 1894A value that is a submultiple of the number of online CPUs is recommended, 1895as that would likely produce a perfectly balanced runqueue configuration. 1896 1897### sched_credit2_migrate_resist 1898> `= <integer>` 1899 1900### sched_credit_tslice_ms 1901> `= <integer>` 1902 1903Set the timeslice of the credit1 scheduler, in milliseconds. The 1904default is 30ms. Reasonable values may include 10, 5, or even 1 for 1905very latency-sensitive workloads. 1906 1907### sched-gran (x86) 1908> `= cpu | core | socket` 1909 1910> Default: `sched-gran=cpu` 1911 1912Set the scheduling granularity. In case the granularity is larger than 1 (e.g. 1913`core`on a SMT-enabled system, or `socket`) multiple vcpus are assigned 1914statically to a "scheduling unit" which will then be subject to scheduling. 1915This assignment of vcpus to scheduling units is fixed. 1916 1917`cpu`: Vcpus will be scheduled individually on single cpus (e.g. a 1918hyperthread using x86/Intel terminology) 1919 1920`core`: As many vcpus as there are cpus on a physical core are scheduled 1921together on a physical core. 1922 1923`socket`: As many vcpus as there are cpus on a physical sockets are scheduled 1924together on a physical socket. 1925 1926Note: a value other than `cpu` will result in rejecting a runtime modification 1927attempt of the "smt" setting. 1928 1929Note: for AMD x86 processors before Fam17 the terminology in the official data 1930sheets is different: a cpu is named "core" and multiple "cores" are running 1931in the same "compute unit". As from Fam17 on AMD is using the same names as 1932Intel ("thread" and "core") the topology levels are named "cpu", "core" and 1933"socket" even on older AMD processors. 1934 1935### sched_ratelimit_us 1936> `= <integer>` 1937 1938In order to limit the rate of context switching, set the minimum 1939amount of time that a vcpu can be scheduled for before preempting it, 1940in microseconds. The default is 1000us (1ms). Setting this to 0 1941disables it altogether. 1942 1943### sched_smt_power_savings 1944> `= <boolean>` 1945 1946Normally Xen will try to maximize performance and cache utilization by 1947spreading out vcpus across as many different divisions as possible 1948(i.e, numa nodes, sockets, cores threads, &c). This often maximizes 1949throughput, but also maximizes energy usage, since it reduces the 1950depth to which a processor can sleep. 1951 1952This option inverts the logic, so that the scheduler in effect tries 1953to keep the vcpus on the smallest amount of silicon possible; i.e., 1954first fill up sibling threads, then sibling cores, then sibling 1955sockets, &c. This will reduce performance somewhat, particularly on 1956systems with hyperthreading enabled, but should reduce power by 1957enabling more sockets and cores to go into deeper sleep states. 1958 1959### scrub-domheap 1960> `= <boolean>` 1961 1962> Default: `false` 1963 1964Scrub domains' freed pages. This is a safety net against a (buggy) domain 1965accidentally leaking secrets by releasing pages without proper sanitization. 1966 1967### serial_tx_buffer 1968> `= <size>` 1969 1970> Default: `16kB` 1971 1972Set the serial transmit buffer size. 1973 1974### serrors (ARM) 1975> `= diverse | panic` 1976 1977> Default: `diverse` 1978 1979This parameter is provided to administrators to determine how the hypervisor 1980handles SErrors. 1981 1982* `diverse`: 1983 The hypervisor will distinguish guest SErrors from hypervisor SErrors: 1984 - The guest generated SErrors will be forwarded to the currently running 1985 guest. 1986 - The hypervisor generated SErrors will cause the whole system to crash 1987 1988* `panic`: 1989 All SErrors will cause the whole system to crash. This option should only 1990 be used if you trust all your guests and/or they don't have a gadget (e.g. 1991 device) to generate SErrors in normal run. 1992 1993### shim_mem (x86) 1994> `= List of ( min:<size> | max:<size> | <size> )` 1995 1996Set the amount of memory that xen-shim uses. Only has effect if pv-shim mode is 1997enabled. Note that this value accounts for the memory used by the shim itself 1998plus the free memory slack given to the shim for runtime allocations. 1999 2000* `min:<size>` specifies the minimum amount of memory. Ignored if greater 2001 than max. 2002* `max:<size>` specifies the maximum amount of memory. 2003* `<size>` specifies the exact amount of memory. Overrides both min and max. 2004 2005By default, the amount of free memory slack given to the shim for runtime usage 2006is 1MB. 2007 2008### smap (x86) 2009> `= <boolean> | hvm` 2010 2011> Default: `true` unless running in pv-shim mode on AMD or Hygon hardware 2012 2013Flag to enable Supervisor Mode Access Prevention 2014Use `smap=hvm` to allow SMAP use by HVM guests only. 2015 2016In PV shim mode on AMD or Hygon hardware due to significant performance impact 2017in some cases and generally lower security risk the option defaults to false. 2018 2019### smep (x86) 2020> `= <boolean> | hvm` 2021 2022> Default: `true` unless running in pv-shim mode on AMD or Hygon hardware 2023 2024Flag to enable Supervisor Mode Execution Protection 2025Use `smep=hvm` to allow SMEP use by HVM guests only. 2026 2027In PV shim mode on AMD or Hygon hardware due to significant performance impact 2028in some cases and generally lower security risk the option defaults to false. 2029 2030### smt (x86) 2031> `= <boolean>` 2032 2033Default: `true` 2034 2035Control bring up of multiple hyper-threads per CPU core. 2036 2037### snb_igd_quirk 2038> `= <boolean> | cap | <integer>` 2039 2040A true boolean value enables legacy behavior (1s timeout), while `cap` 2041enforces the maximum theoretically necessary timeout of 670ms. Any number 2042is being interpreted as a custom timeout in milliseconds. Zero or boolean 2043false disable the quirk workaround, which is also the default. 2044 2045### spec-ctrl (Arm) 2046> `= List of [ ssbd=force-disable|runtime|force-enable ]` 2047 2048Controls for speculative execution sidechannel mitigations. 2049 2050The option `ssbd=` is used to control the state of Speculative Store 2051Bypass Disable (SSBD) mitigation. 2052 2053* `ssbd=force-disable` will keep the mitigation permanently off. The guest 2054will not be able to control the state of the mitigation. 2055* `ssbd=runtime` will always turn on the mitigation when running in the 2056hypervisor context. The guest will be to turn on/off the mitigation for 2057itself by using the firmware interface `ARCH_WORKAROUND_2`. 2058* `ssbd=force-enable` will keep the mitigation permanently on. The guest will 2059not be able to control the state of the mitigation. 2060 2061By default SSBD will be mitigated at runtime (i.e `ssbd=runtime`). 2062 2063### spec-ctrl (x86) 2064> `= List of [ <bool>, xen=<bool>, {pv,hvm,msr-sc,rsb,md-clear}=<bool>, 2065> bti-thunk=retpoline|lfence|jmp, {ibrs,ibpb,ssbd,eager-fpu, 2066> l1d-flush,branch-harden,srb-lock}=<bool> ]` 2067 2068Controls for speculative execution sidechannel mitigations. By default, Xen 2069will pick the most appropriate mitigations based on compiled in support, 2070loaded microcode, and hardware details, and will virtualise appropriate 2071mitigations for guests to use. 2072 2073**WARNING: Any use of this option may interfere with heuristics. Use with 2074extreme care.** 2075 2076An overall boolean value, `spec-ctrl=no`, can be specified to turn off all 2077mitigations, including pieces of infrastructure used to virtualise certain 2078mitigation features for guests. This also includes settings which `xpti`, 2079`smt`, `pv-l1tf`, `tsx` control, unless the respective option(s) have been 2080specified earlier on the command line. 2081 2082Alternatively, a slightly more restricted `spec-ctrl=no-xen` can be used to 2083turn off all of Xen's mitigations, while leaving the virtualisation support 2084in place for guests to use. 2085 2086Use of a positive boolean value for either of these options is invalid. 2087 2088The booleans `pv=`, `hvm=`, `msr-sc=`, `rsb=` and `md-clear=` offer fine 2089grained control over the alternative blocks used by Xen. These impact Xen's 2090ability to protect itself, and Xen's ability to virtualise support for guests 2091to use. 2092 2093* `pv=` and `hvm=` offer control over all suboptions for PV and HVM guests 2094 respectively. 2095* `msr-sc=` offers control over Xen's support for manipulating `MSR_SPEC_CTRL` 2096 on entry and exit. These blocks are necessary to virtualise support for 2097 guests and if disabled, guests will be unable to use IBRS/STIBP/SSBD/etc. 2098* `rsb=` offers control over whether to overwrite the Return Stack Buffer / 2099 Return Address Stack on entry to Xen. 2100* `md-clear=` offers control over whether to use VERW to flush 2101 microarchitectural buffers on idle and exit from Xen. *Note: For 2102 compatibility with development versions of this fix, `mds=` is also accepted 2103 on Xen 4.12 and earlier as an alias. Consult vendor documentation in 2104 preference to here.* 2105 2106If Xen was compiled with INDIRECT_THUNK support, `bti-thunk=` can be used to 2107select which of the thunks gets patched into the `__x86_indirect_thunk_%reg` 2108locations. The default thunk is `retpoline` (generally preferred for Intel 2109hardware), with the alternatives being `jmp` (a `jmp *%reg` gadget, minimal 2110overhead), and `lfence` (an `lfence; jmp *%reg` gadget, preferred for AMD). 2111 2112On hardware supporting IBRS (Indirect Branch Restricted Speculation), the 2113`ibrs=` option can be used to force or prevent Xen using the feature itself. 2114If Xen is not using IBRS itself, functionality is still set up so IBRS can be 2115virtualised for guests. 2116 2117On hardware supporting IBPB (Indirect Branch Prediction Barrier), the `ibpb=` 2118option can be used to force (the default) or prevent Xen from issuing branch 2119prediction barriers on vcpu context switches. 2120 2121On hardware supporting SSBD (Speculative Store Bypass Disable), the `ssbd=` 2122option can be used to force or prevent Xen using the feature itself. On AMD 2123hardware, this is a global option applied at boot, and not virtualised for 2124guest use. On Intel hardware, the feature is virtualised for guests, 2125independently of Xen's choice of setting. 2126 2127On all hardware, the `eager-fpu=` option can be used to force or prevent Xen 2128from using fully eager FPU context switches. This is currently implemented as 2129a global control. By default, Xen will choose to use fully eager context 2130switches on hardware believed to speculate past #NM exceptions. 2131 2132On hardware supporting L1D_FLUSH, the `l1d-flush=` option can be used to force 2133or prevent Xen from issuing an L1 data cache flush on each VMEntry. 2134Irrespective of Xen's setting, the feature is virtualised for HVM guests to 2135use. By default, Xen will enable this mitigation on hardware believed to be 2136vulnerable to L1TF. 2137 2138If Xen is compiled with `CONFIG_SPECULATIVE_HARDEN_BRANCH`, the 2139`branch-harden=` boolean can be used to force or prevent Xen from using 2140speculation barriers to protect selected conditional branches. By default, 2141Xen will enable this mitigation. 2142 2143On hardware supporting SRBDS_CTRL, the `srb-lock=` option can be used to force 2144or prevent Xen from protect the Special Register Buffer from leaking stale 2145data. By default, Xen will enable this mitigation, except on parts where MDS 2146is fixed and TAA is fixed/mitigated (in which case, there is believed to be no 2147way for an attacker to obtain the stale data). 2148 2149### sync_console 2150> `= <boolean>` 2151 2152> Default: `false` 2153 2154Flag to force synchronous console output. Useful for debugging, but 2155not suitable for production environments due to incurred overhead. 2156 2157### tboot (x86) 2158> `= 0x<phys_addr>` 2159 2160Specify the physical address of the trusted boot shared page. 2161 2162### tbuf_size 2163> `= <integer>` 2164 2165Specify the per-cpu trace buffer size in pages. 2166 2167### tdt (x86) 2168> `= <boolean>` 2169 2170> Default: `true` 2171 2172Flag to enable TSC deadline as the APIC timer mode. 2173 2174### tevt_mask 2175> `= <integer>` 2176 2177Specify a mask for Xen event tracing. This allows Xen tracing to be 2178enabled at boot. Refer to the xentrace(8) documentation for a list of 2179valid event mask values. In order to enable tracing, a buffer size (in 2180pages) must also be specified via the tbuf_size parameter. 2181 2182### tickle_one_idle_cpu 2183> `= <boolean>` 2184 2185### timer_slop 2186> `= <integer>` 2187 2188### tsc (x86) 2189> `= unstable | skewed | stable:socket` 2190 2191### tsx 2192 = <bool> 2193 2194 Applicability: x86 2195 Default: false on parts vulnerable to TAA, true otherwise 2196 2197Controls for the use of Transactional Synchronization eXtensions. 2198 2199On Intel parts released in Q3 2019 (with updated microcode), and future parts, 2200a control has been introduced which allows TSX to be turned off. 2201 2202On systems with the ability to turn TSX off, this boolean offers system wide 2203control of whether TSX is enabled or disabled. 2204 2205On parts vulnerable to CVE-2019-11135 / TSX Asynchronous Abort, the following 2206logic applies: 2207 2208 * An explicit `tsx=` choice is honoured, even if it is `true` and would 2209 result in a vulnerable system. 2210 2211 * When no explicit `tsx=` choice is given, parts vulnerable to TAA will be 2212 mitigated by disabling TSX, as this is the lowest overhead option. 2213 2214 * If the use of TSX is important, the more expensive TAA mitigations can be 2215 opted in to with `smt=0 spec-ctrl=md-clear`, at which point TSX will remain 2216 active by default. 2217 2218### ucode 2219> `= List of [ <integer> | scan=<bool>, nmi=<bool> ]` 2220 2221 Applicability: x86 2222 Default: `nmi` 2223 2224Controls for CPU microcode loading. For early loading, this parameter can 2225specify how and where to find the microcode update blob. For late loading, 2226this parameter specifies if the update happens within a NMI handler. 2227 2228'integer' specifies the CPU microcode update blob module index. When positive, 2229this specifies the n-th module (in the GrUB entry, zero based) to be used 2230for updating CPU micrcode. When negative, counting starts at the end of 2231the modules in the GrUB entry (so with the blob commonly being last, 2232one could specify `ucode=-1`). Note that the value of zero is not valid 2233here (entry zero, i.e. the first module, is always the Dom0 kernel 2234image). Note further that use of this option has an unspecified effect 2235when used with xen.efi (there the concept of modules doesn't exist, and 2236the blob gets specified via the `ucode=<filename>` config file/section 2237entry; see [EFI configuration file description](efi.html)). 2238 2239'scan' instructs the hypervisor to scan the multiboot images for an cpio 2240image that contains microcode. Depending on the platform the blob with the 2241microcode in the cpio name space must be: 2242 - on Intel: kernel/x86/microcode/GenuineIntel.bin 2243 - on AMD : kernel/x86/microcode/AuthenticAMD.bin 2244When using xen.efi, the `ucode=<filename>` config file setting takes 2245precedence over `scan`. 2246 2247'nmi' determines late loading is performed in NMI handler or just in 2248stop_machine context. In NMI handler, even NMIs are blocked, which is 2249considered safer. The default value is `true`. 2250 2251### unrestricted_guest (Intel) 2252> `= <boolean>` 2253 2254### vcpu_migration_delay 2255> `= <integer>` 2256 2257> Default: `0` 2258 2259Specify a delay, in microseconds, between migrations of a VCPU between 2260PCPUs when using the credit1 scheduler. This prevents rapid fluttering 2261of a VCPU between CPUs, and reduces the implicit overheads such as 2262cache-warming. 1ms (1000) has been measured as a good value. 2263 2264### vesa-map 2265> `= <integer>` 2266 2267### vesa-mtrr 2268> `= <integer>` 2269 2270### vesa-ram 2271> `= <integer>` 2272 2273### vga 2274> `= ( ask | current | text-80x<rows> | gfx-<width>x<height>x<depth> | mode-<mode> )[,keep]` 2275 2276`ask` causes Xen to display a menu of available modes and request the 2277user to choose one of them. 2278 2279`current` causes Xen to use the graphics adapter in its current state, 2280without further setup. 2281 2282`text-80x<rows>` instructs Xen to set up text mode. Valid values for 2283`<rows>` are `25, 28, 30, 34, 43, 50, 80` 2284 2285`gfx-<width>x<height>x<depth>` instructs Xen to set up graphics mode 2286with the specified width, height and depth. 2287 2288`mode-<mode>` instructs Xen to use a specific mode, as shown with the 2289`ask` option. (N.B menu modes are displayed in hex, so `<mode>` 2290should be a hexadecimal number) 2291 2292The optional `keep` parameter causes Xen to continue using the vga 2293console even after dom0 has been started. The default behaviour is to 2294relinquish control to dom0. 2295 2296### viridian-spinlock-retry-count (x86) 2297> `= <integer>` 2298 2299> Default: `2047` 2300 2301Specify the maximum number of retries before an enlightened Windows 2302guest will notify Xen that it has failed to acquire a spinlock. 2303 2304### viridian-version (x86) 2305> `= [<major>],[<minor>],[<build>]` 2306 2307> Default: `6,0,0x1772` 2308 2309<major>, <minor> and <build> must be integers. The values will be 2310encoded in guest CPUID 0x40000002 if viridian enlightenments are enabled. 2311 2312### vpid (Intel) 2313> `= <boolean>` 2314 2315> Default: `true` 2316 2317Use Virtual Processor ID support if available. This prevents the need for TLB 2318flushes on VM entry and exit, increasing performance. 2319 2320### vpmu (x86) 2321 = List of [ <bool>, bts, ipc, arch, rtm-abort=<bool> ] 2322 2323 Applicability: x86. Default: false 2324 2325Controls for Performance Monitoring Unit virtualisation. 2326 2327Performance monitoring facilities tend to be very hardware specific, and 2328provide access to a wealth of low level processor information. 2329 2330* An overall boolean can be used to enable or disable vPMU support. vPMU is 2331 disabled by default. 2332 2333 When enabled, guests have full access to all performance counter settings, 2334 including model specific functionality. This is a superset of the 2335 functionality offered by `ipc` and/or `arch`, but a subset of the 2336 functionality offered by `bts`. 2337 2338 Xen's watchdog functionality is implemented using performance counters. 2339 As a result, use of the **watchdog** option will override and disable 2340 vPMU. 2341 2342* The `bts` option enables performance monitoring, and permits additional 2343 access to the Branch Trace Store controls. BTS is an Intel feature where 2344 the processor can write data into a buffer whenever a branch occurs. 2345 However, as this feature isn't virtualised, a misconfiguration by the 2346 guest can lock the entire system up. 2347 2348* The `ipc` option allows access to the most minimal set of counters 2349 possible: instructions, cycles, and reference cycles. These can be used 2350 to calculate instructions per cycle (IPC). 2351 2352* The `arch` option allows access to the pre-defined architectural events. 2353 2354* The `rtm-abort` boolean controls a trade-off between working Restricted 2355 Transactional Memory, and working performance counters. 2356 2357 All processors released to date (Q1 2019) supporting Transactional Memory 2358 Extensions suffer an erratum which has been addressed in microcode. 2359 2360 Processors based on the Skylake microarchitecture with up-to-date 2361 microcode internally use performance counter 3 to work around the erratum. 2362 A consequence is that the counter gets reprogrammed whenever an `XBEGIN` 2363 instruction is executed. 2364 2365 An alternative mode exists where PCR3 behaves as before, at the cost of 2366 `XBEGIN` unconditionally aborting. Enabling `rtm-abort` mode will 2367 activate this alternative mode. 2368 2369*Warning:* 2370As the virtualisation is not 100% safe, don't use the vpmu flag on 2371production systems (see https://xenbits.xen.org/xsa/advisory-163.html)! 2372 2373### vwfi (arm) 2374> `= trap | native` 2375 2376> Default: `trap` 2377 2378WFI is the ARM instruction to "wait for interrupt". WFE is similar and 2379means "wait for event". This option, which is ARM specific, changes the 2380way guest WFI and WFE are implemented in Xen. By default, Xen traps both 2381instructions. In the case of WFI, Xen blocks the guest vcpu; in the case 2382of WFE, Xen yield the guest vcpu. When setting vwfi to `native`, Xen 2383doesn't trap either instruction, running them in guest context. Setting 2384vwfi to `native` reduces irq latency significantly. It can also lead to 2385suboptimal scheduling decisions, but only when the system is 2386oversubscribed (i.e., in total there are more vCPUs than pCPUs). 2387 2388### watchdog (x86) 2389> `= force | <boolean>` 2390 2391> Default: `false` 2392 2393Run an NMI watchdog on each processor. If a processor is stuck for 2394longer than the **watchdog_timeout**, a panic occurs. When `force` is 2395specified, in addition to running an NMI watchdog on each processor, 2396unknown NMIs will still be processed. 2397 2398### watchdog_timeout (x86) 2399> `= <integer>` 2400 2401> Default: `5` 2402 2403Set the NMI watchdog timeout in seconds. Specifying `0` will turn off 2404the watchdog. 2405 2406### x2apic (x86) 2407> `= <boolean>` 2408 2409> Default: `true` 2410 2411Permit use of x2apic setup for SMP environments. 2412 2413### x2apic_phys (x86) 2414> `= <boolean>` 2415 2416> Default: `true` if **FADT** mandates physical mode or if interrupt remapping 2417> is not available, `false` otherwise. 2418 2419In the case that x2apic is in use, this option switches between physical and 2420clustered mode. The default, given no hint from the **FADT**, is cluster 2421mode. 2422 2423### xenheap_megabytes (arm32) 2424> `= <size>` 2425 2426> Default: `0` (1/32 of RAM) 2427 2428Amount of RAM to set aside for the Xenheap. Must be an integer multiple of 32. 2429 2430By default will use 1/32 of the RAM up to a maximum of 1GB and with a 2431minimum of 32M, subject to a suitably aligned and sized contiguous 2432region of memory being available. 2433 2434### xpti (x86) 2435> `= List of [ default | <boolean> | dom0=<bool> | domu=<bool> ]` 2436 2437> Default: `false` on hardware known not to be vulnerable to Meltdown (e.g. AMD) 2438> Default: `true` everywhere else 2439 2440Override default selection of whether to isolate 64-bit PV guest page 2441tables. 2442 2443`true` activates page table isolation even on hardware not vulnerable by 2444Meltdown for all domains. 2445 2446`false` deactivates page table isolation on all systems for all domains. 2447 2448`default` sets the default behaviour. 2449 2450With `dom0` and `domu` it is possible to control page table isolation 2451for dom0 or guest domains only. 2452 2453### xsave (x86) 2454> `= <boolean>` 2455 2456> Default: `true` 2457 2458Permit use of the `xsave/xrstor` instructions. 2459 2460### xsm 2461> `= dummy | flask | silo` 2462 2463> Default: selectable via Kconfig. Depends on enabled XSM modules. 2464 2465Specify which XSM module should be enabled. This option is only available if 2466the hypervisor was compiled with `CONFIG_XSM` enabled. 2467 2468* `dummy`: this is the default choice. Basic restriction for common deployment 2469 (the dummy module) will be applied. It's also used when XSM is compiled out. 2470* `flask`: this is the policy based access control. To choose this, the 2471 separated option in kconfig must also be enabled. 2472* `silo`: this will deny any unmediated communication channels between 2473 unprivileged VMs. To choose this, the separated option in kconfig must also 2474 be enabled. 2475