1# Intel Platform Shared Resource Monitoring/Control in xl 2 3This document introduces Intel Platform Shared Resource Monitoring/Control 4technologies, their basic concepts and the xl interfaces. 5 6## Cache Monitoring Technology (CMT) 7 8Cache Monitoring Technology (CMT) is a new feature available on Intel Haswell 9and later server platforms that allows an OS or Hypervisor/VMM to determine 10the usage of cache (currently only L3 cache supported) by applications running 11on the platform. A Resource Monitoring ID (RMID) is the abstraction of the 12application(s) that will be monitored for its cache usage. The CMT hardware 13tracks cache utilization of memory accesses according to the RMID and reports 14monitored data via a counter register. 15 16For more detailed information please refer to Intel SDM chapter 17"Platform Shared Resource Monitoring: Cache Monitoring Technology". 18 19In Xen's implementation, each domain in the system can be assigned a RMID 20independently, while RMID=0 is reserved for monitoring domains that don't 21have CMT service attached. RMID is opaque for xl/libxl and is only used in 22hypervisor. 23 24### xl interfaces 25 26A domain is assigned a RMID implicitly by attaching it to CMT service: 27 28`xl psr-cmt-attach <domid>` 29 30After that, cache usage for the domain can be shown by: 31 32`xl psr-cmt-show cache-occupancy <domid>` 33 34Once monitoring is not needed any more, the domain can be detached from the 35CMT service by: 36 37`xl psr-cmt-detach <domid>` 38 39An attach may fail because of no free RMID available. In such case unused 40RMID(s) can be freed by detaching corresponding domains from CMT service. 41 42Maximum RMID and supported monitor types in the system can be obtained by: 43 44`xl psr-hwinfo --cmt` 45 46## Memory Bandwidth Monitoring (MBM) 47 48Memory Bandwidth Monitoring(MBM) is a new hardware feature available on Intel 49Broadwell and later server platforms which builds on the CMT infrastructure to 50allow monitoring of system memory bandwidth. It introduces two new monitoring 51event type to monitor system total/local memory bandwidth. The same RMID can 52be used to monitor both cache usage and memory bandwidth at the same time. 53 54For more detailed information please refer to Intel SDM chapter 55"Overview of Cache Monitoring Technology and Memory Bandwidth Monitoring". 56 57In Xen's implementation, MBM shares the same set of underlying monitoring 58service with CMT and can be used to monitor memory bandwidth on a per domain 59basis. 60 61The xl interfaces are the same with that of CMT. The difference is the 62monitor type is corresponding memory monitoring type (local-mem-bandwidth/ 63total-mem-bandwidth instead of cache-occupancy). E.g. after a `xl psr-cmt-attach`: 64 65`xl psr-cmt-show local-mem-bandwidth <domid>` 66 67`xl psr-cmt-show total-mem-bandwidth <domid>` 68 69## Cache Allocation Technology (CAT) 70 71Cache Allocation Technology (CAT) is a new feature available on Intel 72Broadwell and later server platforms that allows an OS or Hypervisor/VMM to 73partition cache allocation (i.e. L3/L2 cache) based on application priority or 74Class of Service (COS). Each COS is configured using capacity bitmasks (CBM) 75which represent cache capacity and indicate the degree of overlap and 76isolation between classes. System cache resource is divided into numbers of 77minimum portions which is then made up into subset for cache partition. Each 78portion corresponds to a bit in CBM and the set bit represents the 79corresponding cache portion is available. 80 81For example, assuming a system with 8 portions and 3 domains: 82 83 * A CBM of 0xff for every domain means each domain can access the whole cache. 84 This is the default. 85 86 * Giving one domain a CBM of 0x0f and the other two domain's 0xf0 means that 87 the first domain gets exclusive access to half of the cache (half of the 88 portions) and the other two will share the other half. 89 90 * Giving one domain a CBM of 0x0f, one 0x30 and the last 0xc0 would give the 91 first domain exclusive access to half the cache, and the other two exclusive 92 access to one quarter each. 93 94For more detailed information please refer to Intel SDM chapter 95"Platform Shared Resource Control: Cache Allocation Technology". 96 97In Xen's implementation, CBM can be configured with libxl/xl interfaces but 98COS is maintained in hypervisor only. The cache partition granularity is per 99domain, each domain has COS=0 assigned by default, the corresponding CBM is 100all-ones, which means all the cache resource can be used by default. 101 102### xl interfaces 103 104System CAT information such as maximum COS and CBM length can be obtained by: 105 106`xl psr-hwinfo --cat` 107 108The simplest way to change a domain's CBM from its default is running: 109 110`xl psr-cat-set [OPTIONS] <domid> <cbm>` 111 112where cbm is a number to represent the corresponding cache subset can be used. 113A cbm is valid only when: 114 115 * Set bits only exist in the range of [0, cbm_len), where cbm_len can be 116 obtained with `xl psr-hwinfo --cat`. 117 * All the set bits are contiguous. 118 119In a multi-socket system, the same cbm will be set on each socket by default. 120Per socket cbm can be specified with the `--socket SOCKET` option. 121 122In different systems, the different cache level is supported, e.g. L3 cache or 123L2 cache. Per cache level cbm can be specified with the `--level LEVEL` option. 124 125Setting the CBM may not be successful if insufficient COS is available. In 126such case unused COS(es) may be freed by setting CBM of all related domains to 127its default value(all-ones). 128 129Per domain CBM settings can be shown by: 130 131`xl psr-cat-show [OPTIONS] <domid>` 132 133In different systems, the different cache level is supported, e.g. L3 cache or 134L2 cache. Per cache level cbm can be specified with the `--level LEVEL` option. 135 136## Code and Data Prioritization (CDP) 137 138Code and Data Prioritization (CDP) Technology is an extension of CAT, which 139is available on Intel Broadwell and later server platforms. CDP enables 140isolation and separate prioritization of code and data fetches to the L3 141cache in a software configurable manner, which can enable workload 142prioritization and tuning of cache capacity to the characteristics of the 143workload. CDP extends Cache Allocation Technology (CAT) by providing 144separate code and data masks per Class of Service (COS). 145 146CDP can be enabled by adding `psr=cdp` to Xen command line. 147 148When CDP is enabled, 149 150 * the CAT masks are re-mapped into interleaved pairs of masks for data or 151 code fetches. 152 153 * the range of COS for CAT is re-indexed, with the lower-half of the COS 154 range available for CDP. 155 156CDP allows the OS or Hypervisor to partition cache allocation in a more 157fine-grained manner. Code cache and data cache can be specified independently. 158With CDP enabled, one COS corresponds to two CBMs (code CBM & data CBM), 159since the sum of CBMs is fixed, that means the number of available COSes 160will reduce by half when CDP is on. 161 162For more detailed information please refer to Intel SDM chapter 163"Platform Shared Resource Control: Cache Allocation Technology". 164 165The xl interfaces are the same with that of CAT. The difference is that 166CBM type can be passed as option to set code CBM or data CBM. 167 168When CDP is enabled, `-c` or `--code` option is available to set code CBM 169for the domain. 170 171When CDP is enabled, `-d` or `--data` option is available to set data CBM 172for the domain. 173 174If neither `-c` nor `-d` option is specified when CDP is on, the same code 175CBM and data CBM will be set for the domain. Passing both `-c` and `-d` 176options is invalid. 177 178Example: 179 180Setting code CBM for a domain: 181`xl psr-cat-set -c <domid> <cbm>` 182 183Setting data CBM for a domain: 184`xl psr-cat-set -d <domid> <cbm>` 185 186Setting the same code and data CBM for a domain: 187`xl psr-cat-set <domid> <cbm>` 188 189## Memory Bandwidth Allocation (MBA) 190 191Memory Bandwidth Allocation (MBA) is a new feature available on Intel 192Skylake and later server platforms that allows an OS or Hypervisor/VMM to 193slow misbehaving apps/VMs by using a credit-based throttling mechanism. To 194enforce bandwidth on a specific domain, just set throttling value (THRTL) 195into Class of Service (COS). MBA provides two THRTL mode. One is linear mode 196and the other is non-linear mode. 197 198In the linear mode the input precision is defined as 100-(THRTL_MAX). Values 199not an even multiple of the precision (e.g., 12%) will be rounded down (e.g., 200to 10% delay by the hardware). 201 202If linear values are not supported then input delay values are powers-of-two 203from zero to the THRTL_MAX value from CPUID. In this case any values not a power 204of two will be rounded down the next nearest power of two. 205 206For example, assuming a system with 2 domains: 207 208 * A THRTL of 0x0 for every domain means each domain can access the whole cache 209 without any delay. This is the default. 210 211 * Linear mode: Giving one domain a THRTL of 0xC and the other domain's 0 means 212 that the first domain gets 10% delay to access the cache and the other one 213 without any delay. 214 215 * Non-linear mode: Giving one domain a THRTL of 0xC and the other domain's 0 216 means that the first domain gets 8% delay to access the cache and the other 217 one without any delay. 218 219For more detailed information please refer to Intel SDM chapter 220"Introduction to Memory Bandwidth Allocation". 221 222In Xen's implementation, THRTL can be configured with libxl/xl interfaces but 223COS is maintained in hypervisor only. The cache partition granularity is per 224domain, each domain has COS=0 assigned by default, the corresponding THRTL is 2250, which means all the cache resource can be accessed without delay. 226 227### xl interfaces 228 229System MBA information such as maximum COS and maximum THRTL can be obtained by: 230 231`xl psr-hwinfo --mba` 232 233The simplest way to change a domain's THRTL from its default is running: 234 235`xl psr-mba-set [OPTIONS] <domid> <thrtl>` 236 237In a multi-socket system, the same thrtl will be set on each socket by default. 238Per socket thrtl can be specified with the `--socket SOCKET` option. 239 240Setting the THRTL may not be successful if insufficient COS is available. In 241such case unused COS(es) may be freed by setting THRTL of all related domains to 242its default value(0). 243 244Per domain THRTL settings can be shown by: 245 246`xl psr-mba-show [OPTIONS] <domid>` 247 248For linear mode, it shows the decimal value. For non-linear mode, it shows 249hexadecimal value. 250 251## Reference 252 253[1] Intel SDM 254(http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html). 255