1# Intel Platform Shared Resource Monitoring/Control in xl
2
3This document introduces Intel Platform Shared Resource Monitoring/Control
4technologies, their basic concepts and the xl interfaces.
5
6## Cache Monitoring Technology (CMT)
7
8Cache Monitoring Technology (CMT) is a new feature available on Intel Haswell
9and later server platforms that allows an OS or Hypervisor/VMM to determine
10the usage of cache (currently only L3 cache supported) by applications running
11on the platform. A Resource Monitoring ID (RMID) is the abstraction of the
12application(s) that will be monitored for its cache usage. The CMT hardware
13tracks cache utilization of memory accesses according to the RMID and reports
14monitored data via a counter register.
15
16For more detailed information please refer to Intel SDM chapter
17"Platform Shared Resource Monitoring: Cache Monitoring Technology".
18
19In Xen's implementation, each domain in the system can be assigned a RMID
20independently, while RMID=0 is reserved for monitoring domains that don't
21have CMT service attached. RMID is opaque for xl/libxl and is only used in
22hypervisor.
23
24### xl interfaces
25
26A domain is assigned a RMID implicitly by attaching it to CMT service:
27
28`xl psr-cmt-attach <domid>`
29
30After that, cache usage for the domain can be shown by:
31
32`xl psr-cmt-show cache-occupancy <domid>`
33
34Once monitoring is not needed any more, the domain can be detached from the
35CMT service by:
36
37`xl psr-cmt-detach <domid>`
38
39An attach may fail because of no free RMID available. In such case unused
40RMID(s) can be freed by detaching corresponding domains from CMT service.
41
42Maximum RMID and supported monitor types in the system can be obtained by:
43
44`xl psr-hwinfo --cmt`
45
46## Memory Bandwidth Monitoring (MBM)
47
48Memory Bandwidth Monitoring(MBM) is a new hardware feature available on Intel
49Broadwell and later server platforms which builds on the CMT infrastructure to
50allow monitoring of system memory bandwidth. It introduces two new monitoring
51event type to monitor system total/local memory bandwidth. The same RMID can
52be used to monitor both cache usage and memory bandwidth at the same time.
53
54For more detailed information please refer to Intel SDM chapter
55"Overview of Cache Monitoring Technology and Memory Bandwidth Monitoring".
56
57In Xen's implementation, MBM shares the same set of underlying monitoring
58service with CMT and can be used to monitor memory bandwidth on a per domain
59basis.
60
61The xl interfaces are the same with that of CMT. The difference is the
62monitor type is corresponding memory monitoring type (local-mem-bandwidth/
63total-mem-bandwidth instead of cache-occupancy). E.g. after a `xl psr-cmt-attach`:
64
65`xl psr-cmt-show local-mem-bandwidth <domid>`
66
67`xl psr-cmt-show total-mem-bandwidth <domid>`
68
69## Cache Allocation Technology (CAT)
70
71Cache Allocation Technology (CAT) is a new feature available on Intel
72Broadwell and later server platforms that allows an OS or Hypervisor/VMM to
73partition cache allocation (i.e. L3/L2 cache) based on application priority or
74Class of Service (COS). Each COS is configured using capacity bitmasks (CBM)
75which represent cache capacity and indicate the degree of overlap and
76isolation between classes. System cache resource is divided into numbers of
77minimum portions which is then made up into subset for cache partition. Each
78portion corresponds to a bit in CBM and the set bit represents the
79corresponding cache portion is available.
80
81For example, assuming a system with 8 portions and 3 domains:
82
83 * A CBM of 0xff for every domain means each domain can access the whole cache.
84   This is the default.
85
86 * Giving one domain a CBM of 0x0f and the other two domain's 0xf0 means that
87   the first domain gets exclusive access to half of the cache (half of the
88   portions) and the other two will share the other half.
89
90 * Giving one domain a CBM of 0x0f, one 0x30 and the last 0xc0 would give the
91   first domain exclusive access to half the cache, and the other two exclusive
92   access to one quarter each.
93
94For more detailed information please refer to Intel SDM chapter
95"Platform Shared Resource Control: Cache Allocation Technology".
96
97In Xen's implementation, CBM can be configured with libxl/xl interfaces but
98COS is maintained in hypervisor only. The cache partition granularity is per
99domain, each domain has COS=0 assigned by default, the corresponding CBM is
100all-ones, which means all the cache resource can be used by default.
101
102### xl interfaces
103
104System CAT information such as maximum COS and CBM length can be obtained by:
105
106`xl psr-hwinfo --cat`
107
108The simplest way to change a domain's CBM from its default is running:
109
110`xl psr-cat-set  [OPTIONS] <domid> <cbm>`
111
112where cbm is a number to represent the corresponding cache subset can be used.
113A cbm is valid only when:
114
115 * Set bits only exist in the range of [0, cbm_len), where cbm_len can be
116   obtained with `xl psr-hwinfo --cat`.
117 * All the set bits are contiguous.
118
119In a multi-socket system, the same cbm will be set on each socket by default.
120Per socket cbm can be specified with the `--socket SOCKET` option.
121
122In different systems, the different cache level is supported, e.g. L3 cache or
123L2 cache. Per cache level cbm can be specified with the `--level LEVEL` option.
124
125Setting the CBM may not be successful if insufficient COS is available. In
126such case unused COS(es) may be freed by setting CBM of all related domains to
127its default value(all-ones).
128
129Per domain CBM settings can be shown by:
130
131`xl psr-cat-show [OPTIONS] <domid>`
132
133In different systems, the different cache level is supported, e.g. L3 cache or
134L2 cache. Per cache level cbm can be specified with the `--level LEVEL` option.
135
136## Code and Data Prioritization (CDP)
137
138Code and Data Prioritization (CDP) Technology is an extension of CAT, which
139is available on Intel Broadwell and later server platforms. CDP enables
140isolation and separate prioritization of code and data fetches to the L3
141cache in a software configurable manner, which can enable workload
142prioritization and tuning of cache capacity to the characteristics of the
143workload. CDP extends Cache Allocation Technology (CAT) by providing
144separate code and data masks per Class of Service (COS).
145
146CDP can be enabled by adding `psr=cdp` to Xen command line.
147
148When CDP is enabled,
149
150 * the CAT masks are re-mapped into interleaved pairs of masks for data or
151   code fetches.
152
153 * the range of COS for CAT is re-indexed, with the lower-half of the COS
154   range available for CDP.
155
156CDP allows the OS or Hypervisor to partition cache allocation in a more
157fine-grained manner. Code cache and data cache can be specified independently.
158With CDP enabled, one COS corresponds to two CBMs (code CBM & data CBM),
159since the sum of CBMs is fixed, that means the number of available COSes
160will reduce by half when CDP is on.
161
162For more detailed information please refer to Intel SDM chapter
163"Platform Shared Resource Control: Cache Allocation Technology".
164
165The xl interfaces are the same with that of CAT. The difference is that
166CBM type can be passed as option to set code CBM or data CBM.
167
168When CDP is enabled, `-c` or `--code` option is available to set code CBM
169for the domain.
170
171When CDP is enabled, `-d` or `--data` option is available to set data CBM
172for the domain.
173
174If neither `-c` nor `-d` option is specified when CDP is on, the same code
175CBM and data CBM will be set for the domain. Passing both `-c` and `-d`
176options is invalid.
177
178Example:
179
180Setting code CBM for a domain:
181`xl psr-cat-set -c <domid> <cbm>`
182
183Setting data CBM for a domain:
184`xl psr-cat-set -d <domid> <cbm>`
185
186Setting the same code and data CBM for a domain:
187`xl psr-cat-set <domid> <cbm>`
188
189## Memory Bandwidth Allocation (MBA)
190
191Memory Bandwidth Allocation (MBA) is a new feature available on Intel
192Skylake and later server platforms that allows an OS or Hypervisor/VMM to
193slow misbehaving apps/VMs by using a credit-based throttling mechanism. To
194enforce bandwidth on a specific domain, just set throttling value (THRTL)
195into Class of Service (COS). MBA provides two THRTL mode. One is linear mode
196and the other is non-linear mode.
197
198In the linear mode the input precision is defined as 100-(THRTL_MAX). Values
199not an even multiple of the precision (e.g., 12%) will be rounded down (e.g.,
200to 10% delay by the hardware).
201
202If linear values are not supported then input delay values are powers-of-two
203from zero to the THRTL_MAX value from CPUID. In this case any values not a power
204of two will be rounded down the next nearest power of two.
205
206For example, assuming a system with 2 domains:
207
208 * A THRTL of 0x0 for every domain means each domain can access the whole cache
209   without any delay. This is the default.
210
211 * Linear mode: Giving one domain a THRTL of 0xC and the other domain's 0 means
212   that the first domain gets 10% delay to access the cache and the other one
213   without any delay.
214
215 * Non-linear mode: Giving one domain a THRTL of 0xC and the other domain's 0
216   means that the first domain gets 8% delay to access the cache and the other
217   one without any delay.
218
219For more detailed information please refer to Intel SDM chapter
220"Introduction to Memory Bandwidth Allocation".
221
222In Xen's implementation, THRTL can be configured with libxl/xl interfaces but
223COS is maintained in hypervisor only. The cache partition granularity is per
224domain, each domain has COS=0 assigned by default, the corresponding THRTL is
2250, which means all the cache resource can be accessed without delay.
226
227### xl interfaces
228
229System MBA information such as maximum COS and maximum THRTL can be obtained by:
230
231`xl psr-hwinfo --mba`
232
233The simplest way to change a domain's THRTL from its default is running:
234
235`xl psr-mba-set  [OPTIONS] <domid> <thrtl>`
236
237In a multi-socket system, the same thrtl will be set on each socket by default.
238Per socket thrtl can be specified with the `--socket SOCKET` option.
239
240Setting the THRTL may not be successful if insufficient COS is available. In
241such case unused COS(es) may be freed by setting THRTL of all related domains to
242its default value(0).
243
244Per domain THRTL settings can be shown by:
245
246`xl psr-mba-show [OPTIONS] <domid>`
247
248For linear mode, it shows the decimal value. For non-linear mode, it shows
249hexadecimal value.
250
251## Reference
252
253[1] Intel SDM
254(http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html).
255