1.. SPDX-License-Identifier: GPL-2.0
2
3======
4Design
5======
6
7Configurable Layers
8===================
9
10DAMON provides data access monitoring functionality while making the accuracy
11and the overhead controllable.  The fundamental access monitorings require
12primitives that dependent on and optimized for the target address space.  On
13the other hand, the accuracy and overhead tradeoff mechanism, which is the core
14of DAMON, is in the pure logic space.  DAMON separates the two parts in
15different layers and defines its interface to allow various low level
16primitives implementations configurable with the core logic.
17
18Due to this separated design and the configurable interface, users can extend
19DAMON for any address space by configuring the core logics with appropriate low
20level primitive implementations.  If appropriate one is not provided, users can
21implement the primitives on their own.
22
23For example, physical memory, virtual memory, swap space, those for specific
24processes, NUMA nodes, files, and backing memory devices would be supportable.
25Also, if some architectures or devices support special optimized access check
26primitives, those will be easily configurable.
27
28
29Reference Implementations of Address Space Specific Primitives
30==============================================================
31
32The low level primitives for the fundamental access monitoring are defined in
33two parts:
34
351. Identification of the monitoring target address range for the address space.
362. Access check of specific address range in the target space.
37
38DAMON currently provides the implementations of the primitives for the physical
39and virtual address spaces. Below two subsections describe how those work.
40
41
42VMA-based Target Address Range Construction
43-------------------------------------------
44
45This is only for the virtual address space primitives implementation.  That for
46the physical address space simply asks users to manually set the monitoring
47target address ranges.
48
49Only small parts in the super-huge virtual address space of the processes are
50mapped to the physical memory and accessed.  Thus, tracking the unmapped
51address regions is just wasteful.  However, because DAMON can deal with some
52level of noise using the adaptive regions adjustment mechanism, tracking every
53mapping is not strictly required but could even incur a high overhead in some
54cases.  That said, too huge unmapped areas inside the monitoring target should
55be removed to not take the time for the adaptive mechanism.
56
57For the reason, this implementation converts the complex mappings to three
58distinct regions that cover every mapped area of the address space.  The two
59gaps between the three regions are the two biggest unmapped areas in the given
60address space.  The two biggest unmapped areas would be the gap between the
61heap and the uppermost mmap()-ed region, and the gap between the lowermost
62mmap()-ed region and the stack in most of the cases.  Because these gaps are
63exceptionally huge in usual address spaces, excluding these will be sufficient
64to make a reasonable trade-off.  Below shows this in detail::
65
66    <heap>
67    <BIG UNMAPPED REGION 1>
68    <uppermost mmap()-ed region>
69    (small mmap()-ed regions and munmap()-ed regions)
70    <lowermost mmap()-ed region>
71    <BIG UNMAPPED REGION 2>
72    <stack>
73
74
75PTE Accessed-bit Based Access Check
76-----------------------------------
77
78Both of the implementations for physical and virtual address spaces use PTE
79Accessed-bit for basic access checks.  Only one difference is the way of
80finding the relevant PTE Accessed bit(s) from the address.  While the
81implementation for the virtual address walks the page table for the target task
82of the address, the implementation for the physical address walks every page
83table having a mapping to the address.  In this way, the implementations find
84and clear the bit(s) for next sampling target address and checks whether the
85bit(s) set again after one sampling period.  This could disturb other kernel
86subsystems using the Accessed bits, namely Idle page tracking and the reclaim
87logic.  To avoid such disturbances, DAMON makes it mutually exclusive with Idle
88page tracking and uses ``PG_idle`` and ``PG_young`` page flags to solve the
89conflict with the reclaim logic, as Idle page tracking does.
90
91
92Address Space Independent Core Mechanisms
93=========================================
94
95Below four sections describe each of the DAMON core mechanisms and the five
96monitoring attributes, ``sampling interval``, ``aggregation interval``,
97``regions update interval``, ``minimum number of regions``, and ``maximum
98number of regions``.
99
100
101Access Frequency Monitoring
102---------------------------
103
104The output of DAMON says what pages are how frequently accessed for a given
105duration.  The resolution of the access frequency is controlled by setting
106``sampling interval`` and ``aggregation interval``.  In detail, DAMON checks
107access to each page per ``sampling interval`` and aggregates the results.  In
108other words, counts the number of the accesses to each page.  After each
109``aggregation interval`` passes, DAMON calls callback functions that previously
110registered by users so that users can read the aggregated results and then
111clears the results.  This can be described in below simple pseudo-code::
112
113    while monitoring_on:
114        for page in monitoring_target:
115            if accessed(page):
116                nr_accesses[page] += 1
117        if time() % aggregation_interval == 0:
118            for callback in user_registered_callbacks:
119                callback(monitoring_target, nr_accesses)
120            for page in monitoring_target:
121                nr_accesses[page] = 0
122        sleep(sampling interval)
123
124The monitoring overhead of this mechanism will arbitrarily increase as the
125size of the target workload grows.
126
127
128Region Based Sampling
129---------------------
130
131To avoid the unbounded increase of the overhead, DAMON groups adjacent pages
132that assumed to have the same access frequencies into a region.  As long as the
133assumption (pages in a region have the same access frequencies) is kept, only
134one page in the region is required to be checked.  Thus, for each ``sampling
135interval``, DAMON randomly picks one page in each region, waits for one
136``sampling interval``, checks whether the page is accessed meanwhile, and
137increases the access frequency of the region if so.  Therefore, the monitoring
138overhead is controllable by setting the number of regions.  DAMON allows users
139to set the minimum and the maximum number of regions for the trade-off.
140
141This scheme, however, cannot preserve the quality of the output if the
142assumption is not guaranteed.
143
144
145Adaptive Regions Adjustment
146---------------------------
147
148Even somehow the initial monitoring target regions are well constructed to
149fulfill the assumption (pages in same region have similar access frequencies),
150the data access pattern can be dynamically changed.  This will result in low
151monitoring quality.  To keep the assumption as much as possible, DAMON
152adaptively merges and splits each region based on their access frequency.
153
154For each ``aggregation interval``, it compares the access frequencies of
155adjacent regions and merges those if the frequency difference is small.  Then,
156after it reports and clears the aggregated access frequency of each region, it
157splits each region into two or three regions if the total number of regions
158will not exceed the user-specified maximum number of regions after the split.
159
160In this way, DAMON provides its best-effort quality and minimal overhead while
161keeping the bounds users set for their trade-off.
162
163
164Dynamic Target Space Updates Handling
165-------------------------------------
166
167The monitoring target address range could dynamically changed.  For example,
168virtual memory could be dynamically mapped and unmapped.  Physical memory could
169be hot-plugged.
170
171As the changes could be quite frequent in some cases, DAMON checks the dynamic
172memory mapping changes and applies it to the abstracted target area only for
173each of a user-specified time interval (``regions update interval``).
174