1********************************************************************************
2 A Rough Introduction to Using Grant Tables
3 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
4                                              Christopher Clark, March, 2005.
5
6Grant tables are a mechanism for sharing and transferring frames between
7domains, without requiring the participating domains to be privileged.
8
9The first mode of use allows domA to grant domB access to a specific frame,
10whilst retaining ownership. The block front driver uses this to grant memory
11access to the block back driver, so that it may read or write as requested.
12
13 1. domA creates a grant access reference, and transmits the ref id to domB.
14 2. domB uses the reference to map the granted frame.
15 3. domB performs the memory access.
16 4. domB unmaps the granted frame.
17 5. domA removes its grant.
18
19
20The second mode allows domA to accept a transfer of ownership of a frame from
21domB. The net front and back driver will use this for packet tx/rx. This
22mechanism is still being implemented, though the xen<->guest interface design
23is complete.
24
25 1. domA creates an accept transfer grant reference, and transmits it to domB.
26 2. domB uses the ref to hand over a frame it owns.
27 3. domA accepts the transfer
28 4. domA clears the used reference.
29
30
31********************************************************************************
32 Data structures
33 ~~~~~~~~~~~~~~~
34
35 The following data structures are used by Xen and the guests to implement
36 grant tables:
37
38 1. Shared grant entries
39 2. Active grant entries
40 3. Map tracking
41
42 These are not the users primary interface to grant tables, but are discussed
43 because an understanding of how they work may be useful. Each of these is a
44 finite resource.
45
46 Shared grant entries
47 ~~~~~~~~~~~~~~~~~~~~
48
49 A set of pages are shared between Xen and a guest, holding the shared grant
50 entries. The guest writes into these entries to create grant references. The
51 index of the entry is transmitted to the remote domain: this is the
52 reference used to activate an entry. Xen will write into a shared entry to
53 indicate to a guest that its grant is in use.
54  sha->domid : remote domain being granted rights
55  sha->frame : machine frame being granted
56  sha->flags : allow access, allow transfer, remote is reading/writing, etc.
57
58 Active grant entries
59 ~~~~~~~~~~~~~~~~~~~~
60
61 Xen maintains a set of private frames per domain, holding the active grant
62 entries for safety, and to reference count mappings.
63  act->domid : remote domain being granted rights
64  act->frame : machine frame being granted
65  act->pin   : used to hold reference counts
66  act->lock  : spinlock used to serialize access to active entry state
67
68 Map tracking
69 ~~~~~~~~~~~~
70
71 Every time a frame is mapped, a map track entry is stored in the metadata of
72 the mapping domain. The index of this entry is returned from the map call,
73 and is used to unmap the frame. Map track entries are also searched whenever a
74 page table entry containing a foreign frame number is overwritten: the first
75 matching map track entry is then removed, as if unmap had been invoked.
76 These are not used by the transfer mechanism.
77  map->domid         : owner of the mapped frame
78  map->ref           : grant reference
79  map->flags         : ro/rw, mapped for host or device access
80
81********************************************************************************
82 Locking
83 ~~~~~~~
84 Xen uses several locks to serialize access to the internal grant table state.
85
86  grant_table->lock          : rwlock used to prevent readers from accessing
87                               inconsistent grant table state such as current
88                               version, partially initialized active table pages,
89                               etc.
90  grant_table->maptrack_lock : spinlock used to protect the maptrack limit
91  v->maptrack_freelist_lock  : spinlock used to protect the maptrack free list
92  active_grant_entry->lock   : spinlock used to serialize modifications to
93                               active entries
94
95 The primary lock for the grant table is a read/write spinlock. All
96 functions that access members of struct grant_table must acquire a
97 read lock around critical sections. Any modification to the members
98 of struct grant_table (e.g., nr_status_frames, nr_grant_frames,
99 active frames, etc.) must only be made if the write lock is
100 held. These elements are read-mostly, and read critical sections can
101 be large, which makes a rwlock a good choice.
102
103 The maptrack free list is protected by its own spinlock. The maptrack
104 lock may be locked while holding the grant table lock.
105
106 The maptrack_freelist_lock is an innermost lock.  It may be locked
107 while holding other locks, but no other locks may be acquired within
108 it.
109
110 Active entries are obtained by calling active_entry_acquire(gt, ref).
111 This function returns a pointer to the active entry after locking its
112 spinlock. The caller must hold the grant table read lock before
113 calling active_entry_acquire(). This is because the grant table can
114 be dynamically extended via gnttab_grow_table() while a domain is
115 running and must be fully initialized. Once all access to the active
116 entry is complete, release the lock by calling active_entry_release(act).
117
118 Summary of rules for locking:
119  active_entry_acquire() and active_entry_release() can only be
120  called when holding the relevant grant table's read lock. I.e.:
121    read_lock(&gt->lock);
122    act = active_entry_acquire(gt, ref);
123    ...
124    active_entry_release(act);
125    read_unlock(&gt->lock);
126
127 Active entries cannot be acquired while holding the maptrack lock.
128 Multiple active entries can be acquired while holding the grant table
129 _write_ lock.
130
131 Maptrack entries are protected by the corresponding active entry
132 lock.  As an exception, new maptrack entries may be populated without
133 holding the lock, provided the flags field is written last.  This
134 requires any maptrack entry user validates the flags field as
135 non-zero first.
136
137********************************************************************************
138
139 Granting a foreign domain access to frames
140 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
141
142 domA [frame]--> domB
143
144
145 domA:  #include <asm-xen/gnttab.h>
146        grant_ref_t gref[BATCH_SIZE];
147
148        for ( i = 0; i < BATCH_SIZE; i++ )
149            gref[i] = gnttab_grant_foreign_access( domBid, mfn, (readonly ? 1 : 0) );
150
151
152 .. gref is then somehow transmitted to domB for use.
153
154
155 Mapping foreign frames
156 ~~~~~~~~~~~~~~~~~~~~~~
157
158 domB:  #include <asm-xen/hypervisor.h>
159        unsigned long       mmap_vstart;
160        gnttab_op_t         aop[BATCH_SIZE];
161        grant_ref_t         mapped_handle[BATCH_SIZE];
162
163        if ( (mmap_vstart = allocate_empty_lowmem_region(BATCH_SIZE)) == 0 )
164            BUG();
165
166        for ( i = 0; i < BATCH_SIZE; i++ )
167        {
168            aop[i].u.map_grant_ref.host_virt_addr =
169                                              mmap_vstart + (i * PAGE_SIZE);
170            aop[i].u.map_grant_ref.dom      = domAid;
171            aop[i].u.map_grant_ref.ref      = gref[i];
172            aop[i].u.map_grant_ref.flags    = ( GNTMAP_host_map | GNTMAP_readonly );
173        }
174
175        if ( unlikely(HYPERVISOR_grant_table_op(
176                        GNTTABOP_map_grant_ref, aop, BATCH_SIZE)))
177            BUG();
178
179        for ( i = 0; i < BATCH_SIZE; i++ )
180        {
181            if ( unlikely(aop[i].u.map_grant_ref.handle < 0) )
182            {
183                tidyup_all(aop, i);
184                goto panic;
185            }
186
187            phys_to_machine_mapping[__pa(mmap_vstart + (i * PAGE_SIZE))>>PAGE_SHIFT] =
188                FOREIGN_FRAME(aop[i].u.map_grant_ref.dev_bus_addr);
189
190            mapped_handle[i] = aop[i].u.map_grant_ref.handle;
191        }
192
193
194
195 Unmapping foreign frames
196 ~~~~~~~~~~~~~~~~~~~~~~~~
197
198 domB:
199        for ( i = 0; i < BATCH_SIZE; i++ )
200        {
201            aop[i].u.unmap_grant_ref.host_virt_addr = mmap_vstart + (i * PAGE_SIZE);
202            aop[i].u.unmap_grant_ref.dev_bus_addr   = 0;
203            aop[i].u.unmap_grant_ref.handle         = mapped_handle[i];
204        }
205        if ( unlikely(HYPERVISOR_grant_table_op(
206                        GNTTABOP_unmap_grant_ref, aop, BATCH_SIZE)))
207            BUG();
208
209
210 Ending foreign access
211 ~~~~~~~~~~~~~~~~~~~~~
212
213    Note that this only prevents further mappings; it does _not_ revoke access.
214    Should _only_ be used when the remote domain has unmapped the frame.
215    gnttab_query_foreign_access( gref ) will indicate the state of any mapping.
216
217 domA:
218        if ( gnttab_query_foreign_access( gref[i] ) == 0 )
219            gnttab_end_foreign_access( gref[i], readonly );
220
221        TODO: readonly yet to be implemented.
222
223
224********************************************************************************
225
226 Transferring ownership of a frame to another domain
227 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
228
229 [ XXX: Transfer mechanism is alpha-calibre code, untested, use at own risk XXX ]
230 [ XXX: show use of batch operations below, rather than single frame XXX ]
231 [ XXX: linux internal interface could/should be wrapped to be tidier XXX ]
232
233
234 Prepare to accept a frame from a foreign domain
235 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
236
237  domA:
238        if ( (p = alloc_page(GFP_HIGHUSER)) == NULL )
239        {
240            printk("Cannot alloc a frame to surrender\n");
241            break;
242        }
243        pfn = p - mem_map;
244        mfn = phys_to_machine_mapping[pfn];
245
246        if ( !PageHighMem(p) )
247        {
248            v = phys_to_virt(pfn << PAGE_SHIFT);
249            scrub_pages(v, 1);
250            queue_l1_entry_update(get_ptep((unsigned long)v), 0);
251        }
252
253        /* Ensure that ballooned highmem pages don't have cached mappings. */
254        kmap_flush_unused();
255
256        /* Flush updates through and flush the TLB. */
257        xen_tlb_flush();
258
259        phys_to_machine_mapping[pfn] = INVALID_P2M_ENTRY;
260
261        if ( HYPERVISOR_dom_mem_op(
262            MEMOP_decrease_reservation, &mfn, 1, 0) != 1 )
263        {
264            printk("MEMOP_decrease_reservation failed\n");
265            /* er... ok. free the page then */
266            __free_page(p);
267            break;
268        }
269
270        accepting_pfn = pfn;
271        ref = gnttab_grant_foreign_transfer( (domid_t) args.arg[0], pfn );
272        printk("Accepting dom %lu frame at ref (%d)\n", args.arg[0], ref);
273
274
275 Transfer a frame to a foreign domain
276 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
277
278  domB:
279        mmu_update_t            update;
280        domid_t                 domid;
281        grant_ref_t             gref;
282        unsigned long           pfn, mfn, *v;
283        struct page            *transfer_page = 0;
284
285        /* alloc a page and grant access.
286         * alloc page returns a page struct. */
287        if ( (transfer_page = alloc_page(GFP_HIGHUSER)) == NULL )
288            return -ENOMEM;
289
290        pfn = transfer_page - mem_map;
291        mfn = phys_to_machine_mapping[pfn];
292
293        /* need to remove all references to this page */
294        if ( !PageHighMem(transfer_page) )
295        {
296            v = phys_to_virt(pfn << PAGE_SHIFT);
297            scrub_pages(v, 1);
298            sprintf((char *)v, "This page (%lx) was transferred.\n", mfn);
299            queue_l1_entry_update(get_ptep((unsigned long)v), 0);
300        }
301#ifdef CONFIG_XEN_SCRUB_PAGES
302        else
303        {
304            v = kmap(transfer_page);
305            scrub_pages(v, 1);
306            sprintf((char *)v, "This page (%lx) was transferred.\n", mfn);
307            kunmap(transfer_page);
308        }
309#endif
310        /* Delete any cached kmappings */
311        kmap_flush_unused();
312
313        /* Flush updates through and flush the TLB */
314        xen_tlb_flush();
315
316        /* invalidate in P2M */
317        phys_to_machine_mapping[pfn] = INVALID_P2M_ENTRY;
318
319        domid = (domid_t)args.arg[0];
320        gref  = (grant_ref_t)args.arg[1];
321
322        update.ptr  = MMU_EXTENDED_COMMAND;
323        update.ptr |= ((gref & 0x00FF) << 2);
324        update.ptr |= mfn << PAGE_SHIFT;
325
326        update.val  = MMUEXT_TRANSFER_PAGE;
327        update.val |= (domid << 16);
328        update.val |= (gref & 0xFF00);
329
330        ret = HYPERVISOR_mmu_update(&update, 1, NULL);
331
332
333 Map a transferred frame
334 ~~~~~~~~~~~~~~~~~~~~~~~
335
336 TODO:
337
338
339 Clear the used transfer reference
340 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
341
342 TODO:
343
344
345********************************************************************************
346
347 Using a private reserve of grant references
348 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
349
350Where it is known in advance how many grant references are required, and
351failure to allocate them on demand would cause difficulty, a batch can be
352allocated and held in a private reserve.
353
354To reserve a private batch:
355
356    /* housekeeping data - treat as opaque: */
357    grant_ref_t gref_head, gref_terminal;
358
359    if ( 0 > gnttab_alloc_grant_references( number_to_reserve,
360                                            &gref_head, &gref_terminal ))
361        return -ENOSPC;
362
363
364To release a batch back to the shared pool:
365
366    gnttab_free_grant_references( number_reserved, gref_head );
367
368
369To claim a reserved reference:
370
371    ref = gnttab_claim_grant_reference( &gref_head, gref_terminal );
372
373
374To release a claimed reference back to the reserve pool:
375
376    gnttab_release_grant_reference( &gref_head, gref );
377
378
379To use a claimed reference to grant access, use these alternative functions
380that take an additional parameter of the grant reference to use:
381
382    gnttab_grant_foreign_access_ref
383    gnttab_grant_foreign_transfer_ref
384