1Xenstore protocol specification
2-------------------------------
3
4Xenstore implements a database which maps filename-like pathnames
5(also known as `keys') to values.  Clients may read and write values,
6watch for changes, and set permissions to allow or deny access.  There
7is a rudimentary transaction system.
8
9While xenstore and most tools and APIs are capable of dealing with
10arbitrary binary data as values, this should generally be avoided.
11Data should generally be human-readable for ease of management and
12debugging; xenstore is not a high-performance facility and should be
13used only for small amounts of control plane data.  Therefore xenstore
14values should normally be 7-bit ASCII text strings containing bytes
150x20..0x7f only, and should not contain a trailing nul byte.  (The
16APIs used for accessing xenstore generally add a nul when reading, for
17the caller's convenience.)
18
19A separate specification will detail the keys and values which are
20used in the Xen system and what their meanings are.  (Sadly that
21specification currently exists only in multiple out-of-date versions.)
22
23
24Paths are /-separated and start with a /, just as Unix filenames.
25
26We can speak of two paths being <child> and <parent>, which is the
27case if they're identical, or if <parent> is /, or if <parent>/ is an
28initial substring of <child>.  (This includes <path> being a child of
29itself.)
30
31If a particular path exists, all of its parents do too.  Every
32existing path maps to a possibly empty value, and may also have zero
33or more immediate children.  There is thus no particular distinction
34between directories and leaf nodes.  However, it is conventional not
35to store nonempty values at nodes which also have children.
36
37The permitted character for paths set is ASCII alphanumerics and plus
38the four punctuation characters -/_@ (hyphen slash underscore atsign).
39@ should be avoided except to specify special watches (see below).
40Doubled slashes and trailing slashes (except to specify the root) are
41forbidden.  The empty path is also forbidden.  Paths longer than 3072
42bytes are forbidden; clients specifying relative paths should keep
43them to within 2048 bytes.  (See XENSTORE_*_PATH_MAX in xs_wire.h.)
44
45
46Communication with xenstore is via either sockets, or event channel
47and shared memory, as specified in io/xs_wire.h: each message in
48either direction is a header formatted as a struct xsd_sockmsg
49followed by xsd_sockmsg.len bytes of payload.
50
51The payload syntax varies according to the type field.  Generally
52requests each generate a reply with an identical type, req_id and
53tx_id.  However, if an error occurs, a reply will be returned with
54type ERROR, and only req_id and tx_id copied from the request.
55
56A caller who sends several requests may receive the replies in any
57order and must use req_id (and tx_id, if applicable) to match up
58replies to requests.  (The current implementation always replies to
59requests in the order received but this should not be relied on.)
60
61The payload length (len field of the header) is limited to 4096
62(XENSTORE_PAYLOAD_MAX) in both directions.  If a client exceeds the
63limit, its xenstored connection will be immediately killed by
64xenstored, which is usually catastrophic from the client's point of
65view.  Clients (particularly domains, which cannot just reconnect)
66should avoid this.
67
68Existing clients do not always contain defences against overly long
69payloads.  Increasing xenstored's limit is therefore difficult; it
70would require negotiation with the client, and obviously would make
71parts of xenstore inaccessible to some clients.  In any case passing
72bulk data through xenstore is not recommended as the performance
73properties are poor.
74
75
76---------- Xenstore protocol details - introduction ----------
77
78The payload syntax and semantics of the requests and replies are
79described below.  In the payload syntax specifications we use the
80following notations:
81
82 |		A nul (zero) byte.
83 <foo>		A string guaranteed not to contain any nul bytes.
84 <foo|>		Binary data (which may contain zero or more nul bytes)
85 <foo>|*	Zero or more strings each followed by a trailing nul
86 <foo>|+	One or more strings each followed by a trailing nul
87 ?		Reserved value (may not contain nuls)
88 ??		Reserved value (may contain nuls)
89
90Except as otherwise noted, reserved values are believed to be sent as
91empty strings by all current clients.  Clients should not send
92nonempty strings for reserved values; those parts of the protocol may
93be used for extension in the future.
94
95
96Error replies are as follows:
97
98ERROR						E<something>|
99	Where E<something> is the name of an errno value
100	listed in io/xs_wire.h.  Note that the string name
101	is transmitted, not a numeric value.
102
103
104Where no reply payload format is specified below, success responses
105have the following payload:
106						OK|
107
108Values commonly included in payloads include:
109
110    <path>
111	Specifies a path in the hierarchical key structure.
112	If <path> starts with a / it simply represents that path.
113
114	<path> is allowed not to start with /, in which case the
115	caller must be a domain (rather than connected via a socket)
116	and the path is taken to be relative to /local/domain/<domid>
117	(eg, `x/y' sent by domain 3 would mean `/local/domain/3/x/y').
118
119    <domid>
120	Integer domid, represented as decimal number 0..65535.
121	Parsing errors and values out of range generally go
122	undetected.  The special DOMID_... values (see xen.h) are
123	represented as integers; unless otherwise specified it
124	is an error not to specify a real domain id.
125
126
127
128The following are the actual type values, including the request and
129reply payloads as applicable:
130
131
132---------- Database read, write and permissions operations ----------
133
134READ			<path>|			<value|>
135WRITE			<path>|<value|>
136	Store and read the octet string <value> at <path>.
137	WRITE creates any missing parent paths, with empty values.
138
139MKDIR			<path>|
140	Ensures that the <path> exists, by necessary by creating
141	it and any missing parents with empty values.  If <path>
142	or any parent already exists, its value is left unchanged.
143
144RM			<path>|
145	Ensures that the <path> does not exist, by deleting
146	it and all of its children.  It is not an error if <path> does
147	not exist, but it _is_ an error if <path>'s immediate parent
148	does not exist either.
149
150DIRECTORY		<path>|			<child-leaf-name>|*
151	Gives a list of the immediate children of <path>, as only the
152	leafnames.  The resulting children are each named
153	<path>/<child-leaf-name>.
154
155DIRECTORY_PART		<path>|<offset>		<gencnt>|<child-leaf-name>|*
156	Same as DIRECTORY, but to be used for children lists longer than
157	XENSTORE_PAYLOAD_MAX. Input are <path> and the byte offset into
158	the list of children to return. Return values are the generation
159	count <gencnt> of the node (to be used to ensure the node hasn't
160	changed between two reads: <gencnt> being the same for multiple
161	reads guarantees the node hasn't changed) and the list of children
162	starting at the specified <offset> of the complete list.
163
164GET_PERMS	 	<path>|			<perm-as-string>|+
165SET_PERMS		<path>|<perm-as-string>|+?
166	<perm-as-string> is one of the following
167		w<domid>	write only
168		r<domid>	read only
169		b<domid>	both read and write
170		n<domid>	no access
171	See https://wiki.xen.org/wiki/XenBus section
172	`Permissions' for details of the permissions system.
173	It is possible to set permissions for the special watch paths
174	"@introduceDomain" and "@releaseDomain" to enable receiving those
175	watches in unprivileged domains.
176
177---------- Watches ----------
178
179WATCH			<wpath>|<token>|?
180	Adds a watch.
181
182	When a <path> is modified (including path creation, removal,
183	contents change or permissions change) this generates an event
184	on the changed <path>.  Changes made in transactions cause an
185	event only if and when committed.  Each occurring event is
186	matched against all the watches currently set up, and each
187	matching watch results in a WATCH_EVENT message (see below).
188
189	The event's path matches the watch's <wpath> if it is an child
190	of <wpath>.
191
192	<wpath> can be a <path> to watch or @<wspecial>.  In the
193	latter case <wspecial> may have any syntax but it matches
194	(according to the rules above) only the following special
195	events which are invented by xenstored:
196	    @introduceDomain	occurs on INTRODUCE
197	    @releaseDomain 	occurs on any domain crash or
198				shutdown, and also on RELEASE
199				and domain destruction
200	<wspecial> events are sent to privileged callers or explicitly
201	via SET_PERMS enabled domains only.
202
203	When a watch is first set up it is triggered once straight
204	away, with <path> equal to <wpath>.  Watches may be triggered
205	spuriously.  The tx_id in a WATCH request is ignored.
206
207	Watches are supposed to be restricted by the permissions
208	system but in practice the implementation is imperfect.
209	Applications should not rely on being sent a notification for
210	paths that they cannot read; however, an application may rely
211	on being sent a watch when a path which it _is_ able to read
212	is deleted even if that leaves only a nonexistent unreadable
213	parent.  A notification may omitted if a node's permissions
214	are changed so as to make it unreadable, in which case future
215	notifications may be suppressed (and if the node is later made
216	readable, some notifications may have been lost).
217
218WATCH_EVENT					<epath>|<token>|
219	Unsolicited `reply' generated for matching modification events
220	as described above.  req_id and tx_id are both 0.
221
222	<epath> is the event's path, ie the actual path that was
223	modified; however if the event was the recursive removal of an
224	parent of <wpath>, <epath> is just
225	<wpath> (rather than the actual path which was removed).  So
226	<epath> is a child of <wpath>, regardless.
227
228	Iff <wpath> for the watch was specified as a relative pathname,
229	the <epath> path will also be relative (with the same base,
230	obviously).
231
232UNWATCH			<wpath>|<token>|?
233
234RESET_WATCHES		|
235	Reset all watches and transactions of the caller.
236
237---------- Transactions ----------
238
239TRANSACTION_START	|			<transid>|
240	<transid> is an opaque uint32_t allocated by xenstored
241	represented as unsigned decimal.  After this, transaction may
242	be referenced by using <transid> (as 32-bit binary) in the
243	tx_id request header field.  When transaction is started whole
244	db is copied; reads and writes happen on the copy.
245	It is not legal to send non-0 tx_id in TRANSACTION_START.
246
247TRANSACTION_END		T|
248TRANSACTION_END		F|
249	tx_id must refer to existing transaction.  After this
250 	request the tx_id is no longer valid and may be reused by
251	xenstore.  If F, the transaction is discarded.  If T,
252	it is committed: if there were any other intervening writes
253	then our END gets get EAGAIN.
254
255	The plan is that in the future only intervening `conflicting'
256	writes cause EAGAIN, meaning only writes or other commits
257	which changed paths which were read or written in the
258	transaction at hand.
259
260---------- Domain management and xenstored communications ----------
261
262INTRODUCE		<domid>|<gfn>|<evtchn>|?
263	Notifies xenstored to communicate with this domain.
264
265	INTRODUCE is currently only used by xend (during domain
266	startup and various forms of restore and resume), and
267	xenstored prevents its use other than by dom0.
268
269	<domid> must be a real domain id (not 0 and not a special
270	DOMID_... value).  <gfn> must be a page in that domain
271	represented in signed decimal (!).  <evtchn> must be event
272	channel is an unbound event channel in <domid> (likewise in
273	decimal), on which xenstored will call bind_interdomain.
274	Violations of these rules may result in undefined behaviour;
275	for example passing a high-bit-set 32-bit gfn as an unsigned
276	decimal will attempt to use 0x7fffffff instead (!).
277
278RELEASE			<domid>|
279	Manually requests that xenstored disconnect from the domain.
280	The event channel is unbound at the xenstored end and the page
281	unmapped.  If the domain is still running it won't be able to
282	communicate with xenstored.  NB that xenstored will in any
283	case detect domain destruction and disconnect by itself.
284	xenstored prevents the use of RELEASE other than by dom0.
285
286GET_DOMAIN_PATH		<domid>|		<path>|
287	Returns the domain's base path, as is used for relative
288	transactions: ie, /local/domain/<domid> (with <domid>
289	normalised).  The answer will be useless unless <domid> is a
290	real domain id.
291
292IS_DOMAIN_INTRODUCED	<domid>|		T| or F|
293	Returns T if xenstored is in communication with the domain:
294	ie, if INTRODUCE for the domain has not yet been followed by
295	domain destruction or explicit RELEASE.
296
297SET_TARGET		<domid>|<tdomid>|
298	Notifies xenstored that domain <domid> is targeting domain
299	<tdomid>. This grants domain <domid> full access to paths
300	owned by <tdomid>. Domain <domid> also inherits all
301	permissions granted to <tdomid> on all other paths. This
302	allows <domid> to behave as if it were dom0 when modifying
303	paths related to <tdomid>.
304
305	xenstored prevents the use of SET_TARGET other than by dom0.
306
307---------- Miscellaneous ----------
308
309CONTROL			<command>|[<parameters>|]
310	Send a control command <command> with optional parameters
311	(<parameters>) to Xenstore daemon.
312
313	The set of commands and their semantics is implementation
314	specific and is likely to change from one Xen version to the
315	next.  Out-of-tree users will encounter compatibility issues.
316
317	Current commands are:
318	check
319		checks xenstored innards
320	log|on
321		turn xenstore logging on
322	log|off
323		turn xenstore logging off
324	logfile|<file-name>
325		log to specified file
326	memreport|[<file-name>]
327		print memory statistics to logfile (no <file-name>
328		specified) or to specific file
329	print|<string>
330		print <string> to syslog (xenstore runs as daemon) or
331		to console (xenstore runs as stubdom)
332	help			<supported-commands>
333		return list of supported commands for CONTROL
334
335DEBUG
336	Deprecated, now named CONTROL
337
338