1Xenstore protocol specification 2------------------------------- 3 4Xenstore implements a database which maps filename-like pathnames 5(also known as `keys') to values. Clients may read and write values, 6watch for changes, and set permissions to allow or deny access. There 7is a rudimentary transaction system. 8 9While xenstore and most tools and APIs are capable of dealing with 10arbitrary binary data as values, this should generally be avoided. 11Data should generally be human-readable for ease of management and 12debugging; xenstore is not a high-performance facility and should be 13used only for small amounts of control plane data. Therefore xenstore 14values should normally be 7-bit ASCII text strings containing bytes 150x20..0x7f only, and should not contain a trailing nul byte. (The 16APIs used for accessing xenstore generally add a nul when reading, for 17the caller's convenience.) 18 19A separate specification will detail the keys and values which are 20used in the Xen system and what their meanings are. (Sadly that 21specification currently exists only in multiple out-of-date versions.) 22 23 24Paths are /-separated and start with a /, just as Unix filenames. 25 26We can speak of two paths being <child> and <parent>, which is the 27case if they're identical, or if <parent> is /, or if <parent>/ is an 28initial substring of <child>. (This includes <path> being a child of 29itself.) 30 31If a particular path exists, all of its parents do too. Every 32existing path maps to a possibly empty value, and may also have zero 33or more immediate children. There is thus no particular distinction 34between directories and leaf nodes. However, it is conventional not 35to store nonempty values at nodes which also have children. 36 37The permitted character for paths set is ASCII alphanumerics and plus 38the four punctuation characters -/_@ (hyphen slash underscore atsign). 39@ should be avoided except to specify special watches (see below). 40Doubled slashes and trailing slashes (except to specify the root) are 41forbidden. The empty path is also forbidden. Paths longer than 3072 42bytes are forbidden; clients specifying relative paths should keep 43them to within 2048 bytes. (See XENSTORE_*_PATH_MAX in xs_wire.h.) 44 45 46Communication with xenstore is via either sockets, or event channel 47and shared memory, as specified in io/xs_wire.h: each message in 48either direction is a header formatted as a struct xsd_sockmsg 49followed by xsd_sockmsg.len bytes of payload. 50 51The payload syntax varies according to the type field. Generally 52requests each generate a reply with an identical type, req_id and 53tx_id. However, if an error occurs, a reply will be returned with 54type ERROR, and only req_id and tx_id copied from the request. 55 56A caller who sends several requests may receive the replies in any 57order and must use req_id (and tx_id, if applicable) to match up 58replies to requests. (The current implementation always replies to 59requests in the order received but this should not be relied on.) 60 61The payload length (len field of the header) is limited to 4096 62(XENSTORE_PAYLOAD_MAX) in both directions. If a client exceeds the 63limit, its xenstored connection will be immediately killed by 64xenstored, which is usually catastrophic from the client's point of 65view. Clients (particularly domains, which cannot just reconnect) 66should avoid this. 67 68Existing clients do not always contain defences against overly long 69payloads. Increasing xenstored's limit is therefore difficult; it 70would require negotiation with the client, and obviously would make 71parts of xenstore inaccessible to some clients. In any case passing 72bulk data through xenstore is not recommended as the performance 73properties are poor. 74 75 76---------- Xenstore protocol details - introduction ---------- 77 78The payload syntax and semantics of the requests and replies are 79described below. In the payload syntax specifications we use the 80following notations: 81 82 | A nul (zero) byte. 83 <foo> A string guaranteed not to contain any nul bytes. 84 <foo|> Binary data (which may contain zero or more nul bytes) 85 <foo>|* Zero or more strings each followed by a trailing nul 86 <foo>|+ One or more strings each followed by a trailing nul 87 ? Reserved value (may not contain nuls) 88 ?? Reserved value (may contain nuls) 89 90Except as otherwise noted, reserved values are believed to be sent as 91empty strings by all current clients. Clients should not send 92nonempty strings for reserved values; those parts of the protocol may 93be used for extension in the future. 94 95 96Error replies are as follows: 97 98ERROR E<something>| 99 Where E<something> is the name of an errno value 100 listed in io/xs_wire.h. Note that the string name 101 is transmitted, not a numeric value. 102 103 104Where no reply payload format is specified below, success responses 105have the following payload: 106 OK| 107 108Values commonly included in payloads include: 109 110 <path> 111 Specifies a path in the hierarchical key structure. 112 If <path> starts with a / it simply represents that path. 113 114 <path> is allowed not to start with /, in which case the 115 caller must be a domain (rather than connected via a socket) 116 and the path is taken to be relative to /local/domain/<domid> 117 (eg, `x/y' sent by domain 3 would mean `/local/domain/3/x/y'). 118 119 <domid> 120 Integer domid, represented as decimal number 0..65535. 121 Parsing errors and values out of range generally go 122 undetected. The special DOMID_... values (see xen.h) are 123 represented as integers; unless otherwise specified it 124 is an error not to specify a real domain id. 125 126 127 128The following are the actual type values, including the request and 129reply payloads as applicable: 130 131 132---------- Database read, write and permissions operations ---------- 133 134READ <path>| <value|> 135WRITE <path>|<value|> 136 Store and read the octet string <value> at <path>. 137 WRITE creates any missing parent paths, with empty values. 138 139MKDIR <path>| 140 Ensures that the <path> exists, by necessary by creating 141 it and any missing parents with empty values. If <path> 142 or any parent already exists, its value is left unchanged. 143 144RM <path>| 145 Ensures that the <path> does not exist, by deleting 146 it and all of its children. It is not an error if <path> does 147 not exist, but it _is_ an error if <path>'s immediate parent 148 does not exist either. 149 150DIRECTORY <path>| <child-leaf-name>|* 151 Gives a list of the immediate children of <path>, as only the 152 leafnames. The resulting children are each named 153 <path>/<child-leaf-name>. 154 155DIRECTORY_PART <path>|<offset> <gencnt>|<child-leaf-name>|* 156 Same as DIRECTORY, but to be used for children lists longer than 157 XENSTORE_PAYLOAD_MAX. Input are <path> and the byte offset into 158 the list of children to return. Return values are the generation 159 count <gencnt> of the node (to be used to ensure the node hasn't 160 changed between two reads: <gencnt> being the same for multiple 161 reads guarantees the node hasn't changed) and the list of children 162 starting at the specified <offset> of the complete list. 163 164GET_PERMS <path>| <perm-as-string>|+ 165SET_PERMS <path>|<perm-as-string>|+? 166 <perm-as-string> is one of the following 167 w<domid> write only 168 r<domid> read only 169 b<domid> both read and write 170 n<domid> no access 171 See https://wiki.xen.org/wiki/XenBus section 172 `Permissions' for details of the permissions system. 173 It is possible to set permissions for the special watch paths 174 "@introduceDomain" and "@releaseDomain" to enable receiving those 175 watches in unprivileged domains. 176 177---------- Watches ---------- 178 179WATCH <wpath>|<token>|? 180 Adds a watch. 181 182 When a <path> is modified (including path creation, removal, 183 contents change or permissions change) this generates an event 184 on the changed <path>. Changes made in transactions cause an 185 event only if and when committed. Each occurring event is 186 matched against all the watches currently set up, and each 187 matching watch results in a WATCH_EVENT message (see below). 188 189 The event's path matches the watch's <wpath> if it is an child 190 of <wpath>. 191 192 <wpath> can be a <path> to watch or @<wspecial>. In the 193 latter case <wspecial> may have any syntax but it matches 194 (according to the rules above) only the following special 195 events which are invented by xenstored: 196 @introduceDomain occurs on INTRODUCE 197 @releaseDomain occurs on any domain crash or 198 shutdown, and also on RELEASE 199 and domain destruction 200 <wspecial> events are sent to privileged callers or explicitly 201 via SET_PERMS enabled domains only. 202 203 When a watch is first set up it is triggered once straight 204 away, with <path> equal to <wpath>. Watches may be triggered 205 spuriously. The tx_id in a WATCH request is ignored. 206 207 Watches are supposed to be restricted by the permissions 208 system but in practice the implementation is imperfect. 209 Applications should not rely on being sent a notification for 210 paths that they cannot read; however, an application may rely 211 on being sent a watch when a path which it _is_ able to read 212 is deleted even if that leaves only a nonexistent unreadable 213 parent. A notification may omitted if a node's permissions 214 are changed so as to make it unreadable, in which case future 215 notifications may be suppressed (and if the node is later made 216 readable, some notifications may have been lost). 217 218WATCH_EVENT <epath>|<token>| 219 Unsolicited `reply' generated for matching modification events 220 as described above. req_id and tx_id are both 0. 221 222 <epath> is the event's path, ie the actual path that was 223 modified; however if the event was the recursive removal of an 224 parent of <wpath>, <epath> is just 225 <wpath> (rather than the actual path which was removed). So 226 <epath> is a child of <wpath>, regardless. 227 228 Iff <wpath> for the watch was specified as a relative pathname, 229 the <epath> path will also be relative (with the same base, 230 obviously). 231 232UNWATCH <wpath>|<token>|? 233 234RESET_WATCHES | 235 Reset all watches and transactions of the caller. 236 237---------- Transactions ---------- 238 239TRANSACTION_START | <transid>| 240 <transid> is an opaque uint32_t allocated by xenstored 241 represented as unsigned decimal. After this, transaction may 242 be referenced by using <transid> (as 32-bit binary) in the 243 tx_id request header field. When transaction is started whole 244 db is copied; reads and writes happen on the copy. 245 It is not legal to send non-0 tx_id in TRANSACTION_START. 246 247TRANSACTION_END T| 248TRANSACTION_END F| 249 tx_id must refer to existing transaction. After this 250 request the tx_id is no longer valid and may be reused by 251 xenstore. If F, the transaction is discarded. If T, 252 it is committed: if there were any other intervening writes 253 then our END gets get EAGAIN. 254 255 The plan is that in the future only intervening `conflicting' 256 writes cause EAGAIN, meaning only writes or other commits 257 which changed paths which were read or written in the 258 transaction at hand. 259 260---------- Domain management and xenstored communications ---------- 261 262INTRODUCE <domid>|<gfn>|<evtchn>|? 263 Notifies xenstored to communicate with this domain. 264 265 INTRODUCE is currently only used by xend (during domain 266 startup and various forms of restore and resume), and 267 xenstored prevents its use other than by dom0. 268 269 <domid> must be a real domain id (not 0 and not a special 270 DOMID_... value). <gfn> must be a page in that domain 271 represented in signed decimal (!). <evtchn> must be event 272 channel is an unbound event channel in <domid> (likewise in 273 decimal), on which xenstored will call bind_interdomain. 274 Violations of these rules may result in undefined behaviour; 275 for example passing a high-bit-set 32-bit gfn as an unsigned 276 decimal will attempt to use 0x7fffffff instead (!). 277 278RELEASE <domid>| 279 Manually requests that xenstored disconnect from the domain. 280 The event channel is unbound at the xenstored end and the page 281 unmapped. If the domain is still running it won't be able to 282 communicate with xenstored. NB that xenstored will in any 283 case detect domain destruction and disconnect by itself. 284 xenstored prevents the use of RELEASE other than by dom0. 285 286GET_DOMAIN_PATH <domid>| <path>| 287 Returns the domain's base path, as is used for relative 288 transactions: ie, /local/domain/<domid> (with <domid> 289 normalised). The answer will be useless unless <domid> is a 290 real domain id. 291 292IS_DOMAIN_INTRODUCED <domid>| T| or F| 293 Returns T if xenstored is in communication with the domain: 294 ie, if INTRODUCE for the domain has not yet been followed by 295 domain destruction or explicit RELEASE. 296 297SET_TARGET <domid>|<tdomid>| 298 Notifies xenstored that domain <domid> is targeting domain 299 <tdomid>. This grants domain <domid> full access to paths 300 owned by <tdomid>. Domain <domid> also inherits all 301 permissions granted to <tdomid> on all other paths. This 302 allows <domid> to behave as if it were dom0 when modifying 303 paths related to <tdomid>. 304 305 xenstored prevents the use of SET_TARGET other than by dom0. 306 307---------- Miscellaneous ---------- 308 309CONTROL <command>|[<parameters>|] 310 Send a control command <command> with optional parameters 311 (<parameters>) to Xenstore daemon. 312 313 The set of commands and their semantics is implementation 314 specific and is likely to change from one Xen version to the 315 next. Out-of-tree users will encounter compatibility issues. 316 317 Current commands are: 318 check 319 checks xenstored innards 320 log|on 321 turn xenstore logging on 322 log|off 323 turn xenstore logging off 324 logfile|<file-name> 325 log to specified file 326 memreport|[<file-name>] 327 print memory statistics to logfile (no <file-name> 328 specified) or to specific file 329 print|<string> 330 print <string> to syslog (xenstore runs as daemon) or 331 to console (xenstore runs as stubdom) 332 help <supported-commands> 333 return list of supported commands for CONTROL 334 335DEBUG 336 Deprecated, now named CONTROL 337 338