1Xen crash debugger notes 2------------------------ 3 4Xen has a simple gdb stub for doing post-mortem debugging i.e. once 5you've crashed it, you get to poke around and find out why. There's 6also a special key handler for making it crash, which is handy. 7 8You need to have CRASH_DEBUG=y set when compiling, and you also need 9to enable it on the Xen command line, eg by gdb=com1. 10 11If you need to have a serial port shared between gdb and the console, 12you can use gdb=com1H. CDB will then set the high bit on every byte 13it sends, and only respond to bytes with the high bit set. Similarly 14for com2. If you do this you will need a demultiplexing program on 15the debugging workstation, such as perhaps tools/misc/nsplitd. 16 17The next step depends on your individual setup. This is how to do it 18if you have a simple null modem connection between the test box and 19the workstation, and aren't using a H/L split console: 20 21 * Set debug=y in Config.mk 22 * Set CRASH_DEBUG=y with `make -C xen menuconfig` 23 * Make the changes in the attached patch, and build. 24 * Arrange to pass gdb=com1 as a hypervisor command line argument 25 (I already have com1=38400,8n1 console=com1,vga sync_console) 26 27 * Boot the system with minicom (or your favourite terminal program) 28 connected from your workstation via a null modem cable in the 29 usual way. 30 * In minicom, give the escape character (^A by default) three times 31 to talk to Xen (Xen prints `(XEN) *** Serial input -> Xen...'). 32 * Press % and observe the messages 33 (XEN) '%' pressed -> trapping into debugger 34 (XEN) GDB connection activated. 35 (XEN) Waiting for GDB to attach... 36 * Disconnect from minicom without allowing minicom to send any 37 modem control sequences. 38 * Start gdb with gdb /path/to/build/tree/xen/xen-syms and then 39 (gdb) set remotebaud 38400 40 Remote debugging using /dev/ttyS0 41 0xff124d61 in idle_loop () at domain.c:78 42 78 safe_halt(); 43 (gdb) 44 45There is code which was once intended to make it possible to resume 46after entering the debugger. However this does not presently work; it 47has been nonfunctional for quite some time. 48 49As soon as you reach the debugger, we disable interrupts, the 50watchdog, and every other CPU, so the state of the world shouldn't 51change too much behind your back. 52 53 54Reasons why we might fail to reach the debugger: 55----------------------------------------------- 56 57-- In order to stop the other processors, we need to acquire the SMP 58 call lock. If you happen to have crashed in the middle of that, 59 you're screwed. 60-- If the page tables are wrong, you're screwed 61-- If the serial port setup is wrong, badness happens 62-- Obviously, the low level processor state can be screwed in any 63 number of wonderful ways 64