1.. SPDX-License-Identifier: GPL-2.0+ 2.. Copyright (c) 2013 The Chromium OS Authors. 3 4Tracing in U-Boot 5================= 6 7U-Boot supports a simple tracing feature which allows a record of excecution 8to be collected and sent to a host machine for analysis. At present the 9main use for this is to profile boot time. 10 11 12Overview 13-------- 14 15The trace feature uses GCC's instrument-functions feature to trace all 16function entry/exit points. These are then recorded in a memory buffer. 17The memory buffer can be saved to the host over a network link using 18tftpput or by writing to an attached memory device such as MMC. 19 20On the host, the file is first converted with a tool called 'proftool', 21which extracts useful information from it. The resulting trace output 22resembles that emitted by Linux's ftrace feature, so can be visually 23displayed by pytimechart. 24 25 26Quick-start using Sandbox 27------------------------- 28 29Sandbox is a build of U-Boot that can run under Linux so it is a convenient 30way of trying out tracing before you use it on your actual board. To do 31this, follow these steps: 32 33Add the following to include/configs/sandbox.h (if not already there) 34 35.. code-block:: c 36 37 #define CONFIG_TRACE 38 #define CONFIG_CMD_TRACE 39 #define CONFIG_TRACE_BUFFER_SIZE (16 << 20) 40 #define CONFIG_TRACE_EARLY_SIZE (8 << 20) 41 #define CONFIG_TRACE_EARLY 42 #define CONFIG_TRACE_EARLY_ADDR 0x00100000 43 44Build sandbox U-Boot with tracing enabled: 45 46.. code-block:: console 47 48 $ make FTRACE=1 O=sandbox sandbox_config 49 $ make FTRACE=1 O=sandbox 50 51Run sandbox, wait for a bit of trace information to appear, and then capture 52a trace: 53 54.. code-block:: console 55 56 $ ./sandbox/u-boot 57 58 U-Boot 2013.04-rc2-00100-ga72fcef (Apr 17 2013 - 19:25:24) 59 60 DRAM: 128 MiB 61 trace: enabled 62 Using default environment 63 64 In: serial 65 Out: serial 66 Err: serial 67 =>trace stats 68 671,406 function sites 69 69,712 function calls 70 0 untracked function calls 71 73,373 traced function calls 72 16 maximum observed call depth 73 15 call depth limit 74 66,491 calls not traced due to depth 75 =>trace stats 76 671,406 function sites 77 1,279,450 function calls 78 0 untracked function calls 79 950,490 traced function calls (333217 dropped due to overflow) 80 16 maximum observed call depth 81 15 call depth limit 82 1,275,767 calls not traced due to depth 83 =>trace calls 0 e00000 84 Call list dumped to 00000000, size 0xae0a40 85 =>print 86 baudrate=115200 87 profbase=0 88 profoffset=ae0a40 89 profsize=e00000 90 stderr=serial 91 stdin=serial 92 stdout=serial 93 94 Environment size: 117/8188 bytes 95 =>host save host 0 trace 0 ${profoffset} 96 11405888 bytes written in 10 ms (1.1 GiB/s) 97 =>reset 98 99 100Then run proftool to convert the trace information to ftrace format 101 102.. code-block:: console 103 104 $ ./sandbox/tools/proftool -m sandbox/System.map -p trace dump-ftrace >trace.txt 105 106Finally run pytimechart to display it 107 108.. code-block:: console 109 110 $ pytimechart trace.txt 111 112Using this tool you can zoom and pan across the trace, with the function 113calls on the left and little marks representing the start and end of each 114function. 115 116 117CONFIG Options 118-------------- 119 120CONFIG_TRACE 121 Enables the trace feature in U-Boot. 122 123CONFIG_CMD_TRACE 124 Enables the trace command. 125 126CONFIG_TRACE_BUFFER_SIZE 127 Size of trace buffer to allocate for U-Boot. This buffer is 128 used after relocation, as a place to put function tracing 129 information. The address of the buffer is determined by 130 the relocation code. 131 132CONFIG_TRACE_EARLY 133 Define this to start tracing early, before relocation. 134 135CONFIG_TRACE_EARLY_SIZE 136 Size of 'early' trace buffer. Before U-Boot has relocated 137 it doesn't have a proper trace buffer. On many boards 138 you can define an area of memory to use for the trace 139 buffer until the 'real' trace buffer is available after 140 relocation. The contents of this buffer are then copied to 141 the real buffer. 142 143CONFIG_TRACE_EARLY_ADDR 144 Address of early trace buffer 145 146 147Building U-Boot with Tracing Enabled 148------------------------------------ 149 150Pass 'FTRACE=1' to the U-Boot Makefile to actually instrument the code. 151This is kept as a separate option so that it is easy to enable/disable 152instrumenting from the command line instead of having to change board 153config files. 154 155 156Collecting Trace Data 157--------------------- 158 159When you run U-Boot on your board it will collect trace data up to the 160limit of the trace buffer size you have specified. Once that is exhausted 161no more data will be collected. 162 163Collecting trace data has an affect on execution time/performance. You 164will notice this particularly with trvial functions - the overhead of 165recording their execution may even exceed their normal execution time. 166In practice this doesn't matter much so long as you are aware of the 167effect. Once you have done your optimisations, turn off tracing before 168doing end-to-end timing. 169 170The best time to start tracing is right at the beginning of U-Boot. The 171best time to stop tracing is right at the end. In practice it is hard 172to achieve these ideals. 173 174This implementation enables tracing early in board_init_f(). This means 175that it captures most of the board init process, missing only the 176early architecture-specific init. However, it also misses the entire 177SPL stage if there is one. 178 179U-Boot typically ends with a 'bootm' command which loads and runs an 180OS. There is useful trace data in the execution of that bootm 181command. Therefore this implementation provides a way to collect trace 182data after bootm has finished processing, but just before it jumps to 183the OS. In practical terms, U-Boot runs the 'fakegocmd' environment 184variable at this point. This variable should have a short script which 185collects the trace data and writes it somewhere. 186 187Trace data collection relies on a microsecond timer, accesed through 188timer_get_us(). So the first think you should do is make sure that 189this produces sensible results for your board. Suitable sources for 190this timer include high resolution timers, PWMs or profile timers if 191available. Most modern SOCs have a suitable timer for this. Make sure 192that you mark this timer (and anything it calls) with 193__attribute__((no_instrument_function)) so that the trace library can 194use it without causing an infinite loop. 195 196 197Commands 198-------- 199 200The trace command has variable sub-commands: 201 202stats 203 Display tracing statistics 204 205pause 206 Pause tracing 207 208resume 209 Resume tracing 210 211funclist [<addr> <size>] 212 Dump a list of functions into the buffer 213 214calls [<addr> <size>] 215 Dump function call trace into buffer 216 217If the address and size are not given, these are obtained from environment 218variables (see below). In any case the environment variables are updated 219after the command runs. 220 221 222Environment Variables 223--------------------- 224 225The following are used: 226 227profbase 228 Base address of trace output buffer 229 230profoffset 231 Offset of first unwritten byte in trace output buffer 232 233profsize 234 Size of trace output buffer 235 236All of these are set by the 'trace calls' command. 237 238These variables keep track of the amount of data written to the trace 239output buffer by the 'trace' command. The trace commands which write data 240to the output buffer can use these to specify the buffer to write to, and 241update profoffset each time. This allows successive commands to append data 242to the same buffer, for example:: 243 244 => trace funclist 10000 e00000 245 => trace calls 246 247(the latter command appends more data to the buffer). 248 249 250fakegocmd 251 Specifies commands to run just before booting the OS. This 252 is a useful time to write the trace data to the host for 253 processing. 254 255 256Writing Out Trace Data 257---------------------- 258 259Once the trace data is in an output buffer in memory there are various ways 260to transmit it to the host. Notably you can use tftput to send the data 261over a network link:: 262 263 fakegocmd=trace pause; usb start; set autoload n; bootp; 264 trace calls 10000000 1000000; 265 tftpput ${profbase} ${profoffset} 192.168.1.4:/tftpboot/calls 266 267This starts up USB (to talk to an attached USB Ethernet dongle), writes 268a trace log to address 10000000 and sends it to a host machine using 269TFTP. After this, U-Boot will boot the OS normally, albeit a little 270later. 271 272 273Converting Trace Output Data 274---------------------------- 275 276The trace output data is kept in a binary format which is not documented 277here. To convert it into something useful, you can use proftool. 278 279This tool must be given the U-Boot map file and the trace data received 280from running that U-Boot. It produces a text output file. 281 282Options 283 284-m <map_file> 285 Specify U-Boot map file 286 287-p <trace_file> 288 Specifiy profile/trace file 289 290Commands: 291 292dump-ftrace 293 Write a text dump of the file in Linux ftrace format to stdout 294 295 296Viewing the Trace Data 297---------------------- 298 299You can use pytimechart for this (sudo apt-get pytimechart might work on 300your Debian-style machine, and use your favourite search engine to obtain 301documentation). It expects the file to have a .txt extension. The program 302has terse user interface but is very convenient for viewing U-Boot 303profile information. 304 305 306Workflow Suggestions 307-------------------- 308 309The following suggestions may be helpful if you are trying to reduce boot 310time: 311 3121. Enable CONFIG_BOOTSTAGE and CONFIG_BOOTSTAGE_REPORT. This should get 313 you are helpful overall snapshot of the boot time. 314 3152. Build U-Boot with tracing and run it. Note the difference in boot time 316 (it is common for tracing to add 10% to the time) 317 3183. Collect the trace information as descibed above. Use this to find where 319 all the time is being spent. 320 3214. Take a look at that code and see if you can optimise it. Perhaps it is 322 possible to speed up the initialisation of a device, or remove an unused 323 feature. 324 3255. Rebuild, run and collect again. Compare your results. 326 3276. Keep going until you run out of steam, or your boot is fast enough. 328 329 330Configuring Trace 331----------------- 332 333There are a few parameters in the code that you may want to consider. 334There is a function call depth limit (set to 15 by default). When the 335stack depth goes above this then no tracing information is recorded. 336The maximum depth reached is recorded and displayed by the 'trace stats' 337command. 338 339 340Future Work 341----------- 342 343Tracing could be a little tidier in some areas, for example providing 344run-time configuration options for trace. 345 346Some other features that might be useful: 347 348- Trace filter to select which functions are recorded 349- Sample-based profiling using a timer interrupt 350- Better control over trace depth 351- Compression of trace information 352 353 354Simon Glass <sjg@chromium.org> 355April 2013 356