1.. SPDX-License-Identifier: GPL-2.0+
2.. Copyright (c) 2013 The Chromium OS Authors.
3
4Tracing in U-Boot
5=================
6
7U-Boot supports a simple tracing feature which allows a record of excecution
8to be collected and sent to a host machine for analysis. At present the
9main use for this is to profile boot time.
10
11
12Overview
13--------
14
15The trace feature uses GCC's instrument-functions feature to trace all
16function entry/exit points. These are then recorded in a memory buffer.
17The memory buffer can be saved to the host over a network link using
18tftpput or by writing to an attached memory device such as MMC.
19
20On the host, the file is first converted with a tool called 'proftool',
21which extracts useful information from it. The resulting trace output
22resembles that emitted by Linux's ftrace feature, so can be visually
23displayed by pytimechart.
24
25
26Quick-start using Sandbox
27-------------------------
28
29Sandbox is a build of U-Boot that can run under Linux so it is a convenient
30way of trying out tracing before you use it on your actual board. To do
31this, follow these steps:
32
33Add the following to include/configs/sandbox.h (if not already there)
34
35.. code-block:: c
36
37    #define CONFIG_TRACE
38    #define CONFIG_CMD_TRACE
39    #define CONFIG_TRACE_BUFFER_SIZE    (16 << 20)
40    #define CONFIG_TRACE_EARLY_SIZE     (8 << 20)
41    #define CONFIG_TRACE_EARLY
42    #define CONFIG_TRACE_EARLY_ADDR     0x00100000
43
44Build sandbox U-Boot with tracing enabled:
45
46.. code-block:: console
47
48    $ make FTRACE=1 O=sandbox sandbox_config
49    $ make FTRACE=1 O=sandbox
50
51Run sandbox, wait for a bit of trace information to appear, and then capture
52a trace:
53
54.. code-block:: console
55
56    $ ./sandbox/u-boot
57
58    U-Boot 2013.04-rc2-00100-ga72fcef (Apr 17 2013 - 19:25:24)
59
60    DRAM:  128 MiB
61    trace: enabled
62    Using default environment
63
64    In:    serial
65    Out:   serial
66    Err:   serial
67    =>trace stats
68        671,406 function sites
69         69,712 function calls
70              0 untracked function calls
71         73,373 traced function calls
72             16 maximum observed call depth
73             15 call depth limit
74         66,491 calls not traced due to depth
75    =>trace stats
76        671,406 function sites
77      1,279,450 function calls
78              0 untracked function calls
79        950,490 traced function calls (333217 dropped due to overflow)
80             16 maximum observed call depth
81             15 call depth limit
82          1,275,767 calls not traced due to depth
83    =>trace calls 0 e00000
84    Call list dumped to 00000000, size 0xae0a40
85    =>print
86    baudrate=115200
87    profbase=0
88    profoffset=ae0a40
89    profsize=e00000
90    stderr=serial
91    stdin=serial
92    stdout=serial
93
94    Environment size: 117/8188 bytes
95    =>host save host 0 trace 0 ${profoffset}
96    11405888 bytes written in 10 ms (1.1 GiB/s)
97    =>reset
98
99
100Then run proftool to convert the trace information to ftrace format
101
102.. code-block:: console
103
104    $ ./sandbox/tools/proftool -m sandbox/System.map -p trace dump-ftrace >trace.txt
105
106Finally run pytimechart to display it
107
108.. code-block:: console
109
110    $ pytimechart trace.txt
111
112Using this tool you can zoom and pan across the trace, with the function
113calls on the left and little marks representing the start and end of each
114function.
115
116
117CONFIG Options
118--------------
119
120CONFIG_TRACE
121    Enables the trace feature in U-Boot.
122
123CONFIG_CMD_TRACE
124    Enables the trace command.
125
126CONFIG_TRACE_BUFFER_SIZE
127    Size of trace buffer to allocate for U-Boot. This buffer is
128    used after relocation, as a place to put function tracing
129    information. The address of the buffer is determined by
130    the relocation code.
131
132CONFIG_TRACE_EARLY
133    Define this to start tracing early, before relocation.
134
135CONFIG_TRACE_EARLY_SIZE
136    Size of 'early' trace buffer. Before U-Boot has relocated
137    it doesn't have a proper trace buffer. On many boards
138    you can define an area of memory to use for the trace
139    buffer until the 'real' trace buffer is available after
140    relocation. The contents of this buffer are then copied to
141    the real buffer.
142
143CONFIG_TRACE_EARLY_ADDR
144    Address of early trace buffer
145
146
147Building U-Boot with Tracing Enabled
148------------------------------------
149
150Pass 'FTRACE=1' to the U-Boot Makefile to actually instrument the code.
151This is kept as a separate option so that it is easy to enable/disable
152instrumenting from the command line instead of having to change board
153config files.
154
155
156Collecting Trace Data
157---------------------
158
159When you run U-Boot on your board it will collect trace data up to the
160limit of the trace buffer size you have specified. Once that is exhausted
161no more data will be collected.
162
163Collecting trace data has an affect on execution time/performance. You
164will notice this particularly with trvial functions - the overhead of
165recording their execution may even exceed their normal execution time.
166In practice this doesn't matter much so long as you are aware of the
167effect. Once you have done your optimisations, turn off tracing before
168doing end-to-end timing.
169
170The best time to start tracing is right at the beginning of U-Boot. The
171best time to stop tracing is right at the end. In practice it is hard
172to achieve these ideals.
173
174This implementation enables tracing early in board_init_f(). This means
175that it captures most of the board init process, missing only the
176early architecture-specific init. However, it also misses the entire
177SPL stage if there is one.
178
179U-Boot typically ends with a 'bootm' command which loads and runs an
180OS. There is useful trace data in the execution of that bootm
181command. Therefore this implementation provides a way to collect trace
182data after bootm has finished processing, but just before it jumps to
183the OS. In practical terms, U-Boot runs the 'fakegocmd' environment
184variable at this point. This variable should have a short script which
185collects the trace data and writes it somewhere.
186
187Trace data collection relies on a microsecond timer, accesed through
188timer_get_us(). So the first think you should do is make sure that
189this produces sensible results for your board. Suitable sources for
190this timer include high resolution timers, PWMs or profile timers if
191available. Most modern SOCs have a suitable timer for this. Make sure
192that you mark this timer (and anything it calls) with
193__attribute__((no_instrument_function)) so that the trace library can
194use it without causing an infinite loop.
195
196
197Commands
198--------
199
200The trace command has variable sub-commands:
201
202stats
203    Display tracing statistics
204
205pause
206    Pause tracing
207
208resume
209    Resume tracing
210
211funclist [<addr> <size>]
212    Dump a list of functions into the buffer
213
214calls  [<addr> <size>]
215    Dump function call trace into buffer
216
217If the address and size are not given, these are obtained from environment
218variables (see below). In any case the environment variables are updated
219after the command runs.
220
221
222Environment Variables
223---------------------
224
225The following are used:
226
227profbase
228    Base address of trace output buffer
229
230profoffset
231    Offset of first unwritten byte in trace output buffer
232
233profsize
234    Size of trace output buffer
235
236All of these are set by the 'trace calls' command.
237
238These variables keep track of the amount of data written to the trace
239output buffer by the 'trace' command. The trace commands which write data
240to the output buffer can use these to specify the buffer to write to, and
241update profoffset each time. This allows successive commands to append data
242to the same buffer, for example::
243
244    => trace funclist 10000 e00000
245    => trace calls
246
247(the latter command appends more data to the buffer).
248
249
250fakegocmd
251    Specifies commands to run just before booting the OS. This
252    is a useful time to write the trace data to the host for
253    processing.
254
255
256Writing Out Trace Data
257----------------------
258
259Once the trace data is in an output buffer in memory there are various ways
260to transmit it to the host. Notably you can use tftput to send the data
261over a network link::
262
263    fakegocmd=trace pause; usb start; set autoload n; bootp;
264    trace calls 10000000 1000000;
265    tftpput ${profbase} ${profoffset} 192.168.1.4:/tftpboot/calls
266
267This starts up USB (to talk to an attached USB Ethernet dongle), writes
268a trace log to address 10000000 and sends it to a host machine using
269TFTP. After this, U-Boot will boot the OS normally, albeit a little
270later.
271
272
273Converting Trace Output Data
274----------------------------
275
276The trace output data is kept in a binary format which is not documented
277here. To convert it into something useful, you can use proftool.
278
279This tool must be given the U-Boot map file and the trace data received
280from running that U-Boot. It produces a text output file.
281
282Options
283
284-m <map_file>
285    Specify U-Boot map file
286
287-p <trace_file>
288    Specifiy profile/trace file
289
290Commands:
291
292dump-ftrace
293    Write a text dump of the file in Linux ftrace format to stdout
294
295
296Viewing the Trace Data
297----------------------
298
299You can use pytimechart for this (sudo apt-get pytimechart might work on
300your Debian-style machine, and use your favourite search engine to obtain
301documentation). It expects the file to have a .txt extension. The program
302has terse user interface but is very convenient for viewing U-Boot
303profile information.
304
305
306Workflow Suggestions
307--------------------
308
309The following suggestions may be helpful if you are trying to reduce boot
310time:
311
3121. Enable CONFIG_BOOTSTAGE and CONFIG_BOOTSTAGE_REPORT. This should get
313   you are helpful overall snapshot of the boot time.
314
3152. Build U-Boot with tracing and run it. Note the difference in boot time
316   (it is common for tracing to add 10% to the time)
317
3183. Collect the trace information as descibed above. Use this to find where
319   all the time is being spent.
320
3214. Take a look at that code and see if you can optimise it. Perhaps it is
322   possible to speed up the initialisation of a device, or remove an unused
323   feature.
324
3255. Rebuild, run and collect again. Compare your results.
326
3276. Keep going until you run out of steam, or your boot is fast enough.
328
329
330Configuring Trace
331-----------------
332
333There are a few parameters in the code that you may want to consider.
334There is a function call depth limit (set to 15 by default). When the
335stack depth goes above this then no tracing information is recorded.
336The maximum depth reached is recorded and displayed by the 'trace stats'
337command.
338
339
340Future Work
341-----------
342
343Tracing could be a little tidier in some areas, for example providing
344run-time configuration options for trace.
345
346Some other features that might be useful:
347
348- Trace filter to select which functions are recorded
349- Sample-based profiling using a timer interrupt
350- Better control over trace depth
351- Compression of trace information
352
353
354Simon Glass <sjg@chromium.org>
355April 2013
356