As of version 2011.1, there are three tracing mechanisms available in LITMUSRT:
litmus_log
: This trace contains text messages (created with the TRACE()
macro, see litmus.h
) that convey information useful for debugging. There is one global
litmus_log
buffer for the whole system. The litmus_log
facility is essentially a replacement for printk()
, which cannot be invoked from scheduling code without risking deadlock. Debug tracing must be enabled at compile time. Note that debug tracing creates significant overhead because string formatting takes place.
ft_trace
: This trace contains binary-encoded time stamps. It is used for overhead tracing. There is one global ft_trace
buffer for the whole system. The "ft" stands for Feather-Trace. Feather-Trace is designed to create only negligible overhead when event sources are disabled, and to incur only low overhead when recording time stamps.
sched_trace
: This trace contains binary-encoded scheduling event information, e.g., an event can be recorded whenever a task got scheduled, a job was released, a job completed, etc. There is one sched_trace
buffer per processor. sched_trace
is based on Feather-Trace and hence also incurs only negligible overhead when event sources are disabled.
litmus_log
functionality is provided through the kernel's misc
driver interface. The other two traces are exported to user space through standard character device drivers.
On systems with default udev rules, the devices are created in the /dev/litmus/
directory:
litmus_log
trace is exported as /dev/litmus/log
.ft_trace
trace is exported as /dev/litmus/ft_trace0
.sched_trace
traces are exported as /dev/litmus/sched_trace0
, /dev/litmus/sched_trace1
, etc. The litmus_log
buffer can be read by simply opening the file and reading its contents:
cat /dev/litmus/log > my_debug_log
Kill the cat
process to stop recording debug messages.
No post-processing is required since the debug messages are plain text. However, note that messages may appear in an order that differs from the sequence of events at runtime. If order is important (for example when debugging race conditions), then recorded messages can be sorted offline with the help of the sequence number at the start of each recorded message.
Example:
sort -n my_debug_log > my_sorted_debug_log
Hint: One can use netcat(1)
to send the debug messages to another host to avoid filesystem activity.
Feather-Trace allows for much more fine-grained tracing than the simple debug stream realized by litmus_log
and hence requires special-purpose tools to be used. These tools are available as part of the ft_tools
package, which is available on the LITMUSRT download page.
Feather-Trace events can be enabled on a per-event basis. Each event is identified by a unique 32-bit ID. Initially, when the device buffer is not being accessed by any user space programs (i.e., when the device driver is unused), all events are disabled. Events can be enabled (and subsequently disabled again) by writing binary commands to the buffer device file. Once events are enabled they can generate trace records, which are stored in the trace buffer. User space programs can obtain these records by reading from the device file. Reading a trace record removes it from the buffer, i.e., each record can be read exactly once. Records should be consumed shortly after they were created since buffer capacity is limited.
The tool ftcat
, which is part of the ft_tools
package (see above), automates the process of enabling events and retrieving trace records. It takes the name of the ft_trace
device file and the events of interest as arguments:
ftcat <ft device> <event 1> <event 2> <event 3> ...
The tool enables the specified events and copies all recorded events to stdout
.
Events can be specified either by their ID (see include/litmus/trace.h
in the kernel directory for a list of time stamp generating events) or by their symbolic name. The following symbolic names are recognized:
SCHED_START
, SCHED_END
: CXS_START
, CXS_END
: SCHED2_START
, SCHED2_END
: SCHED_START
, SCHED_END
].TICK_START
, TICK_END
: PLUGIN_TICK_START
, PLUGIN_TICK_END
: TICK_START
, TICK_END
], but only measures the time spent by the active scheduling plugin.PLUGIN_SCHED_START
, PLUGIN_SCHED_END
: SCHED_START
, SCHED_END
], but only measures the time spent by the active scheduling plugin. There is no equivalent SCHED2
counterpart because the scheduling plugins do not directly contribute to the SCHED2
overhead.RELEASE_START
, RELEASE_END
: For example, the following command can be used to store the length of context switches and scheduling decisions in the file my_trace
.
$ ftcat ft_trace CXS_START CXS_END SCHED_START SCHED_END SCHED2_START SCHED2_END > my_trace
The tracing can be stopped by interrupting ftcat
with ^C
or by sending SIGTERM
with kill(1)
.
Note that the recorded trace is stored in the byte order of the host.
The binary event stream recorded by ftcat
is of course of limited direct use—the data has yet to be exported for analysis. This can be achieved with the tool ft2csv
, which is also part of the ft_tools
package.
As the name suggests, ft2csv
extracts intervals defined by pairs of time stamps in a recorded trace and displays them as comma-separated values (CSV). It takes the name of an overhead trace and one start event as arguments:
ft2csv <start event> <overhead trace>
Events are specified in the same way as with ftcat
. For example, the following command can be used to print the context-switch lengths that were recorded in the example above to stdout
.
$ ft2csv CXS_START my_trace
2397634444579592, 2397634444583328, 3736
2397634477323130, 2397634477326686, 3556
2397634477346366, 2397634477348986, 2620
2397634611910924, 2397634611914348, 3424
...
For each event, the start time (in clock cycles) is given in the first column, the end time is given in the second column, and the length is given in the third column (again, in cycles).
ft2csv
accepts a few options that affect how events are filtered. By default, events that do not involve real-time tasks are ignored. This can be changed by specifying the -b
option. If one happens to be processing output on a little-endian host that was produced on a big-endian host then the -e
option can come in handy.
Once the recorded overheads have been exported to CSV files they can be easily analyzed with tools such as Python's csv
module, Octave, or Matlab.
Scheduling traces are also recorded using Feather-Trace. However, a different binary format is used. The definition and an explanation of the format can be found in the file include/litmus/sched_trace.h
(in the kernel directory).
Since sched_trace
uses per-processor buffers several ftcat
instances have to be launched. The ft_tools
package includes a wrapper script called st_trace
that automates this setup procedure. Note that one may have to modify st_trace
to change the default location of the sched_trace
device files and the ftcat
binary. st_trace
accepts a tag as the only argument. The tag is used to assign a unique name to the trace files.
The following example illustrates how st_trace
is used with the tag global-exp
.
$ st_trace global-exp
CPU 0: 8204 > st-global-exp-0.bin [0]
CPU 1: 8205 > st-global-exp-1.bin [0]
CPU 2: 8206 > st-global-exp-2.bin [0]
CPU 3: 8207 > st-global-exp-3.bin [0]
Press Enter to end tracing... # enter pressed
Disabling 9 events.
Disabling 9 events.
Disabling 9 events.
Disabling 9 events.
/home/litmus/log0: 1234 bytes read.
/home/litmus/log1: 1234 bytes read.
/home/litmus/log2: 1234 bytes read.
/home/litmus/log3: 1234 bytes read.
Note that st_trace
may have to be modified to change the default sched_trace
device locations and the ftcat
binary.
Hint: The dummy real-time task rtspin
(distributed as part of the liblitmus
package) may be useful when studying/testing the behavior of a scheduler plugin.
Recorded scheduling traces can be analyzed with Unit-Trace.
For any questions please contact the current LITMUSRT maintainer (as indicated on the LITMUSRT homepage).