-*- mode: text -*-
                      GDB Tracepoints for Linux


This directory contains the source for a kernel module and patches for
GDB which, when used together, allow GDB to debug the kernel it's
running under using tracepoints.  Jim Blandy
<jimb@codesourcery.com> first presented the Linux tracepoints work at
FOSDEM 2007.  GDB's tracepoint facilities for embedded systems were
originally developed in 1997 by Michael Snyder and Jim Blandy.

At the moment, the kernel module supports only the IA-32 architecture;
patches for more architectures are welcome; see "Porting Tracepoints",
below.  The GDB patches should be architecture-independent.  The code
was developed against Linux 2.6.19.1 and the current GDB sources; the
INSTALL file has details.

See INSTALL for instructions on building and installing the code.  See
TODO for suggestions on how to make kernel tracepoints more useful.
There is some explanation of how to use tracepoints below.

See http://www.red-bean.com/trac/tracepoints for the latest
information.  That page also has the slides for the FOSDEM talk, and a
link to the mailing list.


Release History

0.1: First release.
0.1.1: Fix gdb.patch to apply to public GDB.
0.1.2: Document prerequisites.  Backpointer to web site.  Mention
       mailing list. 


Using Tracepoints

The "Tracepoints" chapter in the GDB manual explains how tracepoints
work in general.  (You can probably pop it up from the shell with
"info '(gdb)Tracepoints'".)  The manual says tracepoints are currently
only available for remote targets; that's not true any more. :)

You'll want to install debugging information for the kernel, along
with the corresponding sources, for GDB to use and display.  On
Fedora, you'll need to install the kernel-debuginfo and
kernel-debuginfo-common packages.

To use tracepoints to debug Linux, you first need to insert the
tracepoints module into the kernel:

    $ su
    Password: 
    # insmod gtp.ko
    # 

Then, run GDB as root, select the kernel executable with debugging
information, and then choose the linux-trace target:

    # gdb /usr/lib/debug/lib/modules/$(uname -r)/vmlinux
    GNU gdb 6.6.50.20070223-cvs
    Copyright (C) 2007 Free Software Foundation, Inc.
    ...
    (gdb) target linux-trace
    Debugging with Linux kernel tracepoints.
    (gdb) 

(If you're unconcerned about security, you can always change the
permission bits in the 'gtp' module's init function, in procfs.c, to
allow user and group read and write permission on
/proc/gdb-tracepoints; then running GDB as root is unnecessary.)

For example, to watch calls to vfs_read and vfs_write:

    (gdb) trace vfs_read
    Tracepoint 1 at 0xc0475e7b: file fs/read_write.c, line 256.
    (gdb) actions
    Enter actions for tracepoint 1, one per line.
    End with a line saying just "end".
    > collect $args
    > end
    (gdb) trace vfs_write
    Tracepoint 2 at 0xc0475d22: file fs/read_write.c, line 314.
    (gdb) actions
    Enter actions for tracepoint 2, one per line.
    End with a line saying just "end".
    > collect $args
    > end
    (gdb)

This creates the two tracepoints, and specifies what should be
collected when each one is hit.

    (gdb) tstart

This enables tracing.  Since vfs_read and vfs_write are used very
often (for example, to send X Window System packets), the hit log will
fill up quickly, giving us some hits to look at.

    (gdb) tstop
    (gdb) tfind start
    #0  vfs_read (file=0xf3416500, buf=0x8346020 <Address 0x8346020 out of bounds>, count=1024, 
        pos=0xf1ca0fa4) at fs/read_write.c:256
    256     {
    (gdb)

The "256   {" is the source line at which control stopped.

The "Address X out of bounds" message is due to the fact that the
'buf' argument to vfs_read is a __user pointer: it refers to data in
the address space of the process making the system call, not in that
of the kernel.  The __user type qualifier is not visible to GDB, so
GDB can't know that it should treat it specially.

If you try to access memory that the tracepoint action's didn't
collect, this is what you see:

    (gdb) print *file
    Cannot access memory at address 0xf3416500
    (gdb) 

Without arguments, the 'tfind' command advances to the next tracepoint
hit; 'tfind -' returns to the previous tracepoint hit:

    (gdb) tfind
    #0  vfs_write (file=0xf3416500, buf=0x82b3316 <Address 0x82b3316 out of bounds>, count=7, 
        pos=0xf1ca0fa4) at fs/read_write.c:314
    314     {
    (gdb) tfind -
    #0  vfs_read (file=0xf3416500, buf=0x8346020 <Address 0x8346020 out of bounds>, count=1024, 
        pos=0xf1ca0fa4) at fs/read_write.c:256
    256     {
    (gdb) 

We can change the vfs_read tracepoint to collect more data:

    (gdb) actions 1
    Enter actions for tracepoint 1, one per line.
    End with a line saying just "end".
    > collect $args
    > collect *file
    > collect file->f_dentry->d_inode->i_sb->s_dev
    > end
    (gdb) tstart
    (gdb) tstop
    (gdb) tfind tracepoint 1
    #0  vfs_read (file=0xf3416500, buf=0x8346020 <Address 0x8346020 out of bounds>, count=1024, 
        pos=0xf1ca0fa4) at fs/read_write.c:256
    256     {
    (gdb) print *file
    $1 = {f_u = {fu_list = {next = 0xf6d63180, prev = 0xf7da4aa8},
          fu_rcuhead = {next = 0xf6d63180, func = 0xf7da4aa8}}, 
          f_dentry = 0xefb39238, ... }
    (gdb) print file->f_dentry->d_inode->i_sb->s_dev
    $2 = 3
    (gdb) 

Selecting a tracepoint hit makes GDB treat the registers and memory
that were collected at that hit as if they were currently available.
This means that any command that works by reading collected memory can
be used, regardless of how it was collected.  For example:

    (gdb) trace sys_symlink
    Tracepoint 1 at 0xc047e935: file fs/namei.c, line 2247.
    (gdb) actions
    Enter actions for tracepoint 1, one per line.
    End with a line saying just "end".
    > collect $args
    > collect $esp[0]@128
    > end
    (gdb)

The syntax $esp[0]@128, on an IA-32 machine, means to collect the 128
bytes pointed to by $esp.  See the GDB manual's description of the '@'
operator, (gdb)Arrays.

We do the same for sys_symlinkat and vfs_symlink:

    (gdb) trace sys_symlinkat
    Tracepoint 2 at 0xc047e882: file fs/namei.c, line 2211.
    (gdb) actions
    Enter actions for tracepoint 2, one per line.
    End with a line saying just "end".
    > collect $args
    > collect $esp[0]@128
    > end
    (gdb) trace vfs_symlink 
    Tracepoint 3 at 0xc047c1c1: file fs/namei.c, line 2189.
    (gdb) actions
    Enter actions for tracepoint 3, one per line.
    End with a line saying just "end".
    > collect $args
    > collect $esp[0]@128
    > end
    (gdb)

Now we start tracing, and make a symlink to get some hits:

    (gdb) tstart
    (gdb) shell ln -s moxie froxie
    (gdb) tstop
    (gdb) tfind start
    #0  sys_symlink (oldname=0xbfc429d2 <Address 0xbfc429d2 out of bounds>, 
        newname=0xbfc429d8 <Address 0xbfc429d8 out of bounds>) at fs/namei.c:2247
    2247    {
    (gdb)

Now, since we've collected the memory at the top of the stack, we can
get partial backtraces:

    (gdb) where
    #0  sys_symlink (oldname=0xbfc429d2 <Address 0xbfc429d2 out of bounds>, 
        newname=0xbfc429d8 <Address 0xbfc429d8 out of bounds>) at fs/namei.c:2247
    #1  <signal handler called>
    #2  0x00d2c402 in ?? ()
    #3  0xbfc411f8 in ?? ()
    #4  0x0000007b in ?? ()
    #5  0x00000000 in ?? ()
    (gdb) tfind
    #0  sys_symlinkat (oldname=0xbfc429d2 <Address 0xbfc429d2 out of bounds>, newdfd=-100, 
        newname=0xbfc429d8 <Address 0xbfc429d8 out of bounds>) at fs/namei.c:2211
    2211    {
    (gdb) where
    #0  sys_symlinkat (oldname=0xbfc429d2 <Address 0xbfc429d2 out of bounds>, newdfd=-100, 
        newname=0xbfc429d8 <Address 0xbfc429d8 out of bounds>) at fs/namei.c:2211
    #1  0xc047e954 in sys_symlink (oldname=0xbfc429d2 <Address 0xbfc429d2 out of bounds>, 
        newname=0xbfc429d8 <Address 0xbfc429d8 out of bounds>) at fs/namei.c:2248
    #2  <signal handler called>
    #3  0x00d2c402 in ?? ()
    #4  0xbfc411f8 in ?? ()
    #5  0x0000007b in ?? ()
    #6  0x00000000 in ?? ()
    (gdb) tfind
    #0  vfs_symlink (dir=0xf6d30124, dentry=0xefb251a8, 
        oldname=0xf2a30000 <Address 0xf2a30000 out of bounds>, mode=4095) at fs/namei.c:2189
    2189    {
    (gdb) where
    #0  vfs_symlink (dir=0xf6d30124, dentry=0xefb251a8, 
        oldname=0xf2a30000 <Address 0xf2a30000 out of bounds>, mode=4095) at fs/namei.c:2189
    #1  0xc047e8fc in sys_symlinkat (oldname=0xbfc429d2 <Address 0xbfc429d2 out of bounds>, 
        newdfd=-100, newname=0xbfc429d8 <Address 0xbfc429d8 out of bounds>) at fs/namei.c:2234
    #2  0xc047e954 in sys_symlink (oldname=0xbfc429d2 <Address 0xbfc429d2 out of bounds>, newname=Cannot access memory at address 0xefba3fc0
    )
        at fs/namei.c:2248
    #3  <signal handler called>
    Cannot access memory at address 0xefba3fe4
    (gdb)

Note how, when GDB runs off the end of the memory we collected, it
stops the backtrace.  Also, it seems that GDB interprets the frame
where we entered the kernel as a signal handler trap, which is nice.

Since all the hit log entries are still there, we can go backwards,
and "uncall" vfs_symlink:

    (gdb) tfind -
    #0  sys_symlinkat (oldname=0xbfc429d2 <Address 0xbfc429d2 out of bounds>, newdfd=-100, 
        newname=0xbfc429d8 <Address 0xbfc429d8 out of bounds>) at fs/namei.c:2211
    2211    {
    (gdb) where
    #0  sys_symlinkat (oldname=0xbfc429d2 <Address 0xbfc429d2 out of bounds>, newdfd=-100, 
        newname=0xbfc429d8 <Address 0xbfc429d8 out of bounds>) at fs/namei.c:2211
    #1  0xc047e954 in sys_symlink (oldname=0xbfc429d2 <Address 0xbfc429d2 out of bounds>, 
        newname=0xbfc429d8 <Address 0xbfc429d8 out of bounds>) at fs/namei.c:2248
    #2  <signal handler called>
    #3  0x00d2c402 in ?? ()
    #4  0xbfc411f8 in ?? ()
    #5  0x0000007b in ?? ()
    #6  0x00000000 in ?? ()
    (gdb) 


Porting Tracepoints

In this source tree, the 'arch/$(ARCH)' and 'include/asm-$(ARCH)'
directories contain architecture-specific code needed by the GTP
kernel module for some architecture $(ARCH).  The architecture
dependencies are pretty limited, so it should be straightforward to
port tracepoints to new architectures.

In the top-level directory, the 'gtp-arch.h' header declares functions
that must be defined by each 'arch/$(ARCH)/gtp-arch.c' file, and
documents the macros and types that must be defined by each
'include/asm-$(ARCH)/gtp-arch.h' header.  The comments in the
top-level 'gtp-arch.h' should explain what's needed; if not, please
let us know, so we can clarify them.