Discussion:
linker debugger interface
Ben Woodard
2011-06-13 21:49:01 UTC
Permalink
Ben,
We appreciate you guys looking into this. As Dong said, this is an
important debugger issue for us. I have a concern and a question
about the fix.
I think this fixes half of our problem. As per the bug report we're
seeing too many library related debug events, most of which are
redudant. There were enough of these that one application took ~40
minutes to start under a debugger (down to ~12 minutes after the IBM
patch).
1) Each library load triggered a 'before' and 'after' library load
event, when most debuggers just cared about the 'after'.
2) A call to dlopen would trigger a library load event, even when no
new library was loaded.
The proposal seems to do a good job at fixing 1, but point 2 is also a
big issue. Point 2 is motivated by a case where a specific
application is pre-linked with ~1000 shared libraries. It tries to
dlopen these libraries at runtime (I believe to get their dlopen
handle). These dlopen calls don't load an actual library, but they do
trigger a debugger event that leads to a re-parse of the library list.
The IBM patch fixed this by triggering the debugger breakpoint only
after a library was actually loaded. Would we be able to get the hook
in a similar location?
Let's see if we can add an additional probe point there so that we can
take care of the case where an application dlopen's an already open library.
As for the question, I'm curious about the implications of using
SystemTap to implement the debugger hooks. Would we be able to
override the systemtap hooks from the application and point them at
our own probes?
I'm not sure that I understand your question. What are you trying to do?

If you install Fedora 15 and install the systemtap-sdt-devel package and
look at /usr/include/sys/sdt.h you can see the assembly C & C++ macros
that implement the user space probe points for systemtap.

To see how they are made available inside of gdb make sure that you
install the gdb-doc package then "info gdb" and within the info document
type "/5.1.9" to go to the section directly or search for "/static
probe points".

I kind of think what you are asking is how to define your own probe
points within a library but I'm not sure.
I'm asking because we had another issue where the application had poor
debugger performance when dlopen'ing many new libraries. These were
correct debug events since a library was actually loading, but there
were still too many of them. Under the IBM patch we could change the
application to disable debug when it was in its library loading phase,
then throw a single event when everything was complete. I'm wondering if
we can implement the same thing via systemtap.
Couldn't you do something like insert a SDT_PROBE immediately before and
after all the loading is done and then tell gdb or whatever tool you are
using to disable monitoring of the library loading until after the
second probe. If that is what you are doing, then it seems like you are
simply giving some semantic meaning to a particular set of probe points.
Finally, the idea of getting system tap hooks into glibc is very cool.
There's a lot of information in there tool writers would love to get at.
My understanding is that they are low enough overhead that resistance to
adding some that expose useful data won't be too excessive. It seems
that the discussion regarding these is spread between
archer-9JcytcrH/bA+***@public.gmane.org and libc-alpha-9JcytcrH/bA+***@public.gmane.org

-ben
-Matt
Hi Ben,
Yes, I remember this. This is a request to apply the fast dll
interface to speed up debugging applications like KULL. I am glad
there was some progress here. I will review the proposal and get back
to you soon. Meantime, I want John Delsignore of Rogue Wave Software
and Matt Legendre of LLNL to review this proposal as well. Both
cc’ed. I can tell you, though, this is a very important problem for
LLNL’s debugging environment.
Best,
Dong
Sent: Thursday, June 09, 2011 8:59 AM
To: Dong Ahn
Cc: Matt Wolfe; Gyllenhaal, John C.; Travis Gummels
Subject: linker debugger interface
Dong,
A while back you sent me some info on how to speed up TV and gdb on
applications that have a huge number of dynamic libraries. The
Currently the dynamic linker has an interface used by gdb and
Totalview (and possibly other tools) that is used to keep track of
the current load state of the shared libraries used by the debugged
program. The dynamic linker has defined a function called
_dl_debug_state() and will call this function under various
conditions. A debugger can then set a breakpoint on this function,
stop at various points during a debugging session, and inspect the
state of the shared libraries at that point. For purposes of this
document, a debug event will refer to the set of operations that take
place during a debugging session, when the debugger sets a break on
this entry point, the breakpoint is hit, and some processing is done
at that point.
(Let me know if you need more of that document to refresh your memory.)
The tools and libc developers have been working on the problem from a
different direction.
http://sourceware.org/ml/archer/2011-q2/msg00000.html
http://sourceware.org/ml/archer/2011-q2/msg00015.html
From my reading of it, it appears to solve your problem but in a
different way. I wanted to verify that you concur and make sure that
the Rogue Wave/TV people also are OK with the proposed solution (Matt
can you take care of that?). If not, then I feel now is the time when
I need to pass that back to Gary Benson to make sure that he
understands the problems inherent in his approach.
-ben
Loading...