Discussion:
Expression Parser Plug-In Available
Keith Seitz
2011-03-10 20:46:20 UTC
Permalink
Hi, all,

In October, we discussed writing a compiler plug-in to play/experiment
with the possibility of re-using the compiler's parsers for gdb.

Well, at long last, I have checked-in an initial version of the plug-in
to do this. I'm sure someone more familiar with the compiler will find
a bunch of problems and/or refinements to be made, but this should be a
start.

You can grab the sources from gcc's git repository. The plug-in is on
the archer-expr-plugin branch. The README (archer-expr-plugin/README)
explains how to build and play with this thing.

Right now, the plug-in is attempting to suppress errors and warnings.
I've attempted to install a minimal diagnostic pretty-printer which does
nothing, but I haven't quite got it right yet; it will/might crash if
the expression produces a fatal error. Not nice, but good enough for
now. I'll try to fix this as time permits and demand dictates.

The plug-in uses debug_tree to output the result of the parse. I'm sure
this is too verbose to be useful, or at least optimal, but again,
probably good enough for playing around.

I should mention: the plug-in could probably be optimized a bit. I had
to resort to some expensive location expansion and strcmp'ing filename
basenames... In the little playing around that I've done, this hasn't
really been as big an issue as I would have thought, though, but I
thought it worth mentioning.

Here are the two tests I've been using. One is parsing an expression
(just a local variable) in a function near the end of breakpoint.c.
The other is something similar in a firefox source file:

Parse "command" at breakpoint.c:11990:

$ sh ~/compile-breakpoint.o
<var_decl 0x7fbb4d008aa0 command
type <pointer_type 0x7fbb4da115e8
type <record_type 0x7fbb4da0d690 cmd_list_element VOID
align 8 symtab 0 alias set -1 canonical type 0x7fbb4da0d690
pointer_to_this <pointer_type 0x7fbb4da115e8> chain
<type_decl 0x7fbb4da057e8 D.10095>>
unsigned DI
size <integer_cst 0x7fbb4dfb67a8 constant 64>
unit size <integer_cst 0x7fbb4dfb67d0 constant 8>
align 64 symtab 0 alias set -1 canonical type 0x7fbb4da115e8
pointer_to_this <pointer_type 0x7fbb4d824348>>
used unsigned DI file ../../src/gdb/breakpoint.c line 11987 col 28
size <integer_cst 0x7fbb4dfb67a8 64> unit size <integer_cst
0x7fbb4dfb67d0 8>
align 64 context <function_decl 0x7fbb4d294c00 add_catch_command>>
time in evaluating in-line expression: 0.308954 (99%)

real 0m0.329s
user 0m0.306s
sys 0m0.019s

The phrase "in-line expression" means that the expression was parsed at
the location given by the input. [If parsing fails here, the plug-in
will continue until the end of the file and try again. In that case, it
says "at exit".]


Parse "catMan" in mozilla/content/base/src/nsContentUtils.cpp:6437:

$ sh ~/compile-firefox
<var_decl 0x7fddb1dd3280 catMan
type <record_type 0x7fddb3084150 nsCOMPtr addressable asm_written
needs-constructing type_1 type_4 type_5 type_6 BLK
size <integer_cst 0x7fddbb7827a8 constant 64>
unit size <integer_cst 0x7fddbb7827d0 constant 8>
align 64 symtab -1310826496 alias set -1 canonical type
0x7fddb3084150
fields <field_decl 0x7fddb1dd2720 mRawPtr type <pointer_type
0x7fddb4525e70>
private unsigned nonlocal decl_3 DI file
../../../dist/include/nsCOMPtr.h line 522 col 10 size <integer_cst
0x7fddbb7827a8 64> unit size <integer_cst 0x7fddbb7827d0 8>
align 64 offset_align 128
offset <integer_cst 0x7fddbb782410 constant 0>
bit offset <integer_cst 0x7fddbb782f28 constant 0> context
<record_type 0x7fddb3084150 nsCOMPtr> chain <type_decl 0x7fddb1dcba10
nsCOMPtr>> context <translation_unit_decl 0x7fddbb78e958 D.1>
full-name "class nsCOMPtr<nsICategoryManager>"
needs-constructor needs-destructor X() has-type-conversion
X(constX&) this=(X&) n_parents=0 use_template=1 interface-unknown
pointer_to_this <pointer_type 0x7fddb30842a0> reference_to_this
<reference_type 0x7fddb3088d20> chain <type_decl 0x7fddb1dcb958 nsCOMPtr>>
addressable used tree_1 tree_3 decl_5 BLK file
../../../../mozilla-central/content/base/src/nsContentUtils.cpp line
6433 col 32 size <integer_cst 0x7fddbb7827a8 64> unit size <integer_cst
0x7fddbb7827d0 8>
align 64 context <function_decl 0x7fddb6981200
FindInternalContentViewer>>
time in evaluating in-line expression: 5.395179 (100%)

real 0m12.356s
user 0m5.123s
sys 0m0.297s

As you can see, there is a substantial difference between the shell and
the plug-in times in this test case. I haven't investigated this a
whole lot, but by using -ftime-report, the majority of this time is
spent in "parsing", "garbage collection", and "name lookup".

The plug-in measures time from the end of plug-in initialization to the
outputting of the parse tree. I'll investigate further.

Comments/advice/suggestions -- please send them along!

Keith
Keith Seitz
2011-03-10 22:51:32 UTC
Permalink
Post by Keith Seitz
The plug-in measures time from the end of plug-in initialization to the
outputting of the parse tree. I'll investigate further.
Turns out I was simply abusing libiberty. I've committed a serious of
fixes to clean this up. It will now properly report an approximate
elapsed time.

Keith
Tom Tromey
2011-03-14 18:29:17 UTC
Permalink
Keith> In October, we discussed writing a compiler plug-in to play/experiment
Keith> with the possibility of re-using the compiler's parsers for gdb.

Keith> Well, at long last, I have checked-in an initial version of the
Keith> plug-in to do this.

Very cool.

Keith> I should mention: the plug-in could probably be optimized a bit. I
Keith> had to resort to some expensive location expansion and strcmp'ing
Keith> filename basenames... In the little playing around that I've done,
Keith> this hasn't really been as big an issue as I would have thought,
Keith> though, but I thought it worth mentioning.

I wouldn't worry about this. Parsing all the C++ is going to be the
major cost. And if by some miracle the basename stuff shows up, we can
fix it later.

Keith> The phrase "in-line expression" means that the expression was parsed
Keith> at the location given by the input. [If parsing fails here, the
Keith> plug-in will continue until the end of the file and try again. In
Keith> that case, it says "at exit".]

What is that for?

Keith> real 0m12.356s
Keith> user 0m5.123s

Yikes.

Keith> Comments/advice/suggestions -- please send them along!

A couple things...

First, I think we should not worry about the C compiler. GDB's C parser
is not that bad, and anyway is more maintainable than the C++ stuff,
just because C is so much simpler. I think it is fine to hack on the C
plugin if it helps you in some way, but if it gets in the way at all,
just ditch it.


Second, I realized recently that current approach will not work at all
with convenience functions. This is because convenience functions are
untyped.

I think this could be made to work via some evil tricks, but it seems
complicated and hard to make efficient. (The trick is to do the base
parsing in GDB, using g++ only for name resolution, and do that at
expression-evaluation time. That way you could invoke the convenience
function before name resolution. But, this would at least need
memoization to be efficient and would also defer some syntax errors
until the wrong time...)

Tom

Loading...