Discussion:
C++ draft
Tom Tromey
2011-06-27 21:08:16 UTC
Permalink
I've been working on another proposal to move gdb to C++. I'd
appreciate help with it. Here is what I have so far.

Do you find it reasonably convincing? If not, why not? What can be
improved? Are there other good initial targets for conversion? Are
there lurking problems of which I am unaware?

thanks,
Tom

-*- text -*-

At the GCC Summit, I once again brought up the perennial idea of
moving GDB to be implemented in C++. There, we agreed that as a
follow-on to that discussion that I would raise the topic among the
GDB maintainers, and in particular present my migration plan.

My goal for moving to C++ is to make GDB more robust.

My view is that GDB is already written in a poor cousin of C++.
Nearly every feature that people hate about C++ is already in use in
GDB. This list is not exhaustive, just informational:

* Subclasses. See general_symbol_info. struct value and struct type
would be improved by them.

* Virtual functions. gdbarch, languages, and values all use these.

* Overloaded functions. Anywhere you see a _1 suffix.

* Templates. Both observers and VEC are templates.

* Exceptions. Used ubiquitously.

* RAII. Cleanups, but they are dynamic and more error-prone.

In most cases, GDB's implementation of these features is inferior to
that of the C++ compiler. Its exceptions are slower. Its data
structures have less type-safety. Both cleanups and TRY_CATCH are
error-prone in practice. GDB is needlessly verbose due to using
callback-based container iteration and callback-based exception
handling.

I think a gradual move to C++ would enable us to fix these problems.
I believe it would also provide us a way to fix the ongoing reference
counting bugs in the Python layer.


My proposal is:

1. Modify GDB so it can be compiled with -Wc++-compat.
This would be the first patch series. There is already an archer
branch for this.

2. Then, change GDB to compile with a C++ compiler (-Wc++-compat is
not complete). This would be the second patch series.

3. Require C++.

4. Change selected modules to use C++ rather than C.
I don't think a wholesale change makes sense, but some areas would
benefit.

My first target would be to change the exception handling system to
use C++ exception. This would enable us to begin using RAII in
some areas, which would help robustness.

My concrete plan here is:

* Use the GCC cleanup-checking plugin I already wrote to detect
cleanup-aware functions.

* Modify these functions, using a script, to add an RAII-using
object to manage the local cleanups. This is important so that
we run cleanups at the correct time during stack unwinding.

* Change throw_exception to use 'throw' and all TRY_EXCEPT
instances to try...catch.

* Finally, convert functions to static RAII usage when appropriate;
this will be an ongoing transition.

I think our second target will be sorting out Python reference
counting, so we can avoid the many problems we have had there.
Joel Brobecker
2011-06-27 21:22:46 UTC
Permalink
Post by Tom Tromey
Do you find it reasonably convincing? If not, why not? What can be
improved? Are there other good initial targets for conversion? Are
there lurking problems of which I am unaware?
I can see the technical merits towards moving to C++, and I'm OK
with that (not super enthusiastic, but maybe all I need is some
time taking advantage of C++ over C).

I have two fears:

- Not having access to a good C++ compiler on some of the exotic
platforms out there. Getting a g++ installed by some of us at
AdaCore who know GCC will take me a bit of time.

- The debugging situation; how well can we debug C++? I'm pretty
sure that we've given up on stabs, but we're still doing work
on eliminating stabs from platforms such as AIX for instance.
It's tough to say how far we are, except that we're hoping that
we're getting closer (Tristan is working on and off on that).

I think "FUD" #1 can be dismissed as AdaCore's problem, and we'll
take care of that. I'd like to be reassured that Fear #2 is
unjustified, and that you will have reasonable functionality even
with GCC 4.5 (that's what we use at AdaCore, and we'll probably
stay with that for another 18months).
--
Joel
Tom Tromey
2011-06-29 19:55:15 UTC
Permalink
Joel> - Not having access to a good C++ compiler on some of the exotic
Joel> platforms out there. Getting a g++ installed by some of us at
Joel> AdaCore who know GCC will take me a bit of time.
[...]
Joel> I think "FUD" #1 can be dismissed as AdaCore's problem, and we'll
Joel> take care of that.

I definitely wouldn't put it that way. If you ship on hosts where this
is a problem, it is best to confront it in advance.

Could you say what hosts you are worried about?

Joel> - The debugging situation; how well can we debug C++? I'm pretty
Joel> sure that we've given up on stabs, but we're still doing work
Joel> on eliminating stabs from platforms such as AIX for instance.
Joel> It's tough to say how far we are, except that we're hoping that
Joel> we're getting closer (Tristan is working on and off on that).

Joel> I'd like to be reassured that Fear #2 is unjustified, and that you
Joel> will have reasonable functionality even with GCC 4.5 (that's what
Joel> we use at AdaCore, and we'll probably stay with that for another
Joel> 18months).

Actually, debugging GDB will improve. Hard to believe, but I think it
is true.

The first thing to remember is that it is a slow transition. GDB will
start using some C++ features, but it will take a long time -- honestly,
probably forever -- to transition to fully idiomatic C++.

I think GDB can handle C-like C++ perfectly well. It handles some of
the simpler language additions -- e.g., classes -- quite well.

We have pretty-printing for basically everything in the STL. This means
that printing collections will actually improve. E.g., right now,
printing a hash table is difficult because the slots are just void*.
With STL and pretty-printing you will see a more obvious display.

Another improvement is exception handling. Right now, GDB doesn't work
with longjmp on most Linux distros[*]. But, GCC 4.5 shipped with
_Unwind_DebugHook, so if you have debuginfo for libgcc, then "next" over
a throw in C++ will do the right thing -- take you to where the
exception lands.

[*] We fixed this in Fedora -- but with a local glibc patch and with the
as-yet unmerged SystemTap probe patch for GDB.


There are still a lot of C++ constructs that GDB doesn't handle well.
You can see a reasonably complete list on the roadmap, see the
"Expression Parsing" list:

http://sourceware.org/gdb/wiki/ProjectArcher

Much of this stuff is not very important, though. I doubt a program
like GDB would run into most of these.

Tom
Joel Brobecker
2011-06-29 21:46:56 UTC
Permalink
Post by Tom Tromey
I definitely wouldn't put it that way. If you ship on hosts where this
is a problem, it is best to confront it in advance.
Could you say what hosts you are worried about?
Right now, I'm mostly worried about AIX, and HP/UX. Our last Tru64
machine died of a sudden death, so I can't help with that port
anymore. I also need to think about Windows x64, but we already
have a C++ compiler on 32bit Windows, so I'm hoping it'll be easy
enough to build a 64bit version.

But really, I am just going to have to convince the guys at AdaCore
that I just need these compilers, and pronto :-).
Post by Tom Tromey
Actually, debugging GDB will improve. Hard to believe, but I think it
is true.
OK, you've seem to have thought through this quite a bit, so
I look forward to seeing those improvements...
--
Joel
Matt Rice
2011-06-28 04:33:38 UTC
Permalink
Post by Tom Tromey
1. Modify GDB so it can be compiled with -Wc++-compat.
  This would be the first patch series.  There is already an archer
  branch for this.
fwiw, -Wc++-compat was complete, but requires merging to head
(about 4 months old), latest push is about 9 months old..
don't imagine it'll take me too long to do that merge.
knock on wood.
Post by Tom Tromey
2. Then, change GDB to compile with a C++ compiler (-Wc++-compat is
  not complete).  This would be the second patch series.
I had started on this locally, but did not get very far into it.
the patches here are a little less mechanical than -Wc++-compat.
i'll try and muster up some motivation to work on these 2 things.
Yao Qi
2011-06-28 08:21:22 UTC
Permalink
I've been working on another proposal to move gdb to C++.  I'd
appreciate help with it.  Here is what I have so far.
Do you find it reasonably convincing?  If not, why not?  What can be
improved?  Are there other good initial targets for conversion?  Are
there lurking problems of which I am unaware?
I don't have special preference over C or C++.
1. Modify GDB so it can be compiled with -Wc++-compat.
  This would be the first patch series.  There is already an archer
  branch for this.
2. Then, change GDB to compile with a C++ compiler (-Wc++-compat is
  not complete).  This would be the second patch series.
3. Require C++.
4. Change selected modules to use C++ rather than C.
  I don't think a wholesale change makes sense, but some areas would
  benefit.
  My first target would be to change the exception handling system to
  use C++ exception.  This would enable us to begin using RAII in
  some areas, which would help robustness.
  * Use the GCC cleanup-checking plugin I already wrote to detect
    cleanup-aware functions.
  * Modify these functions, using a script, to add an RAII-using
    object to manage the local cleanups.  This is important so that
    we run cleanups at the correct time during stack unwinding.
  * Change throw_exception to use 'throw' and all TRY_EXCEPT
    instances to try...catch.
  * Finally, convert functions to static RAII usage when appropriate;
    this will be an ongoing transition.
  I think our second target will be sorting out Python reference
  counting, so we can avoid the many problems we have had there.
Tom,
In your concrete plan, IIUC, your plan is about converting GDB to C++
*partially*, instead of re-write GDB *completely*. Is that correct?
For example, I don't anything in your plan about converting *-tdep.c
stuff into C++. Is it in your plan or we plan to leave them as they
are now?

Do we plan to move gdbserver to C++? I think no, because some
baremental boards have too few memory to hold a C++ application. So
we are in a state that both C and C++ co-exist in GDB for some time.
I don't think C and C++ co-existance is a problem, or, your plan is
about "make good use of C++ to replace some bad and error-prone stuffs
in GDB, and keep the rest of GDB as it is". Is it right?

Just want to know clearly what GDB will be after your plan is performed.
--
Yao Qi <qiyaoltc AT gmail DOT com>
http://sites.google.com/site/duewayqi/
Gary Benson
2011-06-28 12:21:35 UTC
Permalink
Post by Yao Qi
Do we plan to move gdbserver to C++? I think no, because some
baremental boards have too few memory to hold a C++ application.
So we are in a state that both C and C++ co-exist in GDB for
some time. I don't think C and C++ co-existance is a problem,
or, your plan is> about "make good use of C++ to replace some
bad and error-prone stuffs in GDB, and keep the rest of GDB as
it is". Is it right?
As I understand it, I don't know that there's any reason C++ has to
use more memory than C. Granted there are things like the STL that
generate vastly more code under the hood than you might expect, but
I don't think anybody is talking about using STL here.

I've spent the past few years working on HotSpot, which is written
using a similar subsection of C++ to what Tom is proposing (with the
exception that HotSpot does not use C++ exceptions). My experience
has been that the overheads of C++ are as minimal as the can be. A
data structure in a virtualized class, for example, is larger than
that same data structure in a struct by only a single pointer. In
cases where GDB is doing what C++ would anyway--for example, replacing
pointers to functions with C++ virtual functions--then we're doing
essentially the same thing, only with better readability and error-
detection.

A concrete example, hello world:

#include <stdio.h>

int
main (int argc, char *argv[])
{
puts ("Hello world");
return 0;
}

Save it as hello.c and compile with gcc, then save it as hello.cc
and compile with g++. The code for the main function is _exactly_
the same:

0000000000400514 <main>:
400514: 55 push %rbp
400515: 48 89 e5 mov %rsp,%rbp
400518: 48 83 ec 10 sub $0x10,%rsp
40051c: 89 7d fc mov %edi,-0x4(%rbp)
40051f: 48 89 75 f0 mov %rsi,-0x10(%rbp)
400523: bf 38 06 40 00 mov $0x400638,%edi
400528: e8 e3 fe ff ff callq 400410 <***@plt>
40052d: b8 00 00 00 00 mov $0x0,%eax
400532: c9 leaveq
400533: c3 retq
400534: 90 nop
400535: 90 nop
400536: 90 nop
400537: 90 nop
400538: 90 nop
400539: 90 nop
40053a: 90 nop
40053b: 90 nop
40053c: 90 nop
40053d: 90 nop
40053e: 90 nop
40053f: 90 nop

The resulting executable is slightly larger (6562 bytes from 6433).
I'm not sure where this comes from or how it would scale from this
trivial example to a project the size of GDB, but the beauty of
Tom's plan is that the first step is to get the basic C code to
compile with a C++ compiler. Once that's done we can build the
same codebase with the different compilers and see where we're at.

Cheers,
Gary
--
http://gbenson.net/
Matt Rice
2011-06-28 17:08:34 UTC
Permalink
Post by Gary Benson
The resulting executable is slightly larger (6562 bytes from 6433).
note that these numbers are equivalent to the hello-c++1 from the
attached foo.sh shell script, which brings in a bunch of shared
libraries. I tried some other sources/linking scenerios to get an
idea of the footprint. (below is the output).

of concern is that of the *-tdep.c files:
amd64, i386, ppc, rs6000, and spu, (at least) use TRY_CATCH or throw_*

and that at least arm uses VEC

I'm not sure how far outside of *-tdep.c this stuff would propagate.


here is the output:
File: hello-c
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
File: hello-c++1
0x0000000000000001 (NEEDED) Shared library: [libstdc++.so.6]
0x0000000000000001 (NEEDED) Shared library: [libm.so.6]
0x0000000000000001 (NEEDED) Shared library: [libgcc_s.so.1]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
File: hello-c++2
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
File: hello-c++3
0x0000000000000001 (NEEDED) Shared library: [libm.so.6]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
File: hello-exceptions
0x0000000000000001 (NEEDED) Shared library: [libm.so.6]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
0x0000000000000001 (NEEDED) Shared library: [ld-linux-x86-64.so.2]
File: hello-exceptions+vector
0x0000000000000001 (NEEDED) Shared library: [libm.so.6]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
0x0000000000000001 (NEEDED) Shared library: [ld-linux-x86-64.so.2]
File: hello-vector
0x0000000000000001 (NEEDED) Shared library: [libm.so.6]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
0x0000000000000001 (NEEDED) Shared library: [ld-linux-x86-64.so.2]
6.3K ./hello-c
6.5K ./hello-c++1
6.3K ./hello-c++2
6.4K ./hello-c++3
106K ./hello-exceptions
166K ./hello-exceptions+vector
156K ./hello-vector
stripped
4.2K ./hello-c
4.3K ./hello-c++1
4.2K ./hello-c++2
4.3K ./hello-c++3
86K ./hello-exceptions
122K ./hello-exceptions+vector
118K ./hello-vector
Jan Kratochvil
2011-06-28 17:35:45 UTC
Permalink
Post by Gary Benson
I don't think anybody is talking about using STL here.
We should use at least C++ standard library features, it is one of the
purposes for the move to C++. std::string should be used instead of all the
make_cleanup, do_cleanups, discard_cleanups (or C++ exceptions for the same
purpose) for char * , also vec.h replaced by std::vector etc.


Thanks,
Jan
Tom Tromey
2011-06-29 20:08:13 UTC
Permalink
Gary> As I understand it, I don't know that there's any reason C++ has to
Gary> use more memory than C. Granted there are things like the STL that
Gary> generate vastly more code under the hood than you might expect, but
Gary> I don't think anybody is talking about using STL here.

Actually, I think we do want to use the STL. At the very least it would
put us on track to get rid of vec.h, our home-grown std::vector. But we
also use or want to use other containers.

Tom
Tom Tromey
2011-06-29 20:05:35 UTC
Permalink
Yao> In your concrete plan, IIUC, your plan is about converting GDB to C++
Yao> *partially*, instead of re-write GDB *completely*. Is that
Yao> correct?

Yes. I don't think a complete rewrite is either practical or advisable.
Instead I think an incremental approach is best.

Now, one possible criticism is that such incremental changes often peter
out. And this is definitely a possible problem -- after exceptions and
python reference counting, what do we care enough about to transform? I
mean, it is easy to think of areas that can be C++-ified, but are the
benefits enough to justify the work? Would we be better off just
writing GCC plugins to check our changes? I tend to think the benefits
are worth the cost, but it is hard to know this with any certainty.

Yao> For example, I don't anything in your plan about converting *-tdep.c
Yao> stuff into C++. Is it in your plan or we plan to leave them as they
Yao> are now?

Leave them.

Yao> Do we plan to move gdbserver to C++? I think no, because [...]

I agree.

Yao> I don't think C and C++ co-existance is a problem, or, your plan is
Yao> about "make good use of C++ to replace some bad and error-prone stuffs
Yao> in GDB, and keep the rest of GDB as it is". Is it right?

Yes.

Yao> Just want to know clearly what GDB will be after your plan is performed.

I think we will always have parts in C. At the very least BFD, and if
you push forward on the gdbserver library project, then the shared bits
there as well.

Tom
Yao Qi
2011-06-30 06:06:31 UTC
Permalink
Post by Tom Tromey
Yao> In your concrete plan, IIUC, your plan is about converting GDB to C++
Yao> *partially*, instead of re-write GDB *completely*.  Is that
Yao> correct?
Yes.  I don't think a complete rewrite is either practical or advisable.
Instead I think an incremental approach is best.
Tom,
Thanks for your clarification.

I agree.
Post by Tom Tromey
Now, one possible criticism is that such incremental changes often peter
out.  And this is definitely a possible problem -- after exceptions and
python reference counting, what do we care enough about to transform?  I
mean, it is easy to think of areas that can be C++-ified, but are the
benefits enough to justify the work?  Would we be better off just
writing GCC plugins to check our changes?  I tend to think the benefits
are worth the cost, but it is hard to know this with any certainty.
I agree that it is hard to say which part should be first C++-ified,
or C++-ified easily. When the first step (python reference counting
and exception) is done, we may have a clear view of which part should
be C++-ified.
Post by Tom Tromey
Yao> I don't think C and C++ co-existance is a problem, or, your plan is
Yao> about "make good use of C++ to replace some bad and error-prone stuffs
Yao> in GDB, and keep the rest of GDB as it is".  Is it right?
Yes.
That is good to me.
Post by Tom Tromey
Yao> Just want to know clearly what GDB will be after your plan is performed.
I think we will always have parts in C.  At the very least BFD, and if
you push forward on the gdbserver library project, then the shared bits
there as well.
Right, that is the reason I ask do we plan to move GDBServer to C++.
This makes sense to me.
--
Yao Qi
Jim Blandy
2011-09-26 21:01:22 UTC
Permalink
I came across this thread, and thought I might bring up Mozilla's
experience converting our JavaScript engine to C++. (It seems like
this discussion hasn't moved over to the gdb mailing list, so I'm
sending this here.)

C++'s richer type system is wonderful. We try to get strict typing
everywhere we can, and it has helped a lot. Legibility is way up since
I started at Mozilla, in 2008, just before we switched.

Contrary to your plan, we did *not* use C++ exceptions, but stuck with
our primitive "false/NULL return value means error; please clean up
and propagate" system (we carry details about the error in a structure
off to the side). It's actually a lot of work to transition C code to
being robust in the presence of exceptions. Pretty much every place
you call malloc/realloc needs to use some kind of automatic clean-up
facility like std::auto_ptr, unless you're positive nobody will ever
add a call to a function that might throw an exception before that
malloc'd storage is freed, or linked into whatever data structure it's
destined for, or the like. The essential problem is that the old C
code, by explicitly propagating the errors, effectively makes a big,
visible distinction between calls that might throw, and calls that
won't. All the control flow is explicit. When you switch to
exceptions, that distinction goes away: any call could potentially
leave the scope, and every pointer that's live across that call must
be prepared for the possibility.

We do use RAII a lot, though; that works great, and is a big win, even
when the code propagates errors explicitly. Constructors and
destructors are extremely helpful for all sorts of things.

One thing that surprised me is that we have a lot of little classes
that are only used for local variables, never allocated on the heap.
We use them to abstract out common patterns of code that occur *within
functions*. Because inlining is trustworthy in G++, the generated code
doesn't change much, but it's much more legible.

Stan Shebs
2011-06-29 18:52:18 UTC
Permalink
Post by Tom Tromey
I've been working on another proposal to move gdb to C++. I'd
appreciate help with it. Here is what I have so far.
Looks good to me! At some point there should be a task to define the
accepted set of C++ features; I really really want to avoid people
having to spend time debating minutiae of the language, and I don't want
to hear about how target vectors could be recoded using
amazing-but-not-yet-implemented features of C++2x. :-)

Stan
Joel Brobecker
2011-06-29 19:02:47 UTC
Permalink
Post by Stan Shebs
Looks good to me! At some point there should be a task to define
the accepted set of C++ features; I really really want to avoid
people having to spend time debating minutiae of the language, and I
don't want to hear about how target vectors could be recoded using
amazing-but-not-yet-implemented features of C++2x. :-)
Agreed 100%. (and that connects well with not having to have the
latest and greatest GCC to be able to build or debug GDB)
--
Joel
Loading...