Q: mutlithreaded tracees && clone/exit

Discussion:

Oleg Nesterov

2010-07-16 20:51:47 UTC

Hello.

In case this matters, I used gdb-7.1 for testing.

Q1: if gsbstub reported that the tracee has exited (say, we sent
'$X09#..' to gdb), can gsbstub assume it can forget about this thread?
I mean, can it assume that gdb won't send something like 'D;EXITED_PID'?
Or it should keep the info connected to the exited thread until the
explicit detach request? Or may be we should keep some info just to
report this exit code again to the next '$?#3f' request?

Looking at gdb sources/behaviour, I think the answer is yes, it can
forget. But I'd like to have the confirmation.

And. I'd like to let you know that gdb is buggy ;) But it is not very
easy to explain the bug because I don't know the terminology. Let's
consider this particular example:

(gdb) target extended-remote whatever
(gdb) attach PID
...
(gdb) c

Now gdb sleeps in sys_read/sys_recvfrom.

The user presses ^C, gdb sends 3 and waits for reply. Suppose that
gdbstub doesn't reply immediately.

The user presses ^C again and acks the "Give up (and stop debugging it)?"
question.

gdb does remote_close()->discard_all_inferiors()->...->exit_inferior_1().
Surprisingly, exit_inferior_1() does not remove this thread from
inferior_list but clears inf->pid.

This means that the subsequent find_inferior_pid() fails and returns NULL,
and gdb segfaults if the user does, say,

(gdb) detach

after that.

I noticed this bug when I found another problem, gdb+gdbserver doesn't
work correctly if the main thread exits. But let's forget about this
problem for now.

The main question is, I do not understand how gdbstub should handle the
multithreaded targets.

Trivial testcase:

void *tfunc(void *arg)
{
getchar();
return NULL;
}

int main(int argc, const char *argv[])
{
pthread_t thr;

printf("PID=%d\n", getpid());

pthread_create(&thr, NULL, tfunc, NULL);

for (;;)
pause();

return 0;
}

Gdbserver:

gdbserver --multi :2000

gdb:

(gdb) file test1
(gdb) target extended-remote :2000
(gdb) attach 16927
Attached to process 16927
...
0x00000033af60e57d in pause () from /lib64/libpthread.so.0
(gdb)

OK. gdbserver ptraces both threads. But afaics gdb doesn't now this
program is multithreaded, and strace shows that gdb doesn't send
qfThreadInfo request.

Q2: Shouldn't gdbstub let debugger know about sub-threads somehow?

Let'c continue:

(gdb) c
Continuing.

gdbserver resumes both threads. Press enter, the sub-thread exits.

And nothing happens! gdbserver sends nothing to gdb, it just reaps
the tracee silently:

rt_sigsuspend([]) = ? ERESTARTNOHAND (To be restarted)
--- SIGCHLD (Child exited) @ 0 (0) ---
rt_sigreturn(0x11) = -1 EINTR (Interrupted system call)
wait4(-1, 0x7fff2c719fbc, WNOHANG, NULL) = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG|__WCLONE, NULL) = 16928
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [], 8) = 0
wait4(-1, 0x7fff2c719fbc, WNOHANG, NULL) = 0
wait4(-1, 0x7fff2c719fbc, WNOHANG|__WCLONE, NULL) = -1 ECHILD (No child processes)
rt_sigsuspend([]

Q3: is it correct? shouldn't we inform the debugger?

So. Afaics, gdb can only find the new thread if the user does
"info threads", or if this thread reports somehow about itself
(say, it gets a signal and gdbserver sends "$T..." with its tid).

Also. gdb can't know the sub-thread has exited unless the user
does "info threads" again, or something like "$TpPID.TGID" gets
"E01" in reply.

Correct?

Q4: is this what we want to implement?

I am asking because that I thought that gdb+gdbserver should
try to work the same way as it works without gdbserver, and
thus it should see clone/exit.

However, gdbserver sends nothing to gdb if the tracee does
pthread_create() or pthread_exit().

Oleg.

Roland McGrath

2010-07-16 21:39:50 UTC