gdb

单步执行和跟踪函数调用

gdb 用于调试程序,找出 bug. 下面我们来试着用 gdb 找出下面程序中的 bug:

#include <stdio.h>

int add_range(int low, int high)
{
    int i, sum;
    for (i = low; i <= high; i++)
        sum = sum + i;
    return sum;
}

int main(void)
{
    int result[100];
    result[0] = add_range(1, 10);
    result[1] = add_range(1, 100);
    printf("result[0]=%d\nresult[1]=%d\n", result[0], result[1]);
    return 0;
}

add_range函数从low加到high,在main函数中首先从1加到10,把结果保存下来,然后从1加到100,再把结果保存下来,最后打印的两个结果是:

result[0]=-391038065
result[1]=-391033015

结果显然不对,正确结果应该是 55 和 5050.

为了调试,编译时要加上 -g 选项,生成的可执行文件才能用gdb进行源码级调试:

$ gcc main.c -o main -g

-g 选项的作用是在可执行文件中加入源代码的信息,比如可执行文件中第几条机器指令对应源代码的第几行,但并不是把整个源文件嵌入到可执行文件中,所以在调试时必须保证gdb能找到源文件。

然后我们输入 gdb main(如果没有安装,可以用 sudo apt install gdb 安装),进入 gdb 命令行环境:

pi@raspbian:~/code/learn c$ gdb main
GNU gdb (Debian 8.2.1-2+b3) 8.2.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "aarch64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from main...done.
(gdb) 

(gdb) 是命令提示符,表示等待输入。输入 help 查看常用命令:

(gdb) help
List of classes of commands:

aliases -- Aliases of other commands
breakpoints -- Making program stop at certain points
data -- Examining data
files -- Specifying and examining files
internals -- Maintenance commands
obscure -- Obscure features
running -- Running the program
stack -- Examining the stack
status -- Status inquiries
support -- Support facilities
tracepoints -- Tracing of program execution without stopping the program
user-defined -- User-defined commands

然后用 list 命令从第一行开始列出 10 行源码:

(gdb) list
1       #include <stdio.h>
2
3       int add_range(int low, int high)
4       {
5               int i, sum;
6               for (i = low; i <= high; i++)
7                       sum = sum + i;
8               return sum;
9       }
10

再次输入 list 会继续显示后面 10 行:

(gdb) list
11      int main(void)
12      {
13              int result[100];
14              result[0] = add_range(1, 10);
15              result[1] = add_range(1, 100);
16              printf("result[0]=%d\nresult[1]=%d\n", result[0], result[1]);
17              return 0;
18      }

也可以什么都不输直接敲回车,gdb提供了一个很方便的功能,在提示符下直接敲回车表示重复上一条命令。

gdb的很多常用命令有简写形式,例如list命令可以写成l,要列一个函数的源代码也可以用函数名做参数:

(gdb) l add_range
1       #include <stdio.h>
2
3       int add_range(int low, int high)
4       {
5               int i, sum;
6               for (i = low; i <= high; i++)
7                       sum = sum + i;
8               return sum;
9       }
10

退出gdb的环境:

(gdb) quit

下面正式开始调试。首先用start命令开始执行程序:

(gdb) start
Temporary breakpoint 1 at 0x7bc: file main.c, line 14.
Starting program: /home/pi/code/learn c/main 

Temporary breakpoint 1, main () at main.c:14
14              result[0] = add_range(1, 10);

gdb停在main函数中变量定义之后的第一条语句处等待我们发命令,gdb列出的这条语句是即将执行的下一条语句。我们可以用next命令(简写为n)控制这些语句一条一条地执行:

(gdb) n
15              result[1] = add_range(1, 100);
(gdb) (直接回车)
16              printf("result[0]=%d\nresult[1]=%d\n", result[0], result[1]);
(gdb) (直接回车)
result[0]=-3105
result[1]=1945
17              return 0;

用n命令依次执行两行赋值语句和一行打印语句,在执行打印语句时结果立刻打出来了,然后停在return语句之前等待我们发命令。虽然我们完全控制了程序的执行,但仍然看不出哪里错了,因为错误不在 main 函数中而在 add_range 函数中,现在用start命令重新来过,这次用 step 命令(简写为 s)钻进 add_range 函数中去跟踪执行:

(gdb) start
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Temporary breakpoint 2 at 0xaaaaaaaaa7bc: file main.c, line 14.
Starting program: /home/pi/code/learn c/main 

Temporary breakpoint 2, main () at main.c:14
14              result[0] = add_range(1, 10);
(gdb) s
add_range (low=1, high=10) at main.c:6
6               for (i = low; i <= high; i++)

我们可以用 backtracebt 来查看函数调用:

(gdb) bt
#0  add_range (low=1, high=10) at main.c:6
#1  0x0000aaaaaaaaa7c8 in main () at main.c:14

可以看到,low=1,high=10. main 函数的栈帧编号为 1,add_range 的栈帧编号为0。现在可以用 info 命令(简写为 i)查看 add_range 函数局部变量的值:

(gdb) i locals
i = 65535
sum = -3144

可以看出,此时的 sum 的值都不为 0,说明我们忘记了初始化。所以我们需要将 sum 赋为 0. 问题解决。

我们对修改后的代码重新 debug:

(gdb) start
Temporary breakpoint 1 at 0x7c0: file main.c, line 14.
Starting program: /home/pi/code/learn c/main 

Temporary breakpoint 1, main () at main.c:14
14              result[0] = add_range(1, 10);
(gdb) s
add_range (low=1, high=10) at main.c:5
5               int i, sum=0;
(gdb) 
6               for (i = low; i <= high; i++)
(gdb) i locals
i = 65535
sum = 0
(gdb) finish
Run till exit from #0  add_range (low=1, high=10) at main.c:6
main () at main.c:14
14              result[0] = add_range(1, 10);
Value returned is $1 = 55

s 进入函数后,用 i locals 查看局部变量,然后用 finish 命令让程序一直运行到从当前函数返回为止,返回的结果为 55,正确。

总结一下常用的 gdb 命令:

命令 描述
backtrace(或bt) 查看各级函数调用及参数
finish 连续运行到当前函数返回为止,然后停下来等待命令
frame(或f) 帧编号 选择栈帧
info(或i) locals 查看当前栈帧局部变量的值
list(或l) 列出源代码,接着上次的位置往下列,每次列10行
list 行号 列出从第几行开始的源代码
list 函数名 列出某个函数的源代码
next(或n) 执行下一行语句
print(或p) 打印表达式的值,通过表达式可以修改变量的值或者调用函数
quit(或q) 退出gdb调试环境
set var 修改变量的值
start 开始执行程序,停在main函数第一行语句前面等待命令
step(或s) 执行下一行语句,如果有函数调用则进入到函数中

断点

下面我们用这段程序来说明断点的设置:

#include <stdio.h>

int main(void)
{
	int sum = 0, i = 0;
	char input[5];

	while (1) {
		scanf("%s", input);
		for (i = 0; input[i] != '\0'; i++)
			sum = sum*10 + input[i] - '0';
		printf("input=%d\n", sum);
	}
	return 0;
}

这个程序从键盘读取一串数字,然后再打印出来。我们编译运行一下:

123
input=123
456
input=123456

可以看出,当我们输入第二个数字时,并没有正常打印出来。我们再次进入 gdb,然后运行程序:

$ gdb main
(gdb) start
Temporary breakpoint 1 at 0x7bc: file main.c, line 5.
Starting program: /home/pi/code/learn c/main 

Temporary breakpoint 1, main () at main.c:5
5               int sum = 0, i = 0;

我们用 display 命令使得每次停下来的时候都显示当前sum的值:

(gdb) display sum
1: sum = 0
(gdb) n
9                       scanf("%s", input);
1: sum = 0
(gdb) 
123
10                      for (i = 0; input[i] != '\0'; i++)
1: sum = 0

undisplay命令可以取消跟踪显示,变量sum的编号是1,可以用undisplay 1命令取消它的跟踪显示。

如果不想一步一步走这个循环,可以用break命令(简写为b)在第9行设一个断点(Breakpoint):

(gdb) l
5               int sum = 0, i = 0;
6               char input[5];
7
8               while (1) {
9                       scanf("%s", input);
10                      for (i = 0; input[i] != '\0'; i++)
11                              sum = sum*10 + input[i] - '0';
12                      printf("input=%d\n", sum);
13              }
14              return 0;
(gdb) b 9
Breakpoint 2 at 0xaaaaaaaaa7c4: file main.c, line 9.

break命令的参数也可以是函数名,表示在某个函数开头设断点。现在用continue命令(简写为c)连续运行而非单步运行,程序到达断点会自动停下来,这样就可以停在下一次循环的开头:

(gdb) c
Continuing.
input=123

Breakpoint 2, main () at main.c:9
9                       scanf("%s", input);
1: sum = 123

然后输入新的字符串准备转换:

(gdb) n
456
10                      for (i = 0; input[i] != '\0'; i++)
1: sum = 123

这时我们可以发现,输入新的字符串前,sum 没有清零。

一次调试可以设置多个断点,用info命令可以查看已经设置的断点:

(gdb) b 12
Breakpoint 3 at 0xaaaaaaaaa830: file main.c, line 12.
(gdb) i breakpoints
Num     Type           Disp Enb Address            What
2       breakpoint     keep y   0x0000aaaaaaaaa7c4 in main at main.c:9
        breakpoint already hit 1 time
3       breakpoint     keep y   0x0000aaaaaaaaa830 in main at main.c:12

每个断点都有一个编号,可以用编号指定删除某个断点:

(gdb) delete breakpoints 3
(gdb) i breakpoints
Num     Type           Disp Enb Address            What
2       breakpoint     keep y   0x0000aaaaaaaaa7c4 in main at main.c:9
        breakpoint already hit 1 time

有时候一个断点暂时不用可以禁用掉而不必删除,这样以后想用的时候可以直接启用,而不必重新从代码里找应该在哪一行设断点:

(gdb) disable breakpoints 2
(gdb) i breakpoints
Num     Type           Disp Enb Address            What
2       breakpoint     keep n   0x0000aaaaaaaaa7c4 in main at main.c:9
        breakpoint already hit 1 time

gdb的断点功能非常灵活,还可以设置断点在满足某个条件时才激活,例如我们仍然在循环开头设置断点,但是仅当sum不等于0时才中断,然后用run命令(简写为r)重新从程序开头连续运行:

(gdb) break 9 if sum != 0
Note: breakpoint 2 (disabled) also set at pc 0xaaaaaaaaa7c4.
Breakpoint 4 at 0xaaaaaaaaa7c4: file main.c, line 9.
(gdb) i breakpoints
Num     Type           Disp Enb Address            What
2       breakpoint     keep n   0x0000aaaaaaaaa7c4 in main at main.c:9
        breakpoint already hit 1 time
4       breakpoint     keep y   0x0000aaaaaaaaa7c4 in main at main.c:9
        stop only if sum != 0
(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/pi/code/learn c/main 
123
input=123

Breakpoint 4, main () at main.c:9
9                       scanf("%s", input);
1: sum = 123

结果是第一次执行scanf之前没有中断,第二次却中断了。总结一下本节用到的gdb命令:

命令描述
break(或b) 行号在某一行设置断点
break 函数名在某个函数开头设置断点
break ... if ...设置条件断点
continue(或c)从当前位置开始连续运行程序
delete breakpoints 断点号删除断点
display 变量名跟踪查看某个变量,每次停下来都显示它的值
disable breakpoints 断点号禁用断点
enable 断点号启用断点
info(或i) breakpoints查看当前设置了哪些断点
run(或r)从头开始连续运行程序
undisplay 跟踪显示号取消跟踪显示