Tuesday 2 June 2015

Memory layout of C program

Memory layout of C program

In practical words, when we run any C-program, its executable image is loaded into RAM of computer in an organized manner.
This memory layout is organized in following fashion :-






1>Text or Code Segment :-
Text segment contains machine code of the compiled program. Usually, the text segment is sharable so that only a single copy needs to be in memory for frequently executed programs, such as text editors, the C compiler, the shells, and so on. The text segment of an executable object file is often read-only segment that prevents a program from being accidentally modified.

2>Initialized Data Segment :-
Initialized data stores all global, static, constant, and external variables ( declared with extern keyword ) that are initialized beforehand. Data segment is not read-only, since the values of the variables can be altered at run time.
This segment can be further classified into initialized read-only area and initialized read-write area.

#include <stdio.h>

char c[]="rishabh tripathi";     /*  global variable stored in Initialized Data 
                                     Segment in read-write area*/
const char s[]="HackerEarth";    /* global variable stored in Initialized Data 
                                    Segment in read-only area*/

int main()
{
    static int i=11;          /* static variable stored in Initialized Data 
                                 Segment*/
    return 0;
}


3>Uninitialized Data Segment (bss) :-
Data in this segment is initialized to arithmetic 0 before the program starts executing. Uninitialized data starts at the end of the data segment and contains all global variables and static variables that are initialized to 0 or do not have explicit initialization in source code.
 
#include <stdio.h>

char c;               /* Uninitialized variable stored in bss*/

int main()
{
    static int i;     /* Uninitialized static variable stored in bss */
    return 0;
}
 
 
4>Heap :-
Heap is the segment where dynamic memory allocation usually takes place. When some more memory need to be allocated using malloc and calloc function, heap grows upward. The Heap area is shared by all shared libraries and dynamically loaded modules in a process.

#include <stdio.h>
int main()
{
    char *p=(char*)malloc(sizeof(char));    /* memory allocating in heap segment */
    return 0;
}
 
5>Stack :-
Stack segment is used to store all local variables and is used for passing arguments to the functions along with the return address of the instruction which is to be executed after the function call is over. Local variables have a scope to the block which they are defined in, they are created when control enters into the block. All recursive function calls are added to stack.
The stack and heap are traditionally located at opposite ends of the process's virtual address space.

Monday 1 June 2015

"extern" keyword in C

Understanding of "extern" keyword in C 

So let me start with saying that extern keyword applies to C variables (data objects) and C functions. Basically extern keyword extends the visibility of the C variables and C functions. Though (almost) everyone knows the meaning of declaration and definition of a variable/function yet for the sake of completeness of this post, I would like to clarify them.  

Declaration of Variable/function:-Declaration of a variable/function simply declares that the variable/function exists somewhere in the program but the memory is not allocated for them. when a variable is declared, the program knows the data type of that variable. In case of function declaration, the program knows what are the arguments to that functions, their data types, the order of arguments and the return type of the function. So that’s all about declaration.

Definition of a variable/function:-Coming to the definition, when we define a variable/function, apart from the role of declaration, it also allocates memory for that variable/function. Therefore, we can think of definition as a super set of declaration. (or declaration as a subset of definition). From this explanation, it should be obvious that a variable/function can be declared any number of times but it can be defined only once.

Remember the basic principle that you can’t have two locations of the same variable/function.
Now coming back to our main objective: Understading of "extern keyword in C. I’ve explained the role of declaration/definition because it’s mandatory to understand them to understand the “extern” keyword.

Use of extern with C functions:- By default, the declaration and definition of a C function have “extern” prepended with them. It means even though we don’t use extern with the declaration/definition of C functions, it is present there.

For example, when we write.
int foo(int arg1, char arg2);

There’s an extern present in the beginning which is hidden and the compiler treats it as below.
extern int foo(int arg1, char arg2);

Same is the case with the definition of a C function (Definition of a C function means writing the body of the function). Therefore whenever we define a C function, an extern is present there in the beginning of the function definition. Since the declaration can be done any number of times and definition can be done only once, we can notice that declaration of a function can be added in several C/H files or in a single C/H file several times. But we notice the actual definition of the function only once (i.e. in one file only). And as the extern extends the visibility to the whole program, the functions can be used (called) anywhere in any of the files of the whole program provided the declaration of the function is known. (By knowing the declaration of the function, C compiler knows that the definition of the function exists and it goes ahead to compile the program). So that’s all about extern with C functions.


Use of extern with C variables:- I feel that it more interesting and information than the previous case where extern is present by default with C functions. So let me ask the question, how would you declare a C variable without defining it? Many of you would see it trivial but it’s important question to understand extern with C variables. The answer goes as follows.

extern int var;
 
Here, an integer type variable called var has been declared (remember no definition i.e. no memory allocation for var so far). And we can do this declaration as many times as needed. (remember that declaration can be done any number of times)

Now how would you define a variable. Now I agree that it is the most trivial question in programming and the answer is as follows.
int var;
 
Here, an integer type variable called var has been declared as well as defined. (remember that definition is the super set of declaration). Here the memory for var is also allocated. Now here comes the surprise, when we declared/defined a C function, we saw that an extern was present by default. While defining a function, we can prepend it with extern without any issues. But it is not the case with C variables. If we put the presence of extern in variable as default then the memory for them will not be allocated ever, they will be declared only. Therefore, we put extern explicitly for C variables when we want to declare them without defining them.
Also, as the extern extends the visibility to the whole program, by externing a variable we can use the variables anywhere in the program provided we know the declaration of them and the variable is defined somewhere.

Now let us try to understand extern with examples.

Example 1:
int var;
int main(void)
{
 var = 10; return 0;
}
Analysis: This program is compiled successfully. Here var is defined (and declared implicitly) globally.

Example 2:
extern int var;
int main(void)
{ return 0; }
Analysis: This program is compiled successfully. Here var is declared only. Notice var is never used so no problems.

Example 3:
extern int var;
int main(void)
{ var = 10; return 0; }

Analysis: This program throws error in compilation. Because var is declared but not defined anywhere. Essentially, the var isn’t allocated any memory. And the program is trying to change the value to 10 of a variable that doesn’t exist at all.

Example 4:
#include "somefile.h"
extern int var;
int main(void)
{
  var = 10; return 0;
}
Analysis: Supposing that somefile.h has the definition of var. This program will be compiled successfully.

Example 5:
extern int var = 0;
int main(void)
{
   var = 10; return 0;
}
Analysis: Guess this program will work? Well, here comes another surprise from C standards. They say that..if a variable is only declared and an initializer is also provided with that declaration, then the memory for that variable will be allocated i.e. that variable will be considered as defined. Therefore, as per the C standard, this program will compile successfully and work.

So that was a preliminary look at “extern” keyword in C.

In short, we can say:-
  1. Declaration can be done any number of times but definition only once.
  2. “extern” keyword is used to extend the visibility of variables/functions().
  3. Since functions are visible through out the program by default. The use of extern is not needed in function declaration/definition. Its use is redundant.
  4. When extern is used with a variable, it’s only declared not defined.
  5. As an exception, when an extern variable is declared with initialization, it is taken as definition of the variable as well.

Debugging with "gdb"

Debugging with "gdb"

Gdb is a debugger for C and C++ programs. It allows you to do things like run the program up to a certain point then stop and print out the values of certain variables at that point, or step through the program one line at a time and print out the values of each variable after executing each line. It uses a command line interface.

Most commonly used GDB commands:-

help :- Gdb provides online documentation. Just typing help will give you a list of topics. Then you can type help topic to get information about that topic (or it will give you more specific terms that you can ask for help about). Or you can just type help command and get information about any other command.

file:- file executable specifies which program you want to debug.

run :- run will start the program running under gdb. (The program that starts will be the one that you have previously selected with the file command, or on the unix command line when you started gdb. You can give command line arguments to your program on the gdb command line the same way you would on the unix command line, except that you are saying run instead of the program name:


run 2048 24 4
You can even do input/output redirection: run > outfile.txt.

break :- A ``breakpoint'' is a spot in your program where you would like to temporarily stop execution in order to check the values of variables, or to try to find out where the program is crashing, etc. To set a breakpoint you use the break command.

break function sets the breakpoint at the beginning of function. If your code is in multiple files, you might need to specify filename:function.
break linenumber or break filename:linenumber sets the breakpoint to the given line number in the source file. Execution will stop before that line has been executed.

delete:- delete will delete all breakpoints that you have set.delete number will delete breakpoint numbered number. You can find out what number each breakpoint is by doing info breakpoints. (The command info can also be used to find out a lot of other stuff. Do help info for more information.)

clear:- clear function will delete the breakpoint set at that function. Similarly for linenumber, filename:function, and filename:linenumber.

continue :- continue will set the program running again, after you have stopped it at a breakpoint.

step :- step will go ahead and execute the current source line, and then stop execution again before the next source line.

next :- next will continue until the next source line in the current function (actually, the current innermost stack frame, to be precise). This is similar to step, except that if the line about to be executed is a function call, then that function call will be completely executed before execution stops again, whereas with step execution will stop at the first line of the function that is called.

until :- until is like next, except that if you are at the end of a loop, until will continue execution until the loop is exited, whereas next will just take you back up to the beginning of the loop. This is convenient if you want to see what happens after the loop, but don't want to step through every iteration.

list:- list linenumber will print out some lines from the source code around linenumber. If you give it the argument function it will print out lines from the beginning of that function. Just list without any arguments will print out the lines just after the lines that you printed out with the previous list command.

print :- print expression will print out the value of the expression, which could be just a variable name. To print out the first 25 (for example) values in an array called list, do


print list[0]
25 


Lets take an example and debug it.
Below program is giving a count of number which divides a given number in given range.

Example:
if passed parameter to function checkDivisible is 100(num), 1(startRange) and 14(endRange). then it will return count = 5 because 100 is divisible by 1,2,4,5 and 10 in range 1 to 14.

//countDivisible.c

#include<stdio.h>

int checkDivisible(int num,int startRange,int endRange)
{
        int count=0,i=0;
        for (i = startRange ; i <=endRange; i++)
        {
                if (num%i == 0)
                        count++;
        }
        return count;
}

int main()
{

        int number = 1000,count=0,startRange=0,endRange=10;
        count = checkDivisible(number,startRange,endRange);
        printf("Total no count which divide number[%d] in range:[%d] to [%d] 
        are:%d",number,startRange,endRange,count);

        return 0;
}
 

Recompile the program for debugging and start debugging.
 
$cc -g -o countDivisible  countDivisible.c
-g is used to compile it in debugging mode.
 
 Now start GDB:-
$gdb countDivisible
GNU gdb Red Hat Linux (6.3.0.0-1.143.el4rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host libthread_db 
library "/lib/libthread_db.so.1".
(gdb)
 
  
Set breakpoints and run a program:-(Step by step explanation is given in coming 
sections):
 
(gdb) break main
Breakpoint 1 at 0x80483cc: file countDivisible.c, line 17.
(gdb) run
Starting program: /iiidb/work/ram/hackerEarth/countDivisible

Breakpoint 1, main () at countDivisible.c:17
17              int number = 1000,count=0,startRange=0,endRange=10;
(gdb) n
18              count = checkDivisible(number,startRange,endRange);
(gdb) s
checkDivisible (num=1000, startRange=0, endRange=10) at countDivisible.c:5
5               int count=0,i=0;
(gdb) p startRange
$1 = 0
(gdb) p endRange
$2 = 10
(gdb) p num
$3 = 1000
(gdb) list
1       #include<stdio.h>
2
3       int checkDivisible(int num,int startRange,int endRange)
4       {
5               int count=0,i=0;
6               for (i = startRange ; i <=endRange; i++)
7               {
8                       if (num%i == 0)
9                               count++;
10              }
(gdb) n
6               for (i = startRange ; i <=endRange; i++)
(gdb) n
8                       if (num%i == 0)
(gdb) n

Program received signal SIGFPE, Arithmetic exception.
0x08048399 in checkDivisible (num=1000, startRange=0, endRange=10) at 
countDivisible.c:8
8                       if (num%i == 0)
(gdb) bt
#0  0x08048399 in checkDivisible (num=1000, startRange=0, endRange=10) at 
countDivisible.c:8
#1  0x080483f6 in main () at countDivisible.c:18
(gdb) cont
Continuing.

Program terminated with signal SIGFPE, Arithmetic exception.
The program no longer exists.
(gdb) quit
$
 
gdb online help:-
gdb has extensive online help
 
(gdb) help
List of classes of commands:
aliases -- Aliases of other commands
breakpoints -- Making program stop at certain points
data -- Examining data
files -- Specifying and examining files
internals -- Maintenance commands
obscure -- Obscure features
running -- Running the program
stack -- Examining the stack
status -- Status inquiries
support -- Support facilities
tracepoints -- Tracing of program execution without stopping the program
user-defined -- User-defined commands
Type help followed by a class name for a list of commands in that class.
Type help all for the list of all commands.
Type help followed by command name for full documentation.
Type apropos word to search for commands related to word”.
Command name abbreviations are allowed if unambiguous.
(gdb)
 
 
Setting breakpoints:-
You can stop the program at any point by setting breakpoints. These cause the program to stop and return control to the debugger. You’ll be able to inspect variables and then allow the program to continue.
A number of commands are used for setting breakpoints. These are listed by gdb with help breakpoint

(gdb)help breakpoint
Making program stop at certain points.
List of commands:
awatch -- Set a watchpoint for an expression
break -- Set breakpoint at specified line or function
catch -- Set catchpoints to catch events
clear -- Clear breakpoint at specified line or function
commands -- Set commands to be executed when a breakpoint is hit
condition -- Specify breakpoint number N to break only if COND is true
delete -- Delete some breakpoints or auto-display expressions
delete breakpoints -- Delete some breakpoints or auto-display expressions
delete checkpoint -- Delete a fork/checkpoint (experimental)
delete mem -- Delete memory region
delete tracepoints -- Delete specified tracepoints
disable -- Disable some breakpoints
disable breakpoints -- Disable some breakpoints
disable display -- Disable some expressions to be displayed when program stops
disable mem -- Disable memory region
disable tracepoints -- Disable specified tracepoints
enable -- Enable some breakpoints
enable delete -- Enable breakpoints and delete when hit
enable display -- Enable some expressions to be displayed when program stops
enable mem -- Enable memory region
enable once -- Enable breakpoints for one hit
enable tracepoints -- Enable specified tracepoints
hbreak -- Set a hardware assisted breakpoint
ignore -- Set ignore-count of breakpoint number N to COUNT
rbreak -- Set a breakpoint for all functions matching REGEXP
rwatch -- Set a read watchpoint for an expression
tbreak -- Set a temporary breakpoint
tcatch -- Set temporary catchpoints to catch events
thbreak -- Set a temporary hardware assisted breakpoint
watch -- Set a watchpoint for an expression
Type help followed by command name for full documentation.
Type apropos word to search for commands related to word”.

 
Command name abbreviations are allowed if unambiguous.
you can set break point at any function of source file. In our example we have set break point at main function. That's why when we run it, it get stop at main.

(gdb) b main
Breakpoint 1 at 0x80483cc: file countDivisible.c, line 17.
(gdb) r
Starting program: /iiidb/work/ram/hackerEarth/countDivisible

Breakpoint 1, main () at countDivisible.c:17
***17              int number = 1000,count=0,startRange=0,endRange=10;***
(gdb)

you can set break point at any line in source code. in our example, we can set break point at main as countDivisible.c : 14 (source file name> : )

(gdb) break <fileName>:<line#>
(gdb) break countDivisible.c :14
 
So here both breakpoints are pointing to same location.

Running a program:-
You can execute the program with the run command. Any arguments that you give to the run command are passed to the program as its arguments.
In our example:

(gdb) run
Starting program: /iiidb/work/ram/hackerEarth/countDivisible

The program runs incorrectly as before. When the program faults,gdb shows the reason and the location. You can now investigate the underlying cause of the problem.

8                       if (num%i == 0)
(gdb) n

Program received signal SIGFPE, Arithmetic exception.
0x08048399 in checkDivisible (num=1000, startRange=0, endRange=10) at 
countDivisible.c:8 //shows fault reason: the exact line number of source file 
8                       if (num%i == 0)

Here a segmentation fault occured (signal SIGFPE). Here we can see that gdb is giving us fault information (exact line number and source file).

Stack Trace:-
You can see the function tree, how you got to this position by using the backtrace command:
In our example:

(gdb) backtrace
#0  0x08048399 in checkDivisible (num=1000, startRange=0, endRange=10) at 
countDivisible.c:8
#1  0x080483f6 in main () at countDivisible.c:18
(gdb)

Its showing tree structure of function call.
1) function call is main() at countDivisible.c : 8
2) then second funcion call is countDivisible.c : 18
This is a very simple program, and the trace is short because you haven’t called many functions from within other functions. This can be very useful when debugging functions that are called from many different places.
The backtrace command may be abbreviated bt , and, for compatibility with other debuggers, the where command has the same function.
Go to next line:-
you can go to next line to debug by using 'n' command.
Examining variables:-
print command shows the content of the variable.
 
(gdb) print $var
$1 10 
set command is used to set a value in variable. 
 
(gdb) set var=50
(gdb) print var
$1 = 50
here value of var has been changed from 10 to 50.

Go inside the function definition:-
While debugging if a contro reaches to function call and you want to go inside the function definition for debug, they you can go using 's' command. Like in our program we went inside the function.

(gdb) n
18              count = checkDivisible(number,startRange,endRange);
(gdb) s
checkDivisible (num=1000, startRange=0, endRange=10) at countDivisible.c:5
5               int count=0,i=0;
(gdb)
 
Empty command:-
All versions support an “empty command”; hitting Enter executes the last command again. This is especially useful when stepping through a program line by line with the step or next commands.

Listing a program:-
You can view the source code of the program from within gdb by using the list command. This prints out a portion of the code around the current position. Subsequent uses of list will print out more. You can also give list a function name as an argument and it will show the code at that position, or a pair of line numbers and it will list the code between those line numbers.

(gdb) list
1       #include<stdio.h>
2
3       int checkDivisible(int num,int startRange,int endRange)
4       {
5               int count=0,i=0;
6               for (i = startRange ; i <=endRange; i++)
7               {
8                       if (num%i == 0)
9                               count++;
10              }
(gdb)

Continue:-
If you want to allow continue, use command cont or c. Control will exit from that break points and it runs to compilation of program ahead. In our example:-
(gdb) cont
Continuing.

Program terminated with signal SIGFPE, Arithmetic exception.
The program no longer exists.
(gdb)
 
Exit from gdb:-
To exit from gdb, use quit command.
$(gdb)quit
 
Set print elements 0:-
Set a limit on how many elements of an array GDB will print. If GDB is printing a large array, it stops printing after it has printed the number of elements set by the set print elements command. This limit also applies to the display of strings. When GDB starts, this limit is set to 200. Setting number-of-elements to zero means that the printing is unlimited.
 $(gdb) set print elements 0

Debugging a C program with command line argument:-

Lets take an example , a program to add two integer number, and these numbers are passing to program by command line.

#include<stdio.h>

void main(int argc, char * argv[]) 
{
   int i, sum = 0;

   if (argc != 3) {
      printf("You have missed numbers to pass in command line.");
      exit(1);
   }

   for (i = 1; i < argc; i++)
   {
      printf("Argument number %d passed is\n",i,atoi(argv[i]));
    }

    printf("The sum is : ");

   for (i = 1; i < argc; i++)
      sum = sum + atoi(argv[i]);

   printf("%d", sum);

}

here argc is a number of arguments pass as a command line to the program.
and agrv[], is the string array containing passed arguments.

compile this program in debug mode:-

$cc -g -o commandLine commandLine.c
-g is used to compile it in debugging mode.

Debug this program with gdb:-

$gdb commandLine 
GNU gdb Red Hat Linux (6.3.0.0-1.143.el4rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host libthread_db library "/lib/libthread_db.so.1".

(gdb) b main
Breakpoint 1 at 0x80483ec: file
commandLine.c, line 6.

 /* Now Run this program and pass the required command line arguments with run */ 

(gdb) run 10 20

Starting program: /iiidb/work/ram/python/commandLine 10 20

Breakpoint 1, main (argc=3, argv=0xbff669a4) at
commandLine.c:6
6          int i, sum = 0;
(gdb) n
8          if (argc != 3) {

/* Here you can print the no of arguments passed to program */ 

(gdb) p argc
$1 = 3
(gdb) n
14         for (i = 1; i < argc; i++)
(gdb) n
17            printf("Argument number %d passed is:%d\n",i,atoi(argv[i]));
(gdb) n
Argument number 1 passed is:10
14         for (i = 1; i < argc; i++)

/* Here you can see the passed arguments in argv[i] */

(gdb) p atoi(argv[1])
$2 = 10
(gdb) p atoi(argv[2])

$3 = 20
(gdb) n
17            printf("Argument number %d passed is:%d\n",i,atoi(argv[i]));
(gdb) n
Argument number 2 passed is:20
14         for (i = 1; i < argc; i++)
(gdb) n
23          printf("The sum is : ");
(gdb) n
25         for (i = 1; i < argc; i++)
(gdb) n
26            sum = sum + atoi(argv[i]);
(gdb) n
25         for (i = 1; i < argc; i++)
(gdb) n
26            sum = sum + atoi(argv[i]);
(gdb) n
25         for (i = 1; i < argc; i++)
(gdb) n
28         printf("%d", sum);
(gdb) n
30      }
(gdb) p sum
$4 = 30


Debugging programs with multiple processes:-

GDB provides support for debugging programs that create additional processes using the fork or vfork function. By default, when a program forks, GDB will continue to debug the parent process and the child process will run unimpeded.
If you want to follow the child process instead of the parent process, use the command set follow-fork-mode.
Here mode can be:- parent In this mode GDB will continue debugging the parent process if a fork() or vfork() is called. This is the default mode. child In this mode GDB will switch to the child process if a fork() or vfork() is called.
Syntax:-
set follow-fork-mode parent
set follow-fork-mode child
show follow-fork-mode
 
Default mode:-
The default value for the follow-fork-mode setting is 'parent'.
Remarks:-
If you have set detach-on-fork to on, GDB will debug both the parent and the child process. Use the info inferiors command to show details and the inferior command to switch between them.
Example:-

//In this example we will debug the following C++ program:
#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>

void func(int pid, int ret)
{
    printf("My PID is %d, fork() returned %d\n", pid, ret);

    if (ret)
        printf("We are in the parent process\n");
    else
        printf("We are in the child process\n");
}

int main()
{
    int r = fork();
    func(getpid(), r);
    return 0;
}

First we will debug the program with the default setting for follow-fork-mode:
 
gdb) break main
Breakpoint 1 at 0x804848f: file forktest.cpp, line 17.
(gdb) run
Starting program: /home/testuser/forktest

Breakpoint 1, main () at forktest.cpp:17
17 int r = fork();
(gdb) show follow-fork-mode
Debugger response to a program call of fork or vfork is "parent".
(gdb) break func
Breakpoint 2 at 0x804844a: file forktest.cpp, line 7.
(gdb) continue
Continuing.

Breakpoint 2, func (pid=7975, ret=7980) at forktest.cpp:7
7 printf("My PID is %d, fork() returned %dpid, ret);
(gdb)
My PID is 7980, fork() returned 0
We are in the child process
(gdb) continue
Continuing.
My PID is 7975, fork() returned 7980
We are in the parent process
[Inferior 1 (process 7975) exited normally]
 
As GDB was configured to continue debugging the parent process, the child process produced the 'We are in the child process' text while GDB was stopped at a breakpoint in the parent process. When we issued the continue command, the parent process printed its message and exited.
Now we will see what happens when GDB is configured to switch to the child process:

(gdb) break main
Breakpoint 1 at 0x804848f: file forktest.cpp, line 17.
(gdb) run
Starting program: /home/testuser/forktest

Breakpoint 1, main () at forktest.cpp:17
17 int r = fork();
(gdb) set follow-fork-mode child
(gdb) break func
Breakpoint 2 at 0x804844a: file forktest.cpp, line 7.
(gdb) continue
Continuing.
My PID is 8025, fork() returned 8029
We are in the parent process[New process 8029]
[Switching to process 8029]

Breakpoint 2, func (pid=8029, ret=0) at forktest.cpp:7
7 printf("My PID is %d, fork() returned %dpid, ret);
(gdb) continue
Continuing.
My PID is 8029, fork() returned 0
We are in the child process
[Inferior 2 (process 8029) exited normally]
  
GDB has now switched to the child process, keeping the parent process run in the background. The value of ret in the breakpoint message was 0 indicating that we are in the child process.

Debugging a multi-threaded application:-

------------------------------------------------------

Notification on Thread Creation

When GDB detects that a new thread is created, it displays a message specifying the thread's identification on the current system. This identification, known as the systag, varies from platform to platform. Here is an example of this notification:
?
1
2
3
4
Starting program: /home/user/threads
[Thread debugging using libthread_db enabled]
[New Thread -151132480 (LWP 4445)]
[New Thread -151135312 (LWP 4446)]
Keep in mind that the systag is the operating system's identification for a thread, not GDB's. GDB assigns each thread a unique number that identifies it for debugging purposes.

Getting a List of All Threads in the Application

GDB provides the generic info command to get a wide variety of information about the program being debugged. It is no surprise that a subcommand of info would be info threads. This command prints a list of threads running in the system:
?
1
2
3
(gdb) info threads
2 Thread -151135312 (LWP 4448)  0x00905f80 in vfprintf ()   from /lib/tls/libc.so.6
* 1 Thread -151132480 (LWP 4447)  main () at threads.c:27
The info threads command displays a table that lists three properties of the threads in the system:

The thread number attached to the thread by GDB.
The systag value
The current stack frame for the current thread.

The currently active thread is denoted by GDB with the * symbol. The thread number is used in all other commands in GDB.

Setting Thread-Specific Breakpoints

GDB allows users that are debugging multithreaded applications to choose whether or not to set a breakpoint on all threads or on a particular thread. The much like the info command, this capability is enabled via an extended parameter that's specified in the break command. The general form of this instruction is:

1
break linespec thread threadnum

where linespec is the standard gdb syntax for specifying a breakpoint, and threadnum is the thread number obtained from the info threads command. If the thread threadnum arguments are omitted, the breakpoint applies to all threads in your program. Thread-specific breakpoints can be combined with conditional breakpoints:

1
(gdb) break buffer.c:33 thread 7 if level > watermark

Note that stopping on a breakpoint stops all threads in your program. Generally speaking this is a desirable effect-it allows a developer to examine the entire state of an application, and the ability to switch the current thread. These are good things.

Developers should keep certain behaviors in mind, however, when using breakpoints from within GDB. The first issue is related to how system calls behave when they are interrupted by the debugger. To illustrate this point, consider a system with two threads. The first thread is in the middle of a system call when the second thread reaches a breakpoint. When the breakpoint is triggered, the system call may return early. The reason-GDB uses signals to manage breakpoints. The signal may cause a system call to return prematurely. To illustrate this point, let's say that thread 1 was executing the system call sleep(30). When the breakpoint in thread 2 is hit, the sleep call will return, regardless of how long the thread has actually slept. To avoid unexpected behavior due to system calls returning prematurely, it is advisable that you check the return values of all system calls and handle this case.

In this example, sleep() returns the number of seconds left to sleep. This call can be placed inside of a loop to guarantee that the sleep has occurred for the amount of time specified. This is shown in Listing Eight.

Listing Eight: Proper Error Handling of System Calls.


1
2
3
4
5
int sleep_duration = 30;
do
{
   sleep_duration = sleep(sleep_duration);
} while (sleep_duration > 0);
 


The second point to keep in mind is that GDB does not single step all threads in lockstep. Therefore, when single-stepping a line of code in one thread, you may end up executing a lot of code in other threads prior to returning to the thread that you are debugging. If you have breakpoints in other threads, you may suddenly jump to those code sections. On some OSs, GDB supports a scheduler locking mode via the set scheduler-locking command. This allows a developer to specify that the current thread is the only thread that should be allowed to run.

Switching Between Threads

In GDB, the thread command may be used to switch between threads. It takes a single parameter, the thread number returned by the info threads command. Here is an example of the thread command:
?
1
2
3
4
5
6
7
gdb) thread 2
[Switching to thread 2 (Thread -151135312 (LWP 4549))]#0  PrintThreads (num=0xf6fddbb0) at threads.c:39
39      {
(gdb) info threads
* 2 Thread -151135312 (LWP 4549)  PrintThreads (num=0xf6fddbb0) at threads.c:39
  1 Thread -151132480 (LWP 4548)  main () at threads.c:27
(gdb)
In this example, the thread command makes thread number 2 the active thread.

Applying a Command to a Group of Threads

The thread command supports a single subcommand apply that can be used to apply a command to one or more threads in the application. The thread numbers can be supplied individually, or the special keyword all may be used to apply the command to all threads in the process, as illustrated in the following example:

1
2
3
4
5
6
7
8
9
gdb) thread apply all bt
Thread 2 (Thread -151135312 (LWP 4549)):
#0  PrintThreads (num=0xf6fddbb0) at threads.c:39
#1  0x00b001d5 in start_thread () from /lib/tls/libpthread.so.0
#2  0x009912da in clone () from /lib/tls/libc.so.6
Thread 1 (Thread -151132480 (LWP 4548)):
#0  main () at threads.c:27
39      {
(gdb)
The GDB backtrace (bt) command is applied to all threads in the system. In this scenario, this command is functionally equivalent to: thread apply 2 1 bt.

Key Points

This article described a number of general purpose debugging techniques for multithreaded applications. To sum up:
  • Proper software engineering principles should be followed when writing and developing robust multithreaded applications.
  • When trying to isolate a bug in a multithreaded application, it is useful to have a log of the different sequence of events that led up to failure. A trace buffer is a simple mechanism that allows programmers to store this event information.
  • Bracket events that are logged in the trace buffer with "before" and "after" messages to determine the order in which the events occurred.
  • Running the application in the debugger may alter the timing conditions of your runtime application, masking potential race conditions in your application.
  • Tracepoints can be a useful way to log or record the sequence of events as they occur.
  • For advanced debugging, consider using the Intel software tools, specifically, the Intel Debugger, the Intel Thread Checker, and the Intel Thread Profiler.