wiki:kernel_debugging_UML

Goal of this "how to" is to explain how kernel and kernel modules can be debugged (just as if they were user programms). It will be necessary to compile kerner in User Mode. Information are available at http://user-mode-linux.sourceforge.net/debugging.html

We divide the work in seven PARTS:

  1. Patching host linux (already done in our SVN repository!)
  2. Compile guest linux in UM
  3. Use the debugger (i.e. dgb or graphical support ddd)
  4. Example to use breakpoint in the kernel
  5. Example to debug TFC as part of the kernel
  6. Example to do debug in tfc modules
  7. Debug Netkit lab, attach to process, etc.


1 - Patching host linux for easier debugging

Note: I think debugging works even without patching the host kernel. This SKAS3 patch makes the VM run faster and makes debug easier, so it is still a good idea''

Note: You only need this if you start from a vanilla linux kernel (one from kernel.org) In our repository, TFC linux 2.6.20.7 is from r271 svn version already skas3 patched. If you use that, you can jump this part and read step 2.

Suppose to have a vanilla linux 2.6.20.7 (should work with newer versions as well). We need to patch it to enable skas3. More information about skas3 are at http://user-mode-linux.sourceforge.net/skas.html
In discreet/tfcproject/trunk create the skas3 and build_skas3 directories and enter in the first:

mkdir skas3
mkdir build_skas3
cd skas3

Get the patch skas-2.6.20-v8-2.patch.bz2. You find it in http://www.user-mode-linux.org/~blaisorblade/
More information in http://www.user-mode-linux.org/~blaisorblade/howtoapply.html
Unzip it in same directory (bz2 will delete the compressed file)

wget http://www.user-mode-linux.org/~blaisorblade/patches/skas3-2.6/skas-2.6.20-v8.2/skas-2.6.20-v8-2.patch.bz2
bunzip2 skas-2.6.20-v8.2.patch.bz2 

Copy all files that you find in linux 2.6.20.7-TFC in skas3/ and apply the patch

cd ..
cp linux-2.6.20.7-TFC skas3/
cd skas3
patch -p1 < skas-2.6.20-v8.2.patch 

If everything goes fine, you should see this:

patching file arch/i386/Kconfig 
patching file arch/i386/kernel/ldt.c 
patching file arch/i386/kernel/ptrace.c 
patching file arch/i386/kernel/sys_i386.c 
patching file arch/um/include/skas_ptrace.h 
patching file include/asm-i386/desc.h  
patching file include/asm-i386/mmu_context.h 
patching file include/asm-i386/ptrace.h 
patching file include/linux/mm.h 
patching file include/linux/proc_mm.h 
patching file localversion-skas 
patching file mm/Makefile
patching file mm/mmap.c 
patching file mm/mprotect.c 
patching file mm/proc_mm.c 

Now configure the kernel (here you don't need ARCH=um, this is a host kernel!),
enable /proc/mm under "Processor type and features" menu if needed,
save the new configuaration and build it.

gedit Makefile

and change in it these first lines:
.. KBUILD_OUTPUT:=../build_skas3
#ARCH:=um
#INSTALL_MOD_PATH:=$(NETKIT_HOME)/kernel/modules

check if /proc/mm is already set (CONFIG_PROC_MM=y):

make xconfig

Under processor type and features" check that proc mm is selected.
Compile now the host kernel

make
make modules_install
make install

reboot the system..

2 - Compile guest linux in UM
There is a wiki that cover very well this topic. Follow NetkitTutorial.
Here just a short index of things to do..

First you will install (or better tar xvfz) netkit (3 files: netkit, kernel and filesystem).
Please note that you have to choose the newest filesystem if you want network capability works fully. At the moment of writing this is netkit-filesystem-F3.0a.tar.bz2

wget http://www.netkit.org/download/netkit-filesystem/netkit-filesystem-F3.0a.tar.bz2

You have to compile the new kernel with ARCH=um. The easiest way is to add these lines to the kernel Makefile:

ARCH:=um
INSTALL_MOD_PATH:=$(NETKIT_HOME)/kernel/modules
KBUILD_OUTPUT:=../build_um[[BR]]

Now you can run make and make modules_install.

Finally, instruct netkit to use the kernel you've compiled (called linux), the easiest way is to change the symlink pointing to the kernel:

cp $(NETKIT_HOME)/kernel/netkit-kernel $(NETKIT_HOME)/kernel/netkit-kernel.orig
ln -sf ../build_um/linux $(NETKIT_HOME)/kernel/netkit-kernel

3 - Use the debugger (i.e. gdb)
Start the kernel UM with gdb debugger. No parameters are here allowed.
After some line you will have the (gdb) prompt.

gdb $NETKIT_HOME/kernel/netkit-kernel 
GNU gdb Red Hat Linux (6.5-15.fc6rh)
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host libthread_db library "/lib/libthread_db.so.1".
(gdb)


SIGSEGV and SUGUSR1 are used internally by UM.
If you want to avoid that your debugging session would be continually interrupted by them, just tell to gdb to don't considere them:

(gdb) handle SIGSEGV pass nostop noprint
Signal        Stop      Print   Pass to program Description
SIGSEGV       No        No      Yes             Segmentation fault
(gdb) handle SIGUSR1 pass nostop noprint 
Signal        Stop      Print   Pass to program Description
SIGUSR1       No        No      Yes             User defined signal 1
(gdb)  

Start the file with command r.
All parameters you need to give to linux UM are added at this moment:

(gdb) r ubd0=/home/cesco/netkit-fs-gdb,/home/cesco/netkit2/fs/netkit-fs modules=/home/cesco/netkit2/kernel/modules hosthome=/home/cesco root=98:1 mem=100M

where
r is the start command,
ubd0 is the filesystem copy path, followed to the real fs path (no space between them, just a comma!)
modules is the modules path
mem is the memory it will use.

Starting program: /home/cesco/netkit2/kernel/netkit-kernel ubd0=/home/cesco/netkit2/fs/netkit-fs-model,/home/cesco/netkit2/fs/netkit-fs  modules=/home/cesco/netkit2/kernel/modules hosthome=/home/cesco root=98:1 mem=100M
Checking that ptrace can change system call numbers...OK
Checking syscall emulation patch for ptrace...OK
Checking advanced syscall emulation patch for ptrace...OK
Checking for tmpfs mount on /dev/shm...OK
Checking PROT_EXEC mmap in /dev/shm/...OK
Checking for the skas3 patch in the host:
  - /proc/mm...found
  - PTRACE_FAULTINFO...found
  - PTRACE_LDT...found
UML running in SKAS3 mode
Checking that ptrace can change system call numbers...OK
Checking syscall emulation patch for ptrace...OK
Checking advanced syscall emulation patch for ptrace...OK
Linux version 2.6.20.7-TFC-r263-um (root@localhost.localdomain) (gcc version 4.1.1 20070105 (Red Hat 4.1.1-51)) #2 Mon Apr 23 10:46:01 CEST 2007
Built 1 zonelists.  Total pages: 25400
Kernel command line: ubd0=/home/cesco/netkit2/fs/netkit-fs-model,/home/cesco/netkit2/fs/netkit-fs modules=/home/cesco/netkit2/kernel/modules hosthome=/home/cesco root=98:1 mem=100M

[...]

All modules loaded.
Checking all file systems...
fsck 1.37 (21-Mar-2005)
Setting kernel variables ...
... done.
Mounting local filesystems...
Cleaning /tmp /var/run /var/lock.
Loading IPsec SA/SP database from /etc/ipsec-tools.conf: NET: Registered protocol family 15
done.
Setting up networking...done.
Initializing random number generator...done.
INIT: Entering runlevel: 2
--- Starting Netkit phase 1 startup script

Mounting host home (/home/cesco) on /hosthome...

Modifying /etc/hostname ...
Modifying /etc/hosts ...

--- Netkit phase 1 init script terminated
Starting kernel log daemon: klogd.
Starting static multicast router daemon: INIT: MRT_INIT failed; Errno(92): Protocol not available
Configuring network interfaces...done.
Starting system log daemon: syslogd.
--- Starting Netkit phase 2 startup script

Virtual host pc1 ready.

--- Netkit phase 2 init script terminated

pc1 login: root (automatic login)
Linux pc1 2.6.20.7-TFC-r263-um #2 Mon Apr 23 10:46:01 CEST 2007 i686 GNU/Linux
Welcome to Netkit

pc1:~# 

A new windows witch PC1 name will be on.
Now you are running a linux UM with dgb debugger. You will debug linux as a normal programm.

Good idea is to use DDD, an interface that help to use gdb debbugger (gdb is set the default debugger for DDD).
Install it and then prompt ddd to start it:

ddd

then tell the program you want use ($NETKIT_HOME/kernel/netkit-kernel) and select run (you need to pass as before the initialization parameters). As before you have the gdb prompt.

4. Example to use breakpoint in the kernel

We set a break poin on the function main. Then let start the programme and after a while the kernel stop.
On the file main.c in line 161 there is the main function. We can do a step by step check or tell it to continue..

(gdb) b main
Breakpoint 1 at 0x806563b: file /home/cesco/discreet/tfcproject/trunk/linux-2.6.20.7-TFC/arch/um/os-Linux/main.c, line 161.
(gdb) run ubd0=/home/cesco/netkit-fs-gdb,/home/cesco/netkit2/fs/netkit-fs modules=/home/cesco/netkit2/kernel/modules hosthome=/home/cesco root=98:1 mem=100M eth0=tuntap,tap0

Breakpoint 1, main (argc=7, argv=0xbfd8b554, envp=0xbfd8b574) at /home/cesco/discreet/tfcproject/trunk/linux-2.6.20.7-TFC/arch/um/os-Linux/main.c:161
(gdb) next
(gdb) cont
Checking that ptrace can change system call numbers...OK
Checking syscall emulation patch for ptrace...OK
[..]



5. Example to debug TFC as part of the kernel
A good way to do debug TFC modules is to transform tfc modules in part of the kernel and let tfc_hook_in and tfc_hook_out as modules.
If you have a look at the source code of tfc.c you will see at the end (line 932 and line 933) the two functions module_init(init); and module_exit(fini);
These two functions are done for modules. They will not be used when TFC is kernel part.

On the moment this wiki is written (relase r299) there is a function that give error during "make modules_install".
To avoid this problem just comment the line containing "ip_finish_output(skb)" (in my file tfc.c is line 817).
After that you can compile the kernel (see STEP 2 of this wiki).

As explained upper, now we can put a breakpoint in a tfc function (for example SA_Logic function).
When you start the linux UM machine ddd will stop you when that function is called.


6 - Example to do debug in tfc modules

NOT WORKING - PROBLEMS in umldgb script!!
Gdb has support for debugging code which is dynamically loaded into the process.
This support is what is needed to debug kernel modules under UML.
For more information: http://user-mode-linux.sourceforge.net/debugging.html

First you need to download the umlgdb expect script written by Chandan Kudige. It basically automates the process for you.
Following the upper link you will find it. To run it, you need expect programm. When it's installed just prompt:

[root@dhcp177 Desktop]# expect
expect1.1> exit
[root@dhcp177 Desktop]# 

Before run umlgdb, you need to modify it adding the modules paths. All modules you want to debug need to be in that list. Here an exaple of my modules paths:

###
# Module paths:
# You can add paths for modules that are not in the gdb load-path.
# This is basically a list with alternating module name and module path.
# 
# When a module is loaded, umlgdb tries to load the symbols from the
# path given here. If the module is not listed then no symbols are loaded.
###

set MODULE_PATHS {
"tfc" "/home/cesco/netkit2/kernel/modules/lib/modules/2.6.20.7-TFC-r263-um/build/net/ipv4/tfc.ko"
"tfc_hook_in" "/home/cesco/netkit2/kernel/modules/lib/modules/2.6.20.7-TFC-r263-um/build/net/ipv4/tfc_hook_in.ko"
"tfc_hook_out" "/home/cesco/netkit2/kernel/modules/lib/modules/2.6.20.7-TFC-r263-um/build/net/ipv4/tfc_hook_out.ko"
}


#######################################################################
# Script starts here.
#######################################################################

some other information need to added in the umlgdb file. For example the kernel parth. Note that file umlgdb have to be executable !!

##
# Kernel path relative to current directory.
##
set ARG "/home/cesco/netkit2/kernel/netkit-kernel2"

Now start expect and umlgdb:

[root@dhcp177 Desktop]# expect
expect1.1> ./umlgdb

            ******** GDB pid is 6984 ********
Start UML as: /home/cesco/netkit2/kernel/netkit-kernel2 <kernel switches> debug gdb-pid=6984

GNU gdb Red Hat Linux (6.5-15.fc6rh)
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host libthread_db library "/lib/libthread_db.so.1".

(gdb) 

Now use the run command with the already known parameters (see top of this wiki).

(gdb) r ubd0=/home/cesco/netkit-fs-gdb,/home/cesco/netkit2/fs/netkit-fs modules=/home/cesco/netkit2/kernel/modules hosthome=/home/cesco root=98:1 mem=100M

NOT WORKING!!!!!! *
After your PC console is ready, just load modules with insmod command and debug it
NOT WORKING!!!!!! *

7. Debug Netkit lab, attach to process, etc.
It is really easy to debug a running copy of the kernel using the "Attach to process" feature of ddd (gdb does it as well if you don't like graphical interfaces).

Start your Netkit laboratory as usual.

Start ddd with:

ddd $NETKIT_HOME/kernel/netkit-kernel

Place your breakpoints.

Use "File menu/Attach to process..." in ddd. You will see five processes belonging to each Virtual Machine. Select the first process from the one you are interested in, and debug as usual.

To make debugging easier, you need some extra tricks:

Recompile the netkit kernel with TFC as part of the kernel (not a module). In other words: Y instead of M

Turn off optimization while compiling the kernel. Otherwise "next" in ddd will produce some strange behaviour sometimes.

Last modified 11 years ago Last modified on Dec 3, 2007, 11:33:23 AM