ZNHOO Whatever you are, be a good one!

GDB and Core Dump

This is a tutorial on dynamically analyzing and exploiting objects with GDB (GNU DeBugger). Without explicit notice, the object type is Core Dump. The terms object and executable are used interchangably in this post.

Please playaround with the code in the Appendix section.

  1. Limitations
  2. Extensions
    1. gef
    2. openresty-gdb-utils
    3. nginx.gdb
  3. Configuration
  4. Objects with Debug Info
  5. Core Dump
    1. Verify Core Dump
  6. Quickstart
    1. Launch a Process
    2. Attach to a Process
    3. Load a Core Dump
  7. GDB Commands
    1. Informational Commands
  8. More Tools
    1. hexdump
    2. strings
    3. readelf
    4. objdump
  9. Memory
  10. Nginx
  11. Appendix

Limitations

GDB currently does not support Apple Arm chipset (e.g. M1). To bypass this issue, we can run GDB within Docker container.

The command gdb --configuration shows how GDB was built.

(kong-dev) kong@kong-ee:/kong-ee$ gdb --configuration
This GDB was configured as follows:
   configure --host=aarch64-linux-gnu --target=aarch64-linux-gnu
             --with-auto-load-dir=$debugdir:$datadir/auto-load
             --with-auto-load-safe-path=$debugdir:$datadir/auto-load
             --with-expat
             --with-gdb-datadir=/usr/share/gdb (relocatable)
             --with-jit-reader-dir=/usr/lib/gdb (relocatable)
             --without-libunwind-ia64
             --with-lzma
             --with-babeltrace
             --without-intel-pt
             --with-xxhash
             --with-python=/usr (relocatable)
             --with-python-libdir=/usr/lib (relocatable)
             --with-debuginfod
             --with-curses
             --without-guile
             --without-amd-dbgapi
             --enable-source-highlight
             --enable-threading
             --enable-tui
             --with-system-readline
             --with-separate-debug-dir=/usr/lib/debug (relocatable)
             --with-system-gdbinit=/etc/gdb/gdbinit
             --with-system-gdbinit-dir=/etc/gdb/gdbinit.d

("Relocatable" means the directory can be moved with the GDB installation
tree, and GDB will still find it.)

Extensions

gef

gef (GDB Ehanced Features) is a Python plugin for vanilla GDB, supporting both x86 (32/64) and Arm (AArch 32/64). It is as actively maintained as its successor pwndbg, comes with just a single Python script file and requires no dependencies except Python. Howeveer, some gef commands depends on a few other tools like file, readelf, nm, and ps.

There are other alternatives like pwndbg and peda. pwndbg is as actively maintained as gef, but the installation script introduces customizations, and spreads files everywhere. The script might introduce conflicts things conflicting with my local setup (e.g. bashrc). I quite do not like this style. peda, on the other hand, is almost deprecated, and the last commit is from 4 years ago.

In this post, we elect gef!

Before using gef, ensure the following tools are available in your system.

file
readelf
nm
ps
python3

Additionally, ensure the locale is set to UTF-8, otherwise gef reports "UnicodeEncodeError". See https://github.com/hugsy/gef/issues/195.

We can download gef from the official repo.

ubuntu@ip-172-31-9-194:~/workspace/ git clone https://github.com/hugsy/gef

ubuntu@ip-172-31-9-194:~/workspace$ cd gef/
ubuntu@ip-172-31-9-194:~/workspace/gef$ ls
LICENSE  README.md  docs  gef.py  mkdocs.yml  scripts  tests

The only file required is the gef.py script. We can load gef.py on the fly.

ubuntu@ip-172-31-9-194:~/misc$ gdb -q
(gdb) source ~/workspace/gef/gef.py
GEF for linux ready, type `gef' to start, `gef config' to configure
93 commands loaded and 5 functions added for GDB 12.1 in 0.00ms using Python engine 3.10
gef➤  

Alternatively, load gef automatically on GDB startup.

ubuntu@ip-172-31-9-194:~/misc$ echo 'source /path/to/gef.py' >> ~/.config/gdb/gdbinit

ubuntu@ip-172-31-9-194:~/misc$ gdb -q
GEF for linux ready, type `gef' to start, `gef config' to configure
93 commands loaded and 5 functions added for GDB 12.1 in 0.00ms using Python engine 3.10
gef➤

At any moment, we can use the gef context command to show registers, stack, code (disassembly), code, frames, etc.

gef➤  context

assets/gef-context.png

From the figure above, the stack is growing from high address to low address.

In addition to standard GDB commands, gef offers enhanced commands. Use the gef command to list all enhanced commands.

openresty-gdb-utils

openresty-gdb-utils is another GDB extension to debug OpenResty programs, including Nginx, ngx_lua, LuaJIT, etc.

Regarding installation, configuration and commands, please refer to the official page.

nginx.gdb

nginx.gdb is personal GDB script with a few functionalities.

Configuration

  1. ~/.config/gdb/gdbearlyinit which is checked before any other GDB configuration file.

    # similar to option '-q'
    set startup-quietly on
    
  2. ~/.config/gdb/gdbinit which is the primary configuration file.

    # gef
    source /home/ubuntu/workspace/gef/gef.py
       
    # openresty-gdb-utils
    directory /home/ubuntu/workspace/openresty-gdb-utils
       
    py import sys
    py sys.path.append("/home/ubuntu/workspace/openresty-gdb-utils")
       
    source luajit20.gdb
    source ngx-lua.gdb
    source luajit21.py
    source ngx-raw-req.py
    set python print-stack full
    

Objects with Debug Info

When compiling an executable, we can add the options -g and -Og to gcc to include debug information.

ubuntu@ip-172-31-9-194:~/misc$ cat test.c
#include <stdio.h>

int main()
{
        printf("%d\n", 100/0);
}

# '-g' includes debug info
# '-Og' optimize debug experience
ubuntu@ip-172-31-9-194:~/misc$ gcc -Wall -Wextra -Og -g -std=c11 -o test.out test.c

ubuntu@ip-172-31-9-194:~/misc$ file test.out
test.out: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=06b7264bd3f05cfc7ea928d4cc9b257a4c83c8cd, for GNU/Linux 3.2.0, with debug_info, not stripped

The -Og is designated for debug, but the program might be less optimized and performant. Do not use it for production.

Without debug information, gdb cannot get information like symbol translation, backtrace, etc.

Core Dump

This post focuses on Core Dump object. It is the memory dump of a running process.

There are multiple ways to get a Core Dump. For example, we can make use of the command gcore to dump the memory. Alternatively, we can configure the system to automatically generate a Core Dump when a program crash. Please read "biji" for details.

Please also read the tutorial on how to generate Core Dump within K8s environment.

Verify Core Dump

After receiving a Core Dump file, we must validate it is legitmate.

ubuntu@ip-172-31-9-194:~/misc$ file core_new.2436
core_new.2436: ELF 64-bit LSB core file, x86-64, version 1 (SYSV), SVR4-style, from 'nginx: worker prnginx: worker process', real uid: 1000, effective uid: 1000, real gid: 1000, effective gid: 1000, execfn: '/usr/local/openresty/nginx/sbin/nginx', platform: 'x86_64'

Additionally, we can inspect the logical and physical disk size to see if it is sparse file.

ubuntu@ip-172-31-9-194:~/misc$ ll -hs core_new.2436
747M -rw-r--r-- 1 ubuntu ubuntu 3.0G Jul 31 03:44 core_new.2436

ubuntu@ip-172-31-9-194:~/misc$ du -hsc -- core_new.2436
747M    core_new.2436
747M    total
ubuntu@ip-172-31-9-194:~/misc$ du -hsc --apparent-size -- core_new.2436
3.0G    core_new.2436
3.0G    total

~ $ stat core_new.2436
  File: core_new.2436
  Size: 3114838104      Blocks: 1528400    IO Block: 4096   regular file
ubuntu@ip-172-31-9-194:~/misc$ bc <<< '(1528400 * 512)/1024/1024'
746

Why is Core Dump file a spare file? Core dump is a dump of the virtual memory space, but process usually do not use the entire virtual space. There are unmapped space ranges for two purposes.

  1. Reserved but unmapped space.
  2. Gaps between different segments (e.g. stack, heap) are never mapped.
  3. Memory fragmentation.
  4. Modern 64-bit systems have enormous theoretical address spaces. A process would not use all of them.

GDB understands the sparse Core Dump. Although Core Dump is often sparse, it is not always guaranteed.

Quickstart

Command gdb is an interactive Shell capable of inspecting execution details of a process at a certain point, like dereferencing an uninitialized or NULL pointer (address 0x0).

With the option --nx, gdb skips the configuration.

~ $ gdb --nx
GNU gdb (Ubuntu 12.1-0ubuntu1~22.04.2) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word".
(gdb)

In this section, we will describe multiple ways to make use of the GDB.

Launch a Process

With the option --args, you can pass to the program arguments at the end of command line.

~ $ gdb [--args] /path/to/program [arglist]

# -or- interactively
~ $ gdb
(gdb) file /path/to/program
(gdb) run [arglist]

Attach to a Process

With the option -p, you can omit the binary. We can generate a Core Dump file on the fly with gcore.

# with binary
~ $ gdb /path/to/program <pid>

# no binary; must have '-p'
~ $ gdb -p pid

# generate core dump
(gdb) help gcore
generate-core-file, gcore
Save a core file with the current state of the debugged process.
Usage: generate-core-file [FILENAME]
Argument is optional filename.  Default filename is 'core.PROCESS_ID'.
(gdb) gcore
warning: Memory read failed for corefile section, 4096 bytes at 0xffffffffff600000.
Saved corefile core.1231557

(gdb) shell ls -lhs core.1231557
9.9M -rw-rw-r-- 1 ubuntu ubuntu 9.9M Aug 15 14:15 core.1231557

Load a Core Dump

With the option -c, you can omit the binary.

# binary available
~ $ gdb /path/to/program /path/to/core.pid

# no binary; must have '-c'
~ $ gdb -c /path/to/core.pid

# -or- interactively
(gdb) file /path/to/program
(gdb) core-file /path/to/core.pid

GDB Commands

At any moment, we can type the special command help to get help. For sophiscated manual, please refer to the official documentation.

(gdb) help

(gdb) help internals

(gdb) help disassemble

The following is a list of commonly used gdb commands.

  1. Keyboard ENTER key repeats the last command.
  2. run starts or restart the program. The argument list can be provided immediately like run -a -x y.

    A program either runs successfully, or runs into issues. When the program is running in the middle, we can stop it via shortcut Ctrl-C or gdb command signal SIGINT.

  3. backtrace shows the call stack. With the option -full, it also prints local variables.

    (gdb) backtrace
    #0  0x0000ffffa7a7bd74 in __GI_epoll_pwait (epfd=32, events=0xaaab04fa8760, maxevents=512, timeout=492, set=0x0) at ../sysdeps/unix/sysv/linux/epoll_pwait.c:40
    #1  0x0000aaaadbfe4ac0 in ngx_epoll_process_events (cycle=0xaaab04fa10e0, timer=492, flags=1) at src/event/modules/ngx_epoll_module.c:800
    #2  0x0000aaaadbfcf65c in ngx_process_events_and_timers (cycle=0xaaab04fa10e0) at src/event/ngx_event.c:258
    #3  0x0000aaaadbfe10b4 in ngx_worker_process_cycle (cycle=0xaaab04fa10e0, data=0x0) at src/os/unix/ngx_process_cycle.c:793
    #4  0x0000aaaadbfdccb8 in ngx_spawn_process (cycle=0xaaab04fa10e0, proc=0xaaaadbfe0fc8 <ngx_worker_process_cycle>, data=0x0, name=0xaaaadc2ba400 "worker process", respawn=-3) at src/os/unix/ngx_process.c:199
    #5  0x0000aaaadbfdfa98 in ngx_start_worker_processes (cycle=0xaaab04fa10e0, n=1, type=-3) at src/os/unix/ngx_process_cycle.c:382
    #6  0x0000aaaadbfdefc4 in ngx_master_process_cycle (cycle=0xaaab04fa10e0) at src/os/unix/ngx_process_cycle.c:135
    #7  0x0000aaaadbf91b2c in main (argc=5, argv=0xffffd729e868) at src/core/nginx.c:387
    

    gef has a more powerful context command to show more information.

  4. frame. Stack Frame is an important concept in GDB. It represents a function call and occupies an entry in the call stack.

    Take the output from step 3 for example, the #0 is the innermost frame (most recent function), while #7 is the outermost frame (the main function). We call the frame #0 the current frame. It is the base of a lot of GDB commands.

    A frame contains the function name, arguments, local variables, source code line, etc. Specially, it includes the memory address at which the function is executing, namely the address where the code is mapped onto, recorded in the register $pc. Please be noted that, it is not the stack address.

    frame N           # Select frame number N
    up                # Move to the caller frame (up the call stack)
    down              # Move to the callee frame (down the call stack)
    
    info frame       # Show detailed info about current frame
    info locals      # Show local variables in current frame
    info args        # Show function arguments of current frame
       
    print var        # Print value of variable in current frame
    set var=value    # Set value of variable in current frame
    

    gdb-frame.png

  5. disassemble shows the assembly code. By default, it shows the code surrounding the register $pc, namely the current frame. With the option /s, it also shows the source code.

    (gdb) disassemble /s 0x0000aaaadbfe4ac0
    Dump of assembler code for function ngx_epoll_process_events:
    src/event/modules/ngx_epoll_module.c:
    785     {
       0x0000aaaadbfe4a38 <+0>:     stp     x29, x30, [sp, #-128]!
       0x0000aaaadbfe4a3c <+4>:     mov     x29, sp
       0x0000aaaadbfe4a40 <+8>:     str     x0, [sp, #40]
       0x0000aaaadbfe4a44 <+12>:    str     x1, [sp, #32]
       0x0000aaaadbfe4a48 <+16>:    str     x2, [sp, #24]
       
    786         int                events;
    787         uint32_t           revents;
    788         ngx_int_t          instance, i;
    789         ngx_uint_t         level;
    790         ngx_err_t          err;
    791         ngx_event_t       *rev, *wev;
    792         ngx_queue_t       *queue;
    793         ngx_connection_t  *c;
    794
    795         /* NGX_TIMER_INFINITE == INFTIM */
    796
    797         ngx_log_debug1(NGX_LOG_DEBUG_EVENT, cycle->log, 0,
       0x0000aaaadbfe4a4c <+20>:    ldr     x0, [sp, #40]
       0x0000aaaadbfe4a50 <+24>:    ldr     x0, [x0, #16]
       0x0000aaaadbfe4a54 <+28>:    ldr     x0, [x0]
       0x0000aaaadbfe4a58 <+32>:    bl      0xaaaadc195650 <ngx_http_lua_kong_get_dynamic_log_level>
    --Type <RET> for more, q to quit, c to continue without paging--
       0x0000aaaadbfe4a5c <+36>:    and     x0, x0, #0x80
       0x0000aaaadbfe4a60 <+40>:    cmp     x0, #0x0
       0x0000aaaadbfe4a64 <+44>:    b.eq    0xaaaadbfe4a88 <ngx_epoll_process_events+80>  // b.none
       0x0000aaaadbfe4a68 <+48>:    ldr     x0, [sp, #40]
       0x0000aaaadbfe4a6c <+52>:    ldr     x1, [x0, #16]
       0x0000aaaadbfe4a70 <+56>:    ldr     x4, [sp, #32]
       0x0000aaaadbfe4a74 <+60>:    adrp    x0, 0xaaaadc2ba000
       0x0000aaaadbfe4a78 <+64>:    add     x3, x0, #0xac0
       0x0000aaaadbfe4a7c <+68>:    mov     w2, #0x0                        // #0
       0x0000aaaadbfe4a80 <+72>:    mov     x0, #0x8                        // #8
       0x0000aaaadbfe4a84 <+76>:    bl      0xaaaadbf946fc <ngx_log_error_core>
       
    798                        "epoll timer: %M", timer);
    799
    800         events = epoll_wait(ep, event_list, (int) nevents, timer);
       0x0000aaaadbfe4a88 <+80>:    adrp    x0, 0xaaaadc661000 <week+16>
       0x0000aaaadbfe4a8c <+84>:    add     x0, x0, #0xb90
       0x0000aaaadbfe4a90 <+88>:    ldr     w4, [x0]
       0x0000aaaadbfe4a94 <+92>:    adrp    x0, 0xaaaadc6a0000 <ngx_processes+47632>
       0x0000aaaadbfe4a98 <+96>:    add     x0, x0, #0xa48
       0x0000aaaadbfe4a9c <+100>:   ldr     x1, [x0]
       0x0000aaaadbfe4aa0 <+104>:   adrp    x0, 0xaaaadc6a0000 <ngx_processes+47632>
       0x0000aaaadbfe4aa4 <+108>:   add     x0, x0, #0xa50
       0x0000aaaadbfe4aa8 <+112>:   ldr     x0, [x0]
       0x0000aaaadbfe4aac <+116>:   mov     w2, w0
    --Type <RET> for more, q to quit, c to continue without paging--q
    Quit
    

    The objdump has similar capabilities.

  6. list shows only source code. With the argument ., it shows code surrounding the current frame.

    gef➤  frame 1
    #1  0x0000aaaaad984ac0 in ngx_epoll_process_events (cycle=0xaaaab6b09130, timer=0x134, flags=0x1) at src/event/modules/ngx_epoll_module.c:800
    800         events = epoll_wait(ep, event_list, (int) nevents, timer);
    gef➤  list .
    795         /* NGX_TIMER_INFINITE == INFTIM */
    796
    797         ngx_log_debug1(NGX_LOG_DEBUG_EVENT, cycle->log, 0,
    798                        "epoll timer: %M", timer);
    799
    800         events = epoll_wait(ep, event_list, (int) nevents, timer);
    801
    802         err = (events == -1) ? ngx_errno : 0;
    803
    804         if (flags & NGX_UPDATE_TIME || ngx_event_timer_alarm) {
    

    To show the source code pathname.

    (gdb) info sources
    /home/kong/.cache/bazel/_bazel_kong/ee741459d483c56c5059256f49a7a414/execroot/_main/bazel-out/aarch64-fastbuild/bin/build/kong-dev/openresty/nginx/sbin/nginx:
    (Full debug information has not yet been read for this file.)
       
    /home/kong/.cache/bazel/_bazel_kong/ee741459d483c56c5059256f49a7a414/execroot/_main/bazel-out/aarch64-fastbuild/bin/external/openresty/openresty.build_tmpdir/build/nginx-1.25.3/src/core/nginx.c,
    /home/kong/.cache/bazel/_bazel_kong/ee741459d483c56c5059256f49a7a414/execroot/_main/bazel-out/aarch64-fastbuild/bin/external/openresty/openresty.build_tmpdir/build/nginx-1.25.3/src/os/unix/ngx_files.h,
    /home/kong/.cache/bazel/_bazel_kong/ee741459d483c56c5059256f49a7a414/execroot/_main/bazel-out/aarch64-fastbuild/bin/external/openresty/openresty.build_tmpdir/build/nginx-1.25.3/src/core/ngx_log.h,
    
  7. break toggle a breakpoint and pauses the program at the specified line.

    Use list or disassemble to identify the code line you are interested in.

    (gdb) break ngx_epoll_module.c:800
    Breakpoint 2 at 0xaaaadbfe4a88: file src/event/modules/ngx_epoll_module.c, line 800.
    (gdb) info breakpoints
    Num     Type           Disp Enb Address            What
    2       breakpoint     keep y   0x0000aaaadbfe4a88 in ngx_epoll_process_events at src/event/modules/ngx_epoll_module.c:800
    (gdb) delete breakpoints 2
    (gdb) info breakpoints
    No breakpoints, watchpoints, tracepoints, or catchpoints.
    

    You can define a conditional breakpoint like break 15 if var > 10.

  8. watch monitors a variable and pauses the program when the variable is modified.

    (gdb) info variables -t ngx_cycle_t
    All defined variables with type matching regular expression "ngx_cycle_t" :
       
    File ../echo-nginx-module-0.63/src/ngx_http_echo_filter.c:
    28:     static volatile ngx_cycle_t *ngx_http_echo_prev_cycle;
       
    File ../headers-more-nginx-module-0.37/src/ngx_http_headers_more_filter_module.c:
    111:    static volatile ngx_cycle_t *ngx_http_headers_more_prev_cycle;
       
    File ../ngx_lua-0.10.26/src/ngx_http_lua_module.c:
    71:     static volatile ngx_cycle_t *ngx_http_lua_prev_cycle;
       
    File src/core/ngx_cycle.c:
    21:     volatile ngx_cycle_t *ngx_cycle;
       
    File src/os/unix/ngx_process_cycle.c:
    73:     static ngx_cycle_t ngx_exit_cycle;
       
    (gdb) watch ngx_cycle
    Hardware watchpoint 5: ngx_cycle
       
    (gdb) info watchpoints
    Num     Type           Disp Enb Address            What
    5       hw watchpoint  keep y                      ngx_cycle
    
  9. continue resumes the program until crash, the next breakpoint or exit.
    1. next executes the next statement.
    2. step steps into funtion call while next does not.

      stepi steps over machine instruction instead of source code statement.

    3. finish completes the current function call and then pause.
  10. print shows values of variables.

    (gdb) print ngx_cycle
    $12 = (volatile ngx_cycle_t *) 0xaaab04fa10e0
        
    (gdb) print * $12
    $13 = {conf_ctx = 0xaaab04fa2580, pool = 0xaaab04fa10a0, log = 0xaaab04fa10f8, new_log = {log_level = 6, file = 0xaaab04fa15f0, connection = 0, disk_full_time = 0, handler = 0x0, data = 0x0, writer = 0x0, wdata = 0x0, action = 0x0, next = 0x0}, log_use_stderr = 0,
      files = 0x0, free_connections = 0xffff705cbb78, free_connection_n = 15840, modules = 0xaaab04fa2cb0, modules_n = 89, modules_used = 1, reusable_connections_queue = {prev = 0xaaab04fa1180, next = 0xaaab04fa1180}, reusable_connections_n = 0, connections_reuse_time = 0,
      listening = {elts = 0xaaab055dbf30, nelts = 16, size = 296, nalloc = 20, pool = 0xaaab04fa10a0, old_elts = 0xaaab055d45d0}, paths = {elts = 0xaaab04fa1530, nelts = 5, size = 8, nalloc = 10, pool = 0xaaab04fa10a0, old_elts = 0x0}, config_dump = {elts = 0xaaab04ffdd40,
        nelts = 7, size = 24, nalloc = 8, pool = 0xaaab04fa10a0, old_elts = 0xaaab04ffde30}, config_dump_rbtree = {root = 0xaaab04fc3850, sentinel = 0xaaab04fa1248, insert = 0xaaaadbfa0258 <ngx_str_rbtree_insert_value>}, config_dump_sentinel = {key = 0, left = 0x0,
        right = 0x0, parent = 0x0, color = 0 '\000', data = 0 '\000'}, open_files = {last = 0xaaab04fa1278, part = {elts = 0xaaab04fa15f0, nelts = 7, next = 0x0}, size = 40, nalloc = 20, pool = 0xaaab04fa10a0}, shared_memory = {last = 0xaaab051b0240, part = {
          elts = 0xaaab04fa1940, nelts = 1, next = 0xaaab04fc4a60}, size = 88, nalloc = 1, pool = 0xaaab04fa10a0}, connection_n = 16384, files_n = 0, connections = 0xffff7057f010, read_events = 0xffff703fe010, write_events = 0xffff7027d010, old_cycle = 0x0, conf_file = {
        len = 58, data = 0xaaab04fa1480 "/kong-ee/bazel-bin/build/kong-dev/kong/servroot/nginx.conf"}, conf_param = {len = 0, data = 0xaaab04fa14f0 ""}, conf_prefix = {len = 48, data = 0xaaab04fa13a0 "/kong-ee/bazel-bin/build/kong-dev/kong/servroot/"}, prefix = {len = 48,
        data = 0xaaab04fa13e0 "/kong-ee/bazel-bin/build/kong-dev/kong/servroot/"}, error_log = {len = 14, data = 0xaaab04fa1440 "logs/error.log"}, lock_file = {len = 64, data = 0xaaab055b4330 "/kong-ee/bazel-bin/build/kong-dev/kong/servroot/logs/nginx.lock.accept"},
      hostname = {len = 7, data = 0xaaab04fa2c70 "kong-ee"}, intercept_error_log_handler = 0x0, intercept_error_log_data = 0x0, entered_logger = 0}
    
  11. x shows values referenced by an memory address.

    0x0000ffffe2a53890│+0x0000: 0x0000ffffe2a538d0  →  0x0000ffffe2a53950  →  0x0000ffffe2a539a0  →  0x0000ffffe2a539d0  →  0x0000ffffe2a53a40  →  0x0000ffffe2a53a80  →  0x0000ffffe2a53b90         ← $x29, $sp
    0x0000ffffe2a53898│+0x0008: 0x0000aaaaad984ac0  →  <ngx_epoll_process_events+0088> str w0,  [sp,  #64]
    0x0000ffffe2a538a0│+0x0010: 0x0000ffffe2a54028  →  0x0000ffffe2a54dcf  →  "nginx: worker process"
    0x0000ffffe2a538a8│+0x0018: 0x0000000000000005
    0x0000ffffe2a538b0│+0x0020: 0x0000aaaaadffde88  →  0x0000aaaaad931400  →  <__do_global_dtors_aux+0000> stp x29,  x30,  [sp,  #-32]!
    0x0000ffffe2a538b8│+0x0028: 0x0000aaaaad931574  →  <main+0000> sub sp,  sp,  #0x320
     
    gef➤  x/a 0x0000ffffe2a538a0
    0xffffe2a538a0: 0xffffe2a54028
    gef➤  x/a 0xffffe2a54028
    0xffffe2a54028: 0xffffe2a54dcf
     
    gef➤  x/s 0xffffe2a54dcf
    0xffffe2a54dcf: "nginx: worker process"
     
    gef➤  print (char *) 0x0000ffffe2a54dcf
    $5 = 0xffffe2a54dcf "nginx: worker process"
    
  12. info is very useful. Please see the Informational Commands section.
  13. record combined with *reverse-* (e.g. *reverse-step*) can execute the program in reverse order. Therefore, there is no need to re-run the program to inpsect previous execution contexts.
  14. shell invokes a shell command, like shell ll.

gdb-common-commands.png

Similar to Bash, you can check historical gdb commands in ~/.gdb_history.

Informational Commands

  1. Process info

    (gdb)  info inferiors
      Num  Description       Connection           Executable
    * 1    process 2436      1 (core)             /usr/local/openresty/nginx/sbin/nginx
    
  2. Architecture info

    gef➤  arch get
    Arch: Architecture(X86, 64, LITTLE_ENDIAN)
    Reason: The architecture has been detected via the ELF headers
       
    gef➤  arch list
    Available architectures:
     Architecture(ARM, ARM, LITTLE_ENDIAN)
      ARM
     Architecture(ARM64, None, LITTLE_ENDIAN)
      ARM64
      AARCH64
     Architecture(MIPS, MIPS64, LITTLE_ENDIAN)
      MIPS64
     Architecture(MIPS, MIPS32, LITTLE_ENDIAN)
      MIPS
     Architecture(PPC, PPC64, LITTLE_ENDIAN)
      PowerPC64
      PPC64
     Architecture(PPC, PPC32, LITTLE_ENDIAN)
      PowerPC
      PPC
     Architecture(RISCV, RISCV, LITTLE_ENDIAN)
      RISCV
     Architecture(SPARC, V9, LITTLE_ENDIAN)
      SPARC64
     Architecture(SPARC, None, LITTLE_ENDIAN)
      SPARC
     Architecture(X86, 64, LITTLE_ENDIAN)
      X86_64
      i386:x86-64
     Architecture(X86, 32, LITTLE_ENDIAN)
      X86
    
  3. Registers info

    gef➤  registers $r12
    $r12   : 0x00007fffffffdd38  →  0x00007fffffffe07e  →  "/home/ubuntu/misc/test-gdb.out"
       
    (gdb) info registers $r12
    r12            0x7fffffffdd38      0x7fffffffdd38
    (gdb) x/a 0x7fffffffdd38
    0x7fffffffdd38: 0x7fffffffe07e
    (gdb) x/s 0x7fffffffe07e
    0x7fffffffe07e: "/home/ubuntu/misc/test-gdb.out"
    
  4. ELF info

    gef➤  elf-info
    Magic                 : 7f 45 4c 46
    Class                 : 0x2 - ELF_64_BITS
    Endianness            : 0x1 - LITTLE_ENDIAN
       
    ...
       
    ───────────────────────────────────────────────────────────────────────────────────────────────── Program Header ──────────────────────────────────────────────────────────────────────────────────────
      [ #] Type           Offset   Virtaddr   Physaddr  FileSiz   MemSiz Flags    Align
      [ 0] PT_PHDR          0x40       0x40       0x40    0x2d8    0x2d8 PF_R       0x8
      [ 1] PT_INTERP       0x318      0x318      0x318     0x1c     0x1c PF_R       0x1
       
    ...
       
    -──────────────────────────────────────────────────────────────────────────────────────────────── Section Header ──────────────────────────────────────────────────────────────────────────────────────
      [ #] Name                            Type    Address   Offset     Size   EntSiz Flags Link Info    Align
      [ 0]                                 UNKN        0x0      0x0      0x0      0x0 UNKNOWN_FLAG  0x0  0x0      0x0
      [ 1] .interp                 SHT_PROGBITS      0x318    0x318     0x1c      0x0 ALLOC  0x0  0x0      0x1
      [ 2] .note.gnu.property          SHT_NOTE      0x338    0x338     0x30      0x0 ALLOC  0x0  0x0      0x8
      [ 3] .note.gnu.build-id          SHT_NOTE      0x368    0x368     0x24      0x0 ALLOC  0x0  0x0      0x4
      [ 4] .note.ABI-tag               SHT_NOTE      0x38c    0x38c     0x20      0x0 ALLOC  0x0  0x0      0x4
      [ 5] .gnu.hash               SHT_GNU_HASH      0x3b0    0x3b0   0x33a0      0x0 ALLOC  0x6  0x0      0x8
         
    ...
    

More Tools

Apart from GDB, we have a bunch of other tools to inspect objects.

To make it simple, we can even use vim to inspect the file.

hexdump

hexdump can dump file contents in various formats (e.g. hexadecimal), including customized format with the option --format.

The following example dumps the contents in ASCII format.

~ $ hexdump -C core.pid | less 

000140b0  f0 b8 1d 00 00 00 00 00  0a 00 00 00 00 00 00 00  |................|
000140c0  00 5f 49 54 4d 5f 64 65  72 65 67 69 73 74 65 72  |._ITM_deregister|
000140d0  54 4d 43 6c 6f 6e 65 54  61 62 6c 65 00 5f 5f 67  |TMCloneTable.__g|
000140e0  6d 6f 6e 5f 73 74 61 72  74 5f 5f 00 5f 49 54 4d  |mon_start__._ITM|
000140f0  5f 72 65 67 69 73 74 65  72 54 4d 43 6c 6f 6e 65  |_registerTMClone|
00014100  54 61 62 6c 65 00 63 72  79 70 74 5f 72 00 6c 75  |Table.crypt_r.lu|
00014110  61 5f 70 75 73 68 66 73  74 72 69 6e 67 00 6c 75  |a_pushfstring.lu|
00014120  61 5f 73 65 74 65 78 64  61 74 61 00 6c 75 61 5f  |a_setexdata.lua_|
00014130  67 65 74 66 69 65 6c 64  00 6c 75 61 4c 5f 70 75  |getfield.luaL_pu|
00014140  73 68 72 65 73 75 6c 74  00 6c 75 61 5f 78 6d 6f  |shresult.lua_xmo|
00014150  76 65 00 6c 75 61 5f 72  65 73 75 6d 65 00 6c 75  |ve.lua_resume.lu|
00014160  61 4c 5f 61 64 64 6c 73  74 72 69 6e 67 00 6c 75  |aL_addlstring.lu|
00014170  61 4c 5f 6f 70 74 6e 75  6d 62 65 72 00 6c 75 61  |aL_optnumber.lua|
00014180  5f 6e 65 77 75 73 65 72  64 61 74 61 00 6c 75 61  |_newuserdata.lua|
00014190  5f 73 65 74 66 69 65 6c  64 00 6c 75 61 5f 73 65  |_setfield.lua_se|
000141a0  74 74 6f 70 00 6c 75 61  5f 70 75 73 68 73 74 72  |ttop.lua_pushstr|
000141b0  69 6e 67 00 6c 75 61 6f  70 65 6e 5f 66 66 69 00  |ing.luaopen_ffi.|
000141c0  6c 75 61 5f 67 65 74 69  6e 66 6f 00 6c 75 61 5f  |lua_getinfo.lua_|
000141d0  69 73 6e 75 6d 62 65 72  00 6c 75 61 4c 5f 63 68  |isnumber.luaL_ch|
000141e0  65 63 6b 6c 73 74 72 69  6e 67 00 6c 75 61 5f 6e  |ecklstring.lua_n|
000141f0  65 78 74 00 6c 75 61 5f  74 6f 74 68 72 65 61 64  |ext.lua_tothread|
00014200  00 6c 75 61 5f 67 63 00  6c 75 61 5f 63 72 65 61  |.lua_gc.lua_crea|

However, all the non-printable characters are shown as dots in the ASCII column.

~ $ for ((i=0; i<=255; ++i)); do printf "\x$(printf %x $i)"; done | hexdump -C
00000000  00 01 02 03 04 05 06 07  08 09 0a 0b 0c 0d 0e 0f  |................|
00000010  10 11 12 13 14 15 16 17  18 19 1a 1b 1c 1d 1e 1f  |................|
00000020  20 21 22 23 24 25 26 27  28 29 2a 2b 2c 2d 2e 2f  | !"#$%&'()*+,-./|
00000030  30 31 32 33 34 35 36 37  38 39 3a 3b 3c 3d 3e 3f  |0123456789:;<=>?|
00000040  40 41 42 43 44 45 46 47  48 49 4a 4b 4c 4d 4e 4f  |@ABCDEFGHIJKLMNO|
00000050  50 51 52 53 54 55 56 57  58 59 5a 5b 5c 5d 5e 5f  |PQRSTUVWXYZ[\]^_|
00000060  60 61 62 63 64 65 66 67  68 69 6a 6b 6c 6d 6e 6f  |`abcdefghijklmno|
00000070  70 71 72 73 74 75 76 77  78 79 7a 7b 7c 7d 7e 7f  |pqrstuvwxyz{|}~.|
00000080  80 81 82 83 84 85 86 87  88 89 8a 8b 8c 8d 8e 8f  |................|
00000090  90 91 92 93 94 95 96 97  98 99 9a 9b 9c 9d 9e 9f  |................|
000000a0  a0 a1 a2 a3 a4 a5 a6 a7  a8 a9 aa ab ac ad ae af  |................|
000000b0  b0 b1 b2 b3 b4 b5 b6 b7  b8 b9 ba bb bc bd be bf  |................|
000000c0  c0 c1 c2 c3 c4 c5 c6 c7  c8 c9 ca cb cc cd ce cf  |................|
000000d0  d0 d1 d2 d3 d4 d5 d6 d7  d8 d9 da db dc dd de df  |................|
000000e0  e0 e1 e2 e3 e4 e5 e6 e7  e8 e9 ea eb ec ed ee ef  |................|
000000f0  f0 f1 f2 f3 f4 f5 f6 f7  f8 f9 fa fb fc fd fe ff  |................|
00000100

The following examples show that the object contains a lot of zeros.

~ $ hexdump -C 0x577af3685000-0x577af8547000.bin | less
00003b90  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00003ba0  00 00 00 00 00 00 00 00  91 00 00 00 00 00 00 00  |................|
00003bb0  30 8c 68 f3 7a 57 00 00  30 8c 68 f3 7a 57 00 00  |0.h.zW..0.h.zW..|
00003bc0  a0 8c 68 f3 7a 57 00 00  06 00 00 00 00 00 00 00  |..h.zW..........|
00003bd0  31 37 32 2e 31 36 2e 30  2e 31 30 3a 35 33 00 00  |172.16.0.10:53..|
00003be0  00 00 00 00 00 00 00 00  98 8a 68 f3 7a 57 00 00  |..........h.zW..|
00003bf0  40 8c 68 f3 7a 57 00 00  00 00 00 00 00 00 00 00  |@.h.zW..........|
00003c00  40 72 9b 6b 0b 73 00 00  40 8b 68 f3 7a 57 00 00  |@r.k.s..@.h.zW..|
00003c10  31 37 32 2e 31 36 2e 30  2e 31 30 00 00 00 00 00  |172.16.0.10.....|
00003c20  02 00 00 35 ac 10 00 0a  00 00 00 00 00 00 00 00  |...5............|
00003c30  00 00 00 00 00 00 00 00  61 00 00 00 00 00 00 00  |........a.......|
00003c40  0e 00 00 00 00 00 00 00  d0 8b 68 f3 7a 57 00 00  |..........h.zW..|
00003c50  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00003c60  01 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00003c70  88 8b 68 f3 7a 57 00 00  10 00 00 00 00 00 00 00  |..h.zW..........|
00003c80  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00003c90  00 00 00 00 00 00 00 00  91 00 00 00 00 00 00 00  |................|
00003ca0  20 8d 68 f3 7a 57 00 00  20 8d 68 f3 7a 57 00 00  | .h.zW.. .h.zW..|
00003cb0  90 8d 68 f3 7a 57 00 00  06 00 00 00 00 00 00 00  |..h.zW..........|
00003cc0  20 8c 68 f3 7a 57 00 00  10 00 00 00 00 00 00 00  | .h.zW..........|
00003cd0  0e 00 00 00 00 00 00 00  e0 8c 68 f3 7a 57 00 00  |..........h.zW..|
00003ce0  31 37 32 2e 31 36 2e 30  2e 31 30 3a 35 33 00 00  |172.16.0.10:53..|
00003cf0  00 00 00 00 00 00 00 00  e8 8b 68 f3 7a 57 00 00  |..........h.zW..|
00003d00  30 8d 68 f3 7a 57 00 00  00 00 00 00 00 00 00 00  |0.h.zW..........|
00003d10  f8 31 98 69 0b 73 00 00  f8 8b 68 f3 7a 57 00 00  |.1.i.s....h.zW..|
00003d20  00 00 00 00 00 00 00 00  61 00 00 00 00 00 00 00  |........a.......|

strings

Though hexdump shows ASCII characters, it is not easy to extract the ASCII column. GNU strings, on the other hand, can dump printable characters and helps identify large strings in the Core Dump.

The option -t prefixes the string with the offset in the object file. It is useful show the distribution and density of string.

~ $ strings -t d /path/to/core.pid > core.pid.density

~ $ strings [-n 4] /path/to/core.pid > core.pid.ascii

~ $ sort -o core.pid.ascii.sorted core.pid.ascii

~ $ uniq -c core.pid.ascii.sorted > core.pid.ascii.sorted.uniq

~ $ sort -nrk1,1 -o core.pid.ascii.sorted.uniq.sorted core.pid.ascii.sorted.uniq

However, non-printable characters are not included in the output.

readelf

readelf is the architecture independent but objdump can dump source code.

Specially, gef supports print ELF info.

gef➤  elf-info

Regarding more examples of readelf, please see "biji".

objdump

readelf cannot dump source code as objdump does. Here is an example.

~ $ objdump --source --source-comment=txt test-gdb.out

...

0000000000001149 <f>:
txt#include <stdio.h>
txt
txtint f(int a, int b)
txt{
    1149:       f3 0f 1e fa             endbr64
txt     int sum;
txt     sum = a + b;
    114d:       8d 04 37                lea    (%rdi,%rsi,1),%eax
txt     return sum;
txt}
    1150:       c3                      ret

...

GDB command disassemble and list can also dump source code and assembler code.

Memory

We can make use of top to show the general memory utilization.

~ $ top -o +RES -p $(pgrep -d',' nginx)

Tasks:   2 total,   0 running,   2 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.9 us,  1.8 sy,  0.0 ni, 97.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   7841.2 total,   1885.2 free,   4823.7 used,   1392.9 buff/cache
MiB Swap:   1024.0 total,    105.4 free,    918.6 used.   3017.5 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
37937 kong      20   0 1393704 414628  11904 S   0.0   5.2  14:36.72 nginx
37936 kong      20   0 1023736 108848    896 S   0.0   1.4   0:00.00 nginx

Memory mappings represents how virtual memory space is mapped to RAM. It tells us how the memory is used. GDB and gef both can show memory mappings. Of the three commands below, maintenance info sections only show the mappings of different memory sections of the ELF. It corresponds to only a few starting lines of vmmap and info proc mappings that include dyanmically allocated memory (e.g. heap/stack).

(gdb) maintenance info sections

gef➤  vmmap

(gdb) info proc mappings

Without GDB, we can show the memory mappings of a running process with pmap.

~ $ pmap -X $(pgrep nginx) > all-$(date -Iseconds).pmap

Core dumps captured at different timestamps reveal newly allocated chunks. Moreover, we can use hexdump, strings and even the vim to see what is inside the new chunks. How to identify newly allocated chunks?

The following are two excerpts from two Core Dumps. We will use this data as an example to show the process.

# Core Dump 1
[14]      0x577ae51df000->0x577ae521a000 at 0x00002a40: load1 ALLOC LOAD READONLY HAS_CONTENTS
[15]      0x577ae54fc000->0x577ae54ff000 at 0x0003da40: load2 ALLOC LOAD READONLY HAS_CONTENTS
[16]      0x577ae54ff000->0x577ae551e000 at 0x00040a40: load3 ALLOC LOAD HAS_CONTENTS
[17]      0x577ae551e000->0x577ae5550000 at 0x0005fa40: load4 ALLOC LOAD HAS_CONTENTS
[18]      0x577ae63cd000->0x577ae66a0000 at 0x00091a40: load5 ALLOC LOAD HAS_CONTENTS
[19]      0x577ae66a0000->0x577af3685000 at 0x00364a40: load6 ALLOC LOAD HAS_CONTENTS
[20]      0x730b65ac0000->0x730b671c1000 at 0x0d349a40: load7 ALLOC LOAD HAS_CONTENTS

# Core Dump 2
[14]      0x577ae51df000->0x577ae521a000 at 0x00002a78: load1 ALLOC LOAD READONLY HAS_CONTENTS
[15]      0x577ae54fc000->0x577ae54ff000 at 0x0003da78: load2 ALLOC LOAD READONLY HAS_CONTENTS
[16]      0x577ae54ff000->0x577ae551e000 at 0x00040a78: load3 ALLOC LOAD HAS_CONTENTS
[17]      0x577ae551e000->0x577ae5550000 at 0x0005fa78: load4 ALLOC LOAD HAS_CONTENTS
[18]      0x577ae63cd000->0x577ae66a0000 at 0x00091a78: load5 ALLOC LOAD HAS_CONTENTS
[19]      0x577ae66a0000->0x577af8547000 at 0x00364a78: load6 ALLOC LOAD HAS_CONTENTS
[20]      0x730b64d1d000->0x730b671c1000 at 0x1220ba78: load7 ALLOC LOAD HAS_CONTENTS

The 19th mapping in Core Dump 2 has expanded its ending address than the Core Dump 1. We can calculate the increased size in bytes.

~ $ bc <<< 'obase=16; ibase = 16; 0X577AF8547000 - 0X577AF3685000'
4EC2000

~ $ bc <<< 'obase=10; ibase = 16; 0X577AF8547000 - 0X577AF3685000'
82583552

Now let us extract the newly allocated memory from Core Dump 2 and save to the local disk.

(gdb) dump memory ~/misc/0x577af3685000-0x577af8547000.bin 0x577af3685000 0x577af8547000

With the extracted ranged memory dump, we can resort to More Tools.

Please check "whatever" and "kong-dev" for more details. You are strongly recommended to read Memory Leak (and Growth) Flame Graphs.

Nginx

(gdb) p ngx_cycle->pool
$1 = (ngx_pool_t *) 0x577ae63f3ff0
(gdb) p * ngx_cycle->pool
$2 = {d = {last = 0x577ae63f7ff0 "", end = 0x577ae63f7ff0 "", next = 0x577ae63fd020, failed = 6}, max = 4095, current = 0x577ae643f5f0, chain = 0x0, large = 0x577ae64c52d8, cleanup = 0x577ae64c5980, log = 0x577ae63f4058}
(gdb)
src/ngx_http_lua_socket_udp.c
 -> ngx_http_lua_socket_udp_setpeername
   ->     r = ngx_http_lua_get_req(L);
      ->     host.data = ngx_palloc(r->pool, len + 1);
(gdb) p * (ngx_http_upstream_resolved_t *) 0x577af3685060
$4 = {host = {len = 3471771947605571377, data = 0x33353a30312e <error: Cannot access memory at address 0x33353a30312e>}, port = 0, no_port = 96185581326184, naddrs = 96185581326496, addrs = 0x0, sockaddr = 0x0, socklen = 97, name = {len = 14,
    data = 0x577af3685060 "172.16.0.10:53"}, ctx = 0x0}

Appendix

Please refer to How to look at the stack with gdb.

#include <stdio.h>
#include <stdlib.h>

int main() {
    char stack_string[10] = "stack";
    int x = 10;
    char *heap_string;

    heap_string = malloc(50);

    printf("Enter a string for the stack: ");
    gets(stack_string);
    printf("Enter a string for the heap: ");
    gets(heap_string);
    printf("Stack string is: %s\n", stack_string);
    printf("Heap string is: %s\n", heap_string);
    printf("x is: %d\n", x);
}