Ops
Search…
Debugging

Using the GNU Debugger

Could not load image
If you think you found a problem in OPS or Nanos first thing is to verify where the bug lies. Generally, if you turn on the -d flag and study the bottom you can tell if it is in user or kernel:
1
1 general protection fault in user mode, rip 0x13783afe7
Copied!
Clearly in this example we see a GPF in user.
At this moment, we do not have interactive debugging support with ops, but we plan on integrating it soon. For now if you wish to have symbol access within your user program the following conditions must be met:
  • Ensure your program is statically linked.
  • Ensure you have debugging symbols to begin with:
You can do this with c by doing the following:
Example
For this example will examine a segfault (that we purposely injected):
1
#include <stdio.h>
2
#include <stdlib.h>
3
4
void mybad() {
5
int x = 1;
6
char *stuff = "asdf";
7
8
printf("about to die\n");
9
*(int*)0 = 0;
10
}
11
12
int main(void) {
13
mybad();
14
printf("should not get here\n");
15
16
return 0;
17
}
Copied!
We compile with debugging symbols and link statically:
1
cc main.c -static -g -o main
Copied!
First (since we are missing interactive debug support in ops) you need to modify ops to manually add the noaslr flag in lepton/image.go:
1
m.AddDebugFlag("noaslr", 't')
Copied!
This is important because otherwise we randomize the location of the .text and other parts of your program.
Next, we'll run without accel support:
1
ops run --accel=false main
Copied!
Then we let it crash.
Now let's manually start qemu with gdb support: (Not all of this is necessary but definitely ensure your 'drive file' line matches where your disk image is)
Also the '-s -S' starts the remote gdb debugger:
1
qemu-system-x86_64 \
2
-device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x3 \
3
-device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x3.0x1 \
4
-device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x3.0x2 \
5
-device virtio-scsi-pci,bus=pci.2,addr=0x0,id=scsi0 \
6
-device scsi-hd,bus=scsi0.0,drive=hd0 -no-reboot -cpu max -machine q35 \
7
-device isa-debug-exit -m 2G \
8
-drive file=/home/eyberg/.ops/images/main.img,format=raw,if=none,id=hd0 \
9
-device virtio-net,bus=pci.3,addr=0x0,netdev=n0 \
10
-netdev user,id=n0,hostfwd=tcp::8080-:8080,hostfwd=tcp::9090-:9090,hostfwd=udp::5309-:5309 \
11
-display none -serial stdio -s -S
Copied!
This will pause waiting on a gdb to attach to it.
In another window, we'll launch gdb pointing it at whatever kernel you are using:
1
gdb ~/.ops/0.1.27/kernel.img
Copied!
We'll connect to qemu by specifying the remote:
1
(gdb) target remote localhost:1234
2
Remote debugging using localhost:1234
3
0x000000000000fff0 in ?? ()
Copied!
Great - now we are connected.
Now we load the symbols for our program:
1
(gdb) symbol-file main
2
Load new symbol table from "main"? (y or n) y
3
Reading symbols from main...
Copied!
You can see the source now:
1
(gdb) list
2
1 #include <stdio.h>
3
2 #include <stdlib.h>
4
3
5
4 void mybad() {
6
5 int x = 1;
7
6 char *stuff = "asdf";
8
7
9
8 printf("about to die\n");
10
9 *(int*)0 = 0;
Copied!
Let's set a breakpoint for 'mybad':
1
(gdb) b mybad
2
Breakpoint 1 at 0x401d35: file main.c, line 4.
Copied!
Now let's continue:
1
(gdb) c
2
Continuing.
3
4
Breakpoint 1, mybad () at main.c:4
5
4 void mybad() {
Copied!
You should see in your other window (where you started qemu) that the program starts:
1
assigned: 10.0.2.15
Copied!
Now if we single step through the program we can print out various locals:
1
4 void mybad() {
2
(gdb) step
3
5 int x = 1;
4
(gdb) step
5
6 char *stuff = "asdf";
6
(gdb) step
7
8 printf("about to die\n");
8
(gdb) p x
9
$1 = 1
10
(gdb) p stuff
11
$2 = 0x495004 "asdf"
Copied!
This should get you further down the path when you find various bugs in your program. Hope this helps.
If your program isn't even starting there might be an issue with OPS.
However, if you open an issue in Nanos please provide the following:
Your OPS profile:
1
➜ ~ ops profile
2
Ops version: 0.1.9
3
Nanos version: 0.1.25
4
Qemu version: 4.1.90
5
Arch: darwin
Copied!
Nightly vs Master?
Does this work on the nightly build? Running '-n' will run ops with whatever was in the master branch last night.
Reproducible steps

GUI Debugger in VSCode

This part of the debugging guide shows how to use the GNU Debugger (GDB) in combination with VSCode to get better visualization of the debugging process. It requires that the Native Debug extension is installed.

Prerequisite

Launch the application in debug mode with ops:
1
$ ops run -d main
2
booting ~/.ops/images/main.img ...
3
You have disabled hardware acceleration
4
5
Waiting for gdb connection. Connect to qemu through "(gdb) target remote localhost:1234"
6
See further instructions in https://nanovms.gitbook.io/ops/debugging
Copied!

Attach the Debugger

  1. 1.
    Click the Run icon on the left sidebar (alternatively use Ctrl+Shift+D) and then create a launch.json file.
  2. 2.
    Select GDB as the environment. This will create an autogenerated launch.json file.
  3. 3.
    Replace the contents of the launch.json file with following:
    1
    {
    2
    // Use IntelliSense to learn about possible attributes.
    3
    // Hover to view descriptions of existing attributes.
    4
    // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
    5
    "version": "0.2.0",
    6
    "configurations": [
    7
    {
    8
    "name": "Debug",
    9
    "type": "gdb",
    10
    "request": "attach",
    11
    "executable": "${workspaceFolder}/main",
    12
    "target": "localhost:1234",
    13
    "remote": true,
    14
    "cwd": "${workspaceRoot}",
    15
    "valuesFormatting": "parseText"
    16
    }
    17
    ]
    18
    }
    Copied!
  4. 4.
    Set a Breakpoint in the source file (main.c) and start the debugging session from the Run on the left sidebar (alternatively use Ctrl+Shift+D) and click on the > Debug icon.
It is now possible to use the debugging palette to debug the application code.

Dump

Ops provides a tool that allow you to inspect image crash logs and image manifests. The dump tool binary is inside the ops version directory (~/.ops/<ops_version>/dump). Make sure you use the dump tool of the same ops version you used to build the image you want to analyze.
If the application crashes the unikernel writes the error stack to a log file before exiting. You are able to see the log content if you run the command dump -l <image_path>.
1
$ dump -l ~/.ops/main.img
2
detected filesystem at 0x11600
3
klog offset: 0x10600
4
klog size: 4096 bytes
5
boot id: 0
6
exit code: 1
7
8
en1: assigned 10.0.2.15
9
2021/03/22 11:35:35 Failed
Copied!
The image manifest has details about the image like:
  • files and their paths in the filesystem;
  • environment variable values, including nanos base image version used;
  • program and arguments to run on initialization;
  • static IP, gateway and netmask to configure the network interface;
  • etc.
    You can look into your image manifest by using the command dump <image_path>.
    1
    $ dump ~/.ops/main.img
    2
    detected filesystem at 0xc11600
    3
    Label:
    4
    UUID: ad63ee2d-58d1-336f-7484-9fc81f3bc835
    5
    metadata (
    6
    program:/main
    7
    arguments:(0:main)
    8
    environment:(PWD:/ NANOS_VERSION:0.1.32 USER:root)
    9
    children:(
    10
    .:
    11
    proc:(children:(
    12
    .:
    13
    ..:
    14
    self:(mtime:6942441066663300181 atime:6942441066663300181 children:(
    15
    .:
    16
    ..:
    17
    exe:(mtime:6942441066669261459 atime:6942441066669261459 linktarget:/main)
    18
    ))
    19
    sys:(children:(
    20
    .:
    21
    ..:
    22
    kernel:(children:(
    23
    .:
    24
    ..:
    25
    hostname:(extents:(0:(allocated:1 length:1 offset:5668)) filelength:7)
    26
    ))
    27
    ))
    28
    ))
    29
    etc:(children:(
    30
    ssl:(children:(
    31
    .:
    32
    ..:
    33
    certs:(children:(
    34
    ca-certificates.crt:(extents:(0:(allocated:406 length:406 offset:1078)) filelength:207436)
    35
    .:
    36
    ..:
    37
    ))
    38
    ))
    39
    passwd:(extents:(0:(allocated:1 length:1 offset:1485)) filelength:33)
    40
    .:
    41
    ..:
    42
    resolv.conf:(extents:(0:(allocated:1 length:1 offset:1484)) filelength:18)
    43
    ))
    44
    ..:
    45
    lib:(children:(
    46
    .:
    47
    ..:
    48
    x86_64-linux-gnu:(children:(
    49
    libnss_dns.so.2:(extents:(0:(allocated:53 length:53 offset:1025)) filelength:26936)
    50
    .:
    51
    ..:
    52
    ))
    53
    ))
    54
    main:(extents:(0:(allocated:4182 length:4182 offset:1486)) filelength:2140732)
    55
    )
    56
    booted:)
    Copied!

FUSE

Nanos has a FUSE driver which allows us to mount the TFS image used on the host filesystem. This makes it easy for rapid development, ease of debugging and other interesting tools.
To try it out on Linux:
1
sudo apt-get install libfuse-dev
Copied!
To try it out on Mac:
1
brew install macfuse
Copied!
Then you simply create a mount point and mount your desired image:
1
mkdir -p /tmp/mnt
2
~/.ops/0.1.34/tfs-fuse /tmp/mnt ~/.ops/images/myimg.img
Copied!

Debug Logging

You can use the 'net console' logging feature if you'd like to log without printing via serial/vga. Simply specify it via:
1
{
2
"ManifestPassthrough": {
3
"consoles": ["+net"]
4
}
5
}
Copied!
Then you can capture it via netcat:
1
nc -l -u 4444
Copied!
The default ip/port that Nanos will ship the logging to is 10.0.2.2 (assuming user-mode) and port of 4444.
You can adjust these via the following config though:
1
```netconsole_port
Copied!

Syscall Execution Time

OPS/Nanos has the ability to trace the number of syscall executions and timing information much like strace -c.
Here is a short c example:
1
#include <stdio.h>
2
3
int main() {
4
printf("hello!\n");
5
}
Copied!
The output:
1
[email protected]:~/z$ ops run --syscall-summary main
2
booting /home/eyberg/.ops/images/main.img ...
3
en1: assigned 10.0.2.15
4
hello!
5
6
% time seconds usecs/call calls errors syscall
7
------ ------------ ------------ ------------------------------------------
8
48.53 0.000644 215 3 brk
9
25.16 0.000334 334 1 read
10
16.05 0.000213 213 1 write
11
3.31 0.000044 6 7 mmap
12
1.73 0.000023 6 4 pread64
13
1.58 0.000021 4 5 4 openat
14
0.97 0.000013 13 1 close
15
0.90 0.000012 3 4 mprotect
16
0.67 0.000009 9 1 uname
17
0.37 0.000005 5 1 1 access
18
0.22 0.000003 2 2 fstat
19
0.22 0.000003 1 3 3 stat
20
0.22 0.000003 2 2 1 arch_prctl
21
------ ------------ ------------ ------------------------------------------
22
100.00 0.001327 35 9 total
23
exit status 1
Copied!
Last modified 3mo ago