1 |
Hello, |
2 |
|
3 |
I am trying to make systemtap work with gentoo-kernel (or ideally all |
4 |
dist kernels) and I got a few steps closer with kernel-build.eclass |
5 |
modification I sent this week [0]. However there is still one issue and |
6 |
that is the fact that build-id of the kernel does not match the |
7 |
installed vmlinux file: |
8 |
|
9 |
# stap mba_sc.stp |
10 |
WARNING: Build-id mismatch [man warning::buildid]: |
11 |
"/usr/src/linux-5.17.13-gentoo-dist/vmlinux" pid 0 address |
12 |
0xffffffff8a7b572c, expected c43e775aad5e11755bf5cf1329d2240b519e7518 |
13 |
actual 3a757e0a2b0d777762cd4aaf9cac0c40bc8c398c |
14 |
WARNING: /usr/bin/staprun exited with status: 1 |
15 |
Pass 5: run failed. [man error::pass5] |
16 |
|
17 |
I also noticed that when kernel-build.eclass installs the vmlinux file |
18 |
it also (I presume portage) creates vmlinux.debug using objcopy |
19 |
--only-keep-debug --compress-debug-sections. |
20 |
|
21 |
So now I am in a situation where I have these relevant files on the |
22 |
system: |
23 |
|
24 |
- /usr/src/linux-5.17.13-gentoo-dist/vmlinux |
25 |
- /usr/lib/debug/.build-id/c4/3e775aad5e11755bf5cf1329d2240b519e7518.debug |
26 |
(symlink to the first file) |
27 |
- /usr/lib/debug/usr/src/linux-5.17.13-gentoo-dist/vmlinux.debug and |
28 |
- /boot/vmlinuz-5.17.13-gentoo-dist |
29 |
|
30 |
|
31 |
When I check the build ids (using readelf -n or just "file") of the |
32 |
first three files I get: |
33 |
|
34 |
/usr/src/linux-5.17.13-gentoo-dist/vmlinux: |
35 |
ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, |
36 |
BuildID[sha1]=c43e775aad5e11755bf5cf1329d2240b519e7518, not stripped |
37 |
|
38 |
/usr/lib/debug/.build-id/c4/3e775aad5e11755bf5cf1329d2240b519e7518.debug: |
39 |
ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, |
40 |
BuildID[sha1]=c43e775aad5e11755bf5cf1329d2240b519e7518, with debug_info, |
41 |
not stripped |
42 |
|
43 |
/usr/lib/debug/usr/src/linux-5.17.13-gentoo-dist/vmlinux.debug: |
44 |
ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, |
45 |
BuildID[sha1]=c43e775aad5e11755bf5cf1329d2240b519e7518, with debug_info, |
46 |
not stripped |
47 |
|
48 |
which looks great except: |
49 |
|
50 |
1) the first file does not say it is "with debug_info", |
51 |
|
52 |
2) there is no reason to keep the original vmlinux in place since there |
53 |
is a smaller file that works as a substitute, but I'm not sure what's |
54 |
a clean way to not install it, and most importantly |
55 |
|
56 |
3) the fact that the running kernel has a different build id. |
57 |
|
58 |
The last point is the main issue here. I was trying to find how to |
59 |
check for the build id of the running kernel, but haven't found any way |
60 |
on how to do it with a kernel API, so instead I checked the |
61 |
/boot/vmlinuz-5.17.13-gentoo-dist like this: |
62 |
|
63 |
~/dev/linux/scripts/extract-vmlinux /boot/vmlinuz-5.17.13-gentoo-dist >vmlinux.extracted |
64 |
|
65 |
and for good measure also tried what objcopy does to it: |
66 |
|
67 |
objcopy --only-keep-debug vmlinux.extracted vmlinux.extracted.debug |
68 |
objcopy --only-keep-debug --compress-debug-sections vmlinux.extracted vmlinux.extracted.compressed |
69 |
|
70 |
Now when I check the build id is different from the first files, but |
71 |
unchanged with objcopy and same as systemtap reports for the running |
72 |
kernel: |
73 |
|
74 |
vmlinux.extracted: |
75 |
ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, |
76 |
BuildID[sha1]=3a757e0a2b0d777762cd4aaf9cac0c40bc8c398c, stripped |
77 |
|
78 |
vmlinux.extracted.compressed: |
79 |
ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, |
80 |
BuildID[sha1]=3a757e0a2b0d777762cd4aaf9cac0c40bc8c398c, stripped |
81 |
|
82 |
vmlinux.extracted.debug: |
83 |
ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, |
84 |
BuildID[sha1]=3a757e0a2b0d777762cd4aaf9cac0c40bc8c398c, stripped |
85 |
|
86 |
|
87 |
At this point I got stuck, not knowing when and how does the build-id |
88 |
changes and where to extract the debug symbols from. I would also like |
89 |
to clean up the change I did. So I came here with my question(s) and |
90 |
rather lengthy explanations. Does anyone know what would be the best |
91 |
way to deal with this? Or even where to continue looking? I would |
92 |
really like to make systemtap "just work" on Gentoo with the |
93 |
distribution kernels, but I already spent a lot of time on it, so I |
94 |
figured I'll rather ask here since I'm not that proficient with the |
95 |
intricacies of the build system parts. |
96 |
|
97 |
Thanks a lot for any pointers and have a great day, |
98 |
Martin |
99 |
|
100 |
[0] https://github.com/gentoo/gentoo/pull/25789 |