1 |
OK, I finally managed to work around it, and even though it is not as |
2 |
nice as I would've hoped for it works. So I sent it as a follow-up PR: |
3 |
|
4 |
https://github.com/gentoo/gentoo/pull/26065 |
5 |
|
6 |
Martin |
7 |
|
8 |
On Thu, Jun 16, 2022 at 05:24:26PM +0200, Martin Kletzander wrote: |
9 |
>I finally figured out what is happening, but I am not sure what would be |
10 |
>the best way to work around it. |
11 |
> |
12 |
>The problem is that with FEATURES=splitdebug the vmlinux binary is being |
13 |
>processed by estrip, which uses debugedit and specifically asks it to |
14 |
>recompute the build id. However, the bzImage is created from the |
15 |
>vmlinux *before* that, and thus preserves the old build-id. |
16 |
> |
17 |
>One option would be to create the vmlinux.debug file manually, but I am |
18 |
>afraid it would duplicate lot of the code from estrip, unless it can |
19 |
>somehow be uses cleanly by the ebuild. The advantage of this would be |
20 |
>that there is no need for the huge vmlinux file after that and we can |
21 |
>just keep the vmlinux.debug around instead. |
22 |
> |
23 |
>I'll end with a couple of closing questions if I may: |
24 |
> |
25 |
>- Does anyone have an idea for some a clean way to do this? |
26 |
> |
27 |
>- Is it preferable to use GitHub PRs or this ML for such eclass changes? |
28 |
> |
29 |
>- What is exactly the reason for portage using the `-i`/`--build-id` |
30 |
> option of debugedit? |
31 |
> |
32 |
>Thanks and have a nice day, |
33 |
>Martin |
34 |
> |
35 |
>On Fri, Jun 10, 2022 at 02:22:00PM +0200, Martin Kletzander wrote: |
36 |
>>Hello, |
37 |
>> |
38 |
>>I am trying to make systemtap work with gentoo-kernel (or ideally all |
39 |
>>dist kernels) and I got a few steps closer with kernel-build.eclass |
40 |
>>modification I sent this week [0]. However there is still one issue and |
41 |
>>that is the fact that build-id of the kernel does not match the |
42 |
>>installed vmlinux file: |
43 |
>> |
44 |
>># stap mba_sc.stp |
45 |
>>WARNING: Build-id mismatch [man warning::buildid]: |
46 |
>>"/usr/src/linux-5.17.13-gentoo-dist/vmlinux" pid 0 address |
47 |
>>0xffffffff8a7b572c, expected c43e775aad5e11755bf5cf1329d2240b519e7518 |
48 |
>>actual 3a757e0a2b0d777762cd4aaf9cac0c40bc8c398c |
49 |
>>WARNING: /usr/bin/staprun exited with status: 1 |
50 |
>>Pass 5: run failed. [man error::pass5] |
51 |
>> |
52 |
>>I also noticed that when kernel-build.eclass installs the vmlinux file |
53 |
>>it also (I presume portage) creates vmlinux.debug using objcopy |
54 |
>>--only-keep-debug --compress-debug-sections. |
55 |
>> |
56 |
>>So now I am in a situation where I have these relevant files on the |
57 |
>>system: |
58 |
>> |
59 |
>>- /usr/src/linux-5.17.13-gentoo-dist/vmlinux |
60 |
>>- /usr/lib/debug/.build-id/c4/3e775aad5e11755bf5cf1329d2240b519e7518.debug |
61 |
>> (symlink to the first file) |
62 |
>>- /usr/lib/debug/usr/src/linux-5.17.13-gentoo-dist/vmlinux.debug and |
63 |
>>- /boot/vmlinuz-5.17.13-gentoo-dist |
64 |
>> |
65 |
>> |
66 |
>>When I check the build ids (using readelf -n or just "file") of the |
67 |
>>first three files I get: |
68 |
>> |
69 |
>>/usr/src/linux-5.17.13-gentoo-dist/vmlinux: |
70 |
>>ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, |
71 |
>>BuildID[sha1]=c43e775aad5e11755bf5cf1329d2240b519e7518, not stripped |
72 |
>> |
73 |
>>/usr/lib/debug/.build-id/c4/3e775aad5e11755bf5cf1329d2240b519e7518.debug: |
74 |
>>ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, |
75 |
>>BuildID[sha1]=c43e775aad5e11755bf5cf1329d2240b519e7518, with debug_info, |
76 |
>>not stripped |
77 |
>> |
78 |
>>/usr/lib/debug/usr/src/linux-5.17.13-gentoo-dist/vmlinux.debug: |
79 |
>>ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, |
80 |
>>BuildID[sha1]=c43e775aad5e11755bf5cf1329d2240b519e7518, with debug_info, |
81 |
>>not stripped |
82 |
>> |
83 |
>>which looks great except: |
84 |
>> |
85 |
>>1) the first file does not say it is "with debug_info", |
86 |
>> |
87 |
>>2) there is no reason to keep the original vmlinux in place since there |
88 |
>> is a smaller file that works as a substitute, but I'm not sure what's |
89 |
>> a clean way to not install it, and most importantly |
90 |
>> |
91 |
>>3) the fact that the running kernel has a different build id. |
92 |
>> |
93 |
>>The last point is the main issue here. I was trying to find how to |
94 |
>>check for the build id of the running kernel, but haven't found any way |
95 |
>>on how to do it with a kernel API, so instead I checked the |
96 |
>>/boot/vmlinuz-5.17.13-gentoo-dist like this: |
97 |
>> |
98 |
>>~/dev/linux/scripts/extract-vmlinux /boot/vmlinuz-5.17.13-gentoo-dist >vmlinux.extracted |
99 |
>> |
100 |
>>and for good measure also tried what objcopy does to it: |
101 |
>> |
102 |
>>objcopy --only-keep-debug vmlinux.extracted vmlinux.extracted.debug |
103 |
>>objcopy --only-keep-debug --compress-debug-sections vmlinux.extracted vmlinux.extracted.compressed |
104 |
>> |
105 |
>>Now when I check the build id is different from the first files, but |
106 |
>>unchanged with objcopy and same as systemtap reports for the running |
107 |
>>kernel: |
108 |
>> |
109 |
>>vmlinux.extracted: |
110 |
>>ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, |
111 |
>>BuildID[sha1]=3a757e0a2b0d777762cd4aaf9cac0c40bc8c398c, stripped |
112 |
>> |
113 |
>>vmlinux.extracted.compressed: |
114 |
>>ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, |
115 |
>>BuildID[sha1]=3a757e0a2b0d777762cd4aaf9cac0c40bc8c398c, stripped |
116 |
>> |
117 |
>>vmlinux.extracted.debug: |
118 |
>>ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, |
119 |
>>BuildID[sha1]=3a757e0a2b0d777762cd4aaf9cac0c40bc8c398c, stripped |
120 |
>> |
121 |
>> |
122 |
>>At this point I got stuck, not knowing when and how does the build-id |
123 |
>>changes and where to extract the debug symbols from. I would also like |
124 |
>>to clean up the change I did. So I came here with my question(s) and |
125 |
>>rather lengthy explanations. Does anyone know what would be the best |
126 |
>>way to deal with this? Or even where to continue looking? I would |
127 |
>>really like to make systemtap "just work" on Gentoo with the |
128 |
>>distribution kernels, but I already spent a lot of time on it, so I |
129 |
>>figured I'll rather ask here since I'm not that proficient with the |
130 |
>>intricacies of the build system parts. |
131 |
>> |
132 |
>>Thanks a lot for any pointers and have a great day, |
133 |
>>Martin |
134 |
>> |
135 |
>>[0] https://github.com/gentoo/gentoo/pull/25789 |