Gentoo Archives: gentoo-dev

From: Martin Kletzander <nert.pinx@×××××.com>
To: gentoo-dev@l.g.o
Subject: [gentoo-dev] Re: Systemtap with dist-kernel
Date: Fri, 24 Jun 2022 10:43:27
Message-Id: YrWVSC2pjGWbdmfO@wheatley
In Reply to: [gentoo-dev] Re: Systemtap with dist-kernel by Martin Kletzander
1 OK, I finally managed to work around it, and even though it is not as
2 nice as I would've hoped for it works. So I sent it as a follow-up PR:
3
4 https://github.com/gentoo/gentoo/pull/26065
5
6 Martin
7
8 On Thu, Jun 16, 2022 at 05:24:26PM +0200, Martin Kletzander wrote:
9 >I finally figured out what is happening, but I am not sure what would be
10 >the best way to work around it.
11 >
12 >The problem is that with FEATURES=splitdebug the vmlinux binary is being
13 >processed by estrip, which uses debugedit and specifically asks it to
14 >recompute the build id. However, the bzImage is created from the
15 >vmlinux *before* that, and thus preserves the old build-id.
16 >
17 >One option would be to create the vmlinux.debug file manually, but I am
18 >afraid it would duplicate lot of the code from estrip, unless it can
19 >somehow be uses cleanly by the ebuild. The advantage of this would be
20 >that there is no need for the huge vmlinux file after that and we can
21 >just keep the vmlinux.debug around instead.
22 >
23 >I'll end with a couple of closing questions if I may:
24 >
25 >- Does anyone have an idea for some a clean way to do this?
26 >
27 >- Is it preferable to use GitHub PRs or this ML for such eclass changes?
28 >
29 >- What is exactly the reason for portage using the `-i`/`--build-id`
30 > option of debugedit?
31 >
32 >Thanks and have a nice day,
33 >Martin
34 >
35 >On Fri, Jun 10, 2022 at 02:22:00PM +0200, Martin Kletzander wrote:
36 >>Hello,
37 >>
38 >>I am trying to make systemtap work with gentoo-kernel (or ideally all
39 >>dist kernels) and I got a few steps closer with kernel-build.eclass
40 >>modification I sent this week [0]. However there is still one issue and
41 >>that is the fact that build-id of the kernel does not match the
42 >>installed vmlinux file:
43 >>
44 >># stap mba_sc.stp
45 >>WARNING: Build-id mismatch [man warning::buildid]:
46 >>"/usr/src/linux-5.17.13-gentoo-dist/vmlinux" pid 0 address
47 >>0xffffffff8a7b572c, expected c43e775aad5e11755bf5cf1329d2240b519e7518
48 >>actual 3a757e0a2b0d777762cd4aaf9cac0c40bc8c398c
49 >>WARNING: /usr/bin/staprun exited with status: 1
50 >>Pass 5: run failed. [man error::pass5]
51 >>
52 >>I also noticed that when kernel-build.eclass installs the vmlinux file
53 >>it also (I presume portage) creates vmlinux.debug using objcopy
54 >>--only-keep-debug --compress-debug-sections.
55 >>
56 >>So now I am in a situation where I have these relevant files on the
57 >>system:
58 >>
59 >>- /usr/src/linux-5.17.13-gentoo-dist/vmlinux
60 >>- /usr/lib/debug/.build-id/c4/3e775aad5e11755bf5cf1329d2240b519e7518.debug
61 >> (symlink to the first file)
62 >>- /usr/lib/debug/usr/src/linux-5.17.13-gentoo-dist/vmlinux.debug and
63 >>- /boot/vmlinuz-5.17.13-gentoo-dist
64 >>
65 >>
66 >>When I check the build ids (using readelf -n or just "file") of the
67 >>first three files I get:
68 >>
69 >>/usr/src/linux-5.17.13-gentoo-dist/vmlinux:
70 >>ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked,
71 >>BuildID[sha1]=c43e775aad5e11755bf5cf1329d2240b519e7518, not stripped
72 >>
73 >>/usr/lib/debug/.build-id/c4/3e775aad5e11755bf5cf1329d2240b519e7518.debug:
74 >>ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked,
75 >>BuildID[sha1]=c43e775aad5e11755bf5cf1329d2240b519e7518, with debug_info,
76 >>not stripped
77 >>
78 >>/usr/lib/debug/usr/src/linux-5.17.13-gentoo-dist/vmlinux.debug:
79 >>ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked,
80 >>BuildID[sha1]=c43e775aad5e11755bf5cf1329d2240b519e7518, with debug_info,
81 >>not stripped
82 >>
83 >>which looks great except:
84 >>
85 >>1) the first file does not say it is "with debug_info",
86 >>
87 >>2) there is no reason to keep the original vmlinux in place since there
88 >> is a smaller file that works as a substitute, but I'm not sure what's
89 >> a clean way to not install it, and most importantly
90 >>
91 >>3) the fact that the running kernel has a different build id.
92 >>
93 >>The last point is the main issue here. I was trying to find how to
94 >>check for the build id of the running kernel, but haven't found any way
95 >>on how to do it with a kernel API, so instead I checked the
96 >>/boot/vmlinuz-5.17.13-gentoo-dist like this:
97 >>
98 >>~/dev/linux/scripts/extract-vmlinux /boot/vmlinuz-5.17.13-gentoo-dist >vmlinux.extracted
99 >>
100 >>and for good measure also tried what objcopy does to it:
101 >>
102 >>objcopy --only-keep-debug vmlinux.extracted vmlinux.extracted.debug
103 >>objcopy --only-keep-debug --compress-debug-sections vmlinux.extracted vmlinux.extracted.compressed
104 >>
105 >>Now when I check the build id is different from the first files, but
106 >>unchanged with objcopy and same as systemtap reports for the running
107 >>kernel:
108 >>
109 >>vmlinux.extracted:
110 >>ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked,
111 >>BuildID[sha1]=3a757e0a2b0d777762cd4aaf9cac0c40bc8c398c, stripped
112 >>
113 >>vmlinux.extracted.compressed:
114 >>ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked,
115 >>BuildID[sha1]=3a757e0a2b0d777762cd4aaf9cac0c40bc8c398c, stripped
116 >>
117 >>vmlinux.extracted.debug:
118 >>ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked,
119 >>BuildID[sha1]=3a757e0a2b0d777762cd4aaf9cac0c40bc8c398c, stripped
120 >>
121 >>
122 >>At this point I got stuck, not knowing when and how does the build-id
123 >>changes and where to extract the debug symbols from. I would also like
124 >>to clean up the change I did. So I came here with my question(s) and
125 >>rather lengthy explanations. Does anyone know what would be the best
126 >>way to deal with this? Or even where to continue looking? I would
127 >>really like to make systemtap "just work" on Gentoo with the
128 >>distribution kernels, but I already spent a lot of time on it, so I
129 >>figured I'll rather ask here since I'm not that proficient with the
130 >>intricacies of the build system parts.
131 >>
132 >>Thanks a lot for any pointers and have a great day,
133 >>Martin
134 >>
135 >>[0] https://github.com/gentoo/gentoo/pull/25789