1 |
On Wed, Sep 22, 2021, 14:54 Ed W <lists@××××××××××.com> wrote: |
2 |
|
3 |
> On 22/09/2021 20:26, Michael Jones wrote: |
4 |
> |
5 |
> |
6 |
> |
7 |
> On Wed, Sep 22, 2021 at 1:20 PM Ed W <lists@××××××××××.com> wrote: |
8 |
> |
9 |
>> Hi all, traffic seems to have dropped off here significantly, but here |
10 |
>> goes |
11 |
>> |
12 |
>> I am building a bunch of armv7a images on an AMD Ryzen9 machine (amd64). |
13 |
>> So to keep things simple I |
14 |
>> have just been doing the whole thing using qemu up until now, by which I |
15 |
>> mean I have an arm stage 3 |
16 |
>> somewhere, I chroot into it and then using userspace qemu binaries I just |
17 |
>> run my whole script to |
18 |
>> generate the target build from inside that chroot. This works but it's at |
19 |
>> least a 5x slowdown from |
20 |
>> native |
21 |
>> |
22 |
>> To optimise this I have tried |
23 |
>> |
24 |
>> - turning on the various compiler options for python (claimed to give a |
25 |
>> 30% improvement) + LTO/PGO. |
26 |
>> I don't notice any difference in the chroot - presume that the emulation |
27 |
>> overhead is dominant effect |
28 |
>> |
29 |
>> - tried compiling qemu with -O3 and LTO (claimed to be supported since |
30 |
>> 6.0). Doesn't give any |
31 |
>> noticeable different in performance of emerge |
32 |
>> |
33 |
>> - Added a static compiled amd64 /bin/bash to the chroot - now this does |
34 |
>> give a noticeable boost to |
35 |
>> compile and emerge speeds. (random benchmark went from 26s to 22s) |
36 |
>> |
37 |
>> |
38 |
>> So motivated by the last item I want to try and see how many native exes |
39 |
>> I can push into the chroot |
40 |
>> (since I'm running under usermode qemu! why not!). The obvious one is the |
41 |
>> compiler |
42 |
>> |
43 |
>> Now, I have a cross compiler built, but a) that's not static, so I would |
44 |
>> need to find a way to get |
45 |
>> native libc into the chroot, and b) I'm not clear how I would call it |
46 |
>> inside the chroot, could I |
47 |
>> just move a symlink to the other compiler into the path? How does it find |
48 |
>> things like libgcc*.so etc? |
49 |
>> |
50 |
>> Or perhaps this is easier than this? Can I just use some incantation in |
51 |
>> the same way that the |
52 |
>> crosscompiler must be working to build myself a straight gcc inside the |
53 |
>> chroot which is native arch |
54 |
>> and statically compiled? eg is it enough that assuming I can build gcc |
55 |
>> static, can I just do this |
56 |
>> from outside the chroot and overwrite the native: |
57 |
>> |
58 |
>> ROOT=$PWD emerge -1v --nodeps gcc |
59 |
>> |
60 |
>> |
61 |
>> It seems to me that this should work at least for the gcc binaries, etc. |
62 |
>> However, I'm completely |
63 |
>> ignorant of whether I want things like the linker plugin in native arch |
64 |
>> or target arch? What about |
65 |
>> the libgcc*.so files? (They don't actually exist in my cross compiler |
66 |
>> directories, but they are |
67 |
>> linked in as dependencies in some binaries in target and exist in the |
68 |
>> native compiler dir) |
69 |
>> |
70 |
>> Hacker news had someone do this recently and I believe meego used to do |
71 |
>> something similar, so really |
72 |
>> just trying to work out the details for this on gentoo. Any thoughts? |
73 |
>> |
74 |
>> Thanks |
75 |
>> |
76 |
>> Ed W |
77 |
>> |
78 |
> |
79 |
> |
80 |
> It's not clear to me if you're building gentoo images, or just building |
81 |
> some application. |
82 |
> |
83 |
> If you're building gentoo images, you might consider this project |
84 |
> https://github.com/GenPi64 , we'd love to work with you on the mixed arch |
85 |
> situation, since we suffer the same problem. |
86 |
> |
87 |
> |
88 |
> These are whole gentoo images. :-) |
89 |
> |
90 |
> So it's nothing special, but something like I drop into the arm chroot, |
91 |
> then there is a whole pile of something like: |
92 |
> |
93 |
> ROOT=/mnt/new_image emerge $stuff |
94 |
> |
95 |
> And at the end of all of that you have a shiny image to boot from (on an |
96 |
> imx based SOM as it happens). |
97 |
> |
98 |
> Nice thing about this approach is that I need to build the same system for |
99 |
> i386, amd64 and 32bit arm, and basically it means only running the same |
100 |
> build script in each individual chroot, so it's quite nice not needing to |
101 |
> fixup stuff for each platform. |
102 |
> |
103 |
> |
104 |
> There are arm64bit boxes you can rent from AWS and similar, but we see a |
105 |
> few build oddities on this which still need fixing and at least as near as |
106 |
> I can see they are still quite a bit slower than using an intel processor |
107 |
> in native mode. |
108 |
> |
109 |
> |
110 |
> I'm just about to (re) try using distcc, which basically achieves the |
111 |
> required end goal, so that I can measure performance. So something like run |
112 |
> up a side by side chroot using crossdev, then fire up distcc in there and |
113 |
> talk to it from your arm chroot. This gives less speedup than you would |
114 |
> like because it needs quite a lot of work on the arm qemu side and |
115 |
> serialising stuff, etc. Also linking etc is still on the arm side. |
116 |
> |
117 |
> I think the replacing of the bash binary with a native static binary is |
118 |
> giving a decent speedup. I'm about to try swapping in pypy to see how that |
119 |
> behaves. |
120 |
> |
121 |
> However, there is no doubt that getting the native cross compiler into the |
122 |
> chroot is the solution, but there are more than a few challenges here, such |
123 |
> as how to get it statically compiled and how to insert some or all of it |
124 |
> into the arm chroot. |
125 |
> |
126 |
> See here for inspiration and I guess also the meego stuff from history: |
127 |
> |
128 |
> https://news.ycombinator.com/item?id=28376447 |
129 |
> |
130 |
> |
131 |
> Thanks for any tips! |
132 |
> |
133 |
> Ed W |
134 |
> |
135 |
|
136 |
|
137 |
The genpi64 project does use distcc for building images when configured. |
138 |
|
139 |
Like I said, I think there'd be a big benefit to collaborating, but the |
140 |
image builder is usable as is for your purpose, if I understand it |
141 |
correctly. Its just missing the native binaries to speed things up. |
142 |
|
143 |
> |