1 |
On 22/09/2021 20:26, Michael Jones wrote: |
2 |
> |
3 |
> |
4 |
> On Wed, Sep 22, 2021 at 1:20 PM Ed W <lists@××××××××××.com <mailto:lists@××××××××××.com>> wrote: |
5 |
> |
6 |
> Hi all, traffic seems to have dropped off here significantly, but here goes |
7 |
> |
8 |
> I am building a bunch of armv7a images on an AMD Ryzen9 machine (amd64). So to keep things |
9 |
> simple I |
10 |
> have just been doing the whole thing using qemu up until now, by which I mean I have an arm |
11 |
> stage 3 |
12 |
> somewhere, I chroot into it and then using userspace qemu binaries I just run my whole script to |
13 |
> generate the target build from inside that chroot. This works but it's at least a 5x slowdown from |
14 |
> native |
15 |
> |
16 |
> To optimise this I have tried |
17 |
> |
18 |
> - turning on the various compiler options for python (claimed to give a 30% improvement) + |
19 |
> LTO/PGO. |
20 |
> I don't notice any difference in the chroot - presume that the emulation overhead is dominant |
21 |
> effect |
22 |
> |
23 |
> - tried compiling qemu with -O3 and LTO (claimed to be supported since 6.0). Doesn't give any |
24 |
> noticeable different in performance of emerge |
25 |
> |
26 |
> - Added a static compiled amd64 /bin/bash to the chroot - now this does give a noticeable boost to |
27 |
> compile and emerge speeds. (random benchmark went from 26s to 22s) |
28 |
> |
29 |
> |
30 |
> So motivated by the last item I want to try and see how many native exes I can push into the |
31 |
> chroot |
32 |
> (since I'm running under usermode qemu! why not!). The obvious one is the compiler |
33 |
> |
34 |
> Now, I have a cross compiler built, but a) that's not static, so I would need to find a way to get |
35 |
> native libc into the chroot, and b) I'm not clear how I would call it inside the chroot, could I |
36 |
> just move a symlink to the other compiler into the path? How does it find things like |
37 |
> libgcc*.so etc? |
38 |
> |
39 |
> Or perhaps this is easier than this? Can I just use some incantation in the same way that the |
40 |
> crosscompiler must be working to build myself a straight gcc inside the chroot which is native |
41 |
> arch |
42 |
> and statically compiled? eg is it enough that assuming I can build gcc static, can I just do this |
43 |
> from outside the chroot and overwrite the native: |
44 |
> |
45 |
> ROOT=$PWD emerge -1v --nodeps gcc |
46 |
> |
47 |
> |
48 |
> It seems to me that this should work at least for the gcc binaries, etc. However, I'm completely |
49 |
> ignorant of whether I want things like the linker plugin in native arch or target arch? What about |
50 |
> the libgcc*.so files? (They don't actually exist in my cross compiler directories, but they are |
51 |
> linked in as dependencies in some binaries in target and exist in the native compiler dir) |
52 |
> |
53 |
> Hacker news had someone do this recently and I believe meego used to do something similar, so |
54 |
> really |
55 |
> just trying to work out the details for this on gentoo. Any thoughts? |
56 |
> |
57 |
> Thanks |
58 |
> |
59 |
> Ed W |
60 |
> |
61 |
> |
62 |
> |
63 |
> It's not clear to me if you're building gentoo images, or just building some application. |
64 |
> |
65 |
> If you're building gentoo images, you might consider this project https://github.com/GenPi64 |
66 |
> <https://github.com/GenPi64> , we'd love to work with you on the mixed arch situation, since we |
67 |
> suffer the same problem. |
68 |
|
69 |
|
70 |
These are whole gentoo images. :-) |
71 |
|
72 |
So it's nothing special, but something like I drop into the arm chroot, then there is a whole pile |
73 |
of something like: |
74 |
|
75 |
ROOT=/mnt/new_image emerge $stuff |
76 |
|
77 |
And at the end of all of that you have a shiny image to boot from (on an imx based SOM as it happens). |
78 |
|
79 |
Nice thing about this approach is that I need to build the same system for i386, amd64 and 32bit |
80 |
arm, and basically it means only running the same build script in each individual chroot, so it's |
81 |
quite nice not needing to fixup stuff for each platform. |
82 |
|
83 |
|
84 |
There are arm64bit boxes you can rent from AWS and similar, but we see a few build oddities on this |
85 |
which still need fixing and at least as near as I can see they are still quite a bit slower than |
86 |
using an intel processor in native mode. |
87 |
|
88 |
|
89 |
I'm just about to (re) try using distcc, which basically achieves the required end goal, so that I |
90 |
can measure performance. So something like run up a side by side chroot using crossdev, then fire up |
91 |
distcc in there and talk to it from your arm chroot. This gives less speedup than you would like |
92 |
because it needs quite a lot of work on the arm qemu side and serialising stuff, etc. Also linking |
93 |
etc is still on the arm side. |
94 |
|
95 |
I think the replacing of the bash binary with a native static binary is giving a decent speedup. I'm |
96 |
about to try swapping in pypy to see how that behaves. |
97 |
|
98 |
However, there is no doubt that getting the native cross compiler into the chroot is the solution, |
99 |
but there are more than a few challenges here, such as how to get it statically compiled and how to |
100 |
insert some or all of it into the arm chroot. |
101 |
|
102 |
See here for inspiration and I guess also the meego stuff from history: |
103 |
|
104 |
https://news.ycombinator.com/item?id=28376447 |
105 |
|
106 |
|
107 |
Thanks for any tips! |
108 |
|
109 |
Ed W |