Gentoo Archives: gentoo-dev

From: "C Bergström" <cbergstrom@×××××××××.com>
To: "Anthony G. Basile" <gentoo-dev@l.g.o>
Subject: Re: [gentoo-dev] New project: LLVM
Date: Fri, 19 Aug 2016 18:21:05
Message-Id: CAOnawYoU74cgS6U+qmW5pgQ1AhTV4MmYzPW9A7CuPoZ0wZ35xA@mail.gmail.com
In Reply to: Re: [gentoo-dev] New project: LLVM by james
1 Sorry to be the party crasher, but...
2
3 I'd love to have optimizations for everything out there, but it takes
4 a lot of work to fine tune for something specific.
5
6 Right now I see a few variants of ARMv8
7 ------------
8 ARM reference stuff - A57 cores and the newer bits.. The scheduling
9 and stuff seems more-or-less similar enough that one tuning could
10 probably work for the vast majority of these parts.
11
12 Cavium ThunderX - It's ground up and quite different from the ARM
13 reference stuff under the hood
14
15 APM - Mustang, again ground up and different. I don't have enough
16 hands on to know how different from reference.
17
18 Broadcom - Coming Soon(tm) - Again no hands on or any data, but
19 certainly very interesting..
20
21 ... now add in every variant of ground up implementation and you have
22 50 shades of gray..
23 -------------
24 Soo.. depending on your target hardware, you may be better off with
25 gcc if the end goal is general all-around performance. (It does a
26 quite respectable job of being generic) I realize a lot of people have
27 strong feelings for or against it. I leave that to the reader to
28 decide..
29
30 Back to my own glass house.. It will take a few years, but I am trying
31 to make it easier (internally) to expose in some clear way all the
32 pieces which compose a fine tuning per-processor. If this was "just"
33 scheduling models it would be really easy, but it's not.. Those
34 latencies and other magic bits decide things like.. "should I unroll
35 this loop or do something else" and then you venture into the land of
36 accelerators where a custom regalloc may be what you really need and
37 *nothing* off the shelf fits to meet your goals.. (projects like that
38 can take 9 months and in the end only give a general 1-5% median
39 performance gain..)
40 --------------
41
42
43 On Sat, Aug 20, 2016 at 2:02 AM, james <garftd@×××××××.net> wrote:
44 > On 08/19/2016 11:15 AM, C Bergström wrote:
45 >>
46 >> On Fri, Aug 19, 2016 at 11:01 PM, Luca Barbato <lu_zero@g.o> wrote:
47 >>>
48 >>> BTW is pathscale ready to be used as system compiler as well?
49 >>
50 >>
51 >> I wish, but no. We have known issues when building grub2, glibc and
52 >> the Linux kernel at the very least. Someone* did report a long time
53 >> ago that with their unofficial port, were able to build/boot the
54 >> NetBSD kernel.
55 >> (*A community dev we trusted with our sources and was helping us with
56 >> portability across platforms)
57 >>
58 >> The stuff with grub2 may potentially be fixed in the "near" future...
59 >> the others are more tricky. In general if clang can do it, we have a
60 >> strong chance as well.
61 >>
62 >> As a philosophy - "we" aren't really trying to be the best generic
63 >> compiler in the world. We aim more on optimizing as much for known
64 >> targets. So if by system you mean, a compiler that would produce an
65 >> "OS" which only runs on a single class of hardware, then yeah it could
66 >> work at some point in the future. Specifically, on x86 we default on
67 >> host CPU optimizations. So on newer Intel hardware it's easy to get a
68 >> binary that won't run on AMD or older 64bit Intel.
69 >>
70 >> More recently on ARMv8 - we turn on processor specific tuning. So
71 >> while it may "run", the difference between APM's mustang and Cavium
72 >> ThunderX is pretty big and running binaries intended for A and ran on
73 >> B would certainly take a hit.. (this is just the tip of the iceberg)
74 >>
75 >> For general scalar OS code it isn't likely to matter... the real
76 >> impact being like 1-10% difference (being very general.. it could be
77 >> less or more in the real world..)
78 >>
79 >> For HPC codes or anything where you get loops or computationally
80 >> complex - the gloves are off and I could see big differences... (again
81 >> being general and maybe a bit dramatic for fun)
82 >
83 >
84 >
85 > OK (actually fantastic!). Looking at the pathscale site pages and github,
86 > perhaps a cheap arm embedded board where llvm is the centerpiece of
87 > compiling a minimal system to entice gentoo-llvm testers, would be possible
88 > in the near future?. I have a 96boards, HiKey arm64v8 that I could dedicate
89 > to gentoo+armv8-llvm testing, if that'd help. [1]
90 >
91 > Perhaps a baseline bootstrap iso (or such) version targeted at
92 > llvm-centric testers on x86-64 or armv8 ? Skip grub2 and use grub-legacy or
93 > lilo or (?), since there seems to be issues with llvm-grub2.
94 >
95 >
96 > [1] http://dev.gentoo.org/~tgall/
97 >
98 >
99 > No matter how you slice it, from someone who is focused on building
100 > minimized and embedded (bare metal) systems that are customized and
101 > coalesced into a heterogeneous gentoo cluster for HPC, this is wonderful
102 > news. Finally a vendor in the cluster space, with some vision and
103 > common-sense, imho. Heterogeneous and open HPC is where is at, imho. If
104 > there is a forum where the community and pathscale folks discuss issues,
105 > point that out as I could not find one for deeper reading....
106 >
107 >
108 > hth,
109 > James
110 >

Replies

Subject Author
Re: [gentoo-dev] New project: LLVM james <garftd@×××××××.net>