Gentoo Archives: gentoo-soc

From: Benda Xu <heroxbd@g.o>
To: gentoo-soc <gentoo-soc@l.g.o>
Subject: Re: [gentoo-soc] Week 4 Report for Refining ROCm Packages in Gentoo
Date: Tue, 12 Jul 2022 10:23:08
Message-Id: 8735f6635n.fsf@gentoo.org
In Reply to: [gentoo-soc] Week 4 Report for Refining ROCm Packages in Gentoo by wuyy
1 wuyy <xgreenlandforwyy@×××××.com> writes:
2
3 > Bug fixes cover rocBLAS and rocFFT. For rocBLAS, I backported a patch to
4 > sci-libs/rocBLAS-5.0.2-r1 and dev-util/Tensile-r1, to pass `-j N` from
5 > ${MAKEOPTS} to TensileCreateLibrary when building rocBLAS, which fixed [1]. As
6 > of rocFFT, I corrected its BDEPEND [2], added missing sys-libs/omp for omp.h
7 > [3], and let it depend on dev-util/rocm-cmake-5.0.2-r1 which does not install
8 > files to unexpected paths [4]. However, as the gcc-12.1.0 lands, bugs about
9 > clang expanding __noinline__ macro in g++-v12/bits/shared_ptr_base.h emeregs
10 > [5,6]. Details can be seen on [5], and I'm working on resolving this (see PR
11 > [7]).
12
13 Good. Nice progress!
14
15 > For rocm.eclass, I finished the draft for three major functions: USE_EXPAND,
16 > src_configure and src_test. I also wrote get_amdgpu_flags function used by
17 > src_configure. My latest work on rocm.eclass is located at
18 > https://github.com/littlewu2508/gentoo/blob/rocm-5.1.3/eclass/rocm.eclass. Below
19 > are its status and my questions I'd like to share:
20 >
21 > 1. Default architectures. Now I implement the USE_EXPAND of AMDGPU_TARGETS, I
22 > need to specify the default value of each use. The straightforward way is to
23 > enable all targets by default, but that can be **extremely** slow and
24 > disk-hungry when compiling ROCm libraries such as rocBLAS or rocFFT (expect to
25 > compile for several hours if the CPU is not powerful enough). Currently I
26 > defined a variable OFFICIAL_AMDGPU_TARGETS, which is referenced from ROCm
27 > installation documents [8]. Although the support range is much larger, and
28 > different components have their own support matrices, AMD promise to fully
29 > support these enterprise cards. For enterprise users, they can just emerge ROCm
30 > packages without setting specific use flag, and have out-of-box experience on
31 > Gentoo. For users with consumer end cards, they can read the wiki page (covered
32 > later in my GSoC project) and seek instructions to set the correct use
33 > flag.
34
35 Fine.
36
37 > 2. Whether setting -DSKIP_RPATH=true in mycmakeargs. Previously this is set to
38 > avoid including rpath if USE=benchmark when building ROCm packages like
39 > sci-libs/roc-* and sci-libs/hip-*. The test and benchmark executables are named
40 > "clients" (take rocBLAS as example, clients are programs that uses functions and
41 > link librocblas.so). In order to run tests and benchmarks before install
42 > libraries to system, rpath is set on these executables, but gentoo does not have
43 > a src_benchmark phase, so the benchmark binaries is just installed, and user can
44 > run it afterwards (actually I use it in my research to tune algorithms). So
45 > there should not be rpath in benchmark binaries, and this is achieved by setting
46 > -DSKIP_RPATH=true. However, after this, test program cannot execute because
47 > rpath is also eliminated, so I have to specify LD_LIBRARY_PATH in src_test
48 > manually.
49
50 Go for it. This is exactly LD_LIBRARY_PATH is designed for: test of
51 libraries installed in non-standard locations.
52
53 > [...]
54 >
55 > 3. Detect AMDGPU in src_test. This blocks https://bugs.gentoo.org/817440, and I
56 > also raise questions in the bug report. Tinderbox cannot run tests on ROCm
57 > packages like rocBLAS, because there is no AMDGPU available. I implement the
58 > detection mechanism, with one problem left: if no GPU available, fail the test
59 > or exit normally?
60
61 Just follow the CI runner's advice at https://bugs.gentoo.org/817440#c4
62
63 ,----
64 | Agostino Sarubbo gentoo-dev 2021-10-11 07:16:42 UTC
65 |
66 | (In reply to Benda Xu from comment #0)
67 | > Tinderbox does not have the hardware for GPGPU. The ROCm GPGPU ebuilds
68 | > unconditionally fail.
69 |
70 | Is there a way for the ebuild to die if the hw does not meet the requisites?
71 `----
72
73 > Personally, I think the best solution is to detect AMDGPU during
74 > pretend or setup phase,
75
76 Just die otherwise.
77
78 > turn off the test USE flag if no GPU available, or the architecture
79 > compiled does not match the detected GPU. But is operating USE flag
80 > inside ebuild phase functions possible?
81
82 No, don't modify USE flags at ebuild runtime.
83
84 > Despite these issues I managed a working version of rocm.eclass, and used it on
85 > rocBLAS. The use expand works successful, while src_test can properly detect
86 > hardware and execute in both sandboxed vanilla Gentoo, and non-sandboxed Gentoo
87 > prefix. There are still things to work on rocm.eclass:
88 >
89 > 1. ROCM_USEDEP, similar to PYTHON_USEDEP. For example, hipBLAS uses
90 > architectures gfx906 and gfx1030, then its dependency, rocBLAS, must contains
91 > gfx906 and gfx1030. 2. SRC_URI. 3. A way to automatically add PORTAGE_USERNAME
92 > to render group, to access amdgpu and perform src_test. I don't have any clue on
93 > this yet, maybe meta package in acct-group can do this?
94
95 I have no idea.
96
97 > In the coming week I'll finish rocm.eclass as planed, and send out for early
98 > review. Meanwhile I'll continue fixing bugs [5,6,9,10], answering questions
99 > about enabling rocm in packages [11,12], and prepare to land ROCm-5.1.3. One of
100 > my friend is also plugging Radeon VII on there arm64 server, and if everything
101 > goes well I can try ROCm on arm64 (in kernel document, the GPGPU driver, amdkfd,
102 > support amd64, arm64 and ppc64), and add the ~arm64 KEYWORD in the future.
103
104 Cheers,
105 Benda