Gentoo Archives: gentoo-soc

From: wuyy <xgreenlandforwyy@×××××.com>
To: gentoo-soc <gentoo-soc@l.g.o>
Subject: [gentoo-soc] Week 4 Report for Refining ROCm Packages in Gentoo
Date: Tue, 12 Jul 2022 07:04:25
Message-Id: Ys0dCzAC4COm4hsI@HEPwuyy
1 Hello all,
2
3 Sorry for the late report. I was focusing on server hardware upgrade at the lab
4 I work at, and forgot to send this report yesterday. Also, I apologize that I
5 didn't take the blog seriously, because I hadn't wrote some interesting posts
6 there, and thought it was another place to archive weekly report if not adding
7 new stuff. Yury kindly reminded me last week that there is not posts from me
8 yet, so I uploaded the week reports and two figures of blender rendering using
9 HIP cycles. I'll utilize this platform and spend more time on collecting
10 materials for posts in the coming days.
11
12 The forth week working on packaging ROCm is quite smooth. There are some bug
13 fixes, and also major improvements on rocm.eclass.
14
15 Bug fixes cover rocBLAS and rocFFT. For rocBLAS, I backported a patch to
16 sci-libs/rocBLAS-5.0.2-r1 and dev-util/Tensile-r1, to pass `-j N` from
17 ${MAKEOPTS} to TensileCreateLibrary when building rocBLAS, which fixed [1]. As
18 of rocFFT, I corrected its BDEPEND [2], added missing sys-libs/omp for omp.h
19 [3], and let it depend on dev-util/rocm-cmake-5.0.2-r1 which does not install
20 files to unexpected paths [4]. However, as the gcc-12.1.0 lands, bugs about
21 clang expanding __noinline__ macro in g++-v12/bits/shared_ptr_base.h emeregs
22 [5,6]. Details can be seen on [5], and I'm working on resolving this (see PR
23 [7]).
24
25 For rocm.eclass, I finished the draft for three major functions: USE_EXPAND,
26 src_configure and src_test. I also wrote get_amdgpu_flags function used by
27 src_configure. My latest work on rocm.eclass is located at
28 https://github.com/littlewu2508/gentoo/blob/rocm-5.1.3/eclass/rocm.eclass. Below
29 are its status and my questions I'd like to share:
30
31 1. Default architectures. Now I implement the USE_EXPAND of AMDGPU_TARGETS, I
32 need to specify the default value of each use. The straightforward way is to
33 enable all targets by default, but that can be **extremely** slow and
34 disk-hungry when compiling ROCm libraries such as rocBLAS or rocFFT (expect to
35 compile for several hours if the CPU is not powerful enough). Currently I
36 defined a variable OFFICIAL_AMDGPU_TARGETS, which is referenced from ROCm
37 installation documents [8]. Although the support range is much larger, and
38 different components have their own support matrices, AMD promise to fully
39 support these enterprise cards. For enterprise users, they can just emerge ROCm
40 packages without setting specific use flag, and have out-of-box experience on
41 Gentoo. For users with consumer end cards, they can read the wiki page (covered
42 later in my GSoC project) and seek instructions to set the correct use flag.
43
44 2. Whether setting -DSKIP_RPATH=true in mycmakeargs. Previously this is set to
45 avoid including rpath if USE=benchmark when building ROCm packages like
46 sci-libs/roc-* and sci-libs/hip-*. The test and benchmark executables are named
47 "clients" (take rocBLAS as example, clients are programs that uses functions and
48 link librocblas.so). In order to run tests and benchmarks before install
49 libraries to system, rpath is set on these executables, but gentoo does not have
50 a src_benchmark phase, so the benchmark binaries is just installed, and user can
51 run it afterwards (actually I use it in my research to tune algorithms). So
52 there should not be rpath in benchmark binaries, and this is achieved by setting
53 -DSKIP_RPATH=true. However, after this, test program cannot execute because
54 rpath is also eliminated, so I have to specify LD_LIBRARY_PATH in src_test
55 manually. Another resolution is not skipping rpath, but run chrpath on affected
56 binaries, which means maintainers have to write a dedicated src_install and
57 remember to add chrpath command applying on every new executables when bumping
58 versions. The third solution is to patch CMakeLists.txt to include rpath only in
59 test programs, but this method also introduce more maintenance work. What's your
60 opinion?
61
62 3. Detect AMDGPU in src_test. This blocks https://bugs.gentoo.org/817440, and I
63 also raise questions in the bug report. Tinderbox cannot run tests on ROCm
64 packages like rocBLAS, because there is no AMDGPU available. I implement the
65 detection mechanism, with one problem left: if no GPU available, fail the test
66 or exit normally? Personally, I think the best solution is to detect AMDGPU
67 during pretend or setup phase, turn off the test USE flag if no GPU available,
68 or the architecture compiled does not match the detected GPU. But is operating
69 USE flag inside ebuild phase functions possible?
70
71 Despite these issues I managed a working version of rocm.eclass, and used it on
72 rocBLAS. The use expand works successful, while src_test can properly detect
73 hardware and execute in both sandboxed vanilla Gentoo, and non-sandboxed Gentoo
74 prefix. There are still things to work on rocm.eclass:
75
76 1. ROCM_USEDEP, similar to PYTHON_USEDEP. For example, hipBLAS uses
77 architectures gfx906 and gfx1030, then its dependency, rocBLAS, must contains
78 gfx906 and gfx1030. 2. SRC_URI. 3. A way to automatically add PORTAGE_USERNAME
79 to render group, to access amdgpu and perform src_test. I don't have any clue on
80 this yet, maybe meta package in acct-group can do this?
81
82 In the coming week I'll finish rocm.eclass as planed, and send out for early
83 review. Meanwhile I'll continue fixing bugs [5,6,9,10], answering questions
84 about enabling rocm in packages [11,12], and prepare to land ROCm-5.1.3. One of
85 my friend is also plugging Radeon VII on there arm64 server, and if everything
86 goes well I can try ROCm on arm64 (in kernel document, the GPGPU driver, amdkfd,
87 support amd64, arm64 and ppc64), and add the ~arm64 KEYWORD in the future.
88
89 [1] https://bugs.gentoo.org/852236
90 [2] https://bugs.gentoo.org/836248
91 [3] https://bugs.gentoo.org/850937
92 [4] https://bugs.gentoo.org/836274
93 [5] https://bugs.gentoo.org/857126
94 [6] https://bugs.gentoo.org/857660
95 [7] https://github.com/gentoo/gentoo/pull/26311
96 [8] https://docs.amd.com/bundle/ROCm-Getting-Started-Guide-v5.1.3/page/Overview_of_ROCm_Installation.html
97 [9] https://bugs.gentoo.org/842366
98 [10] https://bugs.gentoo.org/836275
99 [11] https://github.com/gentoo/gentoo/pull/25836
100 [12] https://github.com/gentoo/gentoo/pull/25837
101
102 Cheers,
103 --
104 Yiyang Wu

Replies