Gentoo Archives: gentoo-dev

From: Ulrich Mueller <ulm@g.o>
To: Yiyang Wu <xgreenlandforwyy@×××××.com>
Cc: gentoo-dev@l.g.o, Benda Xu <heroxbd@g.o>
Subject: Re: [gentoo-dev] [PATCH v2 1/2] rocm.eclass: new eclass
Date: Mon, 15 Aug 2022 22:42:03
Message-Id: u8rnpruxg@gentoo.org
In Reply to: [gentoo-dev] [PATCH v2 1/2] rocm.eclass: new eclass by Yiyang Wu
1 >>>>> On Mon, 15 Aug 2022, Yiyang Wu wrote:
2
3 > diff --git a/eclass/rocm.eclass b/eclass/rocm.eclass
4 > new file mode 100644
5 > index 000000000000..8ca2c051ddce
6 > --- /dev/null
7 > +++ b/eclass/rocm.eclass
8 > @@ -0,0 +1,275 @@
9 > +# Copyright 2022 Gentoo Authors
10 > +# Distributed under the terms of the GNU General Public License v2
11 > +
12 > +# @ECLASS: rocm.eclass
13 > +# @MAINTAINER:
14 > +# Gentoo Science Project <sci@g.o>
15 > +# @AUTHOR:
16 > +# Yiyang Wu <xgreenlandforwyy@×××××.com>
17 > +# @SUPPORTED_EAPIS: 7 8
18 > +# @BLURB: Common functions and variables for ROCm packages written in HIP
19 > +# @DESCRIPTION:
20 > +# ROCm packages such as sci-libs/<roc|hip>* can utilize functions in this eclass.
21 > +# Currently, it handles the AMDGPU_TARGETS variable via USE_EXPAND, so user can
22 > +# use USE flag to control which GPU architecture to compile, and ensure coherence
23 > +# among dependencies. It also specify CXX=hipcc, to let hipcc compile. Another
24 > +# important function is src_test, which checks whether a valid KFD device exists
25 > +# for testing, and then execute the test program.
26
27 Please wrap lines at below 80 character positions. (This applies both to
28 documentation and to code.)
29
30 > +#
31 > +# Most ROCm packages use cmake as build system, so this eclass does not export
32 > +# phase functions which overwrites the phase functions in cmake.eclass. Ebuild
33 > +# should explicitly call rocm_src_* in src_configure and src_test.
34 > +#
35 > +# @EXAMPLE:
36 > +# # Example for ROCm packages in https://github.com/ROCmSoftwarePlatform
37 > +# inherit cmake rocm
38 > +# SRC_URI="https://github.com/ROCmSoftwarePlatform/${PN}/archive/rocm-${PV}.tar.gz -> ${P}.tar.gz"
39 > +# SLOT="0/$(ver_cut 1-2)"
40 > +# IUSE="test"
41 > +# REQUIRED_USE="${ROCM_REQUIRED_USE}"
42 > +# RESTRICT="!test? ( test )"
43 > +#
44 > +# RDEPEND="
45 > +# dev-util/hip
46 > +# sci-libs/rocBLAS:${SLOT}[${ROCM_USEDEP}]
47 > +# "
48 > +#
49 > +# S=${WORKDIR}/${PN}-rocm-${PV}
50 > +#
51 > +# src_configure() {
52 > +# local mycmakeargs=(
53 > +# -DBUILD_CLIENTS_TESTS=$(usex test ON OFF)
54 > +# )
55 > +# rocm_src_configure
56 > +# }
57 > +#
58 > +# src_test() {
59 > +# rocm_src_test
60 > +# }
61 > +#
62 > +# # Example for packages depend on ROCm libraries -- a package depend on
63 > +# # rocBLAS, and use comma seperated ${HCC_AMDGPU_TARGET} to determine GPU
64 > +# # architecture to compile. Requires ROCm version >5.
65 > +# ROCM_VERSION=5
66 > +# inherit rocm
67 > +# IUSE="rocm"
68 > +# REQUIRED_USE="rocm? ( ${ROCM_REQUIRED_USE} )"
69 > +# DEPEND="rocm? ( >=dev-util/hip-${ROCM_VERSION}
70 > +# >=sci-libs/rocBLAS-${ROCM_VERSION}[${ROCM_USEDEP}] )"
71 > +# ....
72 > +# src_configure() {
73 > +# if use rocm; then
74 > +# local AMDGPU_FLAGS=$(get_amdgpu_flags)
75 > +# export HCC_AMDGPU_TARGET=${AMDGPU_FLAGS//;/,}
76 > +# fi
77 > +# default
78 > +# }
79
80 @EXAMPLE is read as a paragraph, i.e. lines will be wrapped and this is
81 how it will look when rendered as eclass manpage:
82
83 # Example for ROCm packages in https://github.com/ROCmSoftwarePlatform
84 inherit cmake rocm SRC_URI="https://github.com/ROCmSoftwarePlat‐
85 form/${PN}/archive/rocm-${PV}.tar.gz -> ${P}.tar.gz" SLOT="0/$(ver_cut
86 1-2)" IUSE="test" REQUIRED_USE="${ROCM_REQUIRED_USE}" RESTRICT="!test?
87 ( test )"
88
89 [...]
90
91 So, you may want to include the whole example in a pair of @CODE tokens.
92
93 > +
94 > +if [[ ! ${_ROCM_ECLASS} ]]; then
95 > +
96 > +case "${EAPI:-0}" in
97
98 This should be just ${EAPI} without a fallback to 0. Also, the
99 quotation marks are not necessary.
100
101 > + 7|8)
102 > + ;;
103 > + *)
104 > + die "Unsupported EAPI=${EAPI} for ${ECLASS}"
105
106 Whereas here it should be ${EAPI:-0}.
107
108 > + ;;
109 > +esac
110
111 Or even better, follow standard usage as it can be found in
112 multilib-minimal or toolchain-funcs:
113
114 case ${EAPI} in
115 7|8) ;;
116 *) die "${ECLASS}: EAPI ${EAPI:-0} not supported" ;;
117 esac
118
119 > +
120 > +inherit edo
121 > +
122 > +
123
124 No double empty lines please. (Pkgcheck should have complained about
125 this.)
126
127 > +# @ECLASS_VARIABLE: ROCM_VERSION
128 > +# @DEFAULT_UNSET
129 > +# @PRE_INHERIT
130 > +# @DESCRIPTION:
131 > +# The ROCm version of current package. Default is ${PV}, but for other packages
132 > +# that depend on ROCm libraries, this can be set to match the version of
133 > +# required ROCm libraries.
134 > +
135 > +# @ECLASS_VARIABLE: ALL_AMDGPU_TARGETS
136 > +# @INTERNAL
137 > +# @DESCRIPTION:
138 > +# The list of USE flags corresponding to all AMDGPU targets in this ROCm
139 > +# version. The value depends on ${PV}. Architectures and devices map:
140 > +# https://www.coelacanth-dream.com/posts/2019/12/30/did-rid-product-matome-p2
141 > +
142 > +# @ECLASS_VARIABLE: OFFICIAL_AMDGPU_TARGETS
143 > +# @INTERNAL
144 > +# @DESCRIPTION:
145 > +# The list of USE flags corresponding to all officlially supported AMDGPU
146
147 "officially"
148
149 > +# targets in this ROCm version, documented at
150 > +# https://docs.amd.com/bundle/ROCm-Installation-Guide-v${PV}/page/Prerequisite_Actions.html.
151 > +# USE flag of these architectures will be default on. Depends on ${PV}.
152 > +
153 > +# @ECLASS_VARIABLE: ROCM_REQUIRED_USE
154 > +# @OUTPUT_VARIABLE
155 > +# @DESCRIPTION:
156 > +# Requires at least one AMDGPU target to be compiled.
157 > +# Example use for ROCm libraries:
158 > +# REQUIRED_USE="${ROCM_REQUIRED_USE}"
159 > +# Example use for packages that depend on ROCm libraries
160 > +# IUSE="rocm"
161 > +# REQUIRED_USE="rocm? ( ${ROCM_REQUIRED_USE} )"
162 > +
163 > +# @ECLASS_VARIABLE: ROCM_USEDEP
164 > +# @OUTPUT_VARIABLE
165 > +# @DESCRIPTION:
166 > +# This is an eclass-generated USE-dependency string which can be used to
167 > +# depend on another ROCm package being built for the same AMDGPU architecture.
168 > +#
169 > +# The generated USE-flag list is compatible with packages using rocm.eclass.
170 > +#
171 > +# Example use:
172 > +# @CODE
173 > +# DEPEND="sci-libs/rocBLAS[${ROCM_USEDEP}]"
174 > +# @CODE
175 > +
176 > +# @FUNCTION: _rocm_set_globals
177 > +# @DESCRIPTION:
178 > +# Set global variables used by the eclass: ALL_AMDGPU_TARGETS,
179 > +# OFFICIAL_AMDGPU_TARGETS, ROCM_REQUIRED_USE, and ROCM_USEDEP
180 > +_rocm_set_globals() {
181 > + case ${ROCM_VERSION:-${PV}} in
182 > + 4*)
183 > + ALL_AMDGPU_TARGETS=(
184 > + gfx803 gfx900 gfx906 gfx908
185 > + gfx1010 gfx1011 gfx1012 gfx1030
186 > + )
187 > + OFFICIAL_AMDGPU_TARGETS=(
188 > + gfx906 gfx908
189 > + )
190 > + ;;
191 > + 5*)
192 > + ALL_AMDGPU_TARGETS=(
193 > + gfx803 gfx900 gfx906 gfx908 gfx90a
194 > + gfx1010 gfx1011 gfx1012 gfx1030 gfx1031
195 > + )
196 > + OFFICIAL_AMDGPU_TARGETS=(
197 > + gfx906 gfx908 gfx90a gfx1030
198 > + )
199 > + ;;
200 > + *)
201 > + die "Unknown ROCm major version! Please update rocm.eclass before bumping to new ebuilds"
202 > + ;;
203 > + esac
204 > +
205 > + ROCM_REQUIRED_USE+=" || ("
206 > + for gpu_target in ${ALL_AMDGPU_TARGETS[@]}; do
207
208 Add a pair of double quotes here.
209
210 > + if [[ " ${OFFICIAL_AMDGPU_TARGETS[*]} " =~ " ${gpu_target} " ]]; then
211
212 So OFFICIAL_AMDGPU_TARGETS is an array, but then it's flattened to a
213 string for the test? Either use "has" for the test, or don't use an
214 array in the first place.
215
216 > + IUSE+=" ${gpu_target/#/+amdgpu_targets_}"
217
218 Why "+="? I don't see any previous assignment of IUSE.
219
220 > + else
221 > + IUSE+=" ${gpu_target/#/amdgpu_targets_}"
222
223 Ditto.
224
225 > + fi
226 > + ROCM_REQUIRED_USE+=" ${gpu_target/#/amdgpu_targets_}"
227 > + done
228 > + ROCM_REQUIRED_USE+=" ) "
229 > +
230 > + local flags=( "${ALL_AMDGPU_TARGETS[@]/#/amdgpu_targets_}" )
231 > + local optflags=${flags[@]/%/(-)?}
232 > + ROCM_USEDEP=${optflags// /,}
233 > +}
234 > +_rocm_set_globals
235 > +unset -f _rocm_set_globals
236 > +
237 > +
238 > +# @FUNCTION: get_amdgpu_flags
239 > +# @USAGE: get_amdgpu_flags
240 > +# @DESCRIPTION:
241 > +# Convert specified use flag of amdgpu_targets to compilation flags.
242 > +# Append default target feature to GPU arch. See
243 > +# https://llvm.org/docs/AMDGPUUsage.html#target-features
244 > +get_amdgpu_flags() {
245 > + local AMDGPU_TARGET_FLAGS
246 > + for gpu_target in ${ALL_AMDGPU_TARGETS[@]}; do
247
248 Double quotes please.
249
250 > + local target_feature=
251 > + if use amdgpu_targets_${gpu_target}; then
252 > + case ${gpu_target} in
253 > + gfx906|gfx908)
254 > + target_feature=:xnack-
255 > + ;;
256 > + gfx90a)
257 > + target_feature=:xnack+
258 > + ;;
259 > + *)
260 > + ;;
261 > + esac
262 > + AMDGPU_TARGET_FLAGS+="${gpu_target}${target_feature};"
263 > + fi
264 > + done
265 > + echo ${AMDGPU_TARGET_FLAGS}
266 > +}
267 > +
268 > +# @FUNCTION: check_rw_permission
269 > +# @USAGE: check_rw_permission <file>
270 > +# @DESCRIPTION:
271 > +# check read and write permissions on specific files.
272 > +# allow using wildcard, for example check_rw_permission /dev/dri/render*
273 > +check_rw_permission() {
274 > + [[ -r "$1" ]] && [[ -w "$1" ]] || die \
275
276 Quotation marks aren't necessary here.
277
278 > + "${PORTAGE_USERNAME} do not have read or write permissions on $1! \n Make sure ${PORTAGE_USERNAME} is in render group and check the permissions."
279
280 Please don't use internal Portage variables.
281
282 > +}
283 > +
284 > +# == phase functions ==
285 > +
286 > +# @FUNCTION: rocm_src_configure
287 > +# @DESCRIPTION:
288 > +# configure rocm packages, and setting common cmake arguments
289 > +rocm_src_configure() {
290 > + # allow acces to hardware
291 > + addpredict /dev/kfd
292 > + addpredict /dev/dri/
293 > +
294 > + mycmakeargs+=(
295 > + -DCMAKE_INSTALL_PREFIX="${EPREFIX}/usr"
296 > + -DAMDGPU_TARGETS="$(get_amdgpu_flags)"
297 > + -DCMAKE_SKIP_RPATH=TRUE
298 > + )
299 > +
300 > + CXX="hipcc" cmake_src_configure
301 > +}
302 > +
303 > +# @FUNCTION: rocm_src_test
304 > +# @DESCRIPTION:
305 > +# Test whether valid GPU device is present. If so, find how to, and execute test.
306 > +# ROCm packages can have to test mechanism:
307 > +# 1. cmake_src_test. Set MAKEOPTS="-j1" to make sure only one test on GPU at a time;
308 > +# 2. one single gtest binary called "${PN,,}"-test;
309 > +# 3. Some package like rocFFT have alternative test like rocfft-selftest;
310 > +# 4. Custome testing binaries like dev-libs/rccl. Use ${ROCM_TESTS} to specify.
311 > +rocm_src_test() {
312 > + addwrite /dev/kfd
313 > + addwrite /dev/dri/
314 > +
315 > + # check permissions on /dev/kfd and /dev/dri/render*
316 > + check_rw_permission /dev/kfd
317 > + check_rw_permission /dev/dri/render*
318 > +
319 > + : ${LD_LIBRARY_PATH:="${BUILD_DIR}/clients:${BUILD_DIR}/src:${BUILD_DIR}/library:${BUILD_DIR}/library/src:${BUILD_DIR}/library/src/device"}
320 > + export LD_LIBRARY_PATH
321 > + if grep -q 'build test:' "${BUILD_DIR}"/build.ninja; then
322 > + MAKEOPTS="-j1" cmake_src_test
323 > + elif [[ -d "${BUILD_DIR}"/clients/staging ]]; then
324
325 Quotation marks aren't necessary here.
326
327 > + cd "${BUILD_DIR}/clients/staging" || die "Test directory not found!"
328 > + for test_program in "${PN,,}-"*test; do
329 > + if [[ -x ${test_program} ]]; then
330 > + edob ./${test_program}
331 > + else
332 > + die "The test program ${test_program} does not exist or cannot be excuted!"
333 > + fi
334 > + done
335 > + elif [[ ! -z "${ROCM_TESTS}" ]]; then
336
337 Again, no quotation marks.
338
339 Also, why the double negation? IMHO "-n" would be better readable.
340
341 > + for test_program in "${ROCM_TESTS}"; do
342
343 What is this supposed to do? The loop will be executed exactly once,
344 whatever ROCM_TESTS contains.
345
346 > + cd "${BUILD_DIR}" || die
347 > + if [[ -x ${test_program} ]]; then
348 > + edob ./${test_program}
349 > + else
350 > + die "The test program ${test_program} does not exist or cannot be excuted!"
351 > + fi
352 > + done
353 > + else
354 > + die "There is no cmake tests, no \${ROCM_TESTS} executable provided, nor ${BUILD_DIR}/clients/staging where test program might be located."
355 > + fi
356 > +}
357 > +
358 > +_ROCM_ECLASS=1
359 > +fi

Attachments

File name MIME type
signature.asc application/pgp-signature

Replies

Subject Author
Re: [gentoo-dev] [PATCH v2 1/2] rocm.eclass: new eclass Ulrich Mueller <ulm@g.o>
Re: [gentoo-dev] [PATCH v2 1/2] rocm.eclass: new eclass wuyy <xgreenlandforwyy@×××××.com>