Gentoo Archives: gentoo-dev

From: Mo Zhou <lumin@××××××.org>
To: gentoo-dev@l.g.o
Cc: "Michał Górny" <mgorny@g.o>
Subject: Re: [gentoo-dev] RFC: BLAS and LAPACK runtime switching (Re-designed)
Date: Mon, 17 Jun 2019 13:33:15
Message-Id: b5d5723284ff8e0f0e3dfb9e652a92fd@debian.org
In Reply to: Re: [gentoo-dev] RFC: BLAS and LAPACK runtime switching (Re-designed) by "Michał Górny"
1 Hi Michał,
2
3 Sorry for the late reply. Just encountered some severe hardware failure.
4
5 On 2019-06-13 07:49, Michał Górny wrote:
6 >>
7 >> sci-libs/{blas,cblas,lapack,lapacke}::gentoo should be deprecated. They
8 >> are based on exactly the same source tarball, and maintaining 4 ebuild
9 >> files for a single tarball is not a good choice IHMO. Those old ebuild
10 >> files seems to leverage the flexibility of upstream build system
11 >> because it enables one to, for example, skip the reference blas build
12 >> and use an existing optimized BLAS impelementation and hence introduce
13 >> flexibility. That flexibility is hard to maintain and is not necessary
14 >> anymore with the new runtime switching mechanism.
15 >>
16 >> That's why I propose to merge the 4 ebuild into a single one:
17 >> sci-libs/lapack. We don't need to add the "reference" postfix
18 >> because no upstream will loot the name "lapack". When talking
19 >> about "lapack" it's always the reference implementation.
20 >
21 > What's the real gain here, and how does it compare to loss of
22 > flexibility of being able to build only what the package in question
23 > needs?
24
25 First let's see what these 4 components are:
26 1. blas: written in fortran, provides fundamental linear algebra
27 routines. libblas.so can work alone.
28 2. cblas: a thin C wrapper around the fortran blas. that means
29 libcblas.so calls libblas.so for the real calculation.
30 3. lapack: written in fortran, frequently calls BLAS for
31 implementing higher level linear algebra routines.
32 liblapack.so needs libblas.so (fortran).
33 4. lapacke: a thin C wrapper around the fortran lapack.
34 liblapacke.so needs liblapack.so.
35
36 The real gain by merging 4 ebuilds into 1 ebuild:
37 1. easier to maintain, updating 4 ebuilds on every single
38 version bump is much harder compared to updating only 1.
39 This will also make it easier to provide and maintain
40 the virtual-* features for long run.
41 2. could avoid confusing or even potentially problematic
42 setups, e.g.: A user happened to compile OpenBLAS for
43 the libblas provider, and BLIS for the libcblas provider:
44
45 appA -> libblas (OpenBLAS)
46 appB -> libcblas (BLIS)
47 appC -> liblapacke (Ref) -> liblapack (Ref) -> libblas (OpenBLAS)
48 -> libcblas.so (BLIS)
49
50 The user will get him/herself confused on what BLAS
51 is really doing the calculation. Plus, sometimes
52 mixing threading model may cause poor performance
53 (e.g. openmp + pthread) or even silent corruption
54 (e.g. GNU openmp + Intel openmp).
55
56 Merging cblas into blas, and lapacke into lapack
57 will make it harder to get things wrong.
58
59 IHMO that mentioned flexibility is not really necessary. Any
60 scientific computing user who needs performance and dislikes
61 the virtual-* solution could directly link their programs
62 against MKL or openblas without thinking about the reference
63 blas, because both MKL and OpenBLAS provides the full set
64 of blas,cblas,lapack,lapacke API and ABI via a single shared
65 object. Plus, that flexibility could be replaced by the
66 proposed runtime switching solution: by alternating
67 the blas(cblas) selection, liblapack.so can be dynamicly
68 linked against different optimized implementations.
69
70 Discarding this flexibility will only affect users who
71 insist on linking an unoptimized lapack against a specific
72 blas implementation. And one may also fall into trouble
73 with such flexibility, e.g.:
74
75 libcblas (Reference) -> libblas.so (reference)
76 liblapack (Reference) -> libopenblas.so
77
78 appC -> (liblapacke, libcblas)
79 --> liblapacke -> liblapack -> libopenblas
80 --> libcblas (reference)
81
82 libopenblas's ABI is a superset of those of libcblas,
83 which indicates confusion and symbol race condition
84 during run-time.
85
86 With the proposed (redesigned) solution, these potentially
87 bad cases could be avoided because the solution trys to keep
88 the backend consistency. Some people had headache on the
89 BLAS/LAPACK flexibility and they created flexiblas.
90
91 In a word, the (4->1) change can reudce the maintaining cost
92 for (blas,cblas,lapack,lapacke) and make the virtual-* feature
93 easier to implement and maintain for long run. Additionally,
94 the flexibility mentioned before is not really necessary when
95 the virtual-* feature is fully implemented.
96
97 Best,
98 Mo.