Gentoo Archives: gentoo-soc

From: Donnie Berkholz <dberkholz@g.o>
To: gentoo-soc@l.g.o
Subject: Re: [gentoo-soc] Benchmarking suite - Report 7
Date: Fri, 15 Jul 2011 18:48:39
Message-Id: 20110715184828.GI2828@comet.mayo.edu
In Reply to: Re: [gentoo-soc] Benchmarking suite - Report 7 by Andrea Arteaga
1 On 18:39 Fri 15 Jul , Andrea Arteaga wrote:
2 > > How about FFTs rather than DFTs?
3 >
4 > "A fast Fourier transform (FFT) is an efficient algorithm to compute
5 > the discrete Fourier transform (DFT) and its inverse" [0]
6 > The tests compute the DFT by using FFTW, which of course implements
7 > the FFT algorithms.
8 > Or did you mean something else?
9
10 Ah, I'm used to DFT being used to specific the exact computation, and
11 FFT for the fast "cheating" version. Just my misunderstanding.
12
13 > > Is there any intent that it be able to automatically select the best
14 > > parameters in addition to just graphing the results? From looking at
15 > > some of those graphs, it seems like if you applied some smoothing, you
16 > > could pick the peaks out fairly easily.
17 >
18 > It would actually quite easy to find the best implementation for a
19 > specific operation and matrix size. But...
20 > Have a look of my BLAS results [1]: you can notice that eigen is very
21 > slow for some opeartion like axpy, while it is very fast with matrix
22 > multiplications or system solution; the gotoblas and openblas
23 > implementations are usually the slowest with small matrices, but are
24 > really fast with big matrices; atlas has swinging performances (see
25 > matrix-matrix multiply), making it sometimes quite fast, sometimes
26 > very slow.
27 >
28 > Bref: there are quite a few parameters that the user would insert in
29 > order to decide automatically the best suited implementation. Big
30 > matrices or tiny ones? Just Level 2 and 3 operations or Level 1, too?
31 > Constancy or peak performance? Constructing such an automatic "filter"
32 > would require a lot of work and would probabily be difficult making it
33 > so good as the human eye is... I could maybe create kinda score
34 > system, and give the user an hint of the most performing
35 > implementation, but having a 2-minutes look at the graphical results
36 > would always be the best thing to do. The only drawback of a manual
37 > decision is that only a few (6-7) implementation can be shown at the
38 > same time in order to keep the plots readable.
39
40 You're right, micro-benchmarks and reality don't always meet. It would
41 be pretty useful if a user could specify their case of interest, perhaps
42 in the form of a script.
43
44 --
45 Thanks,
46 Donnie
47
48 Donnie Berkholz
49 Admin, Summer of Code
50 Gentoo Linux and X.Org
51 Blog: http://dberkholz.com