1 |
On 18:39 Fri 15 Jul , Andrea Arteaga wrote: |
2 |
> > How about FFTs rather than DFTs? |
3 |
> |
4 |
> "A fast Fourier transform (FFT) is an efficient algorithm to compute |
5 |
> the discrete Fourier transform (DFT) and its inverse" [0] |
6 |
> The tests compute the DFT by using FFTW, which of course implements |
7 |
> the FFT algorithms. |
8 |
> Or did you mean something else? |
9 |
|
10 |
Ah, I'm used to DFT being used to specific the exact computation, and |
11 |
FFT for the fast "cheating" version. Just my misunderstanding. |
12 |
|
13 |
> > Is there any intent that it be able to automatically select the best |
14 |
> > parameters in addition to just graphing the results? From looking at |
15 |
> > some of those graphs, it seems like if you applied some smoothing, you |
16 |
> > could pick the peaks out fairly easily. |
17 |
> |
18 |
> It would actually quite easy to find the best implementation for a |
19 |
> specific operation and matrix size. But... |
20 |
> Have a look of my BLAS results [1]: you can notice that eigen is very |
21 |
> slow for some opeartion like axpy, while it is very fast with matrix |
22 |
> multiplications or system solution; the gotoblas and openblas |
23 |
> implementations are usually the slowest with small matrices, but are |
24 |
> really fast with big matrices; atlas has swinging performances (see |
25 |
> matrix-matrix multiply), making it sometimes quite fast, sometimes |
26 |
> very slow. |
27 |
> |
28 |
> Bref: there are quite a few parameters that the user would insert in |
29 |
> order to decide automatically the best suited implementation. Big |
30 |
> matrices or tiny ones? Just Level 2 and 3 operations or Level 1, too? |
31 |
> Constancy or peak performance? Constructing such an automatic "filter" |
32 |
> would require a lot of work and would probabily be difficult making it |
33 |
> so good as the human eye is... I could maybe create kinda score |
34 |
> system, and give the user an hint of the most performing |
35 |
> implementation, but having a 2-minutes look at the graphical results |
36 |
> would always be the best thing to do. The only drawback of a manual |
37 |
> decision is that only a few (6-7) implementation can be shown at the |
38 |
> same time in order to keep the plots readable. |
39 |
|
40 |
You're right, micro-benchmarks and reality don't always meet. It would |
41 |
be pretty useful if a user could specify their case of interest, perhaps |
42 |
in the form of a script. |
43 |
|
44 |
-- |
45 |
Thanks, |
46 |
Donnie |
47 |
|
48 |
Donnie Berkholz |
49 |
Admin, Summer of Code |
50 |
Gentoo Linux and X.Org |
51 |
Blog: http://dberkholz.com |