1 |
A very short log of what has been done during the last days. I am |
2 |
working on the PBLAS and ScaLAPACK benchmarks, which is a very |
3 |
challenging topic, because it is very difficult to debug such |
4 |
applications. |
5 |
|
6 |
* I changed some parts of the BTL framework, adapting it to the |
7 |
distributed memory benchmarks. This has required writing two new |
8 |
perfanalyzers -- one for the root process, one for the other (node) |
9 |
processes. The nodes do not perform any measurement, while the root |
10 |
process broadcasts the needed informations, measures the time and |
11 |
manages the output (both std{out,err} and resulting file). |
12 |
|
13 |
* I added a BLACS library that provides an useful interface which |
14 |
scatters and gathers matrices and vectors. I also added a PBLAS |
15 |
library that inherits the BLACS one and will support the most common |
16 |
operations (at the moment just the parallel matrix-vector |
17 |
multiplication). |
18 |
|
19 |
* I added an action for the parallel matrix-vector multiplication |
20 |
which makes use of the two described interfaces. |
21 |
|
22 |
The matrix-vector multiplication is a case study for now. If |
23 |
everything goes fine (and it seems so, now), then more actions will be |
24 |
provided, for both PBLAS and ScaLAPACL, which share the same concepts. |
25 |
I plan to have tomorrow a working (but incomplete) Python module for |
26 |
PBLAS, too. |
27 |
|
28 |
Milestones for the next week: |
29 |
* Having working PBLAS and ScaLAPACK modules |
30 |
* Do some benchmarks using these modules and publish the results |
31 |
* Start the implementation of the advanced FFTW benchmarks, as |
32 |
previously described |
33 |
|
34 |
Best regards |
35 |
Andrea Arteaga |