Gentoo Archives: gentoo-amd64

From: Duncan <1i5t5.duncan@×××.net>
To: gentoo-amd64@l.g.o
Subject: [gentoo-amd64] piledriver/trinity cpu/apu hardware tester needed, bug 445053
Date: Sat, 01 Dec 2012 00:04:05
Message-Id: pan.2012.11.30.23.20.25@cox.net
1 Bug #445053 deals with the new USE=fma flag in sci-libs/fftw-3.3.3
2 (~amd64). This flag enables upstream's new-for-that-version fma
3 instruction set optimizations, but the problem is that there's two
4 different fma instruction sets, fma3 and fma4. The wikipedia article
5 explains the difference, history, etc, in some detail.
6
7 Bug URL: https://bugs.gentoo.org/show_bug.cgi?id=445053
8
9 fma on wikipedia: http://en.wikipedia.org/wiki/FMA_instruction_set
10
11
12 So when I go to do my update, I see the new USE flag, and having an amd
13 bdver1 (bulldozer) with fma4, but seeing the USE flag is for fma (no
14 number appended), I'm confused and start looking into things, then file
15 that bug.
16
17 I've now actually tested USE=fma on my bdver1 (fma4) hardware with both
18 the ebuild's "small" tests, and manually run "make bigtest" in all three
19 subdirs (single/double/long-double) created as part of the build process,
20 passing all tests, so it seems fma works reliably for fma4 hardware.
21 What we do NOT yet know for sure is whether it works reliably on fma3
22 hardware, so we now need someone with fma3 hardware to check there, as
23 well.
24
25 According to the wikipedia article, Intel hardware will support fma3 with
26 hardware to be released in 2013, so AFAIK, there's no released Intel
27 hardware with hardware fma support at all, yet. Still anyone with a
28 current (definitely this year) Intel cpu/apu is welcome to check /proc/
29 cpuinfo and see, and run the tests if they have it.
30
31 The newest amd hardware should already have fma support, however, but it
32 could be fma3 or fma4 depending on CPU.
33
34 Bulldozer (-march=bdver1 in gcc) chips, released in late 2011, should
35 have fma4 listed in /proc/cpuinfo, as I do here. That's what I tested
36 with USE=fma here, with all tests I ran passing.
37
38 The new piledriver CPUs, and trinity APUs, however (I believe -
39 march=bdver2, but am not positive on that), are supposed to support
40 fma3. I'd guess /proc/cpuinfo should report either fma3 or simply fma,
41 for them. That's what still needs tested.
42
43 So, anyone with that hardware, could you at least set USE=fma and run
44 ebuild ... test on sci-libs/fftw-3.3.3 , then report the results in the
45 bug? Based on my results, the whole build and test (the ebuild runs make
46 smalltest for all three subdirs) should only run perhaps five minutes or
47 so (it was about three here, including the configure and build, tho my
48 PORTAGE_TMPDIR is on tmpfs, so it might take a bit longer for those with
49 it on a spinning hard drive).
50
51 Ideally, once the ebuild test passes, you'd also manually cd into the
52 work dir, source the environment file to get the portage build
53 environment, and run emake bigtest in all three subdirs (the ebuild uses
54 a loop thru the subdirs to run smalltest, you can do the same for bigtest,
55 or cd into each and run the tests manually). That will take rather
56 longer, perhaps an hour or so for the single subdir, longer, maybe two
57 hours, for the double subdir, and the same or longer for long-double.
58 However, the tests don't make very efficient use of the CPU, so if you
59 have a quad-core or better, likely with piledriver anyway, you could
60 probably run the tests for all three subdirs in parallel and still have
61 CPU left to run other things.
62
63 If it passes (e)make smalltest (in the ebuild test phase) and the manual
64 (e)make bigtest, for all three subdirs, with USE=fma, on an fma3 hardware
65 system, it should be safe to change the USE flag description to say it
66 can be used for either fma3 or fma4 hardware. If not, then since it does
67 seem to work on my fma4 hardware, perhaps the flag should be changed to
68 fma4.
69
70 So any help testing fma3 hardware would definitely be appreciated. Please
71 report results on the bug. Anyone with fma4 hardware can double-check my
72 results as well, but it does seem to work here.
73
74 Thanks. =:^)
75
76 --
77 Duncan - List replies preferred. No HTML msgs.
78 "Every nonfree program has a lord, a master --
79 and if you use the program, he is your master." Richard Stallman