1 |
In order to catch up a bit since I wasn't subscribed to the |
2 |
mailing list with this email at the time I found this thread. |
3 |
If anything sounds odd, read through to the end. |
4 |
I'm trying to top reply so I'm leaving my 'backstory' till the end. |
5 |
|
6 |
> Rich Freeman <rich0 <at> gentoo.org> writes: |
7 |
> |
8 |
> > James <wireless <at> tampabay.rr.com> writes: |
9 |
> > |
10 |
> > I bet our friends at RackSpace will provide all the virtual HorsePower |
11 |
> > you need, should google not provide the hundreds/thousands or cores for |
12 |
> > you to run on. |
13 |
> > |
14 |
> My guess is that the hardware to run all this on is the simplest part |
15 |
> of the problem. If somebody writes something good and we build the |
16 |
> processes to support it, then I'm sure we can find donors to provide |
17 |
> the CPU cycles. ChromeOS could probably steal whatever we put |
18 |
> together so there is an obvious synergy there, and I'm sure |
19 |
> Amazon/Rackspace/etc are always looking for use cases to show off. |
20 |
> |
21 |
|
22 |
I agree about 80% with that, well put. The disagreeing 20% is pretty much |
23 |
all about physical hardware, this is where, to quote a Red Hat employee I know |
24 |
'the work gets interesting'. For the bulk of work, we can easily use virtual |
25 |
machines, scavenge bargain basement EC2 spot instance hours, and have lots |
26 |
of other options, your right that will be easy, the x86/amd64 arch testing |
27 |
wont be hard to find a home for. Its all the other arch work that wont be easy. |
28 |
I'm currently in the process of obtaining AMD Opteron A1100 dev kit boards, |
29 |
and lets just say I'm not expecting our software to 'boot first time'. |
30 |
Red Hat kindly keep the Beaker project (https://beaker-project.org) moving |
31 |
forward which will be how I deal with these AMD dev kits. It only really helps |
32 |
when you have hardware you can put aside for being part of a test pool. But |
33 |
it is one of the few tools available to easily boot hardware, splat an OS |
34 |
onto it, connect and perform automated tests on it, get as much info out |
35 |
as possible even if there are kernel issues and it doesn't boot properly. |
36 |
Once things progress I'd be amenable to letting my dev boards do ebuild |
37 |
test runs when I'm not using them to port our software stack. |
38 |
|
39 |
So while I dont have the 'idle hardware budget' of AWS, Google or Rackspace, |
40 |
I am however building a cloud platform, with a customised version of CoreOS, |
41 |
which is based on ChromeOS, which is based on Gentoo. |
42 |
And the further the company progresses, the more drift I see between |
43 |
our 'OS' and 'CoreOS'. I could not imagine tackling this if the entire thing |
44 |
wasnt built on top of the foundation of a Gentoo based OS. So consider me |
45 |
an ardent supporter of actually getting Gentoo automatic testing. |
46 |
|
47 |
> Rich Freeman <rich0 <at> gentoo.org> writes: |
48 |
> |
49 |
> From past feedback from Diego and such the biggest issue with any kind |
50 |
> of tinderbox is dealing with the output. As has been pointed out |
51 |
> there are folks running Repoman regularly already, and there have been |
52 |
> past tinderbox efforts. The real issue is to improve the signal/noise |
53 |
> ratio. You'll only get so far with that using code - there has to be |
54 |
> process change to support it.> |
55 |
> |
56 |
> If we were to do something like this long-term I'd envision that this |
57 |
> would run on every commit or something like that, and the commit |
58 |
> doesn't go into the rsync tree until it passes. If the tree goes red |
59 |
> then people stop doing commits until it is fixed, and somebody gets |
60 |
> their wrist slapped. That is my sense of how most mature |
61 |
> organizations do CI. The tinderbox is really just doing verification |
62 |
> - stuff should already be tested BEFORE it goes in the tree. There |
63 |
> also shouldn't be any false positives. There would need to be a |
64 |
> mechanism to flag ebuilds with known issues so that the tinderbox can |
65 |
> ignore them, and of course we can monitor that to ensure it isn't |
66 |
> abused. |
67 |
> |
68 |
> Basically doing this sort of thing right requires a big change in |
69 |
> mindset. You can't just throw stuff at the tree and wait for the bug |
70 |
> reports to come in. You can't just make dealing with the tinderbox |
71 |
> the problem of the poor guy running the tinderbox. The tinderbox |
72 |
> basically has to become the boss and everybody has to make part of |
73 |
> their job keeping the boss happy. |
74 |
|
75 |
1st - Mindset change, definitely required. Can't agree more. |
76 |
2nd - CI on this kind of thing is a multi-headed hydra of a thing. The |
77 |
processes we wind up with will be quite similar philosophically but not |
78 |
necessarily similar in implementation, staging or any other area. |
79 |
For starters most CI pipelines aren't testing an entire distro as complex |
80 |
as Gentoo ;-) |
81 |
3rd - Looking at this linearly is less than idea. If an update breaks |
82 |
5 downstream packages, the entire tree shouldn't go red and the only |
83 |
person who should stop is the maintainer and/or commiter who submitted |
84 |
the broken package. It should be more like automated QA 'gates' than a |
85 |
pass fail build pipeline. |
86 |
4th - Signal to noise is crucial! I'm going to be actively doing |
87 |
something here because I have a build process that is building ebuilds |
88 |
using portage, and when a build can take 30 minutes then an hour to test |
89 |
the final machine image, I need to fail the build early, just flat out |
90 |
avoid any broken but installable package versions ever reaching the |
91 |
machine image testing stage of things. |
92 |
5th - Ideally I think this will need multiple tools working together. |
93 |
However as far as being able to build something that can start doing the |
94 |
job, I feel like buildbot (http://buildbot.net) is probably the best |
95 |
place to start. Its more of a CI 'framework' than a CI 'tool', so it wont |
96 |
place too many 'not designed to do that' roadblocks at us. Its just my |
97 |
first recommendation, I'm sure the matter will need much more discussion. |
98 |
|
99 |
> James <wireless <at> tampabay.rr.com> writes: |
100 |
> |
101 |
> Or some other kind of hook? Fishing was very difficult till the hook |
102 |
> was refined. WE need a hook, imho. Easy Gentoo and a project to assist |
103 |
> in testing code; you've got a killer idea there rich, and I'm glad to |
104 |
> be on your team! |
105 |
> |
106 |
|
107 |
In my mind. Gentoo has the kind of hook you speak of. We just havent |
108 |
refined it or done a good job showing it off to the world. |
109 |
The ebuild format is one of the most powerful packaging standards. |
110 |
It isn't perfect but its an extremely good start. It lets us go anywhere... |
111 |
About a year ago I surprised a long time OSX user / Linux software developer |
112 |
twice in the same conversation after they told me how they used to |
113 |
use Macports before switching to Homebrew. First by mentioning that |
114 |
netbsd's pkg-src works on OSX, then the second time when I told them |
115 |
about the Gentoo Prefix project. They wanted 'freshness' in the software, |
116 |
and felt only Homebrew had it, but that was only because neither pkg-src |
117 |
or the Gentoo Prefix guys go out of their way to promote themselves as a |
118 |
tool for that job the way Homebrew and Macports do. |
119 |
|
120 |
An automatic test battery for Gentoo could possibly catapult Prefix |
121 |
and similar sub projects forward to much greater adoption, and help |
122 |
Gentoo reap secondary benefits as a result. |
123 |
|
124 |
Now all the replying is done, here's the big question, where to organise |
125 |
the bigger picture things? This mailing list? IRC? A new mailing list? |
126 |
I want to get involved in this because I'm going to be building some |
127 |
parts of this already. Why wouldn't I want to give back and help make |
128 |
Gentoo better. |