Gentoo Archives: gentoo-dev

From: Alec Warner <antarus@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] Automatic testing on Gentoo
Date: Wed, 11 May 2011 17:40:03
Message-Id: BANLkTim6Gh5pHe3Zg0y4OvfuGAUmkhZ8Ww@mail.gmail.com
In Reply to: Re: [gentoo-dev] Automatic testing on Gentoo by Jack Morgan
1 On Wed, May 11, 2011 at 6:12 AM, Jack Morgan <jack@×××××××.com> wrote:
2 >
3 >
4 > On 05/10/2011 01:13 PM, Jorge Manuel B. S. Vicetto wrote:
5 >> Hi.
6 >>
7 >> Another issue that was raised in the discussion with the arch teams,
8 >> even though it predates the arch teams resources thread as we've talked
9 >> about it on FOSDEM 2011 and even before, is getting more automatic
10 >> testing done on Gentoo.
11 >>
12 >> I'm bcc'ing a few teams on this thread as it involves them and hopefully
13 >> might interest them as well.
14 >>
15 >> Both Release Engineering and QA teams would like to have more automatic
16 >> testing to find breakages and to help track "when" things break and more
17 >> importantly *why* they break.
18 >>
19 >> To avoid misunderstandings, we already have testing and even automated
20 >> testing being done on Gentoo. The "first line" of testing is done by
21 >> developers using repoman and or the PM's QA tools. We also have
22 >> individual developers and the QA team hopefully checking commits and
23 >> everyone testing their packages.
24 >>
25 >> Furtermore, the current weekly automatic stage building has helped
26 >> identify some issues with the tree. The tinderbox work done by Patrick
27 >> and Diego, as well as others, has also helped finding broken packages
28 >> and or identifying packages affected by major changes before they hit
29 >> the tree. The use of repoman, pcheck and or paludis quality assurance
30 >> tools in the past and present to generate reports about tree issues,
31 >> like Michael's (mr_bones) emails have also helped identifying and
32 >> addressing issues.
33 >>
34 >> Recently, we've got a new site to check the results of some tests
35 >> http://qa-reports.gentoo.org/ with the possibility to add more scripts
36 >> to provide / run even more tests.
37 >>
38 >> So, why "more testing"? For starters, more *automatic* testing. Then
39 >> more testing as reports from testing can help greatly in identifying
40 >> when things break and why they break. As someone that looks over the
41 >> automatic stage building for amd64 and x86, and that has to talk to
42 >> teams / developers when things break, having more, more in depth and
43 >> regular automatic testing would help my (releng) job. The work for the
44 >> live-dvd would also be easier if the builds were "automated" and the job
45 >> wasn't "restarted" every N months. Furthermore, creating a framework for
46 >> developers to be able to schedule testing for proposed changes, in
47 >> particular for substantial changes, might (should?) help improve the
48 >> user's experience.
49 >>
50 >> I hope you agree with "more testing" by now, but what testing? It's good
51 >> to test something, but "what" do we want to test and "how" do we want to
52 >> test?
53 >>
54 >>
55 >> I think we should try to have at least the following categories of tests:
56 >>
57 >>  * Portage (overlays?) QA tests
58 >>       tests with the existing QA tools to check the consistency of
59 >> dependencies and the quality of ebuilds / eclasses.
60
61 These are almost separate. I assume your intent was 'lets automate
62 pcheck & co. runs of gentoo-x86 and if we get that working we can add
63 overlays from layman' which sounds fine to me ;)
64
65 >>
66 >>  * (on demand?) package (stable / unstable) revdep rebuild (tinderbox)
67 >>       framework to schedule testing of proposed changes and check their impact
68
69 I'd be curious what the load is here. We are adopting an on-demand
70 testing infrastructure at work. Right now we have a continuous build
71 but it is time-delta based and not event-based so it groups changes
72 together which makes it hard to find what broke things. At work we
73 only submit a few changes a day though, so we need a very small
74 infrastructure to test each change. Gentoo has way more commits (at
75 least one every few minutes on average, and then there are huge
76 commits like KDE stablization...)
77
78 What I'd recommend here is essentially some kind of control field in
79 the commit itself (commitmsg?) that controls exactly what tests are
80 done for that commit.
81
82 >>
83 >>  * Weekly (?) stable / unstable stage / ISO arch builds
84 >>       the automatic stage building, including new specs for the testing tree
85 >> as we currently only test the stable tree.
86
87 I'm curious if you constantly build unstable..do you plan on fixing
88 it? My understanding of Gentoo is that in ~arch something is always
89 slightly broken and thats OK. I worry that ~arch builds may just end
90 up being noise because they don't build properly due to the high
91 velocity of changes.
92
93 >>
94 >>  * (schedule?) specific tailored stage4 builds
95 >>       testing of specific tailored "real world" images (web server, intranet
96 >> server, generic desktop, GNOME desktop, KDE desktop, etc).
97
98 Again it would be interesting to have some kind of control field in my
99 commits so when KDE is stable I could trigger a build of the 'KDE
100 stage4' or whatnot.
101
102 If we ever finish this gentoo-stats project it would be interesting to
103 see what users are actually using as well. Do users use the defaults?
104 Are the stage4's we are testing actually relevant?
105
106 >>
107 >>  * Bi-Weekly (?) stable / unstable AMD64/X86 LiveDVD builds
108 >>       automatic creation of the live-DVD to test a very broad set of packages
109 >>
110 >>  * automated testing of built stage / CD / LiveDVD (KVM guest?) (CLI /
111 >> GUI / log parsing ?)
112 >>       framework to test the built stages / install media and ensure it works
113 >> as expected
114
115 I think testing that the liveDVD we just built boots is a decent test
116 (and probably not to difficult to write.) Testing that 'everything on
117 the DVD works' is likely more of a challenge and I'm not sure it buys
118 us anything. Do we often find that we release LiveDVDs with broken
119 software?
120
121 >>
122 >>
123 >> I don't have a framework for conducting some of these tests, including
124 >> the stage/iso validation, but some of them can use the existing tools
125 >> like the stage building and the tree QA tests.
126 >>
127 >> Do you have any suggestions about the automatic testing? Do you know of
128 >> other tests or tools that we can and should use to improve QA on Gentoo?
129 >
130 > You might take a look at autotest from kernel.org. It's a Python based
131 > framework for automating testing. It's specific towards kernel testing,
132 > but could be modified for your needs.
133
134 Autotest would likely require a branch and a fair bit of work to be
135 used for OS qualification. We use it for OS qualification at work
136 (Goobuntu@Google)
137
138 While I hesitate to say 'roll your own' if you can get something
139 working in 1-2 months I can certainly see it being easier to maintain
140 than autotest...there really is not a killer feature that autotest
141 has. The reporting / graphing is pretty bad, it uses ssh for
142 everything and basically keeps long-running connections open (might be
143 fine if you are using kvm..but not over the WAN), the API is terrible
144 and requires all kinds of horrible-ness to use...I could go on ;
145
146 >
147 >
148 >
149 >
150 > --
151 > Jack Morgan
152 > Pub 4096R/761D8E0A 2010-09-13 Jack Morgan <jack@×××××××.com>
153 > Fingerprint = DD42 EA48 D701 D520 C2CD 55BE BF53 C69B 761D 8E0A
154 >
155 >