1 |
On Sat, Mar 23, 2019 at 7:18 AM Alec Warner <antarus@g.o> wrote: |
2 |
|
3 |
> |
4 |
> |
5 |
> On Sat, Mar 23, 2019 at 3:32 AM Michał Górny <mgorny@g.o> wrote: |
6 |
> |
7 |
>> Hi, |
8 |
>> |
9 |
>> Gentoo is still having a major problem of unmaintained packages. |
10 |
>> I'm not talking about pure 'maintainer-needed' here but packages that |
11 |
>> have apparent maintainers and stay under the radar for long, harming |
12 |
>> users in the process. I'd like to query potential solutions as how we |
13 |
>> could improve this and look for new maintainers sooner. |
14 |
>> |
15 |
>> |
16 |
>> The current state |
17 |
>> ================= |
18 |
>> The definition of an unmaintained package here is a bit blurry. For our |
19 |
>> needs, let's say that an unmaintained package is a package that is not |
20 |
>> getting attention of any of the maintainers, whose bugs are not looked |
21 |
>> at, that does not receive version bumps or simply fails to build for |
22 |
>> a long time. |
23 |
>> |
24 |
>> This is especially the case with 'revived herds', i.e. projects that |
25 |
>> were formed from old herds. Their main characteristic is that they |
26 |
>> 'maintain' a large number of loosely-related packages, and their |
27 |
>> developers take care of only a small subset of them. Sadly, we still |
28 |
>> have people who cherish that model, and instead of taking packages they |
29 |
>> care about themselves, they shove it into one of 'their' herds. |
30 |
>> |
31 |
>> So far we're rarely catching such cases directly. Sometimes it happens |
32 |
>> when another developer tries to use the package and notices the problem, |
33 |
>> then finds that it's been reported a long time ago and never received |
34 |
>> any attention. |
35 |
>> |
36 |
>> Sometimes, after retiring a developer we notice that he had 'maintained' |
37 |
>> packages that were broken for years and never received any attention. |
38 |
>> There are even real cases of developers taking over broken packages just |
39 |
>> to prevent them from being lastrited but without ever fixing them. |
40 |
>> |
41 |
>> Then, some of the packages are noticed as result of major API update |
42 |
>> trackers, such as the openssl-1.1+ tracker or ncurses[tinfo] tracker. |
43 |
>> Those API changes provoke build failures, and while investigating them |
44 |
>> we discover that some of the software hasn't seen any upstream attention |
45 |
>> since 2000 (!), not to mention maintainers that could actually patch |
46 |
>> the issues. |
47 |
>> |
48 |
>> |
49 |
>> Version bump-based inactivity? |
50 |
>> ============================== |
51 |
>> One of the options would be to monitor inactivity as negligence to bump |
52 |
>> packages. With euscan and/or repology, we are at least able to |
53 |
>> partially monitor and report new versions of software (I think someone |
54 |
>> used to do that but I don't see those reports anymore). While this |
55 |
>> still requires some manual processing (esp. given that repology results |
56 |
>> are sometimes mistaken), it would be a step forward. |
57 |
>> |
58 |
>> The counterarguments for doing this is that not all version bumps are |
59 |
>> meaningful to Gentoo. We'd have to at least be able to filter out |
60 |
>> development releases if maintainers are not doing them. Sometimes we |
61 |
>> also skip releases if they don't introduce anything meaningful to Gentoo |
62 |
>> users. Finally, some developers reject new versions of software for |
63 |
>> various reasons. |
64 |
>> |
65 |
> |
66 |
> I've also considered to just use time. |
67 |
> |
68 |
> Many *packages* have not been touched in N time. While some software |
69 |
> doesn't get updates often, even routine maintenance should require edits on |
70 |
> a fairly regular basis. |
71 |
> |
72 |
> |
73 |
>> |
74 |
>> |
75 |
>> Bugzilla-based inactivity? |
76 |
>> ========================== |
77 |
>> I've noticed something interesting in Fedora lately. They have a policy |
78 |
>> that if a package build failure is reported (note: they are reporting |
79 |
>> them automatically) and the maintainer does not update it from the 'NEW' |
80 |
>> state, it is automatically orphaned after 8 weeks. Effectively, |
81 |
>> if the maintainer does not take care (or at least pretends to) |
82 |
>> of the package, it is orphaned automatically. |
83 |
>> |
84 |
>> I suppose we might be able to look for a similar policy in Gentoo. |
85 |
>> However, there are two obvious counterarguments. Firstly, this would |
86 |
>> create 'busywork' that people would be required to do in order to |
87 |
>> prevent from orphaning their packages. Secondly, a fair number of |
88 |
>> developers would just do this 'busywork' to every new bug just to avoid |
89 |
>> the problem, rendering the measure ineffective. |
90 |
>> |
91 |
> |
92 |
> Avoid letting the perfect be the enemy of the good here. Any metric can be |
93 |
> gamed by developers; but it turns out we must choose some metric to drive |
94 |
> the organization. I'm fairly sure not *all* developers will automate this |
95 |
> busywork; because *some* of us want to see the number of unmaintained |
96 |
> packages reduced; resulting in a net-win. |
97 |
> |
98 |
> |
99 |
>> |
100 |
>> |
101 |
>> What can we actually do? |
102 |
>> ======================== |
103 |
>> Do you have any specific ideas how we could actually improve |
104 |
>> the situation? I'm particularly looking for things we could do at least |
105 |
>> semi-automatically, without having to spend tremendous effort looking |
106 |
>> through thousands of unhandled bugs manually. |
107 |
>> |
108 |
> |
109 |
> So I'd recommend avoiding a specific implementation; which means don't |
110 |
> trigger off of a specific signal. |
111 |
> |
112 |
> Signals: |
113 |
> 1) euscan first; because its most accurate and plausible already |
114 |
> implemented. |
115 |
> 2) Date-based scanning; its trivial to implement. |
116 |
> |
117 |
> So now for each package, we have 2 straightforward signals. When was it |
118 |
> last touched, how many versions behind? |
119 |
> |
120 |
> Rules: |
121 |
> A package is unmaintained if it: |
122 |
> - Has not been touched in 5 years |
123 |
> - Is behind 3 versions AND hasn't been touched in 2 years |
124 |
> - Is behind 5 versions AND hasn't been touched in 1 years |
125 |
> |
126 |
> As we add more signals (e.g. doesn't build, or unfixed bugs) we can add |
127 |
> additional rules. |
128 |
> |
129 |
> We could generate a QA report per package on the qa reports page. |
130 |
> If there is an API for request the QA report, we could cross-link from |
131 |
> p.g.o. |
132 |
> |
133 |
> -A |
134 |
> |
135 |
> |
136 |
> |
137 |
>> -- |
138 |
>> Best regards, |
139 |
>> Michał Górny |
140 |
>> |
141 |
>> |
142 |
As a side observation I'd like to exempt a package from being flagged as |
143 |
unmaintained if there's nothing wrong with it. If upstream is idle and the |
144 |
package in a quiet state simply because there's no work needing done, then |
145 |
the package should be left alone. I think a packages should be flagged in |
146 |
progressive phases. |
147 |
|
148 |
Phase 1 could determine if the package warrants attention, and my proposed |
149 |
metric for this is if there are outstanding bugs on the bugzilla. For this |
150 |
purpose an outstanding bug is anything regarding the package, including |
151 |
revbumps, stablereqs, as well as actual defect/qa/buildfail related bugs. |
152 |
In essence, using the bugzilla as a central point of data collection and a |
153 |
radar for trouble. |
154 |
|
155 |
Phase 2 could take up any phase 1 candidates to actually audit for a lack |
156 |
of maintainership, i.e., "maintainer wanted" or "maintainer needed" |
157 |
packages could escalate the package in question to phase 2, as could a |
158 |
timestamp check on the latest activity for teh package. If the package is |
159 |
"phase 1" status due to an outstanding bug, and either lacks a maintainer |
160 |
altogether or fails a dormancy test, then the package is promoted to "phase |
161 |
2" |
162 |
|
163 |
Phase 3 could be where we take remedial action. If the package has a |
164 |
maintainer this would be a good point to contact them. Perhaps a more |
165 |
comprehensive audit of the package's lack of maintainership, etc etc etc. |
166 |
A package that has entered "phase 2" has already been established as having |
167 |
outstanding bugs AND failed whatever automated sort of audit is done to |
168 |
check for being unmaintained. |
169 |
|
170 |
Phase 4 is the package being officially marked as unmaintained, and at this |
171 |
point it could probably be put on treecleaner's radar or however else we |
172 |
wish to handle unmaintained packages. If the package has a maintainer that |
173 |
failed to respond during phase 3 this could well be raise a concern of its |
174 |
own about that maintainer's own performance. |