Gentoo Archives: gentoo-project

From: Raymond Jennings <shentino@×××××.com>
To: gentoo-project@l.g.o
Subject: Re: [gentoo-project] How to improve detection of unmaintained packages?
Date: Sat, 23 Mar 2019 17:05:46
Message-Id: CAGDaZ_oPGZtcOzhUBZ9jdfJk6vqOac3VKeTBUM0+r1OMY4HYgA@mail.gmail.com
In Reply to: Re: [gentoo-project] How to improve detection of unmaintained packages? by Alec Warner
1 On Sat, Mar 23, 2019 at 7:18 AM Alec Warner <antarus@g.o> wrote:
2
3 >
4 >
5 > On Sat, Mar 23, 2019 at 3:32 AM Michał Górny <mgorny@g.o> wrote:
6 >
7 >> Hi,
8 >>
9 >> Gentoo is still having a major problem of unmaintained packages.
10 >> I'm not talking about pure 'maintainer-needed' here but packages that
11 >> have apparent maintainers and stay under the radar for long, harming
12 >> users in the process. I'd like to query potential solutions as how we
13 >> could improve this and look for new maintainers sooner.
14 >>
15 >>
16 >> The current state
17 >> =================
18 >> The definition of an unmaintained package here is a bit blurry. For our
19 >> needs, let's say that an unmaintained package is a package that is not
20 >> getting attention of any of the maintainers, whose bugs are not looked
21 >> at, that does not receive version bumps or simply fails to build for
22 >> a long time.
23 >>
24 >> This is especially the case with 'revived herds', i.e. projects that
25 >> were formed from old herds. Their main characteristic is that they
26 >> 'maintain' a large number of loosely-related packages, and their
27 >> developers take care of only a small subset of them. Sadly, we still
28 >> have people who cherish that model, and instead of taking packages they
29 >> care about themselves, they shove it into one of 'their' herds.
30 >>
31 >> So far we're rarely catching such cases directly. Sometimes it happens
32 >> when another developer tries to use the package and notices the problem,
33 >> then finds that it's been reported a long time ago and never received
34 >> any attention.
35 >>
36 >> Sometimes, after retiring a developer we notice that he had 'maintained'
37 >> packages that were broken for years and never received any attention.
38 >> There are even real cases of developers taking over broken packages just
39 >> to prevent them from being lastrited but without ever fixing them.
40 >>
41 >> Then, some of the packages are noticed as result of major API update
42 >> trackers, such as the openssl-1.1+ tracker or ncurses[tinfo] tracker.
43 >> Those API changes provoke build failures, and while investigating them
44 >> we discover that some of the software hasn't seen any upstream attention
45 >> since 2000 (!), not to mention maintainers that could actually patch
46 >> the issues.
47 >>
48 >>
49 >> Version bump-based inactivity?
50 >> ==============================
51 >> One of the options would be to monitor inactivity as negligence to bump
52 >> packages. With euscan and/or repology, we are at least able to
53 >> partially monitor and report new versions of software (I think someone
54 >> used to do that but I don't see those reports anymore). While this
55 >> still requires some manual processing (esp. given that repology results
56 >> are sometimes mistaken), it would be a step forward.
57 >>
58 >> The counterarguments for doing this is that not all version bumps are
59 >> meaningful to Gentoo. We'd have to at least be able to filter out
60 >> development releases if maintainers are not doing them. Sometimes we
61 >> also skip releases if they don't introduce anything meaningful to Gentoo
62 >> users. Finally, some developers reject new versions of software for
63 >> various reasons.
64 >>
65 >
66 > I've also considered to just use time.
67 >
68 > Many *packages* have not been touched in N time. While some software
69 > doesn't get updates often, even routine maintenance should require edits on
70 > a fairly regular basis.
71 >
72 >
73 >>
74 >>
75 >> Bugzilla-based inactivity?
76 >> ==========================
77 >> I've noticed something interesting in Fedora lately. They have a policy
78 >> that if a package build failure is reported (note: they are reporting
79 >> them automatically) and the maintainer does not update it from the 'NEW'
80 >> state, it is automatically orphaned after 8 weeks. Effectively,
81 >> if the maintainer does not take care (or at least pretends to)
82 >> of the package, it is orphaned automatically.
83 >>
84 >> I suppose we might be able to look for a similar policy in Gentoo.
85 >> However, there are two obvious counterarguments. Firstly, this would
86 >> create 'busywork' that people would be required to do in order to
87 >> prevent from orphaning their packages. Secondly, a fair number of
88 >> developers would just do this 'busywork' to every new bug just to avoid
89 >> the problem, rendering the measure ineffective.
90 >>
91 >
92 > Avoid letting the perfect be the enemy of the good here. Any metric can be
93 > gamed by developers; but it turns out we must choose some metric to drive
94 > the organization. I'm fairly sure not *all* developers will automate this
95 > busywork; because *some* of us want to see the number of unmaintained
96 > packages reduced; resulting in a net-win.
97 >
98 >
99 >>
100 >>
101 >> What can we actually do?
102 >> ========================
103 >> Do you have any specific ideas how we could actually improve
104 >> the situation? I'm particularly looking for things we could do at least
105 >> semi-automatically, without having to spend tremendous effort looking
106 >> through thousands of unhandled bugs manually.
107 >>
108 >
109 > So I'd recommend avoiding a specific implementation; which means don't
110 > trigger off of a specific signal.
111 >
112 > Signals:
113 > 1) euscan first; because its most accurate and plausible already
114 > implemented.
115 > 2) Date-based scanning; its trivial to implement.
116 >
117 > So now for each package, we have 2 straightforward signals. When was it
118 > last touched, how many versions behind?
119 >
120 > Rules:
121 > A package is unmaintained if it:
122 > - Has not been touched in 5 years
123 > - Is behind 3 versions AND hasn't been touched in 2 years
124 > - Is behind 5 versions AND hasn't been touched in 1 years
125 >
126 > As we add more signals (e.g. doesn't build, or unfixed bugs) we can add
127 > additional rules.
128 >
129 > We could generate a QA report per package on the qa reports page.
130 > If there is an API for request the QA report, we could cross-link from
131 > p.g.o.
132 >
133 > -A
134 >
135 >
136 >
137 >> --
138 >> Best regards,
139 >> Michał Górny
140 >>
141 >>
142 As a side observation I'd like to exempt a package from being flagged as
143 unmaintained if there's nothing wrong with it. If upstream is idle and the
144 package in a quiet state simply because there's no work needing done, then
145 the package should be left alone. I think a packages should be flagged in
146 progressive phases.
147
148 Phase 1 could determine if the package warrants attention, and my proposed
149 metric for this is if there are outstanding bugs on the bugzilla. For this
150 purpose an outstanding bug is anything regarding the package, including
151 revbumps, stablereqs, as well as actual defect/qa/buildfail related bugs.
152 In essence, using the bugzilla as a central point of data collection and a
153 radar for trouble.
154
155 Phase 2 could take up any phase 1 candidates to actually audit for a lack
156 of maintainership, i.e., "maintainer wanted" or "maintainer needed"
157 packages could escalate the package in question to phase 2, as could a
158 timestamp check on the latest activity for teh package. If the package is
159 "phase 1" status due to an outstanding bug, and either lacks a maintainer
160 altogether or fails a dormancy test, then the package is promoted to "phase
161 2"
162
163 Phase 3 could be where we take remedial action. If the package has a
164 maintainer this would be a good point to contact them. Perhaps a more
165 comprehensive audit of the package's lack of maintainership, etc etc etc.
166 A package that has entered "phase 2" has already been established as having
167 outstanding bugs AND failed whatever automated sort of audit is done to
168 check for being unmaintained.
169
170 Phase 4 is the package being officially marked as unmaintained, and at this
171 point it could probably be put on treecleaner's radar or however else we
172 wish to handle unmaintained packages. If the package has a maintainer that
173 failed to respond during phase 3 this could well be raise a concern of its
174 own about that maintainer's own performance.

Replies