Gentoo Archives: gentoo-project

From: Raymond Jennings <shentino@×××××.com>
To: gentoo-project@l.g.o
Subject: Re: [gentoo-project] How to improve detection of unmaintained packages?
Date: Sat, 23 Mar 2019 17:54:03
Message-Id: CAGDaZ_qR19s8V8ra0huV+SjdHqoAC8nLADMBMaz+Uyw=5H34eA@mail.gmail.com
In Reply to: Re: [gentoo-project] How to improve detection of unmaintained packages? by "Michał Górny"
1 On Sat, Mar 23, 2019 at 10:38 AM Michał Górny <mgorny@g.o> wrote:
2
3 > On Sat, 2019-03-23 at 10:05 -0700, Raymond Jennings wrote:
4 > > On Sat, Mar 23, 2019 at 7:18 AM Alec Warner <antarus@g.o> wrote:
5 > >
6 > > >
7 > > > On Sat, Mar 23, 2019 at 3:32 AM Michał Górny <mgorny@g.o>
8 > wrote:
9 > > >
10 > > > > Hi,
11 > > > >
12 > > > > Gentoo is still having a major problem of unmaintained packages.
13 > > > > I'm not talking about pure 'maintainer-needed' here but packages that
14 > > > > have apparent maintainers and stay under the radar for long, harming
15 > > > > users in the process. I'd like to query potential solutions as how
16 > we
17 > > > > could improve this and look for new maintainers sooner.
18 > > > >
19 > > > >
20 > > > > The current state
21 > > > > =================
22 > > > > The definition of an unmaintained package here is a bit blurry. For
23 > our
24 > > > > needs, let's say that an unmaintained package is a package that is
25 > not
26 > > > > getting attention of any of the maintainers, whose bugs are not
27 > looked
28 > > > > at, that does not receive version bumps or simply fails to build for
29 > > > > a long time.
30 > > > >
31 > > > > This is especially the case with 'revived herds', i.e. projects that
32 > > > > were formed from old herds. Their main characteristic is that they
33 > > > > 'maintain' a large number of loosely-related packages, and their
34 > > > > developers take care of only a small subset of them. Sadly, we still
35 > > > > have people who cherish that model, and instead of taking packages
36 > they
37 > > > > care about themselves, they shove it into one of 'their' herds.
38 > > > >
39 > > > > So far we're rarely catching such cases directly. Sometimes it
40 > happens
41 > > > > when another developer tries to use the package and notices the
42 > problem,
43 > > > > then finds that it's been reported a long time ago and never received
44 > > > > any attention.
45 > > > >
46 > > > > Sometimes, after retiring a developer we notice that he had
47 > 'maintained'
48 > > > > packages that were broken for years and never received any attention.
49 > > > > There are even real cases of developers taking over broken packages
50 > just
51 > > > > to prevent them from being lastrited but without ever fixing them.
52 > > > >
53 > > > > Then, some of the packages are noticed as result of major API update
54 > > > > trackers, such as the openssl-1.1+ tracker or ncurses[tinfo] tracker.
55 > > > > Those API changes provoke build failures, and while investigating
56 > them
57 > > > > we discover that some of the software hasn't seen any upstream
58 > attention
59 > > > > since 2000 (!), not to mention maintainers that could actually patch
60 > > > > the issues.
61 > > > >
62 > > > >
63 > > > > Version bump-based inactivity?
64 > > > > ==============================
65 > > > > One of the options would be to monitor inactivity as negligence to
66 > bump
67 > > > > packages. With euscan and/or repology, we are at least able to
68 > > > > partially monitor and report new versions of software (I think
69 > someone
70 > > > > used to do that but I don't see those reports anymore). While this
71 > > > > still requires some manual processing (esp. given that repology
72 > results
73 > > > > are sometimes mistaken), it would be a step forward.
74 > > > >
75 > > > > The counterarguments for doing this is that not all version bumps are
76 > > > > meaningful to Gentoo. We'd have to at least be able to filter out
77 > > > > development releases if maintainers are not doing them. Sometimes we
78 > > > > also skip releases if they don't introduce anything meaningful to
79 > Gentoo
80 > > > > users. Finally, some developers reject new versions of software for
81 > > > > various reasons.
82 > > > >
83 > > >
84 > > > I've also considered to just use time.
85 > > >
86 > > > Many *packages* have not been touched in N time. While some software
87 > > > doesn't get updates often, even routine maintenance should require
88 > edits on
89 > > > a fairly regular basis.
90 > > >
91 > > >
92 > > > >
93 > > > > Bugzilla-based inactivity?
94 > > > > ==========================
95 > > > > I've noticed something interesting in Fedora lately. They have a
96 > policy
97 > > > > that if a package build failure is reported (note: they are reporting
98 > > > > them automatically) and the maintainer does not update it from the
99 > 'NEW'
100 > > > > state, it is automatically orphaned after 8 weeks. Effectively,
101 > > > > if the maintainer does not take care (or at least pretends to)
102 > > > > of the package, it is orphaned automatically.
103 > > > >
104 > > > > I suppose we might be able to look for a similar policy in Gentoo.
105 > > > > However, there are two obvious counterarguments. Firstly, this would
106 > > > > create 'busywork' that people would be required to do in order to
107 > > > > prevent from orphaning their packages. Secondly, a fair number of
108 > > > > developers would just do this 'busywork' to every new bug just to
109 > avoid
110 > > > > the problem, rendering the measure ineffective.
111 > > > >
112 > > >
113 > > > Avoid letting the perfect be the enemy of the good here. Any metric
114 > can be
115 > > > gamed by developers; but it turns out we must choose some metric to
116 > drive
117 > > > the organization. I'm fairly sure not *all* developers will automate
118 > this
119 > > > busywork; because *some* of us want to see the number of unmaintained
120 > > > packages reduced; resulting in a net-win.
121 > > >
122 > > >
123 > > > >
124 > > > > What can we actually do?
125 > > > > ========================
126 > > > > Do you have any specific ideas how we could actually improve
127 > > > > the situation? I'm particularly looking for things we could do at
128 > least
129 > > > > semi-automatically, without having to spend tremendous effort looking
130 > > > > through thousands of unhandled bugs manually.
131 > > > >
132 > > >
133 > > > So I'd recommend avoiding a specific implementation; which means don't
134 > > > trigger off of a specific signal.
135 > > >
136 > > > Signals:
137 > > > 1) euscan first; because its most accurate and plausible already
138 > > > implemented.
139 > > > 2) Date-based scanning; its trivial to implement.
140 > > >
141 > > > So now for each package, we have 2 straightforward signals. When was it
142 > > > last touched, how many versions behind?
143 > > >
144 > > > Rules:
145 > > > A package is unmaintained if it:
146 > > > - Has not been touched in 5 years
147 > > > - Is behind 3 versions AND hasn't been touched in 2 years
148 > > > - Is behind 5 versions AND hasn't been touched in 1 years
149 > > >
150 > > > As we add more signals (e.g. doesn't build, or unfixed bugs) we can add
151 > > > additional rules.
152 > > >
153 > > > We could generate a QA report per package on the qa reports page.
154 > > > If there is an API for request the QA report, we could cross-link from
155 > > > p.g.o.
156 > > >
157 > > > -A
158 > > >
159 > > >
160 > > >
161 > > > > --
162 > > > > Best regards,
163 > > > > Michał Górny
164 > > > >
165 > > > >
166 > > As a side observation I'd like to exempt a package from being flagged as
167 > > unmaintained if there's nothing wrong with it. If upstream is idle and
168 > the
169 > > package in a quiet state simply because there's no work needing done,
170 > then
171 > > the package should be left alone.
172 >
173 > This is the attitude that means that few months later a single person is
174 > overburdened with a few dozens unmaintained packages all suddenly
175 > falling apart. Just like ncurses[tinfo]. Or openssl-1.1.
176 >
177
178 I wanted to point out that a package shouldn't be flagged as unmaintained
179 in the first place unless there is first a reason for it to be maintained.
180 Those should be weeded out as candidates under the principle of "if it
181 isn't broke don't fix it" since there's actually nothing wrong with the
182 package remaining status quo.
183
184 As it is the phase 4 I proposed is meant to catch broken packages that
185 either a) don't have a maintainer at all, or b) whose maintainer is
186 completely incommunicado, and not just busy.
187
188 To clarify context though, could you give an example, howsoever
189 hypothetical about "all suddenly falling apart"? Perhaps you mean a
190 package that is a wide spread dependency, and its revdeps all break at the
191 same time due to some sort of api change or the like? Is this what you
192 meant by ncurses and openssl-1.1?
193
194 >
195 > --
196 > Best regards,
197 > Michał Górny
198 >
199 >