1 |
Hello, everyone. |
2 |
|
3 |
Back in 2016, we've killed the technical representation of herds. Some |
4 |
of them were disbanded completely, others merged with existing projects |
5 |
or converted into new projects. This solved some of the problems with |
6 |
maintainer declarations but it didn't solve the most important problem |
7 |
herds posed. Sadly, it seems that the spirit of herds survived along |
8 |
with those problems. |
9 |
|
10 |
Herds served as a method of grouping packages by a common topic, |
11 |
somewhat similar (but usually more broadly) than categories. In their |
12 |
mature state, herds had either their specific maintainers, or were |
13 |
directly connected to projects (which in turn provided maintainers for |
14 |
the herds). Today, we still have many herds that are masked either |
15 |
as complete projects, or semi-projects (i.e. project entries without |
16 |
explicit lead, policies or anything else). |
17 |
|
18 |
|
19 |
What's wrong with herds? |
20 |
------------------------ |
21 |
The main problem with herds is that they represent an artificial |
22 |
relation between packages. The only common thing about them is topic, |
23 |
and there is no real reason why a group of people would maintain all |
24 |
packages regarding the same topic. In fact, it is absurd -- say, why |
25 |
would a single person maintain 10+ competing cron implementations? |
26 |
Surely, there is some common knowledge related to running cron, |
27 |
and it is entirely possible that a single person would use a few |
28 |
different cron implementations on different systems. But that doesn't |
29 |
justify creating an artificial project to maintain all cron |
30 |
implementations. |
31 |
|
32 |
Mapping this to reality, projects usually represent a few developers, |
33 |
each of them interested in a specific subset of packages maintained by |
34 |
the project. In some cases, this is explicitly noted as project member |
35 |
roles; in other, it is not stated clearly anywhere. In both cases, |
36 |
there is usually some group of packages that are assigned to |
37 |
the specific project but not maintained by any of the project members. |
38 |
|
39 |
Less structured projects often have problems tracking member activity. |
40 |
More than once a project effectively died when all members became |
41 |
inactive, yet effectively hid the fact that the relevant packages were |
42 |
unmaintained and sometimes discouraged more timid developers from fixing |
43 |
bugs. |
44 |
|
45 |
|
46 |
What kind of projects make sense? |
47 |
--------------------------------- |
48 |
If we are to fight herd-like projects, I think it is important to |
49 |
consider a bit what kind of projects make sense, and what form herd-like |
50 |
trouble. |
51 |
|
52 |
The two projects maintaining the largest number of packages in Gentoo |
53 |
are respectively the Perl project and the Python project. Strictly |
54 |
speaking, both could be considered herd-like -- after all, they maintain |
55 |
a lot of packages belonging to the same category. To some degree, this |
56 |
is true. However, I believe those make sense because: |
57 |
|
58 |
a. They maintain a central group of packages, eclasses, policies etc. |
59 |
related to writing ebuilds using the specific programming language, |
60 |
and help other developers with it. The existence of such a project is |
61 |
really useful. |
62 |
|
63 |
b. The packages maintained by them have many common properties, |
64 |
frequently come from common sources (CPAN, pypi) and that makes it |
65 |
possible for a large number of developers to actually maintain all |
66 |
of them. |
67 |
|
68 |
The Python project I know better, so I'll add something. It does not |
69 |
accept all Python packages (although some developers insist on adding us |
70 |
to them without asking), and especially not random programs written in |
71 |
the Python language. It specifically focuses on Python module packages, |
72 |
i.e. resources generally useful to Python programmers. This is what |
73 |
makes it different from a common herd project. |
74 |
|
75 |
The third biggest project in Gentoo is -- in my opinion -- a perfect |
76 |
example of a problematic herd-project. The games project maintains |
77 |
a total of 877 packages, and sad to say many are in a really bad shape. |
78 |
Even if we presumed all developers were active, this gives us 175 |
79 |
packages per person, and I seriously doubt one person can actively |
80 |
maintain that many programs. Add to that the fact that many of them are |
81 |
proprietary and fetch-restricted, and only the people possessing a copy |
82 |
can maintain it, and you see how blurry the package mapping is. |
83 |
|
84 |
Let's look at the next projects on the list. Proxy-maint is very |
85 |
specific as it proxies contributors; however, it is technically valid |
86 |
since all project members can (and should) actively proxy for any |
87 |
maintainers we have. Though I have to admit the number of maintained |
88 |
packages simply overburdens us. |
89 |
|
90 |
Haskell, Java, Ruby are other examples of projects focused on |
91 |
programming languages. KDE and GNOME projects generally make sense |
92 |
since packages maintained by those projects have many common features, |
93 |
and the core set has common upstream and sometimes synced releases. It |
94 |
is reasonable to assume members of those projects will maintain all, or |
95 |
at least majority of those packages. |
96 |
|
97 |
The next project is Sound -- and in my experience, it involves a lot of |
98 |
poorly maintained or unmaintained packages. Again, the problem is that |
99 |
the packages maintained by the project have little in common -- why |
100 |
would any single person maintain a dozen audio players, converters, |
101 |
libraries, etc. Having multiple people in project may increase |
102 |
the chance that they would happen to cover a larger set of competing |
103 |
packages but that's really more incidental than expected. |
104 |
|
105 |
This is basically how I'd summarize a difference between a valid |
106 |
project, and a herd-project. A valid project maintains packages that |
107 |
have many common properties, where it really makes sense for |
108 |
an arbitrarily chosen project member to take care of an arbitrary chosen |
109 |
package maintained by the project. A herd-project maintains packages |
110 |
that have only common topic, and usually means that an arbitrarily |
111 |
chosen project member maintains only a small subset of all packages |
112 |
maintained by the project. |
113 |
|
114 |
Looking further through the list, projects that seem to make sense |
115 |
include ROS, Emacs, maybe base-system, SELinux, ML, X11 (after all, it |
116 |
maintains core Xorg and nobody sets them as 'backup' maintainers for |
117 |
random X11 programs), PHP, vim... |
118 |
|
119 |
Project that are herd-like include science (possibly with all its |
120 |
flavors), netmon, video, desktop-misc (this is a very example of 'random |
121 |
programs'), graphics... |
122 |
|
123 |
|
124 |
What do I propose? |
125 |
------------------ |
126 |
I'd like to propose either disbanding herd-like projects entirely, or |
127 |
transforming them into more proper projects. Not only those that are |
128 |
clearly dysfunctional but also those that incidentally happen to work |
129 |
(e.g. because they maintain a few packages, or because they represent |
130 |
a single developer with wide interest). |
131 |
|
132 |
More specifically, I'd like each of the affected projects to choose |
133 |
between: |
134 |
|
135 |
a. disbanding the project entirely and finding individual maintainers |
136 |
for all packages, |
137 |
|
138 |
b. reducing the packages maintained by the project to a well-defined |
139 |
'core set' whose maintenance by a group of developers makes sense, |
140 |
and finding individual maintainers for the remaining packages, |
141 |
|
142 |
c. splitting one or more smaller projects with well-defined scope from |
143 |
the project, and doing a. or b. for the remaining packages. |
144 |
|
145 |
Let's take a few examples. For a start, cron project. Previously, it |
146 |
maintained a number of different cron implementations (most having their |
147 |
individual maintainers by now), a cronbase package and cron.eclass. |
148 |
In this context, option a. means disbanding the project entirely. Some |
149 |
packages already have maintainers, others go maintainer-needed. |
150 |
|
151 |
Option b. would most likely involve leaving a cron project as small |
152 |
entity to provide policies for consistent cron handling, and maintain |
153 |
cronbase package and cron.eclass. Different cron implementation would |
154 |
go to individual maintainers anyway. |
155 |
|
156 |
A similar example can be made for the PAM project that maintained |
157 |
pambase, Linux-PAM, pam.eclass and some PAM modules. Here a. means |
158 |
giving all packages away, and b. means leaving a minimal project that |
159 |
maintains policies, pambase, Linux-PAM and the eclass. The individual |
160 |
modules (except for maybe very common, if there were some) would find |
161 |
individual maintainers. |
162 |
|
163 |
A good example for the c. option is the recently revived VoIP project. |
164 |
Again, this is an example of herd-project that tries to maintain |
165 |
an arbitrary set of loosely related packages. To some, it might make |
166 |
sense, especially since there's only a few VoIP packages left in Gentoo. |
167 |
Nevertheless, there is no reason why a single project member would |
168 |
maintain multiple competing VoIP stacks. |
169 |
|
170 |
Here, the c. option would mean creating project(s) for specific stacks |
171 |
of interest. For example, if there was specific project-level interest |
172 |
for maintaining Asterisk packages, an Asterisk project would make more |
173 |
sense than generic 'VoIP'. |
174 |
|
175 |
|
176 |
Why, again? |
177 |
----------- |
178 |
As I said before, the main problem with herds is that they introduce |
179 |
artificial and non-transparent relation between packages and package |
180 |
maintainers. |
181 |
|
182 |
Firstly, they usually tend to include packages that none of the project |
183 |
members is actually interested in maintaining. This also includes |
184 |
packages added by other developers (let's shove it in here, it matches |
185 |
their job description!) or packages leftover from other developers |
186 |
(where the project was backup maintainer). This means having a lot of |
187 |
packages that seem to have a maintainer but actually don't. |
188 |
|
189 |
Secondly, they frequently lack proper structure and handling of leaving |
190 |
members. Therefore, whenever a member maintaining a specific set of |
191 |
packages leaves, it is possible that the number of not-really-maintained |
192 |
packages increases. |
193 |
|
194 |
Thirdly, they tend to degenerate and become defunct (much more than |
195 |
projects that make sense). Then, the number of not-really-maintained |
196 |
packages ends up being really high. |
197 |
|
198 |
My goal here is to make sure that we have clear and correct information |
199 |
about package maintainers. Most notable, if a package has no active |
200 |
maintainer, we really need to have 'up for grabs' issued and package |
201 |
marked as maintainer-needed, rather than hidden behind some project |
202 |
whose members may not even be aware of the fact that they're its |
203 |
maintainers. |
204 |
|
205 |
|
206 |
What do you think? |
207 |
|
208 |
-- |
209 |
Best regards, |
210 |
Michał Górny |