Gentoo Archives: gentoo-dev

From: "Michał Górny" <mgorny@g.o>
To: gentoo-dev@l.g.o
Subject: [gentoo-dev] Revisiting version-related tree policies
Date: Thu, 03 Nov 2016 16:12:31
Message-Id: 20161103171122.3df2b201.mgorny@gentoo.org
1 Hi, everyone.
2
3 As part of our work on version operators, we've noticed some issues
4 with our version policies. ulm has done some additional research on
5 the topic and now I'd like to open a discussion on our rules.
6
7
8 == PMS rules ==
9 PMS specifies only minimal syntax for versions, that is allowed types
10 and order of components. It does not define any range, length or
11 count limits. In other words, your versions can be infinitely long,
12 with infinitely many components and thanks to 'negative' suffixes such
13 as _alpha.._rc, also with infinite precision. Revisions can grow up to
14 infinity as well.
15
16 Fun fact: for every existing versions (without considering revisions!)
17 A and B, you can always create a new version X, so that A < X < B. For
18 example, if A = 1.4, B = 1.4_p1, X can be 1.4_p1_pre. For A = 1.4,
19 B = 1.4_p1_pre, X = 1.4_p1_pre_pre and so on.
20
21
22 == Current Gentoo policy ==
23 ulm has found a tiny note in the devmanual [1] stating:
24
25 | No integer part of the version may be longer than 18 digits.
26
27 The rationale is supposedly to be able to practically hold each
28 component in 64-bit integer type.
29
30 [1]:https://devmanual.gentoo.org/ebuild-writing/file-format/index.html#file-naming-rules
31
32
33 == Practical implications ==
34 Aside to purely technical matters, I think the free-form versioning has
35 two major drawbacks:
36
37 1. Some of the more creative versions are confusing to everyone (e.g.
38 when you are trying to figure out what particular components mean)
39 and really hard to type correctly,
40
41 2. Getting safe lower or upper bound for <, <=, >=, > deps is sometimes
42 hard to impossible. For example, >=foo-1.4_alpha wouldn't catch
43 1.4_alpha_rc which is valid. And <=foo-1.4-r9999 wouldn't catch
44 foo-1.4.r-10000.
45
46
47 == Currently used versions in ::gentoo ==
48 [Note: after noting this all down I've noticed my results don't include
49 masked packages]
50
51 === Version lengths (not counting revisions) ===
52 The longest version used is 23 characters long. The longest are:
53
54 23 app-emacs/limit 1.14.10_pre200811252332
55 23 app-i18n/man-pages-ru 3.71.2209.1992.20140911
56 23 app-i18n/man-pages-ru 3.81.2230.2080.20160117
57 23 net-ftp/pybootd 1.5.0_pre20110524131526
58 22 sys-auth/google-authenticator 1.01_pre20160307231538
59
60 Key|Ct (Pct) Histogram
61 5|14597 (39.72%) ******************************************************
62 6| 6496 (17.68%) ************************
63 3| 3954 (10.76%) ***************
64 7| 3396 (9.24%) *************
65 4| 2914 (7.93%) ***********
66 10| 2288 (6.23%) *********
67 8| 1155 (3.14%) *****
68 9| 478 (1.30%) **
69 2| 271 (0.74%) *
70 1| 239 (0.65%) *
71
72 I don't see any problem here.
73
74
75 === Revision values ===
76 The highest revisions are:
77
78 2014120900 www-servers/xsp 2014.12
79 9999 app-crypt/keylookup 2.2
80 500 kde-misc/openofficeorg-thumbnail 1.0.0
81 301 dev-libs/libappindicator 12.10.0
82 301 dev-libs/libindicator 12.10.1
83
84 Key|Ct (Pct) Histogram
85 0|26311 (71.60%) *****************************************************
86 1| 6133 (16.69%) *************
87 2| 1792 (4.88%) ****
88 3| 841 (2.29%) **
89 4| 526 (1.43%) **
90 5| 380 (1.03%) *
91 6| 319 (0.87%) *
92 10| 275 (0.75%) *
93 7| 40 (0.11%) *
94 8| 26 (0.07%) *
95 9| 23 (0.06%) *
96 100| 19 (0.05%) *
97 11| 10 (0.03%) *
98 200| 7 (0.02%) *
99 300| 6 (0.02%) *
100
101 As expected, the most common are revisions increasing monotonically.
102 However, multiples of 100 are also popular.
103
104 The revision number of 9999 is suspicious, and 2014120900 is clearly
105 pathological -- and probably should be replaced by _pre or _p.
106
107 It should be noted that e.g. ::progress overlay is known to use
108 revisions 10000+ to override Gentoo ebuilds.
109
110
111 === Numeric version component lengths ===
112 The longest numeric version component is 14 characters long.
113 The longest are:
114
115 14 20141110122616 dev-vcs/pwclient 20141110122616
116 14 20140414130214 dev-ruby/arel 5.0.1.20140414130214
117 14 20121105131501 dev-vcs/pwclient 20121105131501
118 12 201607172312 sys-apps/gradm 3.1.201607172312
119 12 201607021514 app-crypt/gentoo-keys 201607021514
120 12 201606062304 mail-filter/opensmtpd-extras 5.9.2.201606062304
121 12 201603152148 sys-apps/gradm 3.1.201603152148
122 12 201507191652 sys-apps/gradm 3.1.201507191652
123 12 201506180105 app-misc/xmind 3.5.3.201506180105
124 12 201505061057 net-libs/libasr 1.0.1.201505061057
125 12 201411201906 app-misc/xmind 3.5.1.201411201906
126 12 201401221918 app-misc/xmind 3.4.1.201401221918
127 10 2016020301 dev-perl/Regexp-Common 2016020301.0.0
128 10 2013031301 dev-perl/Regexp-Common 2013031301.0.0
129 10 2009041301 dev-perl/Geography-Countries 2009041301.0.0
130
131 Key|Ct (Pct) Histogram
132 1|81628 (82.51%) ******************************************************
133 2|11700 (11.83%) ********
134 8| 2772 (2.80%) **
135 3| 2063 (2.09%) **
136 4| 665 (0.67%) *
137 5| 38 (0.04%) *
138 6| 28 (0.03%) *
139 7| 20 (0.02%) *
140 12| 9 (0.01%) *
141 14| 3 (0.00%) *
142 10| 3 (0.00%) *
143
144 All longer values seem to be reserved for timestamps with various
145 precisions.
146
147
148 === Version suffix lengths ===
149 The longest version suffix is 17 characters long (14 digits). The
150 longest are:
151
152 17 pre20161004153257 net-irc/kvirc 5.0_pre20161004153257
153 17 pre20160801092805 net-irc/kvirc 5.0_pre20160801092805
154 17 pre20160307231538 sys-auth/google-authenticator 1.01_pre20160307231538
155 17 pre20110524131526 net-ftp/pybootd 1.5.0_pre20110524131526
156 15 pre200811252332 app-emacs/limit 1.14.10_pre200811252332
157 15 p20160215155418 sys-apps/net-tools 1.60_p20160215155418
158 ...
159 13 p200709030413 app-emacs/mu-cite 8.1_p200709030413
160 13 alpha20110303 dev-python/pivy 0.5_alpha20110303
161 12 beta20150411 app-vim/rust-vim 1_beta20150411
162 12 beta20150411 app-shells/rust-zshcomp 1_beta20150411
163 12 beta20150411 app-emacs/rust-mode 1_beta20150411
164 11 pre20161029 media-sound/tomahawk 0.9.0_pre20161029
165
166 Key|Ct (Pct) Histogram
167 9|299 (24.67%) ********************************************************
168 11|208 (17.16%) ***************************************
169 2|197 (16.25%) *************************************
170 3|168 (13.86%) *******************************
171 5|144 (11.88%) ***************************
172 4| 96 (7.92%) ******************
173 6| 38 (3.14%) *******
174 7| 33 (2.72%) *******
175 8| 9 (0.74%) **
176 13| 7 (0.58%) **
177 17| 4 (0.33%) *
178 15| 4 (0.33%) *
179 12| 3 (0.25%) *
180 10| 2 (0.17%) *
181
182 The situation is similar to numeric components. The longest components
183 are various kinds of timestamps, increased by appropriate keyword
184 lengths. (note: actually, all those should be +1 since I didn't count
185 the '_').
186
187
188 === Version suffix counts ===
189 There are no more than 2 suffixes used in versions simultaneously.
190 The packages using two suffixes are:
191
192 _beta_p dev-java/protobuf-java 3.0.0_beta3_p1
193 _beta_p dev-libs/protobuf 3.0.0_beta3_p1
194 _beta_p dev-python/protobuf-python 3.0.0_beta3_p1
195 _beta_p dev-util/xxdiff 4.0_beta1_p20110426
196 _beta_p net-analyzer/fping 2.4_beta2_p161
197 _beta_p net-misc/fatrat 1.2.0_beta2_p20150803
198 _beta_p net-misc/freerdp 1.1.0_beta1_p20130710
199 _beta_p x11-plugins/wmtime 1.0_beta2_p9
200 _p_p net-analyzer/tcptrace 6.6.7_p4_p1
201 _p_p x11-libs/xosd 2.2.14_p2_p1
202 _p_p x11-misc/xkbset 0.5_p5_p1
203 _pre_p app-arch/unp 2.0_pre7_p1
204 _pre_p x11-misc/imwheel 1.0.0_pre13_p20100827
205 _rc_p dev-libs/hidapi 0.8.0_rc1_p20140719
206 _rc_p dev-lua/busted 2.0_rc11_p0
207 _rc_p dev-lua/busted 2.0_rc12_p1
208 _rc_p media-libs/openglide 0.09_rc9_p20160913
209
210 Key|Ct (Pct) Histogram
211 _beta_p|8 (44.44%) *****************************************************
212 _rc_p|4 (22.22%) ***************************
213 _pre_p|3 (16.67%) ********************
214 _p_p|3 (16.67%) ********************
215
216 It seems that double suffixes are used to either indicate snapshots of
217 corresponding upstream versions, or to cover non-Gentoo-friendly
218 versions such as '6.6.7-4.1'.
219
220 Note: there is also a pathological case of bash-4.3_p39_pre0 that is
221 non-keyworded. In this case, it seems that -pre0 doesn't correspond to
222 any upstream version.
223
224
225 == Policy changes? ==
226 I think that the following new policies could make sense:
227
228 1. Revision number must be no longer than 9999:
229 1a. to make <=X-r9999 reliable,
230 1b. to prevent pathological uses of revision as date.
231
232 2. I think we could use a policy to make >=X_alpha reliable. However, I
233 have no clue how to word it without making it weird and artificially
234 restricting valid version numbers.
235
236 What do you think?
237
238 --
239 Best regards,
240 Michał Górny
241 <http://dev.gentoo.org/~mgorny/>

Replies