Gentoo Archives: gentoo-dev

From: "Michał Górny" <mgorny@g.o>
To: gentoo-dev@l.g.o
Cc: "Michał Górny" <mgorny@g.o>
Subject: [gentoo-dev] [PATCH v3 2/3] glep-0074: Specify supported hash algorithms
Date: Sun, 18 Sep 2022 18:32:28
Message-Id: 20220918183139.1534979-3-mgorny@gentoo.org
In Reply to: [gentoo-dev] [PATCH v3 0/3] glep-0074: Explicitly specify hashes and compressed Manifest formats by "Michał Górny"
1 Replace the informational hash name section with a formal specification
2 of allowed hash algorithms. The original reasoning for leaving them
3 implementation-defined was poor. After all, not a single new hash
4 was added since the initial version of the GLEP. At the same time,
5 ensuring consistent support for at least a minimal set of hash
6 algorithms is crucial to interoperability. Given that the effort needed
7 to update the GLEP is relatively small, it is better to require all
8 algorithms to be formally listed than to have to track all
9 implementations for new hashes and hope for consistency.
10
11 Signed-off-by: Michał Górny <mgorny@g.o>
12 ---
13 glep-0074.rst | 177 ++++++++++++++++++++++++++++++++++++--------------
14 1 file changed, 127 insertions(+), 50 deletions(-)
15
16 diff --git a/glep-0074.rst b/glep-0074.rst
17 index 8cec618..5a63f70 100644
18 --- a/glep-0074.rst
19 +++ b/glep-0074.rst
20 @@ -6,7 +6,7 @@ Author: Michał Górny <mgorny@g.o>,
21 Ulrich Müller <ulm@g.o>
22 Type: Standards Track
23 Status: Final
24 -Version: 1.2
25 +Version: 1.3
26 Created: 2017-10-21
27 Last-Modified: 2022-09-11
28 Post-History: 2017-10-26, 2017-11-16, 2018-02-08, 2022-09-08, 2022-09-11
29 @@ -26,6 +26,9 @@ efficient and provide means of backwards compatibility.
30 Changes
31 =======
32
33 +v1.3
34 + Formally specified the current set of hash algorithms supported.
35 +
36 v1.2
37 Specified the newline convention used for Manifests.
38
39 @@ -364,27 +367,60 @@ up to and including the *original* directory. Note that those
40 sub-Manifests can use different filenames than ``Manifest``.
41
42
43 -Checksum algorithms (informational)
44 ------------------------------------
45 -
46 -This section is informational only. Specifying the exact set
47 -of supported algorithms is outside the scope of this specification.
48 -
49 -The algorithm names reserved at the time of writing are:
50 -
51 -- ``MD5`` [#MD5]_,
52 -- ``RMD160`` -- RIPEMD-160 [#RIPEMD160]_,
53 -- ``SHA1`` [#SHS]_,
54 -- ``SHA256`` and ``SHA512`` -- SHA-2 family of hashes [#SHS]_,
55 -- ``WHIRLPOOL`` [#WHIRLPOOL]_,
56 -- ``BLAKE2B`` and ``BLAKE2S`` -- BLAKE2 family of hashes [#BLAKE2]_,
57 -- ``SHA3_256`` and ``SHA3_512`` -- SHA-3 family of hashes [#SHA3]_,
58 -- ``STREEBOG256`` and ``STREEBOG512`` -- Streebog family of hashes
59 - [#STREEBOG]_.
60 -
61 -The method of introducing new hashes is defined by GLEP 59 [#GLEP59]_.
62 -It is recommended that any new hashes are named after the Python
63 -``hashlib`` module algorithm names, transformed into uppercase.
64 +Checksum algorithms
65 +-------------------
66 +
67 +.. table:: Table 1. Defined hash algorithms
68 + :widths: auto
69 +
70 + +-----------------+-----------------------+------+------+-------------+
71 + | Name | Specification | Bits | Enc. | Notes |
72 + +=================+=======================+======+======+=============+
73 + | ``BLAKE2B`` | | 512 | Hex | Recommended |
74 + +-----------------+ RFC 7693 [#RFC7693]_ +------+------+-------------+
75 + | ``BLAKE2S`` | | 256 | Hex | |
76 + +-----------------+-----------------------+------+------+-------------+
77 + | ``MD5`` | RFC 1321 [#RFC1321]_ | 128 | Hex | Deprecated |
78 + +-----------------+-----------------------+------+------+-------------+
79 + | ``RMD160`` | RIPEMD-160 [#RMD160]_ | 160 | Hex | |
80 + +-----------------+-----------------------+------+------+-------------+
81 + | ``SHA1`` | | 160 | Hex | Deprecated |
82 + +-----------------+ +------+------+-------------+
83 + | ``SHA256`` | FIPS 180-4 [#SHS]_ | 256 | Hex | |
84 + +-----------------+ +------+------+-------------+
85 + | ``SHA512`` | | 512 | Hex | Recommended |
86 + +-----------------+-----------------------+------+------+-------------+
87 + | ``SHA3_256`` | | 256 | Hex | |
88 + +-----------------+ FIPS 202 [#SHA3]_ +------+------+-------------+
89 + | ``SHA3_512`` | | 512 | Hex | |
90 + +-----------------+-----------------------+------+------+-------------+
91 + | ``STREEBOG256`` | | 256 | Hex | |
92 + +-----------------+ RFC 6986 [#RFC6986]_ +------+------+-------------+
93 + | ``STREEBOG512`` | | 512 | Hex | |
94 + +-----------------+-----------------------+------+------+-------------+
95 + | ``WHIRLPOOL`` | Whirlpool [#BARRETO]_ | 512 | Hex | |
96 + +-----------------+-----------------------+------+------+-------------+
97 +
98 +Any new hashes must be added to this specification prior to being used
99 +in Manifest files. Adding a new hash is considered
100 +a backwards-compatible change to the GLEP. It is recommended that new
101 +hashes are named after the Python ``hashlib`` module algorithm names,
102 +transformed into uppercase, with dashes replaced by underscores.
103 +
104 +An implementation can implement an arbitrary subset of the listed
105 +hashes. For best interoperability, it should implement at least
106 +recommended hashes. If deprecated hashes are implemented, it is
107 +preferable to disallow their use by default.
108 +
109 +If an entry specifies multiple hashes using different algorithms,
110 +an implementation may choose to verify an arbitrary subset of them.
111 +However, should any tested hash yield a mismatch, the verification must
112 +fail.
113 +
114 +If a particular hash is either unsupported or unknown,
115 +the implementation can either ignore it or report a failure. However,
116 +at least one algorithm specified for a particular entry must be
117 +supported for the verification to succeed.
118
119
120 Manifest compression
121 @@ -498,6 +534,43 @@ for a package directory would have the following content::
122 DIST sphinxtrain-1.0.8.tar.gz 8925803 SHA256 548e.. SHA512 465d..
123
124
125 +Security considerations (informational)
126 +---------------------------------------
127 +
128 +The Manifest files are text files that are transmitted as part of larger
129 +file sets in order to provide integrity and authenticity verification
130 +for other files. They are primarily intended to be processed locally
131 +to verify transferred files. They are commonly used along with the rsync
132 +protocol and inside tar archives.
133 +
134 +The format does not provide support for executable content,
135 +nor the ability to issue network requests. Its security is primarily
136 +considered in context of opening and reading local files for the purpose
137 +of computing hashes.
138 +
139 +Depending on the delivery method, it may be possible to include special
140 +files and symbolic links in the verified file set. Attempting to read
141 +special files (e.g. named pipes or devices like ``/dev/urandom``) could
142 +cause the tools to hang or enter an infinite loop. The specification
143 +explicitly requires implementations to verify the file type and reject
144 +processing non-regular files.
145 +
146 +The use of symbolic links permits computing checksums for arbitrary
147 +paths, including files with potentially sensitive content and files
148 +on special filesystems such as the ``/proc`` filesystem. Reading these
149 +files should not comprise an immediate risk, nor displaying checksum
150 +mismatches to the local risk. However, there is a risk of exposing
151 +sensitive information if the user reports checksum failures.
152 +Implementations can take steps to reduce the risk, e.g. by minimalizing
153 +the amount of information reported on checksum mismatches and warning
154 +about symbolic links.
155 +
156 +
157 +
158 +Portability considerations (informational)
159 +------------------------------------------
160 +
161 +
162 Rationale
163 =========
164
165 @@ -913,23 +986,25 @@ tool working with this Manifest format.
166 Hash algorithms
167 ---------------
168
169 -While maintaining a consistent supported hash set is important
170 -for interoperability, it is not a good fit for the generic layout
171 -of this GLEP. Furthermore, it would require updating the GLEP
172 -in the future every time the used algorithms change.
173 +Originally, this GLEP did not formally specify the complete set of hash
174 +algorithms. Instead, it only listed (informationally) the names already
175 +used at the time of writing. Since enforcing consistent use of algorithm
176 +names is important for interoperability, this was changed in version
177 +1.3.
178
179 -Instead, the specification focuses on listing the currently used
180 -algorithm names for interoperability, and sets a recommendation
181 -for consistent naming of algorithms in the future. The Python
182 -``hashlib`` module is used as a reference since it is used
183 -as the provider of hash functions for most of the Python software,
184 -including Portage and PkgCore.
185 +Since the effort needed to update the GLEP is small compared to the time
186 +needed for a new hash algorithm to be well-deployed, the GLEP needs
187 +to be updated prior to adding a new hash method.
188
189 -The basic rules for changing hash algorithms are defined in GLEP 59
190 -[#GLEP59]_. The implementations can focus only on those algorithms
191 -that are actually used or planned on being used. It may be feasible
192 -to devise a new GLEP that specifies the currently used hashes (or update
193 -GLEP 59 accordingly).
194 +The recommended naming is based off Python ``hashlib`` module,
195 +as most of the Gentoo tooling is written in Python. The names
196 +are transformed to match the historical naming used for hash functions
197 +in Manifests.
198 +
199 +Implementations are allowed to support and use only a subset of hashes
200 +listed in Manifest files. This could be used both to avoid the overhead
201 +of computing multiple hashes on non-performant systems, and to handle
202 +transition to new hashes gracefully.
203
204
205 Manifest compression
206 @@ -1072,27 +1147,29 @@ References
207 .. [#FILE-NAMING-RULES] Ebuild File Format -- Gentoo Development Guide
208 (https://devmanual.gentoo.org/ebuild-writing/file-format/#file-naming-rules)
209
210 -.. [#MD5] RFC1321: The MD5 Message-Digest Algorithm
211 - (https://www.ietf.org/rfc/rfc1321.txt)
212 +.. [#RFC7693] RFC 7693: The BLAKE2 Cryptographic Hash and Message Authentication
213 + Code (MAC)
214 + (https://www.rfc-editor.org/rfc/rfc7693)
215 +
216 +.. [#RFC1321] RFC 1321: The MD5 Message-Digest Algorithm
217 + (https://www.rfc-editor.org/rfc/rfc1321)
218
219 -.. [#RIPEMD160] The hash function RIPEMD-160
220 +.. [#RMD160] The hash function RIPEMD-160
221 (https://homes.esat.kuleuven.be/~bosselae/ripemd160.html)
222
223 .. [#SHS] FIPS PUB 180-4: Secure Hash Standard (SHS)
224 - (http://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.180-4.pdf)
225 -
226 -.. [#WHIRLPOOL] The WHIRLPOOL Hash Function
227 - (http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html)
228 -
229 -.. [#BLAKE2] BLAKE2 -- fast secure hashing
230 - (https://blake2.net/)
231 + (https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.180-4.pdf)
232
233 .. [#SHA3] FIPS PUB 202: SHA-3 Standard: Permutation-Based Hash
234 and Extendable-Output Functions
235 - (http://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.202.pdf)
236 + (https:://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.202.pdf)
237 +
238 +.. [#RFC6986] RFC 6986: GOST R 34.11-2012: Hash Function
239 + (https://www.rfc-editor.org/rfc/rfc6986)
240
241 -.. [#STREEBOG] GOST R 34.11-2012: Streebog Hash Function
242 - (https://www.streebog.net/)
243 +.. [#BARRETO] Paulo S. L. M. Barreto, The WHIRLPOOL Hash Function
244 + (archived at 2017-11-29)
245 + (https://web.archive.org/web/20171129084214/http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html)
246
247 .. [#C08] Cappos, J et al. (2008). "Attacks on Package Managers"
248 (https://www2.cs.arizona.edu/stork/packagemanagersecurity/attacks-on-package-managers.html)
249 --
250 2.37.3