Gentoo Archives: gentoo-dev

From: Gordon Pettey <petteyg359@×××××.com>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] [pre-GLEP] Split distfile mirror directory structure
Date: Sat, 27 Jan 2018 16:43:10
Message-Id: CAHY5Mecb7PPGjG+f0YmEjSmwuE+Wh9FsNg6BMWQUMnXRz7zw9g@mail.gmail.com
In Reply to: Re: [gentoo-dev] [pre-GLEP] Split distfile mirror directory structure by "Michał Górny"
1 Why not use a hash of the file name instead of its contents? That
2 seems like it would be much simpler, and that's not going to reduce
3 the output space for balance...
4
5 On Sat, Jan 27, 2018 at 5:41 AM, Michał Górny <mgorny@g.o> wrote:
6 > W dniu sob, 27.01.2018 o godzinie 11∶36 +0000, użytkownik Roy Bamford
7 > napisał:
8 >> On 2018.01.27 08:30, Michał Górny wrote:
9 >> > W dniu pią, 26.01.2018 o godzinie 20∶48 -0500, użytkownik Michael
10 >> > Orlitzky napisał:
11 >> > > On 01/26/2018 06:24 PM, Michał Górny wrote:
12 >> > > >
13 >> > > > The alternate option of using file hash has the advantage of
14 >> >
15 >> > having
16 >> > > > a more balanced split. Furthermore, since hashes are stored
17 >> > > > in Manifests using them is zero-cost. However, this solution has
18 >> >
19 >> > two
20 >> > > > significant disadvantages:
21 >> > > >
22 >> > > > 1. The hash values are unknown for newly-downloaded distfiles, so
23 >> > > > ``repoman`` (or an equivalent tool) would have to use a
24 >> >
25 >> > temporary
26 >> > > > directory before locating the file in appropriate subdirectory.
27 >> > > >
28 >> > > > 2. User-provided distfiles (e.g. for fetch-restricted packages)
29 >> >
30 >> > with
31 >> > > > hash mismatches would be placed in the wrong subdirectory,
32 >> > > > potentially causing confusing errors.
33 >> > > >
34 >> > >
35 >> > > The filename proposal sounds fine, so this is only academic, but:
36 >> >
37 >> > are
38 >> > > these two points really disadvantages?
39 >> > >
40 >> > > What are we worried about in using a temporary directory? Copying
41 >> >
42 >> > across
43 >> > > filesystem boundaries? Except in rare cases, $DISTDIR itself will be
44 >> > > usable a temporary location (on the same filesystem), won't it?
45 >> >
46 >> > Why add the extra complexity when there's no need for one? Note that
47 >> > there's also the problem of resuming transfers, so in the end we're
48 >> > talking about permanent temporary directory where we keep unfinished
49 >> > transfers.
50 >> >
51 >> > > For the second point, portage is going to tell me where to put the
52 >> >
53 >> > file,
54 >> > > isn't it? Then no matter what garbage I download, won't portage look
55 >> >
56 >> > for
57 >> > > it in the right place, because where-to-put-it is determined using
58 >> >
59 >> > the
60 >> > > same manifest hash that determines where-to-find-it?
61 >> >
62 >> > No, it won't. Why would it? You're going to call something like:
63 >> >
64 >> > edistadd foo.tar.gz bar.tar.gz
65 >> >
66 >> > ...and it will place the files in the right subdirectories.
67 >> >
68 >> > --
69 >> > Best regards,
70 >> > Michał Górny
71 >> >
72 >> >
73 >> >
74 >> >
75 >>
76 >> Michał,
77 >>
78 >> How does this work for fetch restricted files and finding other files
79 >> no longer on the mirrors?
80 >>
81 >> Its no longer a download and move it to $DISTFILES, or is it?
82 >> Whatever it is, users will need to do it unless files in $DISTFILES
83 >> are accepted by package managers if they are not found in the main
84 >> structure.
85 >
86 > I've just answered that, and it's in the GLEP also. There will be
87 > a helper tool to make this easy. Furthermore, I think we may even make
88 > Portage keep accepting both locations indefinitely.
89 >
90 > As for finding files in your distdir, there's no reason why plain:
91 >
92 > find -name 'foo.tar.gz'
93 >
94 > wouldn't work.
95 >
96 > --
97 > Best regards,
98 > Michał Górny
99 >
100 >

Replies