Gentoo Archives: gentoo-portage-dev

From: Zac Medico <zmedico@g.o>
To: "Michał Górny" <mgorny@g.o>, gentoo-portage-dev@l.g.o
Cc: Zac Medico <zmedico@g.o>
Subject: Re: [gentoo-portage-dev] [PATCH v2] movefile: support in-kernel file copying on Linux (bug 607868)
Date: Sat, 04 Mar 2017 02:36:52
Message-Id: 11fd4204-1727-b899-231c-dcce5ea6bc4a@gentoo.org
In Reply to: Re: [gentoo-portage-dev] [PATCH v2] movefile: support in-kernel file copying on Linux (bug 607868) by "Michał Górny"
1 On 03/03/2017 12:45 PM, Michał Górny wrote:
2 > W dniu 02.03.2017, czw o godzinie 17∶09 -0800, użytkownik Zac Medico
3 >> +def copyfile(src, dst):
4 >> + """
5 >> + Copy the contents (no metadata) of the file named src to a file
6 >> + named dst.
7 >> +
8 >> + If possible, copying is done within the kernel, and uses
9 >> + "copy acceleration" techniques (such as reflinks). This also
10 >> + supports sparse files.
11 >> +
12 >> + @param src: path of source file
13 >> + @type src: str
14 >> + @param dst: path of destination file
15 >> + @type dst: str
16 >> + """
17 >> + global _copyfile
18 >> +
19 >> + if _copyfile is None:
20 >> + if _file_copy is None:
21 >> + _copyfile = shutil.copyfile
22 >> + else:
23 >> + _copyfile = _optimized_copyfile
24 >
25 > I dare say the caching logic takes more time than the actual logic.
26 > However, this could be made even simpler -- you could just check
27 > the condition in global scope and set _copyfile value there.
28
29 In v3 it's like this:
30
31 if _file_copy is None:
32 copyfile = shutil.copyfile
33 else:
34 copyfile = _optimized_copyfile
35
36 >> +static off_t
37 >> +do_lseek(int fd_out, int fd_in, loff_t *off_out) {
38 >
39 > Maybe name it do_lseek_data() to make the goal clearer? Some function
40 > docs in comments would also be helpful.
41
42 Renamed to do_lseek_data v3.
43
44 >> +
45 >> + if (!(error && errno == EINTR && PyErr_CheckSignals() == 0))
46 >> + eintr_retry = 0;
47 >
48 > To be honest, I can't do it but it'd be really a good idea if someone
49 > could analyze all the code paths where the code above could give 'error
50 > = 1' and ensure that:
51 >
52 > a. there is no case when we error=1 and leave errno unmodified/unset
53 > (i.e. can accidentally leave earlier EINTR there),
54
55 In v3, I set error = errno immediately after every function call that
56 fails. That makes it really easy to see exactly which function calls the
57 errno value is picked up from.
58
59 > b. we can correctly recover from any EINTR, i.e. if EINTR causes some
60 > function to be interrupted in the middle, our next iteration starts
61 > safely.
62
63 The state is mostly tracked by the offset_out variable, which makes it
64 hard to screw up. We just have to watch that variable carefully, and
65 make sure we don't corrupt its state. I added in a couple of extra lseek
66 calls to ensure that the fd_in file offset is always in sync with the
67 offset_out variable when we resume from EINTR.
68
69 >> + }
70 >> +
71 >> + free(buf);
72 >
73 > if (buf != NULL) ?
74
75 The free(3) documentation says we're allowed to pass in a null pointer,
76 but I've added a buf != NULL check for readability.
77 --
78 Thanks,
79 Zac