Gentoo Archives: gentoo-portage-dev

From: Brian Dolbec <dolsen@g.o>
To: gentoo-portage-dev@l.g.o
Subject: Re: [gentoo-portage-dev] [PATCH v5] file_copy: use sendfile return value to measure bytes copied (bug 635126)
Date: Wed, 01 Nov 2017 16:31:37
Message-Id: 20171101093134.1d7fb3c9@professor-x
In Reply to: [gentoo-portage-dev] [PATCH v5] file_copy: use sendfile return value to measure bytes copied (bug 635126) by Zac Medico
1 On Tue, 24 Oct 2017 11:41:37 -0700
2 Zac Medico <zmedico@g.o> wrote:
3
4 > The sendfile *offset parameter refers to the input file offest, so
5 > it cannot be used in the same way as the copy_file_range *off_out
6 > parameter. Therefore, add sf_wrapper function which implements the
7 > *off_out behavior for sendfile.
8 >
9 > Also update cfr_wrapper so that it does not rely on the fd_in file
10 > offset, and remove corresponding fd_in lseek calls which are no
11 > longer needed.
12 >
13 > The file offset of fd_in is now completely unused, except in the
14 > plain read/write loop, where lseek is called prior to entering
15 > the loop.
16 >
17 > Bug: https://bugs.gentoo.org/635126
18 > ---
19 > [PATCH v5] eliminates all reliance on the file offset of fd_in,
20 > except in the plain read/write loop, where lseek is called prior
21 > to entering the loop.
22 >
23 > src/portage_util_file_copy_reflink_linux.c | 88
24 > ++++++++++++++++++++---------- 1 file changed, 59 insertions(+), 29
25 > deletions(-)
26 >
27 > diff --git a/src/portage_util_file_copy_reflink_linux.c
28 > b/src/portage_util_file_copy_reflink_linux.c index
29 > 4be9e0568..4c3303181 100644 ---
30 > a/src/portage_util_file_copy_reflink_linux.c +++
31 > b/src/portage_util_file_copy_reflink_linux.c @@ -56,12 +56,18 @@
32 > initreflink_linux(void)
33 > /**
34 > * cfr_wrapper - A copy_file_range syscall wrapper function, having a
35 > - * function signature that is compatible with sendfile.
36 > + * function signature that is compatible with sf_wrapper.
37 > * @fd_out: output file descriptor
38 > * @fd_in: input file descriptor
39 > - * @off_out: offset of the output file
40 > + * @off_out: must point to a buffer that specifies the starting
41 > offset
42 > + * where bytes will be copied to fd_out, and this buffer is adjusted
43 > by
44 > + * the number of bytes copied.
45 > * @len: number of bytes to copy between the file descriptors
46 > *
47 > + * Bytes are copied from fd_in starting from *off_out, and the file
48 > + * offset of fd_in is not changed. Effects on the file offset of
49 > + * fd_out are undefined.
50 > + *
51 > * Return: Number of bytes written to out_fd on success, -1 on
52 > failure
53 > * (errno is set appropriately).
54 > */
55 > @@ -69,7 +75,8 @@ static ssize_t
56 > cfr_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
57 > {
58 > #ifdef __NR_copy_file_range
59 > - return syscall(__NR_copy_file_range, fd_in, NULL, fd_out,
60 > + off_t off_in = *off_out;
61 > + return syscall(__NR_copy_file_range, fd_in, &off_in, fd_out,
62 > off_out, len, 0);
63 > #else
64 > /* This is how it fails at runtime when the syscall is not
65 > supported. */ @@ -79,18 +86,50 @@ cfr_wrapper(int fd_out, int fd_in,
66 > off_t *off_out, size_t len) }
67 >
68 > /**
69 > + * sf_wrapper - A sendfile wrapper function, having a function
70 > signature
71 > + * that is compatible with cfr_wrapper.
72 > + * @fd_out: output file descriptor
73 > + * @fd_in: input file descriptor
74 > + * @off_out: must point to a buffer that specifies the starting
75 > offset
76 > + * where bytes will be copied to fd_out, and this buffer is adjusted
77 > by
78 > + * the number of bytes copied.
79 > + * @len: number of bytes to copy between the file descriptors
80 > + *
81 > + * Bytes are copied from fd_in starting from *off_out, and the file
82 > + * offset of fd_in is not changed. Effects on the file offset of
83 > + * fd_out are undefined.
84 > + *
85 > + * Return: Number of bytes written to out_fd on success, -1 on
86 > failure
87 > + * (errno is set appropriately).
88 > + */
89 > +static ssize_t
90 > +sf_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
91 > +{
92 > + ssize_t ret;
93 > + off_t off_in = *off_out;
94 > + /* The sendfile docs do not specify behavior of the output file
95 > + * offset, therefore it must be adjusted with lseek.
96 > + */
97 > + if (lseek(fd_out, *off_out, SEEK_SET) < 0)
98 > + return -1;
99 > + ret = sendfile(fd_out, fd_in, &off_in, len);
100 > + if (ret > 0)
101 > + *off_out += ret;
102 > + return ret;
103 > +}
104 > +
105 > +
106 > +/**
107 > * do_lseek_data - Adjust file offsets to the next location
108 > containing
109 > * data, creating sparse empty blocks in the output file as needed.
110 > * @fd_in: input file descriptor
111 > * @fd_out: output file descriptor
112 > * @off_out: offset of the output file
113 > *
114 > - * Use lseek SEEK_DATA to adjust the fd_in file offset to the next
115 > - * location containing data, and adjust the fd_in file offset and
116 > - * off_out to the same location (creating sparse empty blocks as
117 > - * needed). On success, both fd_in and fd_out file offsets are
118 > - * guaranteed to be exactly equal to the value that off_out points
119 > to.
120 > - *
121 > + * Use lseek SEEK_DATA to adjust off_out to the next location from
122 > fd_in
123 > + * containing data (creates sparse empty blocks when appropriate).
124 > Effects
125 > + * on file offsets are undefined.
126 > + *
127 > * Return: On success, the number of bytes to copy before the next
128 > hole,
129 > * and -1 on failure (errno is set appropriately). Returns 0 when
130 > fd_in
131 > * reaches EOF.
132 > @@ -250,7 +289,7 @@ _reflink_linux_file_copy(PyObject *self, PyObject
133 > *args)
134 > * syscall is not available (less than Linux
135 > 4.5). */
136 > error = 0;
137 > - copyfunc = sendfile;
138 > + copyfunc = sf_wrapper;
139 > copyfunc_ret = copyfunc(fd_out,
140 > fd_in,
141 > &offset_out,
142 > @@ -284,27 +323,18 @@ _reflink_linux_file_copy(PyObject *self,
143 > PyObject *args) } else {
144 > stat_in_acquired = 1;
145 >
146 > - /* For the sendfile call, the fd_in file offset must
147 > be
148 > - * exactly equal to offset_out. Use lseek to ensure
149 > - * correct state, in case an EINTR retry caused it to
150 > - * get out of sync somewhow.
151 > - */
152 > - if (lseek(fd_in, offset_out, SEEK_SET) < 0) {
153 > - error = errno;
154 > - } else {
155 > - while (offset_out < stat_in.st_size) {
156 > - copyfunc_ret = sendfile(fd_out,
157 > - fd_in,
158 > - &offset_out,
159 > - stat_in.st_size -
160 > offset_out);
161 > + while (offset_out < stat_in.st_size) {
162 > + copyfunc_ret = sf_wrapper(fd_out,
163 > + fd_in,
164 > + &offset_out,
165 > + stat_in.st_size -
166 > offset_out);
167 > - if (copyfunc_ret < 0) {
168 > - error = errno;
169 > - if (errno == EINVAL && !offset_out) {
170 > - sendfile_works = 0;
171 > - }
172 > - break;
173 > + if (copyfunc_ret < 0) {
174 > + error = errno;
175 > + if (errno == EINVAL && !offset_out) {
176 > + sendfile_works = 0;
177 > }
178 > + break;
179 > }
180 > }
181 > }
182
183
184 Looks good, has been checked by chutzpah
185
186 --
187 Brian Dolbec <dolsen>

Replies