Gentoo Archives: gentoo-portage-dev

From: Brian Harring <ferringb@g.o>
To: gentoo-portage-dev@l.g.o
Subject: Re: [gentoo-portage-dev] [5/7] portage_exec cleanups
Date: Sat, 29 Oct 2005 16:05:47
Message-Id: 20051029160410.GO24883@nightcrawler
In Reply to: Re: [gentoo-portage-dev] [5/7] portage_exec cleanups by Jason Stubbs
1 On Sun, Oct 30, 2005 at 12:25:23AM +0900, Jason Stubbs wrote:
2 > On Saturday 29 October 2005 19:56, Jason Stubbs wrote:
3 > > All file descriptors opened by spawn() are now closed in the correct
4 > > places. The code that closes all unnecessary descriptors via brute force is
5 > > no longer required.
6 > >
7 > > I found that a file descriptor was being left open and passed around.
8 > > However, it was definately not created in spawn(). It seems to me though
9 > > that code outside of spawn should take care of closing (or telling spawn to
10 > > close) any unnecessary FDs.
11 >
12 >
13 > On Saturday 29 October 2005 23:22, Brian Harring wrote:
14 > > With the fd_unused route, there is no way clean way to have all fd's
15 > > closed. You don't know what fd's are open when you spawn (nor is
16 > > there any way to know, beyond testing each potential fd, or overriding
17 > > the file builtin); going the fd_unused route means fd's the process
18 > > shouldn't have access to, will have access to.
19 >
20 > This is a difference of viewpoint, I think. Having unknown fds left open by
21 > the calling code seems unclean to me. Iterating through 100s or 1000s of fds
22 > and attempting to close them just in case the caller was sloppy seems equally
23 > unclean.
24
25 Arbitrary example as to why the brute force is required
26
27 class user_source:
28 def __init__(self):
29 self.fd = open("/etc/passwd", "r")
30 def __iter__(self):
31 return self
32 def next(self):
33 l = self.fd.next()
34 return l.split(":",1)[0]
35
36 from portage_exec import spawn_bash
37 def process_users(users):
38 for x in users:
39 spawn_bash("echo %s" % x, fd_pipes={1:1, 2:2})
40
41 if __name__ == "__main__":
42 process_users(user_source())
43
44 I'd have to modify the process_users func to handle passing in a
45 potentially unneeded list of fd's to close- if users were a list, I'm
46 cluttering up process_user's signature specializing for once potential
47 source, since the process_users function is designed to take an
48 iterable object (any iterable object).
49
50 Example above doesn't cover the rather nasty cases where there is N
51 func calls involved between where the fd is opened, and the actual
52 spawn call. The worst scenario is where there is >1 thread running-
53 you're stuck using globals there, which is pretty ugly.
54
55 It's pretty much best practice to close what you don't explicitly need
56 when handling total control over to another process, good daemonizing code
57 does the same thing due to the reasons I raised above.
58
59
60 > By the way, this method does not allow having a pipe open to a child process
61 > other than stdin, stdout or stderr.
62
63 It does; the portage_exec that is in stable is the modified version I
64 created for ebd, which uses this exact functionality.
65
66 line #153 of trunk/pym/ebuild.py, example call-
67
68 self.pid = spawn_func(self.ebd+" daemonize", fd_pipes={0:0, 1:1, 2:2,
69 3:cread, 4:dwrite}, returnpid=True,env=env, *args, **spawn_opts)[0]
70
71 Diffing between stable and trunk portage_exec, no changes affect fd
72 handling (not surprising, the code is nasty without your reworking),
73 so please clarify.
74
75 Addendum though... where the implicit fd closing breaks down is in
76 usage of spawn_func, which is a trunk func only, used for implementing
77 parallel-fetch, due to issues with insuring that the fetching process
78 isn't screwing with the normal sequential buildplan execution (nice
79 way of saying it was something of a hack). Plus side, seperate
80 process, it's not bound by the gil, so it sidestepped that trickery.
81
82
83 > > The original code is ugly, and knowing a bit more this time around I'd
84 > > write it a bit differently, that said the protections/behaviour
85 > > originally in it should be left- it's more robust, and is expected
86 > > now.
87 >
88 > I disagree here. Presently, other than the execve call, every try/except block
89 > in portage_exec.py has bugs that are hidden by the "protections".
90
91 Original in this case was referencing the fd_pipes additions, not the
92 true original spawn func :)
93
94 Most of that code could be converted over to OSError catches anyways,
95 although doing so requires testing the piss out of it to make sure no
96 regressions are introduced (shouldn't be possible, but don't care to
97 screw up the subsystem).
98
99 Looking through the 2.4 subprocess module (which came along after the
100 mods in question), adding an option controlling the fd cleansing would
101 make sense, as long as the default option is to close the fds.
102
103 That's subjective, but the examples I gave above show (imo at least)
104 why this is needed; it's anal and *will* expose bugs rather then
105 letting them lie, but that's better from where I sit.
106 ~harring

Replies

Subject Author
Re: [gentoo-portage-dev] [5/7] portage_exec cleanups Jason Stubbs <jstubbs@g.o>