Gentoo Archives: gentoo-dev

From:	Richard Yao <ryao@g.o>
To:	gentoo-dev@l.g.o
Subject:	*Re: [gentoo-dev] Re: [gentoo-kernel] Proper distribution integration of kernel -sources, patches and configuration.**
Date:	Tue, 02 Jul 2013 01:44:55
Message-Id:	`51D23086.4040300@gentoo.org`
In Reply to:	Re: [gentoo-dev] Re: [gentoo-kernel] Proper distribution integration of kernel *-sources, patches and configuration. by Richard Yao

1	On 07/01/2013 09:36 PM, Richard Yao wrote:
2	> On 07/01/2013 03:23 PM, Greg KH wrote:
3	>> On Mon, Jul 01, 2013 at 08:45:16PM +0200, Tom Wijsman wrote:
4	>>>>> Q: What about my stable server? I really don't want to run this
5	>>>>> stuff!
6	>>>>>
7	>>>>> A: These options would depend on !CONFIG_VANILLA or
8	>>>>> CONFIG_EXPERIMENTAL
9	>>>>
10	>>>> What is CONFIG_VANILLA? I don't see that in the upstream kernel tree
11	>>>> at all.
12	>>>>
13	>>>> CONFIG_EXPERIMENTAL is now gone from upstream, so you are going to
14	>>>> have a problem with this.
15	>>>
16	>>> Earlier I mentioned "2) These feature should depend on a non-vanilla /
17	>>> experimental option." which is an option we would introduce under the
18	>>> Gentoo distribution menu section.
19	>>
20	>> Distro-specific config options, great :(
21	>>
22	>>>>> which would be disabled by default, therefore if you keep this
23	>>>>> option the way it is on your stable server; it won't affect you.
24	>>>>
25	>>>> Not always true. Look at aufs as an example. It patches the core
26	>>>> kernel code in ways that are _not_ accepted upstream yet. Now you all
27	>>>> are running that modified code, even if you don't want aufs.
28	>>>
29	>>> Earlier I mentioned "3) The patch should not affect the build by
30	>>> default."; if it does, we have to adjust it to not do that, this is
31	>>> something that can be easily scripted. It's just a matter of embedding
32	>>> each + block in the diff with a config check and updating the counts.
33	>>
34	>> Look at aufs as a specific example of why you can't do that, otherwise,
35	>> don't you think that the aufs developer(s) wouldn't have done so?
36	>
37	> I am accquainted with the developer of a stackable filesystem developer.
38
39	I should probably proofread multiple times before I send emails. Anyway,
40	that should have been:
41
42	> I am acquainted with the developer of a stackable filesystem.
43
44	> According to what he has told me in person offline, the developers on
45	> the LKML cannot decide on how a stackable filesystem should be
46	> implemented. I was told three different variations on the design that
47	> some people liked and others didn't, which ultimately kept the upstream
48	> kernel from adopting anything. I specifically recall two variations,
49	> which were doing it as part of the VFS and doing it as part of ext4. If
50	> you want to criticize stackable filesystems, would you lay out a
51	> groundwork for getting one implemented upon which people will agree?
52	>
53	>> The goal of "don't touch any other kernel code" is a very good one, but
54	>> not always true for these huge out-of-tree kernel patches. Usually that
55	>> is the main reason why these patches aren't merged upstream, because
56	>> those changes are not acceptable.
57	>
58	> I was under the impression that there were several reasons for patches
59	> not being merged upstream:
60	>
61	> 1. Lack of signed-off
62	> 2. Code drop that no one will maintain
63	> 3. Subsystem maintainers saying no simply because they do not like
64	> <insert non-technical reason here>.
65	> 4. Risk of patent trolls
66	> 5. Actual technical reasons
67	>
68	>> So be very careful here, you are messing with things that are rejected
69	>> by upstream.
70	>>
71	>> greg k-h
72	>>
73	>
74	> Only some of the patches were rejected. Others were never submitted. The
75	> PaX/GrSecurity developers prefer their code to stay out-of-tree. As one
76	> of the people hacking on ZFSOnLinux, I prefer that the code be
77	> out-of-tree. That is because fixes for other filesystems are either held
78	> back by a lack of system kernel updates or held hostage by regressions
79	> in newer kernels on certain hardware.
80	>
81	> With that said, being in Linus' tree does not make code fall under some
82	> golden standard for quality. There are many significant issues in code
83	> committed to Linus' the kernel, some of which have been problems for
84	> years. Just to name a few:
85	>
86	> 1. Doing `rm -r /dir` on a directory tree containing millions of inodes
87	> (e.g. ccache) on an ext4 filesystem mounted with discard with the CFQ IO
88	> elevator will cause a system to hang for hours on pre-SATA 3.1 hardware.
89	> This is because TRIM is a non-queued command and is being interleaved
90	> with writes for "fairness". Incidentally, using noop turns a multiple
91	> hour hang into a laggy experience of a few minutes.
92	>
93	> 2. aio_sync() is unimplemented, which means that there is no sane way
94	> for userland software like QEMU and TGT to be both fast and guarantee
95	> data integrity. A single crash and your guest is corrupted. It would
96	> have been better had AIO never been implemented.
97	>
98	> 3. dm-crypt will reorder write requests across flushes. That is because
99	> upon seeing a write, it sends it to a work queue to be processed
100	> asynchronously and upon seeing a flush, it immediately processes it. A
101	> single kernel panic or sudden power loss can damage filesystems stored
102	> on it.
103	>
104	> 4. Under low memory conditions with hundreds of concurrent threads (e.g.
105	> package builds), every thread will enter direct reclaim and there will
106	> be a remarkable drop in system throughput, assuming that the system does
107	> not lockup. There is a fairly substantial amount of time wasted after
108	> one thread finishes direct reclaim in other threads because they will
109	> still be performing direct reclaim afterward.
110	>
111	> 5. The Linux 3.7 nouveau rewrite broke kexec support. The graphics
112	> hardware will not reinitialize properly.
113	>
114	> 6. A throttle mechanism introduced for memory cgroups can cause the
115	> system to deadlock whenever it is holding a lock needed for swap and
116	> enters direct reclaim with a significant number of dirty pages.
117	>
118	> 7. Code has been accepted on multiple occasions that does not compile
119	> and the build failures persist for weeks if not months after Linus' tag.
120	> I sent a patch to fix one failure. It was rejected because I had fixed
121	> code to compile with -Werror, people thought that -Werror should be
122	> removed (and therefore was no reason to fix the warnings) and we went 2
123	> months until someone wrote a patch that people liked to fix it. For a
124	> current example of accepted code failing to build, look here:
125	>
126	> https://bugzilla.kernel.org/show_bug.cgi?id=38052
127	>
128	> Note that I have not checked Linus' tree to see if that bug is still
129	> current, but the bug itself appears to be open as of this writing.
130	>
131	> There are plenty more technical issues, but these are just my pet
132	> peeves. If you want more examples, you could look at the patches people
133	> send you each day and ask yourself how many are things that could have
134	> been caught had people been more careful during review. For instance,
135	> look at the barrier patches that were done around Linux 2.6.30. What
136	> prevented those from being caught by review years earlier?
137	>
138	> Being outside Linus' tree is not synonymous with being bad and being bad
139	> is not synonymous with being rejected. It is perfectly reasonable to
140	> think that there are examples of good code outside Linus' tree.
141	> Furthermore, should the kernel kernel choose to engage that out-of-tree
142
143	That should have been:
144
145	> Furthermore, should the kernel team choose to engage that out-of-tree
146
147	> code, my expectation is that its quality will improve as they do testing
148	> and write patches.
149	>

Attachments

File name	MIME type
signature.asc	application/pgp-signature

Report Message

Find on MARC Find on Google Groups