1 |
On 07/01/2013 09:36 PM, Richard Yao wrote: |
2 |
> On 07/01/2013 03:23 PM, Greg KH wrote: |
3 |
>> On Mon, Jul 01, 2013 at 08:45:16PM +0200, Tom Wijsman wrote: |
4 |
>>>>> Q: What about my stable server? I really don't want to run this |
5 |
>>>>> stuff! |
6 |
>>>>> |
7 |
>>>>> A: These options would depend on !CONFIG_VANILLA or |
8 |
>>>>> CONFIG_EXPERIMENTAL |
9 |
>>>> |
10 |
>>>> What is CONFIG_VANILLA? I don't see that in the upstream kernel tree |
11 |
>>>> at all. |
12 |
>>>> |
13 |
>>>> CONFIG_EXPERIMENTAL is now gone from upstream, so you are going to |
14 |
>>>> have a problem with this. |
15 |
>>> |
16 |
>>> Earlier I mentioned "2) These feature should depend on a non-vanilla / |
17 |
>>> experimental option." which is an option we would introduce under the |
18 |
>>> Gentoo distribution menu section. |
19 |
>> |
20 |
>> Distro-specific config options, great :( |
21 |
>> |
22 |
>>>>> which would be disabled by default, therefore if you keep this |
23 |
>>>>> option the way it is on your stable server; it won't affect you. |
24 |
>>>> |
25 |
>>>> Not always true. Look at aufs as an example. It patches the core |
26 |
>>>> kernel code in ways that are _not_ accepted upstream yet. Now you all |
27 |
>>>> are running that modified code, even if you don't want aufs. |
28 |
>>> |
29 |
>>> Earlier I mentioned "3) The patch should not affect the build by |
30 |
>>> default."; if it does, we have to adjust it to not do that, this is |
31 |
>>> something that can be easily scripted. It's just a matter of embedding |
32 |
>>> each + block in the diff with a config check and updating the counts. |
33 |
>> |
34 |
>> Look at aufs as a specific example of why you can't do that, otherwise, |
35 |
>> don't you think that the aufs developer(s) wouldn't have done so? |
36 |
> |
37 |
> I am accquainted with the developer of a stackable filesystem developer. |
38 |
|
39 |
I should probably proofread multiple times before I send emails. Anyway, |
40 |
that should have been: |
41 |
|
42 |
> I am acquainted with the developer of a stackable filesystem. |
43 |
|
44 |
> According to what he has told me in person offline, the developers on |
45 |
> the LKML cannot decide on how a stackable filesystem should be |
46 |
> implemented. I was told three different variations on the design that |
47 |
> some people liked and others didn't, which ultimately kept the upstream |
48 |
> kernel from adopting anything. I specifically recall two variations, |
49 |
> which were doing it as part of the VFS and doing it as part of ext4. If |
50 |
> you want to criticize stackable filesystems, would you lay out a |
51 |
> groundwork for getting one implemented upon which people will agree? |
52 |
> |
53 |
>> The goal of "don't touch any other kernel code" is a very good one, but |
54 |
>> not always true for these huge out-of-tree kernel patches. Usually that |
55 |
>> is the main reason why these patches aren't merged upstream, because |
56 |
>> those changes are not acceptable. |
57 |
> |
58 |
> I was under the impression that there were several reasons for patches |
59 |
> not being merged upstream: |
60 |
> |
61 |
> 1. Lack of signed-off |
62 |
> 2. Code drop that no one will maintain |
63 |
> 3. Subsystem maintainers saying no simply because they do not like |
64 |
> <insert non-technical reason here>. |
65 |
> 4. Risk of patent trolls |
66 |
> 5. Actual technical reasons |
67 |
> |
68 |
>> So be very careful here, you are messing with things that are rejected |
69 |
>> by upstream. |
70 |
>> |
71 |
>> greg k-h |
72 |
>> |
73 |
> |
74 |
> Only some of the patches were rejected. Others were never submitted. The |
75 |
> PaX/GrSecurity developers prefer their code to stay out-of-tree. As one |
76 |
> of the people hacking on ZFSOnLinux, I prefer that the code be |
77 |
> out-of-tree. That is because fixes for other filesystems are either held |
78 |
> back by a lack of system kernel updates or held hostage by regressions |
79 |
> in newer kernels on certain hardware. |
80 |
> |
81 |
> With that said, being in Linus' tree does not make code fall under some |
82 |
> golden standard for quality. There are many significant issues in code |
83 |
> committed to Linus' the kernel, some of which have been problems for |
84 |
> years. Just to name a few: |
85 |
> |
86 |
> 1. Doing `rm -r /dir` on a directory tree containing millions of inodes |
87 |
> (e.g. ccache) on an ext4 filesystem mounted with discard with the CFQ IO |
88 |
> elevator will cause a system to hang for hours on pre-SATA 3.1 hardware. |
89 |
> This is because TRIM is a non-queued command and is being interleaved |
90 |
> with writes for "fairness". Incidentally, using noop turns a multiple |
91 |
> hour hang into a laggy experience of a few minutes. |
92 |
> |
93 |
> 2. aio_sync() is unimplemented, which means that there is no sane way |
94 |
> for userland software like QEMU and TGT to be both fast and guarantee |
95 |
> data integrity. A single crash and your guest is corrupted. It would |
96 |
> have been better had AIO never been implemented. |
97 |
> |
98 |
> 3. dm-crypt will reorder write requests across flushes. That is because |
99 |
> upon seeing a write, it sends it to a work queue to be processed |
100 |
> asynchronously and upon seeing a flush, it immediately processes it. A |
101 |
> single kernel panic or sudden power loss can damage filesystems stored |
102 |
> on it. |
103 |
> |
104 |
> 4. Under low memory conditions with hundreds of concurrent threads (e.g. |
105 |
> package builds), every thread will enter direct reclaim and there will |
106 |
> be a remarkable drop in system throughput, assuming that the system does |
107 |
> not lockup. There is a fairly substantial amount of time wasted after |
108 |
> one thread finishes direct reclaim in other threads because they will |
109 |
> still be performing direct reclaim afterward. |
110 |
> |
111 |
> 5. The Linux 3.7 nouveau rewrite broke kexec support. The graphics |
112 |
> hardware will not reinitialize properly. |
113 |
> |
114 |
> 6. A throttle mechanism introduced for memory cgroups can cause the |
115 |
> system to deadlock whenever it is holding a lock needed for swap and |
116 |
> enters direct reclaim with a significant number of dirty pages. |
117 |
> |
118 |
> 7. Code has been accepted on multiple occasions that does not compile |
119 |
> and the build failures persist for weeks if not months after Linus' tag. |
120 |
> I sent a patch to fix one failure. It was rejected because I had fixed |
121 |
> code to compile with -Werror, people thought that -Werror should be |
122 |
> removed (and therefore was no reason to fix the warnings) and we went 2 |
123 |
> months until someone wrote a patch that people liked to fix it. For a |
124 |
> current example of accepted code failing to build, look here: |
125 |
> |
126 |
> https://bugzilla.kernel.org/show_bug.cgi?id=38052 |
127 |
> |
128 |
> Note that I have not checked Linus' tree to see if that bug is still |
129 |
> current, but the bug itself appears to be open as of this writing. |
130 |
> |
131 |
> There are plenty more technical issues, but these are just my pet |
132 |
> peeves. If you want more examples, you could look at the patches people |
133 |
> send you each day and ask yourself how many are things that could have |
134 |
> been caught had people been more careful during review. For instance, |
135 |
> look at the barrier patches that were done around Linux 2.6.30. What |
136 |
> prevented those from being caught by review years earlier? |
137 |
> |
138 |
> Being outside Linus' tree is not synonymous with being bad and being bad |
139 |
> is not synonymous with being rejected. It is perfectly reasonable to |
140 |
> think that there are examples of good code outside Linus' tree. |
141 |
> Furthermore, should the kernel kernel choose to engage that out-of-tree |
142 |
|
143 |
That should have been: |
144 |
|
145 |
> Furthermore, should the kernel team choose to engage that out-of-tree |
146 |
|
147 |
> code, my expectation is that its quality will improve as they do testing |
148 |
> and write patches. |
149 |
> |