1 |
Daiajo Tibdixious posted on Tue, 30 Apr 2013 18:30:36 +1000 as excerpted: |
2 |
|
3 |
> During the startup on 3.7.10 /run fails to mount with this error: |
4 |
> mount: wrong fs type, bad option, bad superblock on tmpfs, missing |
5 |
> codepage or helper program, or other error |
6 |
> |
7 |
> Googling shows many people getting this error, and its something to do |
8 |
> with openrc and moving from /var/run to /run. |
9 |
> |
10 |
> What I can't understand is why I can boot 3.7.9 without any problems, |
11 |
> while 3.7.10 bombs. I have DEVTMPFS enabled in the kernel. |
12 |
> Obviously there is something else wrong with my 3.7.10 kernel but I just |
13 |
> can't figure what. |
14 |
> |
15 |
> Is /run supposed to be a physical directory? I thought it was supposed |
16 |
> to be in ram. I got |
17 |
> lrwxrwxrwx 1 root root 4 Dec 5 14:13 /var/run -> /run |
18 |
> drwxr-xr-x 11 root root 360 Apr 30 18:17 /run |
19 |
> |
20 |
> Before the last upgrade of openrc (to 0.11.8) my 3.7.10 kernel was |
21 |
> working fine. |
22 |
|
23 |
Well, 3.7 is several months ago history here, as I'm running mainline- |
24 |
linus-git and just rebooted to 3.9.0 yesterday, and I'm running live-git |
25 |
openrc-9999 as well, but 0.11.8 is the only release in-tree, so I guess |
26 |
it can't be too outdated, tho it does date from early December (07, Pearl |
27 |
Harbor day). But... |
28 |
|
29 |
Looking at kernel.org, first thing I note is that 3.7.10 is EOL for 3.7, |
30 |
so you should be thinking about updating anyway... But a definitely-non- |
31 |
kernel-coder look at its changelog... |
32 |
|
33 |
There's one tmpfs commit listed for 3.7.10, and no block-layer or similar |
34 |
commits that look like they might trigger it, so a first-guess is that |
35 |
it's that tmpfs commit (the mentioned mpol option is numa related): |
36 |
|
37 |
commit 95558dce307f5ac203cdd15192b8d9f028c0b6c4 |
38 |
Author: Greg Thelen <gthelen@××××××.com> |
39 |
Date: Fri Feb 22 16:36:01 2013 -0800 |
40 |
|
41 |
tmpfs: fix use-after-free of mempolicy object |
42 |
|
43 |
commit 5f00110f7273f9ff04ac69a5f85bb535a4fd0987 upstream. |
44 |
|
45 |
The tmpfs remount logic preserves filesystem mempolicy if the mpol=M |
46 |
option is not specified in the remount request. A new policy can be |
47 |
specified if mpol=M is given. |
48 |
|
49 |
Before this patch remounting an mpol bound tmpfs without specifying |
50 |
mpol= mount option in the remount request would set the filesystem's |
51 |
mempolicy object to a freed mempolicy object. |
52 |
|
53 |
[snip the reproducer and panic, you can look it up if curious] |
54 |
|
55 |
Non-debug kernels will not crash immediately because referencing the |
56 |
dangling mpol will not cause a fault. Instead the filesystem will |
57 |
reference a freed mempolicy object, which will cause unpredictable |
58 |
behavior. |
59 |
|
60 |
The problem boils down to a dropped mpol reference below if |
61 |
shmem_parse_options() does not allocate a new mpol: |
62 |
|
63 |
config = *sbinfo |
64 |
shmem_parse_options(data, &config, true) |
65 |
mpol_put(sbinfo->mpol) |
66 |
sbinfo->mpol = config.mpol /* BUG: saves unreferenced mpol */ |
67 |
|
68 |
This patch avoids the crash by not releasing the mempolicy if |
69 |
shmem_parse_options() doesn't create a new mpol. |
70 |
|
71 |
How far back does this issue go? I see it in both 2.6.36 and 3.3. I |
72 |
did not look back further. |
73 |
|
74 |
Signed-off-by: Greg Thelen <gthelen@××××××.com> |
75 |
Acked-by: Hugh Dickins <hughd@××××××.com> |
76 |
Signed-off-by: Andrew Morton <akpm@××××××××××××××××.org> |
77 |
Signed-off-by: Linus Torvalds <torvalds@××××××××××××××××.org> |
78 |
Signed-off-by: Greg Kroah-Hartman <gregkh@×××××××××××××××.org> |
79 |
|
80 |
FWIW, that "commit upstream" appears as v3.9-rc1~99^2~8 according to git |
81 |
name-rev. ~N means Nth generation ancestor ^P refers to parent P, so we |
82 |
99 commits previous to 3.9-rc1, 8 commits previous to that on the second |
83 |
parent side. Or in plainer language, the first (mainline) tagged version |
84 |
it appeared in was 3.9-rc1, with the commit obviously appearing in-tree |
85 |
before that but after 3.8.0. |
86 |
|
87 |
Which is to say it's about 8 weeks old in mainline, appearing in the |
88 |
pre-3.9-rc1 commit window. |
89 |
|
90 |
So... given that I'm running a git kernel anyway, my reaction here would |
91 |
be to try reverting that patch and seeing if that fixed it. If not, I'd |
92 |
given that we know 3.7.9 was fine and 3.7.10 wasn't, it's a 100% |
93 |
reproducer for you, and there's a very limited number of commits between |
94 |
the two, a git bisect would be child's play. (Well, for a child knowing |
95 |
git anyway...) |
96 |
|
97 |
Meanwhile, are you running a NUMA system? My first amd64 system was a |
98 |
dual socket Opteron, so NUMA, tho my current system isn't. Do you mount |
99 |
/run using custom mount-options or just take the default openrc options? |
100 |
Here, I'm using the following (from fstab): |
101 |
|
102 |
run /run tmpfs size=2m,nodev,nosuid,noexec,noauto,nr_inodes=4k 0 0 |
103 |
|
104 |
Oh, do you build tmpfs as a module or build it into the kernel |
105 |
(monolithic)? If it's a module, maybe the kernel either can't find the |
106 |
tmpfs module for some reason, or is confused somehow about what to load? |
107 |
If so, could you try building in tmpfs and see if that changes things? |
108 |
|
109 |
As to your question about what /run is supposed to be... what did you |
110 |
/expect/ ls to show? It's a directory in the filesystem, so ls has to |
111 |
show a directory, regardless of whether it's memory or local-disk or |
112 |
network or ... that backs the filesystem. |
113 |
|
114 |
There does have to be a physical /run directory on / to serve as a |
115 |
mountpoint, so the directory must exist as a physical directory on / |
116 |
before /run is mounted. |
117 |
|
118 |
After mount, since tmpfs is still a filesystem, only in memory, it'll |
119 |
still /appear/ as a directory to ls, even tho it's in memory. To see |
120 |
what it actually is, you can use df /run, which tells you what filesystem |
121 |
it's on, followed by grep <filesystem> /proc/mounts, to give you what the |
122 |
kernel thinks about that filesystem and its mount options. Here: |
123 |
|
124 |
$ ls -dl /run |
125 |
drwxrwxrwt 7 root root 360 Apr 29 16:05 /run/ |
126 |
|
127 |
$ df /run |
128 |
Filesystem Size Used Avail Use% Mounted on |
129 |
run 2.0M 724K 1.3M 36% /run |
130 |
|
131 |
$ grep run /proc/mounts |
132 |
run /run tmpfs rw,nosuid,nodev,noexec,relatime,size=2048k,nr_inodes=4096 |
133 |
0 0 |
134 |
|
135 |
(FWIW, there's another unrelated run entry that shows in my grep as well, |
136 |
the bind-mount for my chrooted named server. I didn't post that.) |
137 |
|
138 |
As you can see ls shows /run as a normal dir... as it should since it's a |
139 |
filesystem, that it happens to be a filesystem in memory doesn't matter |
140 |
to ls. But df shows it as its own filesystem, and a grep of /proc/mounts |
141 |
tells me what filesystem type (tmpfs, so it's in memory because that's |
142 |
where tmpfs creates its filesystem) as well as the options the kernel |
143 |
used to mount that filesystem. |
144 |
|
145 |
Meanwhile, another way to tackle the problem, since you know the openrc |
146 |
version where it shows up as well, would be to bisect openrc instead of |
147 |
the kernel. That'd tell you exactly what openrc commit was the problem. |
148 |
If necessary you could do both bisects, getting an even better picture of |
149 |
what was triggering it, from both the kernel and openrc sides. Of course |
150 |
that'd be easier if you knew which version of openrc you were running |
151 |
previously, hopefully 0.11.7.x, as that'd give you the least commits to |
152 |
bisect down. |
153 |
|
154 |
|
155 |
-- |
156 |
Duncan - List replies preferred. No HTML msgs. |
157 |
"Every nonfree program has a lord, a master -- |
158 |
and if you use the program, he is your master." Richard Stallman |