1 |
Dieter Ries <Clip2@×××.de> posted 20060527083416.76720@×××.net, excerpted |
2 |
below, on Sat, 27 May 2006 10:34:16 +0200: |
3 |
|
4 |
> i have a very severe and urgent problem: |
5 |
> |
6 |
> Then the emerge stopped with some error i dont remember, and after that, |
7 |
> everything got somehow slow. because i didnt want the dvd to be ruined i |
8 |
> waited till it was burned and the 30G were downloade. during my waiting i |
9 |
> tried $top or $ps -A to see the running processes, but i got |
10 |
> "Speicherzugriffsfehler", which is AFAIK the same as segmentation fault. |
11 |
> |
12 |
> i got it for everything i tried, and when i tried something from the KDE |
13 |
> menu, nothing happened. |
14 |
|
15 |
> when the data was downloaded and the dvd burned, i tried to shutdown from |
16 |
> KDE, no success. the i tried to shutdown from the console, no success |
17 |
> either. in the end i had to use the reset button. |
18 |
> |
19 |
> then, just after "freeing unused kernel memory" there are many errors, all |
20 |
> looking quite the same[.] init hung at that state, nothing worked. |
21 |
> |
22 |
> so i got my livecd, botted from it, then i ran fsck for all the |
23 |
> partitions, without any errors or anything, everything seemed fine. |
24 |
> |
25 |
> i then mounted my system and home partition and proc and typed chroot |
26 |
> /mnt/gentoo /bin/bash, which was followed by, guess it: segmentation |
27 |
> fault. |
28 |
> |
29 |
> the data on all my partitions from sda5 to 10 is still there and i can |
30 |
> mount them all, but i cant chroot and i cant boot. |
31 |
> |
32 |
> so no gentoo anymore[,] i am now using knoppix 4.0 to write for help. |
33 |
> |
34 |
> is there any chance to get my gentoo back to life without completely |
35 |
> install it again? and why does the system break when doing some things |
36 |
> simultaneously? |
37 |
> |
38 |
> can this be a hardware issue? |
39 |
|
40 |
OUCH! |
41 |
|
42 |
It /could/ be a hardware issue, but as you can boot from LiveCD and the |
43 |
fscks all come out fine, it wouldn't appear to be. |
44 |
|
45 |
I think the problem is much more likely a glibc update gone bad. |
46 |
Virtually /everything/ on a system links to glibc, so when it goes bad, |
47 |
you end up as they say "Up a creek without a paddle!" |
48 |
|
49 |
I've actually had it happen once, when a portage bug was triggered by an |
50 |
obscure series of events that happened to all come together in a glibc |
51 |
update. I was able to recover, however, as the problem in that case was a |
52 |
bunch of missing symlinks, and I happened to have mc open at the time and |
53 |
just didn't close it, but restored enough symlinks by hand based on |
54 |
trying to run something and getting the error and fixing that symlink |
55 |
and trying again, using mc to get enough of a working system to finish |
56 |
recovery by opening up a binpkged version (thanks to FEATURES=buildpkg, |
57 |
that's one of the times it saved my butt!) of glibc and restoring the |
58 |
symlinks with a mass copy from there. (I had to do the manual error, |
59 |
rebuild symlink cycle several times, until I got enough of them rebuilt to |
60 |
at least run bzip2 so I could untar the appropriate glibc tbz2 binpkg.) |
61 |
|
62 |
So anyway, yeah, I know the feeling! |
63 |
|
64 |
Assuming the problem is indeed glibc |
65 |
|
66 |
If you have been using FEATURES=buildpkg, recovery shouldn't be too |
67 |
difficult. Simply boot the LiveCD, mount the hard drive root and /usr and |
68 |
/var partitions if you have them, and untar the last correctly working |
69 |
glibc package over the hard drive root. Don't chroot to it until after |
70 |
the untar, so you don't kill functionality, just untar the package to the |
71 |
mounted hard drive root with any other partitions it might write to |
72 |
mounted to the correct place on top of that root. |
73 |
|
74 |
Note that you'll probably want to save copies of any of the following |
75 |
files in /etc that you've modified, as the untarring will overwrite them. |
76 |
You can restore them afterward. host.conf, init.d/nscd, nscd.conf, |
77 |
nsswitch.conf, rpc. |
78 |
|
79 |
If you haven't been using FEATURES=buildpkg, the process is a bit more |
80 |
complicated, but still nothing to panic over. You'll have to use the |
81 |
quickpkg feature on the CD to build a copy of the glibc package on the CD, |
82 |
then untar it over the mounted hard drive root as above (saving backups of |
83 |
the /etc files as above too). |
84 |
|
85 |
After this and recovery of the backed up /etc files, if the problem was |
86 |
indeed glibc, you should again have a working system. Since you bypassed |
87 |
portage by untarring the glibc directly, however, the version of glibc |
88 |
that portage thinks is installed will probably be wrong. Thus, you'll |
89 |
want to remerge a known working version using portage. Again, that won't |
90 |
be a big deal if you've been using FEATURES=buildpkg, since you can just |
91 |
emerge -K the version you untarred. If not, you'll need to recompile a |
92 |
new version, which of course will take awhile. You may wish to wait until |
93 |
after tonite's gaming thing, if you won't have time to recompile it before |
94 |
then. |
95 |
|
96 |
After you have your system back up and running, consider a couple things |
97 |
that might make life easier next time. |
98 |
|
99 |
Obviously, I'm going to recommend adding buildpkg to your features if you |
100 |
haven't got it there already. It really /can/ help. To jumpstart the |
101 |
binary package store then, consider using quickpkg to package up all your |
102 |
vital packages, gcc, glibc, portage, python, binutils, etc, at a minimum. |
103 |
If you want to get everything packaged right away, use emerge --pretend |
104 |
--emptytree to get a list, and package all those up using quickpkg. (You |
105 |
can automate the process if you wish using tools such as cut to get the |
106 |
appropriate fields out of the emerge --pretend output, then feed that |
107 |
to a file for further editing if desired, and then into quickpkg as the |
108 |
list of packages it needs to package. I did it this way when I |
109 |
jumpstarted my binpkg cache.) Alternatively, you can just add the |
110 |
buildpkg feature and emerge --emptytree world, but that will of course |
111 |
take awhile. |
112 |
|
113 |
Second suggestion and something I'm again doing here, consider creating a |
114 |
second copy of your root partition, with /var and /usr as well if you have |
115 |
them separate. Then, periodically, when you know you have a stable |
116 |
running system, erase the copy and recopy everything over from your known |
117 |
stable running system. The idea here is that if your system goes haywire |
118 |
for whatever reason, you can simply boot the backup root partition, which |
119 |
will have a complete working system on it as of the time you did the |
120 |
backup. Thus, no worries about this happening again, as you can just boot |
121 |
the backup system (provided you keep the snapshot fairly close to your |
122 |
working system so you aren't trying to use something terribly outdated). |
123 |
|
124 |
I actually do this with most of my system. The root partition has /usr |
125 |
and /var on it as well, so the portage database (stored in /var/db) is |
126 |
current with what's on that partition, and I keep a copy of that |
127 |
partition, which I refer to as my rootmirror. Likewise, I keep a copy of |
128 |
/home, a copy of my media partition, a copy of my packages (the result of |
129 |
FEATURES=buildpkg) partition, etc. I don't worry about a copy of /var/log |
130 |
(which is on a separate partition than /var), or about the portage tree |
131 |
(which I can simply resync if it's lost), or /tmp (since the stuff in |
132 |
there by definition need not survive a reboot). I make sure I keep the |
133 |
backup copies updated to the point where if I lose everything on the |
134 |
working copy, I am comfortable resuming from the backup copy, knowing that |
135 |
I can redo anything changed between them in a reasonable time, should it |
136 |
come to that. |
137 |
|
138 |
If you had been doing this, then you wouldn't be sweating it now, as you'd |
139 |
just have booted your backup copy and resumed from there. Thus, consider |
140 |
setting up your system that way once you are back up and running, so you |
141 |
aren't left in that sort of situation ever again. (Of course, if your |
142 |
hard drive dies, that's another matter. Here, I use a 4-disk RAID-6 to |
143 |
address that problem -- I can loose any two of the four hard drives |
144 |
without losing anything vital. It's software RAID, so if the board goes, |
145 |
I can buy another board, install the drives and CPUs, rebuild my kernel |
146 |
for the new board using an emergency CD, and be up and running once again. |
147 |
That is, however, about the only case where I'd have to use the emergency |
148 |
CD, as in the other cases, I should still be able to boot to the backup |
149 |
root snapshot and recover from there.) |
150 |
|
151 |
Good luck! I hope it /is/ just glibc, as that's scary to recover from |
152 |
when the problem occurs, but not the end of the world. If it's not glibc, |
153 |
things get rather more complex, but all evidence so far says that's what |
154 |
it is. |
155 |
|
156 |
-- |
157 |
Duncan - List replies preferred. No HTML msgs. |
158 |
"Every nonfree program has a lord, a master -- |
159 |
and if you use the program, he is your master." Richard Stallman |
160 |
|
161 |
-- |
162 |
gentoo-amd64@g.o mailing list |