Gentoo Archives: gentoo-amd64

From:	Duncan <1i5t5.duncan@×××.net>
To:	gentoo-amd64@l.g.o
Subject:	[gentoo-amd64] Re: urgent: Segfaults after synchronously emerging\|downloading 30GB\|burnng a DVD iso image
Date:	Sat, 27 May 2006 13:40:18
Message-Id:	`e59knt$89f$1@sea.gmane.org`
In Reply to:	[gentoo-amd64] urgent: Segfaults after synchronously emerging\|downloading 30GB\|burnng a DVD iso image by Dieter Ries

1	Dieter Ries <Clip2@×××.de> posted 20060527083416.76720@×××.net, excerpted
2	below, on Sat, 27 May 2006 10:34:16 +0200:
3
4	> i have a very severe and urgent problem:
5	>
6	> Then the emerge stopped with some error i dont remember, and after that,
7	> everything got somehow slow. because i didnt want the dvd to be ruined i
8	> waited till it was burned and the 30G were downloade. during my waiting i
9	> tried $top or $ps -A to see the running processes, but i got
10	> "Speicherzugriffsfehler", which is AFAIK the same as segmentation fault.
11	>
12	> i got it for everything i tried, and when i tried something from the KDE
13	> menu, nothing happened.
14
15	> when the data was downloaded and the dvd burned, i tried to shutdown from
16	> KDE, no success. the i tried to shutdown from the console, no success
17	> either. in the end i had to use the reset button.
18	>
19	> then, just after "freeing unused kernel memory" there are many errors, all
20	> looking quite the same[.] init hung at that state, nothing worked.
21	>
22	> so i got my livecd, botted from it, then i ran fsck for all the
23	> partitions, without any errors or anything, everything seemed fine.
24	>
25	> i then mounted my system and home partition and proc and typed chroot
26	> /mnt/gentoo /bin/bash, which was followed by, guess it: segmentation
27	> fault.
28	>
29	> the data on all my partitions from sda5 to 10 is still there and i can
30	> mount them all, but i cant chroot and i cant boot.
31	>
32	> so no gentoo anymore[,] i am now using knoppix 4.0 to write for help.
33	>
34	> is there any chance to get my gentoo back to life without completely
35	> install it again? and why does the system break when doing some things
36	> simultaneously?
37	>
38	> can this be a hardware issue?
39
40	OUCH!
41
42	It /could/ be a hardware issue, but as you can boot from LiveCD and the
43	fscks all come out fine, it wouldn't appear to be.
44
45	I think the problem is much more likely a glibc update gone bad.
46	Virtually /everything/ on a system links to glibc, so when it goes bad,
47	you end up as they say "Up a creek without a paddle!"
48
49	I've actually had it happen once, when a portage bug was triggered by an
50	obscure series of events that happened to all come together in a glibc
51	update. I was able to recover, however, as the problem in that case was a
52	bunch of missing symlinks, and I happened to have mc open at the time and
53	just didn't close it, but restored enough symlinks by hand based on
54	trying to run something and getting the error and fixing that symlink
55	and trying again, using mc to get enough of a working system to finish
56	recovery by opening up a binpkged version (thanks to FEATURES=buildpkg,
57	that's one of the times it saved my butt!) of glibc and restoring the
58	symlinks with a mass copy from there. (I had to do the manual error,
59	rebuild symlink cycle several times, until I got enough of them rebuilt to
60	at least run bzip2 so I could untar the appropriate glibc tbz2 binpkg.)
61
62	So anyway, yeah, I know the feeling!
63
64	Assuming the problem is indeed glibc
65
66	If you have been using FEATURES=buildpkg, recovery shouldn't be too
67	difficult. Simply boot the LiveCD, mount the hard drive root and /usr and
68	/var partitions if you have them, and untar the last correctly working
69	glibc package over the hard drive root. Don't chroot to it until after
70	the untar, so you don't kill functionality, just untar the package to the
71	mounted hard drive root with any other partitions it might write to
72	mounted to the correct place on top of that root.
73
74	Note that you'll probably want to save copies of any of the following
75	files in /etc that you've modified, as the untarring will overwrite them.
76	You can restore them afterward. host.conf, init.d/nscd, nscd.conf,
77	nsswitch.conf, rpc.
78
79	If you haven't been using FEATURES=buildpkg, the process is a bit more
80	complicated, but still nothing to panic over. You'll have to use the
81	quickpkg feature on the CD to build a copy of the glibc package on the CD,
82	then untar it over the mounted hard drive root as above (saving backups of
83	the /etc files as above too).
84
85	After this and recovery of the backed up /etc files, if the problem was
86	indeed glibc, you should again have a working system. Since you bypassed
87	portage by untarring the glibc directly, however, the version of glibc
88	that portage thinks is installed will probably be wrong. Thus, you'll
89	want to remerge a known working version using portage. Again, that won't
90	be a big deal if you've been using FEATURES=buildpkg, since you can just
91	emerge -K the version you untarred. If not, you'll need to recompile a
92	new version, which of course will take awhile. You may wish to wait until
93	after tonite's gaming thing, if you won't have time to recompile it before
94	then.
95
96	After you have your system back up and running, consider a couple things
97	that might make life easier next time.
98
99	Obviously, I'm going to recommend adding buildpkg to your features if you
100	haven't got it there already. It really /can/ help. To jumpstart the
101	binary package store then, consider using quickpkg to package up all your
102	vital packages, gcc, glibc, portage, python, binutils, etc, at a minimum.
103	If you want to get everything packaged right away, use emerge --pretend
104	--emptytree to get a list, and package all those up using quickpkg. (You
105	can automate the process if you wish using tools such as cut to get the
106	appropriate fields out of the emerge --pretend output, then feed that
107	to a file for further editing if desired, and then into quickpkg as the
108	list of packages it needs to package. I did it this way when I
109	jumpstarted my binpkg cache.) Alternatively, you can just add the
110	buildpkg feature and emerge --emptytree world, but that will of course
111	take awhile.
112
113	Second suggestion and something I'm again doing here, consider creating a
114	second copy of your root partition, with /var and /usr as well if you have
115	them separate. Then, periodically, when you know you have a stable
116	running system, erase the copy and recopy everything over from your known
117	stable running system. The idea here is that if your system goes haywire
118	for whatever reason, you can simply boot the backup root partition, which
119	will have a complete working system on it as of the time you did the
120	backup. Thus, no worries about this happening again, as you can just boot
121	the backup system (provided you keep the snapshot fairly close to your
122	working system so you aren't trying to use something terribly outdated).
123
124	I actually do this with most of my system. The root partition has /usr
125	and /var on it as well, so the portage database (stored in /var/db) is
126	current with what's on that partition, and I keep a copy of that
127	partition, which I refer to as my rootmirror. Likewise, I keep a copy of
128	/home, a copy of my media partition, a copy of my packages (the result of
129	FEATURES=buildpkg) partition, etc. I don't worry about a copy of /var/log
130	(which is on a separate partition than /var), or about the portage tree
131	(which I can simply resync if it's lost), or /tmp (since the stuff in
132	there by definition need not survive a reboot). I make sure I keep the
133	backup copies updated to the point where if I lose everything on the
134	working copy, I am comfortable resuming from the backup copy, knowing that
135	I can redo anything changed between them in a reasonable time, should it
136	come to that.
137
138	If you had been doing this, then you wouldn't be sweating it now, as you'd
139	just have booted your backup copy and resumed from there. Thus, consider
140	setting up your system that way once you are back up and running, so you
141	aren't left in that sort of situation ever again. (Of course, if your
142	hard drive dies, that's another matter. Here, I use a 4-disk RAID-6 to
143	address that problem -- I can loose any two of the four hard drives
144	without losing anything vital. It's software RAID, so if the board goes,
145	I can buy another board, install the drives and CPUs, rebuild my kernel
146	for the new board using an emergency CD, and be up and running once again.
147	That is, however, about the only case where I'd have to use the emergency
148	CD, as in the other cases, I should still be able to boot to the backup
149	root snapshot and recover from there.)
150
151	Good luck! I hope it /is/ just glibc, as that's scary to recover from
152	when the problem occurs, but not the end of the world. If it's not glibc,
153	things get rather more complex, but all evidence so far says that's what
154	it is.
155
156	--
157	Duncan - List replies preferred. No HTML msgs.
158	"Every nonfree program has a lord, a master --
159	and if you use the program, he is your master." Richard Stallman
160
161	--
162	gentoo-amd64@g.o mailing list

Replies

Subject	Author
Re: [gentoo-amd64] Re: urgent: Segfaults after synchronously emerging\|downloading 30GB\|burnng a DVD iso image	Dieter Ries <clip2@×××.de>
Re: [gentoo-amd64] Re: urgent: Segfaults after synchronously emerging\|downloading 30GB\|burnng a DVD iso image	Sergio Polini <sp_rm_it@×××××.it>
Re: [gentoo-amd64] Re: urgent: Segfaults after synchronously emerging\|downloading 30GB\|burnng a DVD iso image	Peter Humphrey <prh@××××××××××.uk>

Report Message

Find on MARC Find on Google Groups