Gentoo Archives: gentoo-amd64

From: Dieter Ries <clip2@×××.de>
To: gentoo-amd64@l.g.o
Subject: Re: [gentoo-amd64] Re: urgent: Segfaults after synchronously emerging|downloading 30GB|burnng a DVD iso image
Date: Sat, 27 May 2006 22:32:41
Message-Id: 200605280030.06641.clip2@gmx.de
In Reply to: [gentoo-amd64] Re: urgent: Segfaults after synchronously emerging|downloading 30GB|burnng a DVD iso image by Duncan <1i5t5.duncan@cox.net>
1 THANK YOU DUNCAN!
2
3 YOU FIXED IT.
4 or better, i fixed it, but i would have been lost without your help. the
5 quickpkg thing did the trick.
6
7 first i was in doubt, cause quickpkg is not on the livecd fs, but then i
8 started thinking, and then i untared stage and portage and made the package.
9
10 first i untar'ed the package over /etc, but the i did it on root and it
11 worked.
12
13 once again THANK YOU, you saved me a LOT of work!
14
15 cu all
16
17 Dieter
18
19
20 >
21 > OUCH!
22 >
23 > It /could/ be a hardware issue, but as you can boot from LiveCD and the
24 > fscks all come out fine, it wouldn't appear to be.
25 >
26 > I think the problem is much more likely a glibc update gone bad.
27 > Virtually /everything/ on a system links to glibc, so when it goes bad,
28 > you end up as they say "Up a creek without a paddle!"
29 >
30 > I've actually had it happen once, when a portage bug was triggered by an
31 > obscure series of events that happened to all come together in a glibc
32 > update. I was able to recover, however, as the problem in that case was a
33 > bunch of missing symlinks, and I happened to have mc open at the time and
34 > just didn't close it, but restored enough symlinks by hand based on
35 > trying to run something and getting the error and fixing that symlink
36 > and trying again, using mc to get enough of a working system to finish
37 > recovery by opening up a binpkged version (thanks to FEATURES=buildpkg,
38 > that's one of the times it saved my butt!) of glibc and restoring the
39 > symlinks with a mass copy from there. (I had to do the manual error,
40 > rebuild symlink cycle several times, until I got enough of them rebuilt to
41 > at least run bzip2 so I could untar the appropriate glibc tbz2 binpkg.)
42 >
43 > So anyway, yeah, I know the feeling!
44 >
45 > Assuming the problem is indeed glibc
46 >
47 > If you have been using FEATURES=buildpkg, recovery shouldn't be too
48 > difficult. Simply boot the LiveCD, mount the hard drive root and /usr and
49 > /var partitions if you have them, and untar the last correctly working
50 > glibc package over the hard drive root. Don't chroot to it until after
51 > the untar, so you don't kill functionality, just untar the package to the
52 > mounted hard drive root with any other partitions it might write to
53 > mounted to the correct place on top of that root.
54 >
55 > Note that you'll probably want to save copies of any of the following
56 > files in /etc that you've modified, as the untarring will overwrite them.
57 > You can restore them afterward. host.conf, init.d/nscd, nscd.conf,
58 > nsswitch.conf, rpc.
59 >
60 > If you haven't been using FEATURES=buildpkg, the process is a bit more
61 > complicated, but still nothing to panic over. You'll have to use the
62 > quickpkg feature on the CD to build a copy of the glibc package on the CD,
63 > then untar it over the mounted hard drive root as above (saving backups of
64 > the /etc files as above too).
65 >
66 > After this and recovery of the backed up /etc files, if the problem was
67 > indeed glibc, you should again have a working system. Since you bypassed
68 > portage by untarring the glibc directly, however, the version of glibc
69 > that portage thinks is installed will probably be wrong. Thus, you'll
70 > want to remerge a known working version using portage. Again, that won't
71 > be a big deal if you've been using FEATURES=buildpkg, since you can just
72 > emerge -K the version you untarred. If not, you'll need to recompile a
73 > new version, which of course will take awhile. You may wish to wait until
74 > after tonite's gaming thing, if you won't have time to recompile it before
75 > then.
76 >
77 > After you have your system back up and running, consider a couple things
78 > that might make life easier next time.
79 >
80 > Obviously, I'm going to recommend adding buildpkg to your features if you
81 > haven't got it there already. It really /can/ help. To jumpstart the
82 > binary package store then, consider using quickpkg to package up all your
83 > vital packages, gcc, glibc, portage, python, binutils, etc, at a minimum.
84 > If you want to get everything packaged right away, use emerge --pretend
85 > --emptytree to get a list, and package all those up using quickpkg. (You
86 > can automate the process if you wish using tools such as cut to get the
87 > appropriate fields out of the emerge --pretend output, then feed that
88 > to a file for further editing if desired, and then into quickpkg as the
89 > list of packages it needs to package. I did it this way when I
90 > jumpstarted my binpkg cache.) Alternatively, you can just add the
91 > buildpkg feature and emerge --emptytree world, but that will of course
92 > take awhile.
93 >
94 > Second suggestion and something I'm again doing here, consider creating a
95 > second copy of your root partition, with /var and /usr as well if you have
96 > them separate. Then, periodically, when you know you have a stable
97 > running system, erase the copy and recopy everything over from your known
98 > stable running system. The idea here is that if your system goes haywire
99 > for whatever reason, you can simply boot the backup root partition, which
100 > will have a complete working system on it as of the time you did the
101 > backup. Thus, no worries about this happening again, as you can just boot
102 > the backup system (provided you keep the snapshot fairly close to your
103 > working system so you aren't trying to use something terribly outdated).
104 >
105 > I actually do this with most of my system. The root partition has /usr
106 > and /var on it as well, so the portage database (stored in /var/db) is
107 > current with what's on that partition, and I keep a copy of that
108 > partition, which I refer to as my rootmirror. Likewise, I keep a copy of
109 > /home, a copy of my media partition, a copy of my packages (the result of
110 > FEATURES=buildpkg) partition, etc. I don't worry about a copy of /var/log
111 > (which is on a separate partition than /var), or about the portage tree
112 > (which I can simply resync if it's lost), or /tmp (since the stuff in
113 > there by definition need not survive a reboot). I make sure I keep the
114 > backup copies updated to the point where if I lose everything on the
115 > working copy, I am comfortable resuming from the backup copy, knowing that
116 > I can redo anything changed between them in a reasonable time, should it
117 > come to that.
118 >
119 > If you had been doing this, then you wouldn't be sweating it now, as you'd
120 > just have booted your backup copy and resumed from there. Thus, consider
121 > setting up your system that way once you are back up and running, so you
122 > aren't left in that sort of situation ever again. (Of course, if your
123 > hard drive dies, that's another matter. Here, I use a 4-disk RAID-6 to
124 > address that problem -- I can loose any two of the four hard drives
125 > without losing anything vital. It's software RAID, so if the board goes,
126 > I can buy another board, install the drives and CPUs, rebuild my kernel
127 > for the new board using an emergency CD, and be up and running once again.
128 > That is, however, about the only case where I'd have to use the emergency
129 > CD, as in the other cases, I should still be able to boot to the backup
130 > root snapshot and recover from there.)
131 >
132 > Good luck! I hope it /is/ just glibc, as that's scary to recover from
133 > when the problem occurs, but not the end of the world. If it's not glibc,
134 > things get rather more complex, but all evidence so far says that's what
135 > it is.
136 >
137 > --
138 > Duncan - List replies preferred. No HTML msgs.
139 > "Every nonfree program has a lord, a master --
140 > and if you use the program, he is your master." Richard Stallman
141
142 --
143 Frank Castle is dead!
144 Call me 'The PUNISHER'!

Replies