1 |
Anders =?iso-8859-1?Q?Th=F8gersen?= <anderslt@×××××.com> posted |
2 |
20060520223006.GA9058@××××××.mydomain, excerpted below, on Sun, 21 May |
3 |
2006 00:30:06 +0200: |
4 |
|
5 |
> On 04:52 Fri 12 May 2006, Duncan wrote: |
6 |
>> Anders posted as summarized on 12 May 2006: |
7 |
>> |
8 |
>> > [Repeatable segfault doing emerge sync at 51%. Portage-2.0.54] |
9 |
>> |
10 |
>> [That's almost certainly a portage cache corruption issue. Try emerge |
11 |
>> --metadata. That should just update the cache without doing the sync |
12 |
>> part first. If that fails, delete the cache and run emerge --metadata |
13 |
>> again, to rebuild it.] |
14 |
> |
15 |
> Sorry for the late reply,... |
16 |
|
17 |
Don't worry too much about the timeliness as the problem's yours, not |
18 |
mine, so your schedule. From the other side, that's one reason I prefer |
19 |
newsgroups or mailing lists to private help -- if one person doesn't get |
20 |
in a timely reply, someone else likely will. (The other big reason is |
21 |
that no single person always guesses the problem right or has the |
22 |
experience to fix it, and a list/newsgroup allows more folks a chance to |
23 |
look at it than private mail would.) |
24 |
|
25 |
> I backed up /var/cache/edb as you suggested and began emerge --metadata, |
26 |
> ... First segfault occurred at 31%. Feeling bold i restarted the |
27 |
> command and this time it went all the way to the magic 51% where it |
28 |
> segfaulted as before. From here every emerge --metadata results in a |
29 |
> segfault at 51% :-/ |
30 |
> |
31 |
> If I understand you correctly the problem of this segfault is due to a |
32 |
> specific file in the poretage tree. To correct this problem must I then |
33 |
> locate this file? |
34 |
|
35 |
Well, locating it would help, but it may be that it isn't necessary, as |
36 |
there are other ways to tackle the problem. |
37 |
|
38 |
A couple things to keep in mind: (1) Portage /can/ operate without that |
39 |
cache -- it's just /very/ slow. Thus, if it comes to being a problem with |
40 |
the portage you are running, you should still be able to merge a different |
41 |
version. (2) We now know the problem regenerates from a clear cache. |
42 |
|
43 |
At this point, with the problem regenerating from a clear cache, the next |
44 |
thing I'd want to establish is that it's not a file system problem. |
45 |
Delete the cache again. If you have /var or /var/cache on its own mount, |
46 |
umount it (depending on whether you have /var/log on the same mount, and |
47 |
on the services you are running, you may have to switch to single user |
48 |
mode or at least shut down your syslog and perhaps other services in order |
49 |
to umount /var) and do a full fsck on it. Remount and startup your |
50 |
services again or simply reboot, and try the emerge --metadata again. If |
51 |
the problem isn't yet gone, delete the cache again and continue... |
52 |
|
53 |
The next item on the checklist is the file system containing the portage |
54 |
tree itself. The tree can be redownloaded, so in general, it's safe to |
55 |
delete. If you run FEATURES=buildpkg, as I've often recommended on this |
56 |
list (different topic but something to look at once you get up and running |
57 |
again, if you haven't already), and your $PKGDIR is in the portage tree as |
58 |
it is by default (/usr/portage/packages, IIRC), you'll want to copy or |
59 |
move that elsewhere. Depending on your internet speed and whether you are |
60 |
charged per byte downloaded, you may wish to do the same thing with |
61 |
$DISTDIR (/usr/portage/distfiles by default), which contains all the |
62 |
source tarballs portage had downloaded. Then delete the portage tree, and |
63 |
if it's on a non-root filesystem, unmount and fsck it as well. See below |
64 |
for refetching, as there's an easier way than emerge --sync when you are |
65 |
fetching the entire thing. |
66 |
|
67 |
If either or both of the above are on your root filesystem, after the |
68 |
deletes, reboot or boot to your rescue solution (the liveCD or |
69 |
alternate boot volume or whatever) and do the fsck from there. The |
70 |
deletes aren't absolutely necessary, but are worthwhile since the data is |
71 |
redownloadable/rebuildable anyway, and if the problem /is/ a filesystem |
72 |
error, it's easier just renewing the data than it is trying to rebuild the |
73 |
file from incomplete data in lost&found. Additionally, if there happen to |
74 |
be other errors on the filesystem and thus other files end up in |
75 |
lost&found, it's easier to find the files you really /do/ need to recover |
76 |
there if there's less noise from files that would be easier simply |
77 |
refetched or recached. |
78 |
|
79 |
Now that you know it's not a problem with a bad filesystem, the next step |
80 |
is getting a new copy of the portage tree. Since we deleted the tree we |
81 |
had, emerge --sync isn't the most efficient option, tho it would normally |
82 |
do the job. Rather, and this kills two birds with one stone as it's the |
83 |
next thing to try as well, use emerge-webrsync. This fetches a verified |
84 |
snapshot tarball of the tree taken daily, so it's not quite as uptodate as |
85 |
a live sync would be (it could be up to 24 hours old), but it's more |
86 |
efficient if you aren't starting with a mostly uptodate tree with only a |
87 |
few changes needed, than emerge --sync would be. Doing it this way, we |
88 |
test another sync method and ensure that we get a complete copy of the |
89 |
tree, as well, bypassing the rsync and any possibly broken files that had |
90 |
been causing problems in your local copy of the tree. |
91 |
|
92 |
emerge-webrsync performs an emerge --metadata after completing the tree |
93 |
sync, so if it goes fine, you should be back in business. Try another |
94 |
emerge --sync and see. |
95 |
|
96 |
If you are still having problems at /that/ point, having verified that |
97 |
it's not a filesystem issue, and trying a completely new copy of the tree |
98 |
fetched with emerge-webrsync, /then/ things start getting interesting. |
99 |
There are still some things that can be tried, but better to wait until we |
100 |
know they are needed before getting worried. The output of |
101 |
emerge-webrsync or the next sync where the problem reoccurs would be |
102 |
interesting as well, so post it. Also, at this point, it may be useful to |
103 |
file a portage bug and get the opinion of the real experts. However, |
104 |
hopefully, that's not necessary, as a clean filesystem and copy of the |
105 |
tree will have eliminated the issue. |
106 |
|
107 |
|
108 |
|
109 |
-- |
110 |
Duncan - List replies preferred. No HTML msgs. |
111 |
"Every nonfree program has a lord, a master -- |
112 |
and if you use the program, he is your master." Richard Stallman |
113 |
|
114 |
-- |
115 |
gentoo-amd64@g.o mailing list |