1 |
On Sunday, 9 June 2019 18:55:34 BST Grant Taylor wrote: |
2 |
> On 6/9/19 2:56 AM, Mick wrote: |
3 |
> > This sounds as if it may be related to a move from an older gcc to |
4 |
> > a newer version. |
5 |
> |
6 |
> I'm not sure it's related to a gcc version: |
7 |
> |
8 |
> # gcc-config -l |
9 |
> [1] x86_64-pc-linux-gnu-6.4.0 * |
10 |
> [2] x86_64-pc-linux-gnu-8.3.0 |
11 |
> |
12 |
> I think that gcc 8.3 might have been selected and I reverted to 6.4 |
13 |
> thinking that it might have been part of the problem. I have since done |
14 |
> an emerge -DuNeq @world with gcc 6.4 and the problem persists. |
15 |
> |
16 |
> > Checking my understanding: |
17 |
> > |
18 |
> > 1. The old modules, compiled with the old gcc and toolchain worked |
19 |
> > fine. |
20 |
> |
21 |
> Correct. |
22 |
> |
23 |
> > 2. The new modules, compiled with the new gcc but old libtool, |
24 |
> > binutils and glibc worked (usually you update these or @system, |
25 |
> > before you update the whole world). |
26 |
> |
27 |
> Correct. |
28 |
> |
29 |
> > 3. The new modules, compiled with the new gcc and toolchain rebuilt |
30 |
> > the second time do not work (this would use libtools, binutils, glibc, |
31 |
> > now compiled with the new gcc). |
32 |
> |
33 |
> Correct. |
34 |
> |
35 |
> > 4. All of the above happens with the old kernel, which was not rebuilt |
36 |
> > with the new toolchain. |
37 |
> |
38 |
> Correct. |
39 |
> |
40 |
> > 5. New kernel(s) compiled thereafter will not boot. |
41 |
> |
42 |
> Correct. |
43 |
> |
44 |
> > You have not mentioned if you upgraded gcc. |
45 |
> |
46 |
> I think that the first emerge -DuNeq @world did pull in a new gcc. But |
47 |
> I have since selected gcc 6.4 as part of diagnostics. (See above.) |
48 |
> |
49 |
> > The error you get about modules failing to load sounds like a |
50 |
> > path/symlink error, or a linux headers error, or a change of arch. |
51 |
> |
52 |
> I don't think it's a symlink error. (I've configured things to not |
53 |
> automatically update the sym-link.) |
54 |
> |
55 |
> # ls -la /usr/src/linux |
56 |
> lrwxrwxrwx 1 root root 22 Sep 8 2018 /usr/src/linux -> |
57 |
> linux-4.9.76-gentoo-r1/ |
58 |
> # uname -a |
59 |
> Linux REDACTED 4.9.76-gentoo-r1 #1 SMP Thu Nov 15 22:23:44 MST 2018 |
60 |
> x86_64 Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz GenuineIntel GNU/Linux |
61 |
> |
62 |
> As you can see, the machine has enough CPU that I can let it do the |
63 |
> following to make sure that things are consistent. (At least I think |
64 |
> that's what's happening.) |
65 |
> |
66 |
> emerge -DuNeq @world && emerge --depclean && revdep-rebuild |
67 |
> |
68 |
> That's my SOP. If that fails I usually try a --resume to see if the |
69 |
> problem repeats, and if it's at the same place. If that fails for some |
70 |
> reason, I'll fall back to a @system. Usually the failure is caused by |
71 |
> something that I've done, disk space, ZFS version issues, etc. |
72 |
> |
73 |
> > Since both vbox and zfs modules fail to boot I would not think this |
74 |
> > is a zfs isolated problem. |
75 |
> |
76 |
> Agreed. |
77 |
> |
78 |
> > Have you tried forcing the loading of these modules? |
79 |
> > |
80 |
> > modprobe --force --verbose <module_name> |
81 |
> |
82 |
> No, not yet. I've never had any success forcing the kernel to load modules. |
83 |
> > What errors do you get with the new non-booting kernels? |
84 |
> |
85 |
> # modprobe --force --verbose vboxdrv |
86 |
> insmod /lib/modules/4.9.76-gentoo-r1/misc/vboxdrv.ko |
87 |
> modprobe: ERROR: could not insert 'vboxdrv': Exec format error |
88 |
> |
89 |
> dmesg reports the following for each attempt to (force) load the module. |
90 |
> |
91 |
> module: vboxdrv: Unknown rela relocation: 4 |
92 |
> |
93 |
> Mick I get the impression that you've got the correct understanding of |
94 |
> my current situation. I'm interested learn what you think should be done. |
95 |
|
96 |
|
97 |
If you haven't done it already, perhaps have a look in the path /lib/modules/ |
98 |
4.9.76-gentoo-r1/misc/ to check the VBox modules are present and owned by |
99 |
root:root with 0644 access rights. |
100 |
|
101 |
Since you have not cross compiled any of these modules, altered your make.conf |
102 |
CFLAGS, or messed with the linux-headers, I can't see what else might have |
103 |
gone sideways. :-/ |
104 |
|
105 |
I'm not saying this is what you should do, but unless someone more learned |
106 |
than myself chimes in with better advice, this is how I would go about it: |
107 |
|
108 |
1. Make a back up of your system in case you can't get back into it and need |
109 |
to restore from a back up. |
110 |
2. Sync portage and upgrade gcc to the latest stable version. Switch to it. |
111 |
3. Rebuild libtools, binutils, glibc. |
112 |
4. Rebuild @system. |
113 |
5. Copy your present kernel config to the latest stable kernel which also |
114 |
deals with the MDS Intel vulnerability; change symlink; 'make oldconfig' on |
115 |
the latest kernel; build it and install it. Don't forget to emerge @module- |
116 |
rebuild. |
117 |
|
118 |
If the newly built kernel won't boot, troubleshoot the error messages at boot |
119 |
and perhaps keyword and try a later kernel. |
120 |
|
121 |
The reason I would go about it this way is because ultimately you will need to |
122 |
both upgrade gcc and move on to a later version kernel. I appreciate right |
123 |
now may not be the right time for you, but at some point, when convenient, |
124 |
you'll have to make time to deal with these errors and work through them to a |
125 |
solution. |
126 |
|
127 |
PS. May also be worth posting in the forums and asking in IRC as there will be |
128 |
more users who may have come across you problem. |
129 |
|
130 |
-- |
131 |
Regards, |
132 |
Mick |