1 |
Hi, |
2 |
|
3 |
On Thu, 24 Aug 2017 18:27:22 -0300 Francisco Ares wrote: |
4 |
> Hi, All. |
5 |
> |
6 |
> This is a rather special case, so I don't expect much, but who knows? |
7 |
> |
8 |
> I've built a Gentoo x86-64 system for an embedded application. |
9 |
> |
10 |
> Just after a lot of updates, which I am unable to track, it stopped working |
11 |
> as usual. |
12 |
> |
13 |
> There is the development system, fully loaded of a lot of packages used for |
14 |
> development, and the production system, that don't need all of those. |
15 |
> |
16 |
> There is a line in both systems in /etc/iniitab responsible for auto-login |
17 |
> the production system user and the programs we need running (in its |
18 |
> ".bash_profile" and ".xinitrc"): |
19 |
> |
20 |
> c6:2345:respawn:/sbin/agetty -a production-user 38400 tty6 linux |
21 |
> |
22 |
> The development system starts a WindowMaker session, and the production |
23 |
> system starts a program that controls the rest of the hardware of this |
24 |
> embedded system, with an X11 graphical interface. That runs normally when |
25 |
> simulated at the development system. |
26 |
> |
27 |
> The development system runs smoothly. The production system, after |
28 |
> removing the files from undesirable packages and creating a squashfs image |
29 |
> of the ripped-off root partition behaves strangely at boot: |
30 |
> |
31 |
> It shows the initialization messages as expected, but when the auto-login |
32 |
> and the controller program start should take place, it completely stalls up |
33 |
> to I plug a USB keyboard and issue some times some of the key combinations |
34 |
> to change to a text console and back to X11 (Ctrl-Alt-F1 and Ctrl-Alt-F6); |
35 |
> only then the things resume as expected. |
36 |
> |
37 |
> As you might suspect, there is no keyboard for the production system ;-) . |
38 |
> |
39 |
> As a matter of fact, I don't know where the stall take place, as when I try |
40 |
> to switch to a text console to see the logs, it switches back to X11 and |
41 |
> starts our program. By the way, the logs just show that the events |
42 |
> occurred at latter times than expected. |
43 |
> |
44 |
> Although the squashfs is read-only, some main directories are arranged in a |
45 |
> way that, using tmpfs mounts and unionfs with the read-only directory to |
46 |
> the read-write tmpfs directory to that main directory provide a way of |
47 |
> creating temporary files that has been working for a few years now. |
48 |
> |
49 |
> For instance, in "/etc/fstab": |
50 |
> |
51 |
> tmpfs /.etc.rw tmpfs defaults,mode=755 |
52 |
> 0 0 |
53 |
> union /etc unionfs |
54 |
> default_permissions,allow_other,use_ino,nonempty,suid,cow,dirs=/. |
55 |
> etc.rw=rw:/.etc.ro=ro 0 0 |
56 |
> |
57 |
> And there is a "/.etc.ro" with a copy of all files present in regular |
58 |
> "/etc" , a "/.etc.rw" directory to be mounted tmpfs, and the original |
59 |
> "/etc" directory, that needs to be there at boot, even before mounting all |
60 |
> this. |
61 |
> |
62 |
> Does anyone have a clue? |
63 |
|
64 |
Try to dissect your problem. Start with removing squashfs and all |
65 |
tmpfs/unionfs manipulations. Create the same image, but on "normal" |
66 |
writable file system and see how it goes. It may be fs-related bug, |
67 |
may be you removed too many files and some "undesired" packages are |
68 |
actually mandatory. |
69 |
|
70 |
If you have some form on snapshots of your changes, you can try to |
71 |
bisect them in a git bisect way. |
72 |
|
73 |
Another approach is to run X server (or any other app suspected as |
74 |
a troublemaker) under strace (or attach strace to a running process) |
75 |
and see what is going on. You will have a lot of low level |
76 |
information and extensive filtering will be required; strace is |
77 |
capable of that, but you will need to dig into its documentation. |
78 |
|
79 |
Best regards, |
80 |
Andrew Savchenko |