1 |
What has been done this week: |
2 |
- rewritten stage-logging algorithm. New one is quicker and more reliable. |
3 |
- changed type of internal communication sockets from SOCK_STREAM to |
4 |
SOCK_SEQPACKET. This gives no performance improvement but code became |
5 |
simpler. |
6 |
- move from select to epoll for socket events notifications in python |
7 |
server part. |
8 |
- fully rewritten ld_preload hooking library, here is a commit: |
9 |
http://git.overlays.gentoo.org/gitweb/?p=proj/autodep.git;a=commitdiff;h=d8d0c8f39f95af3647f73b5781199899e9b9529d |
10 |
. |
11 |
- test of both approaches has been unified. |
12 |
|
13 |
--- |
14 |
I've spent a lot of time debugging very interesting issue and I want |
15 |
to describe it here. |
16 |
|
17 |
One of approaches I use for logging filesystem events is ld_preload |
18 |
approach. I set LD_PRELOAD environment variable to tell OS to load my |
19 |
library before starting an application. My library has own functions |
20 |
like open, read and write and they are launched before glibc's. This |
21 |
functions tells main python process about an event through unix domain |
22 |
sockets. Name of socket file is stored in LOG_SOCKET environment |
23 |
variable. When proccess trying to launch another process LD_PRELOAD |
24 |
and LOG_SOCKET are inherited so my library is loaded before this |
25 |
another process too. |
26 |
|
27 |
But I've met a problem: in "prerm" stage of building connects to |
28 |
socket always were refused! |
29 |
|
30 |
First I think it is a bug in python server part. I moved it back from |
31 |
epoll to select method for socket events notifications. Bug not |
32 |
disappeared. Then I double-checked the code. All seems to be good, but |
33 |
bug remains. Then I reverted few commits and this not helps. |
34 |
|
35 |
Then I begin to examine the logging library. I think it is some |
36 |
threading issue. I forced library code execute in one thread. This |
37 |
gives no effect. Then I wrote small unix-socket client. Connections of |
38 |
prerm stage was refused but in same time my client was connecting |
39 |
successfully. In netstat socket was "listening". |
40 |
|
41 |
Then I began to log all environment variables and find then someone |
42 |
changes my LOG_SOCKET variable during "prerm" stage. I renamed it to |
43 |
LOG_SOCKET_<LONG_STRING> in all sources hoping this is a |
44 |
name-collision. But variable continues changing its value. The most |
45 |
interesting fact: value was changing to another valid socket name, for |
46 |
example from "/tmp/tmpFSDhf/socket_for_logging" to |
47 |
"/tmp/tmpHDvJ5/socket_for_logging". But nobody listened |
48 |
"/tmp/tmpHDvJ5/socket_for_logging". I started to believe in the higher |
49 |
powers in this moment because last socket file was created five days |
50 |
ago and notebook has been rebooted several times these days. |
51 |
|
52 |
But the explanation was simple: portage system remembers all |
53 |
environment variables while installing a package and restores them |
54 |
while removing this package!! |
55 |
|
56 |
Best, |
57 |
|
58 |
Alexander Bersenev |