1 |
-=====================================================================- |
2 |
- Project IDFetch - Weekly report #8-9 ("Once upon a time...") - |
3 |
-=====================================================================- |
4 |
|
5 |
Once upon a time there were no computers, and nobody knew how a gnome[1] |
6 |
and dEmons[2] look like. Today even kids know this, but i still bumped |
7 |
into a problem that i can not see a dEmon. All it started when i was |
8 |
trying to play "Roshambo game"[3] with segget dEmon. |
9 |
|
10 |
Firstly, i was trying to fork[4] the curly[5] daemon twice and it kept |
11 |
punching me in my nose, so i thought my TTL[6] would rapidly decrease. |
12 |
I understood that it's not such an easy thing to win while fighting with |
13 |
someone you can not see. And when daemon obtained Python[7-8] support |
14 |
and started to spawn[9] zombies[10], i've got even more problems. |
15 |
Conscience was telling me that i must play by the rules, but consciousness |
16 |
was sure that daemon doesn't always abide the protocol[11]. I've tried |
17 |
to follow the thread[12-13], but the dEmon was running like a ghost[13], |
18 |
so i almost got myself lost in the thicket of logs[14] and trees[15] :-( |
19 |
|
20 |
Anjuta[16] came to my rescue and helped me to improve my tools, so i could |
21 |
see what the daemon does. Unfortunately, curses[17-18] usually don't work |
22 |
on dEmons, and i really needed a pure magic to win this game. So, i've |
23 |
learned: "Mutex"[19], "Rainbow Colors"[20] and some other tricks. |
24 |
|
25 |
In a meantime i was finding myself knowing more and more about the dEmon, |
26 |
but this was not enough and i had to prepare good arguments if i were |
27 |
going to talk to the dEmon. Here they are: |
28 |
|
29 |
1. For segget daemon: |
30 |
Command line arguments: |
31 |
--no-daemon |
32 |
--conf-dir=specify_conf_dir_here |
33 |
Arguments are optional. If no arguments provided, segget will run in a daemon |
34 |
mode and use /etc/seggetd dir to read configuration files. |
35 |
|
36 |
2. For request tool: |
37 |
--pkglist-file |
38 |
E.i.: |
39 |
$request --pkglist-file=/home/user/mypkg.list |
40 |
|
41 |
3. For tuiclient: |
42 |
--wait-distfile=distfile_name |
43 |
tuiclient checks distfile status, and returns when distfile is downloaded or not in the queue. |
44 |
|
45 |
|
46 |
Btw, here's features added to segget daemon and tuiclient during this |
47 |
period of time: |
48 |
|
49 |
1. DAEMON |
50 |
========= |
51 |
1.1. Options: |
52 |
-------------- |
53 |
Add daemon mode to segget |
54 |
Add /etc/init.d/seggetd script to start|stop|restart|status segget daemon |
55 |
Check all set checksums, checksums are optional. |
56 |
Consider distfile failed if one of its segments is failed. |
57 |
Fixed: if only local mirrors are available and all of them failed to download |
58 |
a distfile, distfile still had DWAITING status, because attempt_limit wasn't reached. |
59 |
|
60 |
Add CoralCDN support as an option to network#.conf files (section [mode]) |
61 |
|
62 |
Add options FOLLOW_LOCATION and MAX_REDIRS to network#.conf files |
63 |
|
64 |
SYNOPSIS: FOLLOW_LOCATION= 0 | 1 |
65 |
A parameter set to 1 tells segget to follow any Location: header that the server |
66 |
sends as part of an HTTP header. This means that the segget will re-send the |
67 |
same request on the new location and follow new Location: headers all the way |
68 |
until no more such headers are returned. MAX_REDIRS can be used to limit the |
69 |
number of redirects segget will follow. |
70 |
Default: |
71 |
follow_location=1 |
72 |
|
73 |
MAX_REDIRS |
74 |
The set number will be the redirection limit. If that many redirections have |
75 |
been followed, the next redirect will cause an error. This option only makes |
76 |
sense if the FOLLOW_LOCATION is used at the same time. |
77 |
Setting the limit to 0 will make segget refuse any redirect. |
78 |
Minimum value: 0 |
79 |
Maximum value: 100 |
80 |
Default: |
81 |
max_redirs=5 |
82 |
|
83 |
Add BIND_LOCAL_PORT and BIND_LOCAL_PORT_RANGE options to network#.conf files |
84 |
|
85 |
BIND_LOCAL_PORT |
86 |
This sets the local port number of the socket used for connection. This option |
87 |
can be used in combination with BIND_INTERFACE and you are recommended to |
88 |
use BIND_LOCAL_PORT_RANGE as well when this is set. Set to 0 - to disable |
89 |
binding. Valid port numbers are 1 - 65535. |
90 |
Minimum value: 0 (no binding) |
91 |
Maximum value: 65535 |
92 |
Default: |
93 |
bind_local_port=0 |
94 |
|
95 |
BIND_LOCAL_PORT_RANGE |
96 |
If BIND_LOCAL_PORT=0 this option will be ignored. |
97 |
This is the number of attempts segget should make to find a |
98 |
working local port number. It starts with the given BIND_LOCAL_PORT and adds |
99 |
one to the number for each retry. Setting this to 1 or below will make segget |
100 |
do only one try for the exact port number. Port numbers by nature are scarce |
101 |
resources that will be busy at times so setting this value to something too |
102 |
low might cause unnecessary connection setup failures. |
103 |
Minimum value: 1 |
104 |
Maximum value: 65535 |
105 |
Default: |
106 |
bind_local_port_range=20 |
107 |
|
108 |
Add option proxy_type to network#.conf files |
109 |
|
110 |
SYNOPSIS: PROXY_TYPE = 0 | 1 | 2 | 3 | 4 | 5 |
111 |
0 - HTTP |
112 |
1 - HTTP_1_0 |
113 |
2 - SOCKS4 |
114 |
3 - SOCKS4a |
115 |
4 - SOCKS5 |
116 |
5 - SOCKS5_HOSTNAME |
117 |
Specify type of the proxy. |
118 |
Default: |
119 |
proxy_type=0 |
120 |
|
121 |
1.2. Proxy-fetcher |
122 |
------------------ |
123 |
Implement checks for both (proxy_fetcher and request_server) queues. |
124 |
|
125 |
There're 2 queues: proxy_fetcher queue and request_server queue. |
126 |
|
127 |
Note: Segget processes request_server queue first and if no segment was |
128 |
chosen switches to proxy_fetcher queue. |
129 |
|
130 |
Before adding a distifile to any of the queues it's necessary to |
131 |
check both queues, since distfile may already be in one of them. |
132 |
|
133 |
1.3. Python scripting |
134 |
--------------------- |
135 |
Add [scripting_and_scheduling] section to segget.conf file. |
136 |
[scripting_and_scheduling] |
137 |
Segget provides Python scripting functionalyty to support scheduling. |
138 |
Each time segget tries to start a new connection certain network it calls |
139 |
a python script (client.py) to accept or reject this connection and |
140 |
if necessary adjusts its settings. |
141 |
|
142 |
PYTHON_PATH |
143 |
Define path to python |
144 |
Default: |
145 |
python_path=/usr/bin/python |
146 |
|
147 |
SCRIPTS_DIR |
148 |
Define a path to the dir with python scripts. Before establishing connection for |
149 |
a particular segment via network# segget checks SCRIPTS_DIR. |
150 |
If SCRIPTS_DIR contains net#.py file, segget will launch schedule() function |
151 |
from this file to apply settings for connetion and accept or reject this |
152 |
segment for the moment. net#.py file is a python script file |
153 |
with a user-writen schedule() function. |
154 |
It's necessary to import functions before using get("variable"), |
155 |
set("variable",value), accept_segment() and reject_segment() in schedule(). |
156 |
get() function can obtain values for the following variables: |
157 |
connection.num, connection.url, connection.max_speed_limit, |
158 |
network.num, network.mode, network.active_connections_count, |
159 |
distfile.name, distfile.size, distfile.dld_segments_count, |
160 |
distfile.segments_count, distfile.active_connections_count, |
161 |
segment.num, segment.try_num, segment.size, segment.range |
162 |
set() function can change connection.max_speed_limit, see example: |
163 |
-----------------EXAMPLE STARTS----------------- |
164 |
from functions import * |
165 |
import time; |
166 |
def schedule(): |
167 |
localtime = time.localtime(time.time()); |
168 |
hour=localtime[3]; |
169 |
# disable downloading distfiles that have size more than 5 000 000 bytes |
170 |
# from 8-00 to 22-00. |
171 |
if hour>8 and hour<22 and (get("distfile.size"))>5000000: |
172 |
print "reject because distfile is too big" |
173 |
reject_segment() |
174 |
# set speed limit 50 000 cps for distfiles larger than 1 000 000 bytes |
175 |
if get("distfile.size")>1000000: |
176 |
print "limit connection speed" |
177 |
set(connection.max_speed_limit, 50000) |
178 |
accept_segment() |
179 |
-----------------EXAMPLE ENDS----------------- |
180 |
From example above localtime returns following tuple: |
181 |
Index Attributes Values |
182 |
0 tm_year e.i.: 2008 |
183 |
1 tm_mon 1 to 12 |
184 |
2 tm_mday 1 to 31 |
185 |
3 tm_hour 0 to 23 |
186 |
4 tm_min 0 to 59 |
187 |
5 tm_sec 0 to 61 (60 or 61 are leap-seconds) |
188 |
6 tm_wday 0 to 6 (0 is Monday) |
189 |
7 tm_yday 1 to 366 (Julian day) |
190 |
8 tm_isdst -1, 0, 1, -1 means library determines DST |
191 |
Therefore localtime[3] provides hours. |
192 |
Segment will be accecpted by default if it was neither accepted nor rejected |
193 |
during the schedule() function. |
194 |
sagget saves logs of resulting stdout and stderr in the log folder |
195 |
separatly for each network. Hence, if there's an error in net3.py file python |
196 |
error message would be saved to net3_script_stderr.log. Results of print would |
197 |
be saved in net3_script_stdout.log. |
198 |
Default: |
199 |
scripts_dir=./scripts |
200 |
|
201 |
SCRIPT_SOCKET_PATH |
202 |
Segget uses AF_UNIX domain sockets for communication with python. |
203 |
Specify path for the socket on your filesystem. |
204 |
Default: |
205 |
script_socket_path=/tmp/segget_script_socket |
206 |
|
207 |
1.4 Logs |
208 |
-------- |
209 |
Add "none" as an option for log files. |
210 |
|
211 |
Add explanations for CURL error codes to logs. |
212 |
|
213 |
Add options: GENERAL_LOG_TIME_FORMAT, ERROR_LOG_TIME_FORMAT and DEBUG_LOG_TIME_FORMAT to segget.conf file |
214 |
|
215 |
GENERAL_LOG_TIME_FORMAT |
216 |
Set time format for general log as a string containing any combination of |
217 |
regular characters and special format specifiers. These format specifiers are |
218 |
replaced by the function to the corresponding values to represent the time |
219 |
specified in timeptr. They all begin with a percentage (%) sign, and are: |
220 |
%a Abbreviated weekday name [For example: Thu] |
221 |
%A Full weekday name [For example: Thursday] |
222 |
%b Abbreviated month name [For example: Aug] |
223 |
%B Full month name [For example: August] |
224 |
%c Date and time representation [For example: Thu Aug 23 14:55:02 2001] |
225 |
%d Day of the month (01-31) [For example: 23] |
226 |
%H Hour in 24h format (00-23) [For example: 14] |
227 |
%I Hour in 12h format (01-12) [For example: 02] |
228 |
%j Day of the year (001-366) [For example: 235] |
229 |
%m Month as a decimal number (01-12) [For example: 08] |
230 |
%M Minute (00-59) [For example: 55] |
231 |
%p AM or PM designation [For example: PM] |
232 |
%S Second (00-61) [For example: 02] |
233 |
%U Week number with the first Sunday |
234 |
as the first day of week one (00-53) [For example: 33] |
235 |
%w Weekday as a decimal number with |
236 |
Sunday as 0 (0-6) [For example: 4] |
237 |
%W Week number with the first Monday as |
238 |
the first day of week one (00-53) [For example: 34] |
239 |
%x Date representation [For example: 08/23/01] |
240 |
%X Time representation [For example: 14:55:02] |
241 |
%y Year, last two digits (00-99) [For example: 01] |
242 |
%Y Year [For example: 2001] |
243 |
%Z Timezone name or abbreviation [For example: CDT] |
244 |
%% A % sign [For example: %] |
245 |
|
246 |
For instace general_log_time_format=Time: %m/%d %X |
247 |
|
248 |
Default: |
249 |
general_log_time_format=%m/%d %X |
250 |
|
251 |
ERROR_LOG_TIME_FORMAT |
252 |
Set time format for error log as a string containing any combination of |
253 |
regular characters and special format specifiers. See GENERAL_LOG_TIME_FORMAT |
254 |
for details on format specifiers. |
255 |
Default: |
256 |
error_log_time_format=%m/%d %X |
257 |
|
258 |
DEBUG_LOG_TIME_FORMAT |
259 |
Set time format for debug log as a string containing any combination of |
260 |
regular characters and special format specifiers. See GENERAL_LOG_TIME_FORMAT |
261 |
for details on format specifiers. |
262 |
Default: |
263 |
debug_log_time_format=%m/%d %X |
264 |
|
265 |
2. REQUEST TOOL |
266 |
=============== |
267 |
Add request tool. |
268 |
|
269 |
Request tool reads list of distfiles from ./pkg.list file and requests |
270 |
seggetd daemon to download distfiles from the list. |
271 |
|
272 |
3. TUICLIENT |
273 |
============ |
274 |
Add network_type for each connection to tui. |
275 |
Add ETA, AVG speed and active/total connections to tui. |
276 |
Add segments counters to stats and tui. |
277 |
Add connetion num to totals. |
278 |
Add log and error_log windows to tuiclient |
279 |
Add distfiles window to tuiclient that shows progress on distfile downloads, |
280 |
including its status: added/waiting/downloading/downloaded/failed/rejected by script etc. |
281 |
|
282 |
[1] Gnome http://www.gnome.org/ |
283 |
[2] dEmon http://www.clker.com/cliparts/5/1/b/d/11954315391526924611beastie_freebsd_daemon_r_02.svg.med.png |
284 |
[3] Roshambo game http://www.erikandanna.com/Humor/FlashStuff/SouthPark/roshamboN.swf |
285 |
[4] fork http://en.wikipedia.org/wiki/Fork_%28software_development%29 |
286 |
[5] curl http://curl.haxx.se/ |
287 |
[6] TTL http://en.wikipedia.org/wiki/Time_to_live |
288 |
[7] Python http://loyalkng.com/wp-content/uploads/2010/03/adam-apple-bizarro-cartoon-comic-tampon-chandelier-pc-mac-snake-eve.jpg |
289 |
[8] Python http://www.python.org/ |
290 |
[9] spawn http://en.wikipedia.org/wiki/Spawn_(computing) |
291 |
[10] zombies http://en.wikipedia.org/wiki/Zombie_process |
292 |
[11] protocol http://en.wikipedia.org/wiki/Communications_protocol |
293 |
[12] thread http://en.wikipedia.org/wiki/Thread_(computer_science) |
294 |
[13] ghost http://www.youtube.com/watch?v=9WrEDyIzdjY from the 3rd minute |
295 |
[14] logs http://www.nawwal.org/~mrgoff/photojournal/2004/winspr/pictures/03-20nurselog.jpg |
296 |
[15] pstrees http://en.wikipedia.org/wiki/Pstree |
297 |
[16] Anjuta http://www.anjuta.org/ |
298 |
[17] curses http://en.wikipedia.org/wiki/Curse |
299 |
[18] Ncurses http://en.wikipedia.org/wiki/Ncurses |
300 |
[19] Mutex http://en.wikipedia.org/wiki/Mutual_exclusion |
301 |
[20] Rainbow Colors http://idfetch.isgreat.org/_content2/tuiclient_rainbow_colors.jpg see "DISTFILES" window. |
302 |
|
303 |
Best regards, |
304 |
Kostyantyn aka simka |