1 |
Project IDFetch - Weekly report #1 |
2 |
================================== |
3 |
|
4 |
The purpose of the project is to optimize software installation process, |
5 |
by means of making distfile fetcher more intelligent and increasing |
6 |
effectiveness of network connection utilization. The idea of the project |
7 |
is not to rewrite the whole Portage system, but rather the part that |
8 |
actually contiguous with the bottle neck of the network connection – the |
9 |
distfile fetcher. |
10 |
|
11 |
For more information on the project please see website: |
12 |
http://soc.dev.gentoo.org/~simka/ |
13 |
or |
14 |
http://idfetch.isgreat.org |
15 |
|
16 |
git repository for idfetch project: |
17 |
git://git.overlays.gentoo.org/proj/idfetch |
18 |
http://git.overlays.gentoo.org/gitroot/proj/idfetch |
19 |
|
20 |
git repository for changes to Portage: |
21 |
git://git.overlays.gentoo.org/proj/portage-idfetch |
22 |
http://git.overlays.gentoo.org/gitroot/proj/portage-idfetch |
23 |
|
24 |
|
25 |
You can share your ideas on idfetch by joining IRC channel |
26 |
#gentoo-idfetch at freenode or just sending me an email. |
27 |
|
28 |
==================== |
29 |
The progress report: |
30 |
==================== |
31 |
|
32 |
1) I've started from joining the mainstream and becoming pretty nervous |
33 |
of the thoughts whether i can manage this (for some people seemingly |
34 |
easy) project. After importing chocolate and coffee modules i tried to |
35 |
switch to more productive things ;) |
36 |
|
37 |
2) First thing to do was to export some data from the current portage |
38 |
system: basename, mirror URLs, size, checksums. I ended up with some |
39 |
changes to fetch.py file that provided me with the following results: |
40 |
|
41 |
# list of pkgs to be installed |
42 |
Tidfetch_pkg_list : list of Tidfetch_pkg; |
43 |
|
44 |
Tidfetch_pkg : dict |
45 |
['pkg_name'] : string; |
46 |
['distfile_list'] : list of Tidfetch_distfile; |
47 |
|
48 |
Tidfetch_distfile : dict |
49 |
['name'] : string; |
50 |
['url_list'] : string; |
51 |
['size'] : int; |
52 |
['RMD160'] |
53 |
['SHA1'] |
54 |
['SHA256'] |
55 |
|
56 |
3) I started to use pickle module to exchange data between fetch.py and |
57 |
twrapper (threaded wrapper for simultaneous downloads) |
58 |
|
59 |
4) Following advice from Robin H. Johnson (my mentor) replaced pickle |
60 |
module by json [1]. So now pkg_list.list (export file) looks this way: |
61 |
|
62 |
[ |
63 |
{ |
64 |
"distfile_list": [ |
65 |
{ |
66 |
"RMD160": "10a19a10d0388bc084a7c1d3da845068d7169054", |
67 |
"SHA1": "2a7198e8178b2e7dba87cb5794da515200b568f5", |
68 |
"SHA256":"0eb6f356119f2e49b2563210852e17f57f9dcc5755f350a69a46a0d641a0c401", |
69 |
"name": "ghostscript-fonts-std-8.11.tar.gz", |
70 |
"size": 3752871, |
71 |
"url_list": [ |
72 |
"ftp://gentoo.mirrors.tds.net/gentoo/distfiles/ghostscript-fonts-std-8.11.tar.gz", |
73 |
"ftp://ftp.lug.udel.edu/pub/gentoo/distfiles/ghostscript-fonts-std-8.11.tar.gz", |
74 |
"http://www.gtlib.gatech.edu/pub/gentoo/distfiles/ghostscript-fonts-std-8.11.tar.gz", |
75 |
...........more mirrors - skipped ...... |
76 |
] |
77 |
} |
78 |
], |
79 |
"pkg_name": "gnu-gs-fonts-std-8.11" |
80 |
}, |
81 |
{ |
82 |
"distfile_list": [ |
83 |
{ |
84 |
"RMD160": "ae50d9eaccb3cc6aa48669eb5ea44a2857e80952", |
85 |
"SHA1": "d6c3ed6f0c0deab9ee4f6d63f7b2c7ce3cbae280", |
86 |
"SHA256":"5efcc970b0ada0f8b5122e37ce8d02966999a4c8ece44df518f97c984134b645", |
87 |
"name": "util-macros-1.6.1.tar.bz2", |
88 |
"size": 62130, |
89 |
"url_list": [ |
90 |
"ftp://gentoo.mirrors.tds.net/gentoo/distfiles/util-macros-1.6.1.tar.bz2", |
91 |
"ftp://ftp.lug.udel.edu/pub/gentoo/distfiles/util-macros-1.6.1.tar.bz2", |
92 |
"http://www.gtlib.gatech.edu/pub/gentoo/distfiles/util-macros-1.6.1.tar.bz2", |
93 |
...........more mirrors - skipped ...... |
94 |
] |
95 |
} |
96 |
], |
97 |
"pkg_name": "util-macros-1.6.1" |
98 |
} |
99 |
] |
100 |
|
101 |
5) Development of simple-threaded-twrapper started. Twrapper reads the |
102 |
data from pkg_list.list file and starts downloading simultaneously |
103 |
distfiles from the list (according to MAX_ACTIVE_DOWNLOADS in |
104 |
idfetch_settings.py) |
105 |
|
106 |
5.1) Before downloading file it checks if file already exists, it's |
107 |
complete, check sums are ok. If so, skips this distfile. Otherwise |
108 |
downloads it and checks its check sums. |
109 |
|
110 |
6) Interface development: it's possible to choose between curses |
111 |
(USE_CURSES_FLAG=1) and simple log-like output (USE_CURSES_FLAG=0). |
112 |
Probably i'll do tput-interface implementation, because log-like output |
113 |
is hard to follow and curses don't like buggy code :( |
114 |
|
115 |
To see examples of output please follow these links: |
116 |
http://soc.dev.gentoo.org/~simka/curses.jpg |
117 |
http://soc.dev.gentoo.org/~simka/log-like.jpg |
118 |
|
119 |
7) Robin H. Johnson suggested that Portage changes might be better in a |
120 |
separate repo, that tracks the main Portage repo. Therefore, repository |
121 |
for portage-idfetch project was created: |
122 |
git://git.overlays.gentoo.org/proj/portage-idfetch |
123 |
http://git.overlays.gentoo.org/gitroot/proj/portage-idfetch |
124 |
|
125 |
|
126 |
[1] http://docs.python.org/library/json.html |
127 |
|
128 |
|
129 |
-- |
130 |
Best regards, |
131 |
Kostyantyn Ovechko |