Gentoo Archives: gentoo-soc

From: mona <fastinetserver2@×××××.com>
To: gentoo-soc <gentoo-soc@l.g.o>
Subject: [gentoo-soc] A few questions + Draft: Project IDFetch - Weekly report #1
Date: Wed, 02 Jun 2010 21:12:15
Message-Id: 1275513122.12400.7.camel@monapc
1 Project IDFetch - Weekly report #1
2 ==================================
3
4 The purpose of the project is to optimize software installation process,
5 by means of making distfile fetcher more intelligent and increasing
6 effectiveness of network connection utilization. The idea of the project
7 is not to rewrite the whole Portage system, but rather the part that
8 actually contiguous with the bottle neck of the network connection – the
9 distfile fetcher.
10
11 For more information on the project please see website:
12 http://soc.dev.gentoo.org/~simka/
13 or
14 http://idfetch.isgreat.org
15
16 git repository for idfetch project:
17 git://git.overlays.gentoo.org/proj/idfetch
18 http://git.overlays.gentoo.org/gitroot/proj/idfetch
19
20 git repository for changes to Portage:
21 git://git.overlays.gentoo.org/proj/portage-idfetch
22 http://git.overlays.gentoo.org/gitroot/proj/portage-idfetch
23
24
25 You can share your ideas on idfetch by joining IRC channel
26 #gentoo-idfetch at freenode or just sending me an email.
27
28 ====================
29 The progress report:
30 ====================
31
32 1) I've started from joining the mainstream and becoming pretty nervous
33 of the thoughts whether i can manage this (for some people seemingly
34 easy) project. After importing chocolate and coffee modules i tried to
35 switch to more productive things ;)
36
37 2) First thing to do was to export some data from the current portage
38 system: basename, mirror URLs, size, checksums. I ended up with some
39 changes to fetch.py file that provided me with the following results:
40
41 # list of pkgs to be installed
42 Tidfetch_pkg_list : list of Tidfetch_pkg;
43
44 Tidfetch_pkg : dict
45 ['pkg_name'] : string;
46 ['distfile_list'] : list of Tidfetch_distfile;
47
48 Tidfetch_distfile : dict
49 ['name'] : string;
50 ['url_list'] : string;
51 ['size'] : int;
52 ['RMD160']
53 ['SHA1']
54 ['SHA256']
55
56 3) I started to use pickle module to exchange data between fetch.py and
57 twrapper (threaded wrapper for simultaneous downloads)
58
59 4) Following advice from Robin H. Johnson (my mentor) replaced pickle
60 module by json [1]. So now pkg_list.list (export file) looks this way:
61
62 [
63 {
64 "distfile_list": [
65 {
66 "RMD160": "10a19a10d0388bc084a7c1d3da845068d7169054",
67 "SHA1": "2a7198e8178b2e7dba87cb5794da515200b568f5",
68 "SHA256":"0eb6f356119f2e49b2563210852e17f57f9dcc5755f350a69a46a0d641a0c401",
69 "name": "ghostscript-fonts-std-8.11.tar.gz",
70 "size": 3752871,
71 "url_list": [
72 "ftp://gentoo.mirrors.tds.net/gentoo/distfiles/ghostscript-fonts-std-8.11.tar.gz",
73 "ftp://ftp.lug.udel.edu/pub/gentoo/distfiles/ghostscript-fonts-std-8.11.tar.gz",
74 "http://www.gtlib.gatech.edu/pub/gentoo/distfiles/ghostscript-fonts-std-8.11.tar.gz",
75 ...........more mirrors - skipped ......
76 ]
77 }
78 ],
79 "pkg_name": "gnu-gs-fonts-std-8.11"
80 },
81 {
82 "distfile_list": [
83 {
84 "RMD160": "ae50d9eaccb3cc6aa48669eb5ea44a2857e80952",
85 "SHA1": "d6c3ed6f0c0deab9ee4f6d63f7b2c7ce3cbae280",
86 "SHA256":"5efcc970b0ada0f8b5122e37ce8d02966999a4c8ece44df518f97c984134b645",
87 "name": "util-macros-1.6.1.tar.bz2",
88 "size": 62130,
89 "url_list": [
90 "ftp://gentoo.mirrors.tds.net/gentoo/distfiles/util-macros-1.6.1.tar.bz2",
91 "ftp://ftp.lug.udel.edu/pub/gentoo/distfiles/util-macros-1.6.1.tar.bz2",
92 "http://www.gtlib.gatech.edu/pub/gentoo/distfiles/util-macros-1.6.1.tar.bz2",
93 ...........more mirrors - skipped ......
94 ]
95 }
96 ],
97 "pkg_name": "util-macros-1.6.1"
98 }
99 ]
100
101 5) Development of simple-threaded-twrapper started. Twrapper reads the
102 data from pkg_list.list file and starts downloading simultaneously
103 distfiles from the list (according to MAX_ACTIVE_DOWNLOADS in
104 idfetch_settings.py)
105
106 5.1) Before downloading file it checks if file already exists, it's
107 complete, check sums are ok. If so, skips this distfile. Otherwise
108 downloads it and checks its check sums.
109
110 6) Interface development: it's possible to choose between curses
111 (USE_CURSES_FLAG=1) and simple log-like output (USE_CURSES_FLAG=0).
112 Probably i'll do tput-interface implementation, because log-like output
113 is hard to follow and curses don't like buggy code :(
114
115 To see examples of output please follow these links:
116 http://soc.dev.gentoo.org/~simka/curses.jpg
117 http://soc.dev.gentoo.org/~simka/log-like.jpg
118
119 7) Robin H. Johnson suggested that Portage changes might be better in a
120 separate repo, that tracks the main Portage repo. Therefore, repository
121 for portage-idfetch project was created:
122 git://git.overlays.gentoo.org/proj/portage-idfetch
123 http://git.overlays.gentoo.org/gitroot/proj/portage-idfetch
124
125
126 [1] http://docs.python.org/library/json.html