Gentoo Archives: gentoo-soc

From: Kostyantyn Ovechko <fastinetserver2@×××××.com>
To: gentoo-soc <gentoo-soc@l.g.o>
Subject: [gentoo-soc] Project IDFetch - Weekly report #2
Date: Wed, 23 Jun 2010 18:13:44
Message-Id: 1277316828.9059.5.camel@monapc
1 - Project IDFetch - Weekly report #2 -
2 ==================================
3
4 Though, twrapper.py is a good thing to play with: it takes list
5 of files from pkg.list and starts downloading them
6 simultaneously. Nevertheless it uses wget tool for this and
7 communication between twrapper.py has some serious limitations
8 especially when it comes to controling simultaneous downloads
9 (speed, segments, etc). So next step for idfetch project was to
10 make its own tool for segmented file fetching.
11
12
13 1. [Replacing stuff]
14 Started from replacing chocolate module by dates [1] (i still
15 use coffee though), python by cpp, wget by libcurl [2].
16
17 2. [JSON struggles]. Going cpp grows my love to python.
18
19 2.1 Using python
20
21 2.1.1 Saving json data with python
22
23 To save data to json-formated pkg.list fetch.py
24 uses following lines:
25
26 import json
27
28 idfetch_pkg_list_file.write(json.dumps(idfetch_pkg_list, sort_keys=1, indent=4))
29
30 2.1.2 Loading json data with python
31
32 To load data from json-formated pkg.list
33 twrapper.py uses these 2 lines:
34
35 import json
36
37 idfetch_pkg_list=json.loads(idfetch_pkg_list_file.read())
38
39 2.2 Using cpp
40
41
42 2.2.1 Loading json data with cpp
43 Reading data in cpp resulted in:
44 a) #emerge json-c
45 b) g++ -o task task.c -ljson (where
46 task.c has 146 lines of code = 5037
47 bytes);
48 c++) minus one day spend on these 146
49 lines of code.
50
51 3. [Discovering libcurl].
52 It's the first time I use libcurl for my project (Thanks to
53 Robin H. Johnson). Libcurl happened to be pretty interesting lib
54 with lots of opportunities to discover and implement into
55 idfetch. Therefore I only hope not to bump into some grueling
56 bug or limitation.
57
58 4. [Downloading with libcurl].
59
60 Experiments at stage 3 resulted in segget.cpp. Segget downloads
61 files, splitting them into segments. It's possible to set number of
62 parallel connections for segget. After fetching segments, segget
63 combines them together to provide original file (no checksum check
64 implemented yet).
65
66 [1] http://en.wikipedia.org/wiki/Phoenix_dactylifera
67 [2] http://curl.haxx.se/libcurl/
68
69 Best regards,
70 Kostyantyn Ovechko AKA simka