1 |
Hi everyone, |
2 |
|
3 |
== Brief summary of this project == |
4 |
|
5 |
The aim of this project is to create scripts that automate the process |
6 |
of overlay creation/maintenance for R packages from repositories such |
7 |
as CRAN and Bioconductor. |
8 |
|
9 |
Longer: |
10 |
For the ebuild creation of a single package one needs to extract the |
11 |
package, copy-paste data from its description file to the ebuild and |
12 |
look up dependencies, which is time-consuming. Although trivial for a |
13 |
few number of packages, this is practically impossible to do by hand |
14 |
for repositories like CRAN (> 3500 packages), especially 'cause it |
15 |
also requires tracking changes (new / updated / removed packages). The |
16 |
solution is to automate that process and this is what this project is |
17 |
about. |
18 |
|
19 |
The project's git repository is located at |
20 |
http://git.overlays.gentoo.org/gitweb/?p=proj/R_overlay.git |
21 |
|
22 |
== Current state of this project and future directions == |
23 |
|
24 |
Automatically generated overlay of R packages has now reached the end |
25 |
of GSoC 2012's coding period. The result briefly described below is |
26 |
roverlay, a script and modules written in python. I tried to keep the |
27 |
code as extensible as possible, making future extensions like other |
28 |
ways to get R packages (git, svn, ...) easy. |
29 |
|
30 |
It has two user-accessible main parts: |
31 |
* overlay creation, which accepts R packages as input and creates a |
32 |
portage overlay for them |
33 |
* repository management, download R packages from remotes and use them |
34 |
as input for overlay creation |
35 |
|
36 |
The minimal requirement for downloading packages is that a remote |
37 |
offers http access to its packages. The preferred way is rsync, which |
38 |
is used for CRAN and BIOC. The http support has been added later to |
39 |
include repos like R-Forge and Omegahat. |
40 |
|
41 |
Overlay creation is able to work incrementally so that existing |
42 |
ebuilds don't have to be recreated. It involves several tasks: |
43 |
* reading R package metadata and fixing errors like misspelled data |
44 |
fields along the way, e.g. 'Depents' is read as 'Depends'. Package |
45 |
reading is configurable. |
46 |
* ebuild creation, which tries to create an ebuild for an R package |
47 |
using its metadata |
48 |
-> dependency resolution that creates correct DEPEND/RDEPEND ebuild |
49 |
variables. It's realized by a dictionary approach extended by |
50 |
version-relative lookups. |
51 |
* overlay writing |
52 |
-> per-Package metadata.xml/Manifest creation |
53 |
|
54 |
Currently, the ebuild creation success rate is slightly higher than |
55 |
95%. Ca. 900 out of 32000 creations fail due to various reasons: os |
56 |
type not supported, dependency unresolvable, R package format not |
57 |
supported (.Z-compressed tarballs, ...). |
58 |
|
59 |
Extensive documentation is available at [0] and covers usage, |
60 |
configuration, installation, what to expect and how roverlay works. |
61 |
|
62 |
All in all, I accomplished most objectives of my proposal. Some have |
63 |
been dropped like getting packages via svn, some have been added like |
64 |
getting packages via http and the version-relative dependency |
65 |
resolution. What's really missing is the integration into Gentoo's |
66 |
infrastructure so that end-users can add the resulting overlay using |
67 |
Layman. This will hopefully happen in the near future. As for the |
68 |
future, I'll focus on adding features based on real world/production |
69 |
usage needs. |
70 |
|
71 |
At last, I'd like to thank Denis (Calchan), my mentor, for guidance |
72 |
throughout the last months. I don't tend to ask many questions, but |
73 |
whenever I had one, he was able to answer it ;) Overall, taking part |
74 |
in gsoc for Gentoo has been a good experience. |
75 |
|
76 |
|
77 |
[0] http://git.overlays.gentoo.org/gitweb/?p=proj/R_overlay.git;a=blob_plain;f=doc/html/usage.html;hb=HEAD |
78 |
|
79 |
-- |
80 |
Regards, |
81 |
André E. |