Gentoo Archives: gentoo-dev

From: Kent Fredric <kentfredric@×××××.com>
To: gentoo-dev@l.g.o
Cc: perl@g.o
Subject: Re: [gentoo-dev] RFC: Add new remote-id types in metadata.dtd
Date: Fri, 20 Apr 2012 08:27:41
Message-Id: CAATnKFCvAvv=8HH=46FE7DRVdeBuf154ywSaTTEL-3cdyMgcEQ@mail.gmail.com
In Reply to: Re: [gentoo-dev] RFC: Add new remote-id types in metadata.dtd by Corentin Chary
1 On 20 April 2012 19:46, Corentin Chary <corentin.chary@×××××.com> wrote:
2 > On Fri, Apr 20, 2012 at 9:37 AM, Kent Fredric <kentfredric@×××××.com> wrote:
3 >> On 20 April 2012 03:31, Corentin Chary <corentin.chary@×××××.com> wrote:
4 >>> Add rubygems, github, gitorious, pecl, pear, bitbucket.
5 >>> All of them are handled by my remoteids.py script.
6 >>>
7 >>> ref: https://bugs.gentoo.org/show_bug.cgi?id=406287
8 >>> ref: https://github.com/iksaif/portage-janitor/blob/master/remoteids.py
9 >>>
10 >>> --- a/metadata/dtd/metadata.dtd 2010-03-02 18:52:11.000000000 +0100
11 >>> +++ b/metadata/dtd/metadata.dtd 2012-04-19 14:22:14.077954310 +0200
12 >>> @@ -61,7 +61,7 @@
13 >>>     <!ELEMENT bugs-to (#PCDATA)>
14 >>>     <!-- specify a type of package identification tracker -->
15 >>>     <!ELEMENT remote-id (#PCDATA)>
16 >>> -      <!ATTLIST remote-id type (freshmeat|sourceforge|sourceforge-jp|cpan|vim|google-code|ctan|pypi|rubyforge|cran) #REQUIRED>
17 >>> +      <!ATTLIST remote-id type (freshmeat|sourceforge|sourceforge-jp|cpan|vim|google-code|ctan|pypi|rubyforge|cran|rubygems|github|gitorious|pecl|pear|bitbucket) #REQUIRED>
18 >>>
19 >>>   <!-- category/package information for cross-linking in descriptions
20 >>>     and useflag descriptions -->
21 >>>
22 >>> --
23 >>> Corentin Chary
24 >>> http://xf.iksaif.net/
25 >>
26 >>
27 >> I suggested last week on #gentoo-perl that it might be nice to have
28 >> 'cpan' and 'cpan-module'  ( or something like that ) to disambiguate 2
29 >> queryable terms. ( where 'cpan'  => 'the package name on cpan' )
30 >>
31 >> For some purposes, its most convenient to use the distribution name,
32 >> and for other purposes, (ie: cpan clients) its more convenient to use
33 >> a Module name, and its not easy to translate between the two, as
34 >> Module names sometimes switch between packages  they're shipped in.
35 >>
36 >> For instance, a while ago, the BioPerl module was shipped in a
37 >> distribution 'bioperl' , which has only recently been changed to
38 >> BioPerl
39 >>
40 >>
41 >> http://api.metacpan.org/release/_search?q=distribution:bioperl&fields=archive,author,date,download_url
42 >>
43 >> http://api.metacpan.org/release/_search?q=distribution:BioPerl&fields=archive,author,date,download_url
44 >>
45 >> vs
46 >>
47 >>
48 >> http://api.metacpan.org/module/_search?q=module.name:Bio\:\:Perl&fields=distribution,author,release
49 >
50 > Looks sane since the goal of remote-id is being able to identify the
51 > package upstream.
52 > Do you think you could patch remotesid.py to generate tags for cpan /
53 > cpan-modules ? Or at least give me a pseudo-algo that does the trick.
54 > Thanks :)
55 >
56 > --
57 > Corentin Chary
58 > http://xf.iksaif.net
59 >
60
61
62 That is sadly not straight forward. Extracting the package name can
63 be straight forward if you have the URL, because the package name is
64 literally the same as the archive name in SRC_URI , sans version
65 information.
66
67 However, if you look at many perl ebuilds, you'll notice many lack
68 this field and we've got other things in place, so the current parsing
69 technique you use to detect uses of SRC_URI wont work there ( I could
70 be wrong, I don't fully grok your python code )
71
72 And more-over, determining the value of 'cpan-module' may be
73 impossible without access to the tar.gz itself, or querying the
74 MetaCPAN API.
75
76 Usually, upstream are sensible and have package names which closely
77 correspond with the module names, ie: "Dist::Zilla" is shipped in
78 'Dist-Zilla-$VERSION.tar.gz', but there are many packages which dont
79 do this, such as this notable example:
80 https://metacpan.org/release/Scalar-List-Utils , which has no modules
81 corresponding to the package name, and no way to divine the/a 'main'
82 module from the package itself. ( and this is exacerbated by packages
83 changing names, or package joins ( 2 packages becoming 1 via releasing
84 modules together ), and package splits ( 1 package rips into 2 sets
85 of modules ).
86
87 Essentially, using a cpan-module as an identifier is somewhat
88 "forwards only" , and even then, what it will resolve to is governed
89 by time.
90
91 This is fine for CPAN clients, which do the resolution hot, using the
92 whole of CPAN as their data, if a user asks for "Foo::Bar", their cpan
93 client will ask a cpan server ( or regularly (hourly) updated list )
94 as to what package that module can be found in ( and this only returns
95 the most recent package, so name changes and so-forth are invisible to
96 the user ).
97
98 And being helpful to CPAN clients is one of the reasons we want this
99 value as a specifiable option in the first place. For us, its easier
100 to track the package name, and then when that has to change we can
101 manually resolve the issue
102
103 --
104 Kent
105
106 perl -e  "print substr( \"edrgmaM  SPA NOcomil.ic\\@tfrken\", \$_ * 3,
107 3 ) for ( 9,8,0,7,1,6,5,4,3,2 );"
108
109 http://kent-fredric.fox.geek.nz

Replies

Subject Author
Re: [gentoo-dev] RFC: Add new remote-id types in metadata.dtd Corentin Chary <corentin.chary@×××××.com>