Gentoo Archives: gentoo-dev

From: Corentin Chary <corentin.chary@×××××.com>
To: gentoo-dev@l.g.o
Cc: perl@g.o
Subject: Re: [gentoo-dev] RFC: Add new remote-id types in metadata.dtd
Date: Fri, 20 Apr 2012 11:22:42
Message-Id: CAHR064iQVmJ4-tJKfqHBo7y4m6S1aCqY4XB--cyF_P2+NhCT2Q@mail.gmail.com
In Reply to: Re: [gentoo-dev] RFC: Add new remote-id types in metadata.dtd by Kent Fredric
1 On Fri, Apr 20, 2012 at 10:26 AM, Kent Fredric <kentfredric@×××××.com> wrote:
2 > On 20 April 2012 19:46, Corentin Chary <corentin.chary@×××××.com> wrote:
3 >> On Fri, Apr 20, 2012 at 9:37 AM, Kent Fredric <kentfredric@×××××.com> wrote:
4 >>> On 20 April 2012 03:31, Corentin Chary <corentin.chary@×××××.com> wrote:
5 >>>> Add rubygems, github, gitorious, pecl, pear, bitbucket.
6 >>>> All of them are handled by my remoteids.py script.
7 >>>>
8 >>>> ref: https://bugs.gentoo.org/show_bug.cgi?id=406287
9 >>>> ref: https://github.com/iksaif/portage-janitor/blob/master/remoteids.py
10 >>>>
11 >>>> --- a/metadata/dtd/metadata.dtd 2010-03-02 18:52:11.000000000 +0100
12 >>>> +++ b/metadata/dtd/metadata.dtd 2012-04-19 14:22:14.077954310 +0200
13 >>>> @@ -61,7 +61,7 @@
14 >>>>     <!ELEMENT bugs-to (#PCDATA)>
15 >>>>     <!-- specify a type of package identification tracker -->
16 >>>>     <!ELEMENT remote-id (#PCDATA)>
17 >>>> -      <!ATTLIST remote-id type (freshmeat|sourceforge|sourceforge-jp|cpan|vim|google-code|ctan|pypi|rubyforge|cran) #REQUIRED>
18 >>>> +      <!ATTLIST remote-id type (freshmeat|sourceforge|sourceforge-jp|cpan|vim|google-code|ctan|pypi|rubyforge|cran|rubygems|github|gitorious|pecl|pear|bitbucket) #REQUIRED>
19 >>>>
20 >>>>   <!-- category/package information for cross-linking in descriptions
21 >>>>     and useflag descriptions -->
22 >>>>
23 >>>> --
24 >>>> Corentin Chary
25 >>>> http://xf.iksaif.net/
26 >>>
27 >>>
28 >>> I suggested last week on #gentoo-perl that it might be nice to have
29 >>> 'cpan' and 'cpan-module'  ( or something like that ) to disambiguate 2
30 >>> queryable terms. ( where 'cpan'  => 'the package name on cpan' )
31 >>>
32 >>> For some purposes, its most convenient to use the distribution name,
33 >>> and for other purposes, (ie: cpan clients) its more convenient to use
34 >>> a Module name, and its not easy to translate between the two, as
35 >>> Module names sometimes switch between packages  they're shipped in.
36 >>>
37 >>> For instance, a while ago, the BioPerl module was shipped in a
38 >>> distribution 'bioperl' , which has only recently been changed to
39 >>> BioPerl
40 >>>
41 >>>
42 >>> http://api.metacpan.org/release/_search?q=distribution:bioperl&fields=archive,author,date,download_url
43 >>>
44 >>> http://api.metacpan.org/release/_search?q=distribution:BioPerl&fields=archive,author,date,download_url
45 >>>
46 >>> vs
47 >>>
48 >>>
49 >>> http://api.metacpan.org/module/_search?q=module.name:Bio\:\:Perl&fields=distribution,author,release
50 >>
51 >> Looks sane since the goal of remote-id is being able to identify the
52 >> package upstream.
53 >> Do you think you could patch remotesid.py to generate tags for cpan /
54 >> cpan-modules ? Or at least give me a pseudo-algo that does the trick.
55 >> Thanks :)
56 >>
57 >> --
58 >> Corentin Chary
59 >> http://xf.iksaif.net
60 >>
61 >
62 >
63 > That is sadly not straight forward.  Extracting the package name can
64 > be straight forward if you have the URL, because the package name is
65 > literally the same as the archive name in SRC_URI , sans version
66 > information.
67 >
68 > However, if you look at many perl ebuilds, you'll notice many lack
69 > this field and we've got other things in place, so the current parsing
70 > technique you use to detect uses of SRC_URI wont work there ( I could
71 > be wrong, I don't fully grok your python code )
72
73 Currently it uses SRC_URI and HOMEPAGE, but honestly it wouldn't be
74 hard to use any other environment variable and to do some checks on a
75 webservice.
76 Anyway for tricky cases it can still be done by hand.
77
78 > And more-over, determining the value of 'cpan-module' may be
79 > impossible without access to the tar.gz itself, or querying the
80 > MetaCPAN API.
81 >
82 > Usually, upstream are sensible and have package names which closely
83 > correspond with the module names, ie: "Dist::Zilla" is shipped in
84 > 'Dist-Zilla-$VERSION.tar.gz',  but there are many packages which dont
85 > do this, such as this notable example:
86 > https://metacpan.org/release/Scalar-List-Utils  , which has no modules
87 > corresponding to the package name, and no way to divine the/a 'main'
88 > module from the package itself. ( and this is exacerbated by packages
89 > changing names, or package joins ( 2 packages becoming 1 via releasing
90 > modules together ),  and package splits ( 1 package rips into 2 sets
91 > of modules ).
92 >
93 > Essentially, using a cpan-module as an identifier is somewhat
94 > "forwards only" , and even then, what it will resolve to is governed
95 > by time.
96 >
97 > This is fine for CPAN clients, which do the resolution hot, using the
98 > whole of CPAN as their data, if a user asks for "Foo::Bar", their cpan
99 > client will ask a cpan server ( or regularly (hourly) updated list )
100 > as to what package that module can be found in ( and this only returns
101 > the most recent package, so name changes and so-forth are invisible to
102 > the user ).
103 >
104 > And being helpful to CPAN clients is one of the reasons we want this
105 > value as a specifiable option in the first place. For us, its easier
106 > to track the package name, and then when that has to change we can
107 > manually resolve the issue
108 >
109 > --
110 > Kent
111 >
112 > perl -e  "print substr( \"edrgmaM  SPA NOcomil.ic\\@tfrken\", \$_ * 3,
113 > 3 ) for ( 9,8,0,7,1,6,5,4,3,2 );"
114 >
115 > http://kent-fredric.fox.geek.nz
116 >
117
118
119
120 --
121 Corentin Chary
122 http://xf.iksaif.net

Replies

Subject Author
Re: [gentoo-dev] RFC: Add new remote-id types in metadata.dtd Kent Fredric <kentfredric@×××××.com>