1 |
On 01/07/2017 05:32 AM, Michael Orlitzky wrote: |
2 |
> On 01/07/2017 06:08 AM, Wim Muskee wrote: |
3 |
>> |
4 |
>> URISCHEME_RE = re.compile(r'^[a-z\-]+://') |
5 |
>> |
6 |
>> ... |
7 |
>> |
8 |
>> URISCHEME_RE.match(ebuild.metadata.get("HOMEPAGE")) is None: |
9 |
>> |
10 |
> |
11 |
> The PMS allows some weird stuff in HOMEPAGE: |
12 |
> |
13 |
> https://dev.gentoo.org/~ulm/pms/head/pms.html#x1-760008 |
14 |
> |
15 |
> Specifically, |
16 |
> |
17 |
> In addition, SRC_URI, HOMEPAGE, RESTRICT, PROPERTIES, LICENSE and |
18 |
> REQUIRED_USE use dependency-style specifications to specify their |
19 |
> values. |
20 |
> |
21 |
> That means that something like, |
22 |
> |
23 |
> HOMEPAGE="branding? ( https://www.mozilla.org/ ) |
24 |
> !branding? ( https://www.gentoo.org/ )" |
25 |
> |
26 |
> would be valid. It's a little crazy, but there it is. |
27 |
> |
28 |
> If you can figure out a way to parse a dependency spec (this has to |
29 |
> exist somewhere in repoman/portage), then you can run your check against |
30 |
> the URLs at the leaf nodes. At that point, it should be relatively easy |
31 |
> to update the regex to match the RFC =) |
32 |
> |
33 |
> https://tools.ietf.org/html/rfc3986#section-3.1 |
34 |
|
35 |
This will return a flat list: |
36 |
|
37 |
portage.dep.use_reduce(ebuild.metadata["HOMEPAGE"], matchall=True, |
38 |
flat=True) |
39 |
-- |
40 |
Thanks, |
41 |
Zac |