Gentoo Archives: gentoo-dev

From: Ashley Dixon <ash@××××××××××.uk>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] euses(1) Reimplementation
Date: Fri, 10 Jul 2020 04:42:03
Message-Id: 20200710044059.2xqmyff2hxvwh7w6@ad-gentoo-main
In Reply to: Re: [gentoo-dev] euses(1) Reimplementation by Fabian Groffen
1 Hi Fabian, cheers for your response.
2
3 On Thu, Jul 09, 2020 at 08:39:30AM +0200, Fabian Groffen wrote:
4 > Sounds like you've put some work into this. You could compare against
5 > `quse -D <flag>` (from portage-utils) as well to get another point of
6 > measure.
7
8 quse is about half as fast as my tool, however that's understandable as it's
9 working primarily from ebuild scripts, as opposed to USE-flag descriptors. The
10 two tools yield exactly the same results, providing that `-s` is passed to
11 ash-euses (its default behaviour is to include flag descriptions in the search;
12 `-s` instructs it to only display matches which appear as a flag).
13
14 The disadvantage of my tool is its inability to understand the nature of the
15 packages, such that it cannot offer command-line options such as "only display
16 results related to installed packages".
17
18 > I don't know what you did measure euses against though, it seems fairly
19 > fast to me (env PORTDIR=`q -e PORTDIR` euses -v libressl), is there a
20 > specific case you're focussing on?
21
22 It is very fast, however it could be faster. I ran it through callgrind and
23 kcachegrind to find that it spends over 56% of its execution time on strncpy
24 calls; the string-construction is extremely inefficient. My reimplementation
25 also aims to consist of more maintainable and clean code (for example, the
26 original tool declares 23 nondescriptly named local variables at the top of
27 main(), and more throughout the function). Regardless, the obvious main
28 advantage is that it is fully compliant with the repos.conf syntax, but also
29 works on legacy PORTDIR systems.
30
31 As an irrelevant aside, my version also uses the strcasestr(3) function to
32 perform the case-insensitive search. Unfortunately, this forces _GNU_SOURCE to
33 be defined for the inclusion of `string.h`---however, it is hugely faster than
34 running tolower(3) on every character of the query and buffer, as the
35 canonicalisation (in this case, converting the needle and haystack to
36 lower-case), is done as part of the standard string-searching function call
37 (`two_way_{long,short}_needle`) [1]. As discussed in my previous e-mail, I'm
38 working on reimplementing this with the Two-Way algorithm (and shift tables for
39 small needles) to avoid the non-standard dependency, although it might take a
40 few days.
41
42 Ashley.
43
44 [1] https://sourceware.org/git/?p=glibc.git;a=blob;f=string/str-two-way.h;h=de247fbc98b83a6e1653288e4161751710d026ce;hb=HEAD#l35
45
46 --
47
48 Ashley Dixon
49 suugaku.co.uk
50
51 2A9A 4117
52 DA96 D18A
53 8A7B B0D2
54 A30E BF25
55 F290 A8AA

Attachments

File name MIME type
signature.asc application/pgp-signature