1 |
Hi Fabian, cheers for your response. |
2 |
|
3 |
On Thu, Jul 09, 2020 at 08:39:30AM +0200, Fabian Groffen wrote: |
4 |
> Sounds like you've put some work into this. You could compare against |
5 |
> `quse -D <flag>` (from portage-utils) as well to get another point of |
6 |
> measure. |
7 |
|
8 |
quse is about half as fast as my tool, however that's understandable as it's |
9 |
working primarily from ebuild scripts, as opposed to USE-flag descriptors. The |
10 |
two tools yield exactly the same results, providing that `-s` is passed to |
11 |
ash-euses (its default behaviour is to include flag descriptions in the search; |
12 |
`-s` instructs it to only display matches which appear as a flag). |
13 |
|
14 |
The disadvantage of my tool is its inability to understand the nature of the |
15 |
packages, such that it cannot offer command-line options such as "only display |
16 |
results related to installed packages". |
17 |
|
18 |
> I don't know what you did measure euses against though, it seems fairly |
19 |
> fast to me (env PORTDIR=`q -e PORTDIR` euses -v libressl), is there a |
20 |
> specific case you're focussing on? |
21 |
|
22 |
It is very fast, however it could be faster. I ran it through callgrind and |
23 |
kcachegrind to find that it spends over 56% of its execution time on strncpy |
24 |
calls; the string-construction is extremely inefficient. My reimplementation |
25 |
also aims to consist of more maintainable and clean code (for example, the |
26 |
original tool declares 23 nondescriptly named local variables at the top of |
27 |
main(), and more throughout the function). Regardless, the obvious main |
28 |
advantage is that it is fully compliant with the repos.conf syntax, but also |
29 |
works on legacy PORTDIR systems. |
30 |
|
31 |
As an irrelevant aside, my version also uses the strcasestr(3) function to |
32 |
perform the case-insensitive search. Unfortunately, this forces _GNU_SOURCE to |
33 |
be defined for the inclusion of `string.h`---however, it is hugely faster than |
34 |
running tolower(3) on every character of the query and buffer, as the |
35 |
canonicalisation (in this case, converting the needle and haystack to |
36 |
lower-case), is done as part of the standard string-searching function call |
37 |
(`two_way_{long,short}_needle`) [1]. As discussed in my previous e-mail, I'm |
38 |
working on reimplementing this with the Two-Way algorithm (and shift tables for |
39 |
small needles) to avoid the non-standard dependency, although it might take a |
40 |
few days. |
41 |
|
42 |
Ashley. |
43 |
|
44 |
[1] https://sourceware.org/git/?p=glibc.git;a=blob;f=string/str-two-way.h;h=de247fbc98b83a6e1653288e4161751710d026ce;hb=HEAD#l35 |
45 |
|
46 |
-- |
47 |
|
48 |
Ashley Dixon |
49 |
suugaku.co.uk |
50 |
|
51 |
2A9A 4117 |
52 |
DA96 D18A |
53 |
8A7B B0D2 |
54 |
A30E BF25 |
55 |
F290 A8AA |