Gentoo Archives: gentoo-portage-dev

From: Zac Medico <zmedico@g.o>
To: gentoo-portage-dev@l.g.o
Subject: Re: [gentoo-portage-dev] [PATCH] emerge: add --search-fuzzy and --search-fuzzy-cutoff options (bug 65566)
Date: Fri, 08 Apr 2016 06:22:04
Message-Id: 57074E05.4030202@gentoo.org
In Reply to: Re: [gentoo-portage-dev] [PATCH] emerge: add --search-fuzzy and --search-fuzzy-cutoff options (bug 65566) by Alexander Berntsen
1 On 04/04/2016 01:39 AM, Alexander Berntsen wrote:
2 > This is a great idea!
3
4 Yeah, we should have done this sooner. The search index makes our search
5 function so much nicer, so that gave me some incentive to continue
6 improving it.
7
8 >
9 >
10 > On 04/04/16 07:03, Zac Medico wrote:
11 >> +.BR "\-\-search\-fuzzy [ y | n ]"
12 >> +Enable or disable fuzzy search for search actions.
13 > This is likely a good place to briefly explain what a "fuzzy search"
14 > is.
15
16 Okay, will do.
17
18 > Also, I'm not sold on "seach-fuzzy" as opposed to "fuzzy-search". Is
19 > there a particular reasoning for it? Since we don't seem to have a
20 > standardised "verbs mean this, nouns mean this" anyway, I would use
21 > the latter phrase.
22
23 Okay, that will work for me.
24
25 > You also need to document your note on regexes.
26
27 Will do.
28
29 > Lastly, you also need to document that a fuzzy search is slower than a
30 > regular search.
31
32 Will do.
33
34 >> +.TP
35 >> +.BR "\-\-search\-fuzzy\-cutoff CUTOFF"
36 >> +Set similarity ratio cutoff (a floating-point number between 0 and 1).
37 >> +Results with similarity ratios lower than the cutoff are discarded.
38 >> +This option has no effect unless the \fB\-\-search\-fuzzy\fR option
39 >> +is enabled.
40 > This explanation is a bit heavy to read. And I think that using 0 to 1
41 > isn't very nice. And calling the number "floating point" instead of
42 > decimal isn't very useful nor nice. How about making it a percentage,
43 > and describing it simply as a similarity percentage -- "package names
44 > must be at least N% similar to the search term to appear in search
45 > results". The option could then be called --seach-fuzzy-similarity,
46 > or (in keeping with the previous suggestion)
47 > --fuzzy-search-similarity, or -- wait for it -- something similar. ;)
48
49 Okay, that will work for me.
50
51 > Of course if you agree with this, you'll have to reverse the code to
52 > represent which results to show, rather than which ones to not show.
53
54 Reverse? You want it to measure dissimilarity? Not sure what you mean.
55
56 > You should also document here what happens if there's a mistake in the
57 > input.
58 >
59 >> + "--search-fuzzy-cutoff": {
60 >> + "help": "Set similarity ratio cutoff (a floating-point number between 0 and 1)",
61 >> + "action": "store"
62 >> + },
63 > See comments above regarding how to explain what this actually does.
64
65 Yeah, the N% similar thing.
66
67 >> + if myoptions.search_fuzzy_cutoff:
68 >> + try:
69 >> + fuzzy_cutoff = float(myoptions.search_fuzzy_cutoff)
70 >> + except ValueError:
71 >> + fuzzy_cutoff = 0.0
72 > Is this a reasonable fallback? I guess so... but you need to mention
73 > it in the manpage, as mentioned.
74
75 It's not supposed to be a fallback, but rather a failure path. It
76 triggers an error message and unsuccessful exit.
77
78 >> +
79 >> + if fuzzy_cutoff <= 0.0:
80 >> + fuzzy_cutoff = None
81 >> + if not silent:
82 >> + parser.error("Invalid --search-fuzzy-cutoff parameter: '%s'\n" % \
83 >> + (myoptions.search_fuzzy_cutoff,))
84 >> +
85 >> + myoptions.search_fuzzy_cutoff = fuzzy_cutoff
86 >> +
87 > I also don't understand why the first one is just 0.0, but this one
88 > is an error. Why aren't both either errors and revert to 0.8 cut-off
89 > (or 80% similarity) or 0.0/100?
90
91 I just want it to fail if the input is invalid.
92
93 > And this needs to go in the manpage too.
94 >
95 >> + self.fuzzy_cutoff = 0.8 if fuzzy_cutoff is None else fuzzy_cutoff
96 > See above.
97 >
98 >> + fuzzy = False
99 > Here's an interesting discussion: maybe this should be True? After
100 > all, it's True in any modern search engine. What do you think?
101
102 Yeah, I agree.
103
104 >> + # Fuzzy search does not support regular expressions, therefore
105 >> + # it is disabled for regular expression searches.
106 > Manpage.
107
108 Will do.
109 --
110 Thanks,
111 Zac

Replies