1 |
On Tue, 22 Nov 2005 18:51:44 +0100 |
2 |
Sven Vermeulen <swift@g.o> wrote: |
3 |
>> A good start could be to do that the quick and ugly way, thanks |
4 |
>> to Google (with some "site:www.gentoo.org/some/thing/" and other |
5 |
>> black magic in the query terms). |
6 |
> [...] |
7 |
|
8 |
> - Google bases its search functionality on cached pages. |
9 |
|
10 |
Bah, yes, theory is that it's not 100% perfect, but in practice i |
11 |
find it satisfying. |
12 |
|
13 |
> - We would depend on Google a bit |
14 |
|
15 |
Yes, but if other engines offer similar functionalities, in which |
16 |
case it would just be a matter of changing the forms params names |
17 |
and posting it elsewhere. But i don't know much about other public |
18 |
search engines, so i have no idea about what kind of queries they |
19 |
allow. |
20 |
|
21 |
> Now Google might be a reliable web site/service, I'd rather have |
22 |
> the search functionality of our web site implemented on the |
23 |
> Gentoo infrastructure. |
24 |
|
25 |
Sure, if that's doable in terms of workload and time to implement, |
26 |
then it could be the best method. |
27 |
|
28 |
My only concern would be on the choice of that engine: i mean, |
29 |
i would still prefer Google over an internal engine which doesn't |
30 |
allow mixing of exact strings and keywords in queries, or which |
31 |
drops non-alpha chars, etc. I'm suffering enough with the forum's |
32 |
one already :) |
33 |
|
34 |
> - Restricting pages to /doc (documentation), /main (Gentoo |
35 |
> information), /news (News items+GWN), /proj (project stuff) |
36 |
|
37 |
Not a problem with google, that's the "/some/thing/" part of |
38 |
the above cited fake query. I've put some real examples in the |
39 |
proof-of-concept form i've posted about in an earlier message |
40 |
somewhere else in that thread: |
41 |
http://tdegreni.free.fr/gentoosearch/ |
42 |
|
43 |
> - Restricting languages (en, fr, ... and any combination) |
44 |
|
45 |
Same as above for searching in a single language, adding some |
46 |
"/fr/" to the base URL (or also possible using the lr=lang_fr |
47 |
parameter, although it's less reliable). But for arbitrary |
48 |
combinations, yes, that's probably a limitation (or a really ugly |
49 |
query...). |
50 |
|
51 |
What i've thought for i18n of the above JS code was to: |
52 |
- always at least propose search on the english pages |
53 |
- if user has defined in his browser a non-english preferred |
54 |
language, also add some localised choices to the dropdown list. |
55 |
(I'm not sure how to detect the user preferred lang from Javascript |
56 |
though). |
57 |
|
58 |
> - Have the search points assigned so that hits are calculated |
59 |
> with certain weights: |
60 |
> * title's get most of the points, unless many titles are |
61 |
> selected |
62 |
> * abstract's get the second most points, yada yada |
63 |
> * content get third most points |
64 |
|
65 |
Here again, i think google is good enough for the needs, especially |
66 |
if you target the search on some "/doc/en/" or alike sub-parts of |
67 |
the website, which don't let that many pages anyway. I mean, i |
68 |
often do that kind of searchs on the docs or the dev handbook with |
69 |
a conquery plugin, and i don't remember having ever seen the page i |
70 |
was looking for not beeing in the top 5 results. But yes, at least |
71 |
in theory, a tweaked local engine could be even better. |
72 |
|
73 |
|
74 |
Hmm... re-reading the above message, i realize i may sound like |
75 |
some kind of google-zealot: so just to make it clear, i'm not, and i |
76 |
would be pleased to see anything better implemented. It's really |
77 |
just that i think it could do a rather good job and that using it is |
78 |
easy enough to be a really short-term solution. |
79 |
|
80 |
-- |
81 |
TGL. |
82 |
-- |
83 |
gentoo-dev@g.o mailing list |