Gentoo Archives: gentoo-dev

From:	"Robin H. Johnson" <robbat2@g.o>
To:	gentoo-dev@l.g.o
Subject:	Re: [gentoo-dev] packages.gentoo.org lives!
Date:	Thu, 29 Nov 2007 18:35:53
Message-Id:	`20071129183319.GV14557@curie-int.orbis-terrarum.net`
In Reply to:	Re: [gentoo-dev] packages.gentoo.org lives! by Mike Frysinger

1	On Thu, Nov 29, 2007 at 10:20:11AM -0500, Mike Frysinger wrote:
2	> On Tuesday 13 November 2007, Robin H. Johnson wrote:
3	> > If you had bookmarks to the old style of URL, please consult the FAQ for
4	> > the new form. We are NOT rewriting these URLs:
5	> > '/packages/?category=media-sound;name=mp3unicode'
6	> > (The new form is '/package/media-sound/mp3unicode').
7	> why ? you've just broken every site out there that links to us in the common
8	> form you've quoted here. there's no reason you cant add three lines of code
9	> to check if the "category" GET variable exists and if so, redirect
10	> accordingly.
11	Because:
12	- Using the ';' as an argument separator in the old side is not a valid
13	query argument separator, and there are URLs out there that have added
14	further arguments using it, complicating parsing.
15	- See also RFC1738: 'Within the <path> and <searchpart> components, "/",
16	";", "?" are reserved.'
17	- The old site allowed a LOT of varations, all leading to the same
18	content, but some of which broke badly.
19	/?category=foo&name=bar
20	/?category=foo;name=bar
21	/?name=bar&category=foo
22	/?name=bar;category=foo;this=wasbroken
23	/packages/?(one of the above query strings)
24	(several more prefixes, all of which gave you the same page)
25	- Having a single valid URL for a given resource greatly improves cache
26	hit rates (and we do use caching heavily on the new site, 60% hit rate
27	at the moment, see further down as well).
28	- The old parsing and variable usage code was the source of multiple
29	bugs as well as the security issue that shuttered the site.
30	- I _want_ old sites to change to using the new form, which I do
31	advertise as being permanent resource URLs (as well as being much
32	easier to construct, take any "[CAT/]PN[-PF]" and slap it onto the
33	base URL, and you are done).
34
35	That said, if somebody wants to point me to something decent so that
36	Squid can rewrite the URLs WITH the query parameters (the built-in squid
37	stuff seems to ignore them) and hit the cache, and that can add a big
38	warning at the top of the page, I'd be happy to use it for a transition
39	period, just like the RSS URLs (which are redirected until January 2008,
40	but only because they are automated, and not browsed by humans).
41
42	On the subject of Squid, it would be extremely useful if it could ignore
43	some headers and respect others in figuring out if the page is already
44	in the cache, without stripping the headers from the request (it is
45	doable with Apache's mod_cache), so that two requests with only a
46	slightly different User-Agent between them hit the same cache entry,
47	while different Accept* headers are respected, adn don't hit the same
48	cache entry?
49
50	--
51	Robin Hugh Johnson
52	Gentoo Linux Developer & Infra Guy
53	E-Mail : robbat2@g.o
54	GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85

Replies

Subject	Author
Re: [gentoo-dev] packages.gentoo.org lives!	Thilo Bangert <bangert@g.o>
Re: [gentoo-dev] packages.gentoo.org lives!	"Jan Kundrát" <jkt@g.o>

Report Message

Find on MARC Find on Google Groups