Gentoo Archives: gentoo-dev

From: Jason Stubbs <jstubbs@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] MySQL 4.0 => 4.1 upgrade
Date: Mon, 12 Sep 2005 01:21:30
Message-Id: 200509121019.18512.jstubbs@gentoo.org
In Reply to: Re: [gentoo-dev] MySQL 4.0 => 4.1 upgrade by Francesco R
1 On Monday 12 September 2005 02:51, Francesco R wrote:
2 > Jason Stubbs wrote:
3 > >On Monday 12 September 2005 00:05, Francesco R wrote:
4 > >>http://dev.gentoo.org/~vivo/doc/mysql-update.html
5 > >
6 > >With step 2, you should probably mention the issues that can arise with
7 > >non-ASCII data in char fields. The character set really needs to
8 > > specified in the dump. After the upgrade to 4.1, the default charset of
9 > > the server should be set to something compatible and then the charset
10 > > of the data should be specified to mysql when re-importing the backup.
11 >
12 > --default-character-set=charset
13 > should be that of my.cnf config file, mysqldump don't permit an atomic
14 > setting of this variable.
15 > The only option for this kind of users is to atomically dump the tables
16 > and then concat the results.
17 >
18 > Importing in mysql-4.1 it's ok, provided your default character set is
19 > utf8.
20 >
21 > Russian, asian whatever person has experience on this please speak now
22 > to correct what affermed here.
23
24 I had a 4.0 database with strings mostly stored in SJIS that I upgraded to
25 4.1 a while back. 4.1 then uses the "connection characater set" to do on
26 the fly translation of db encoding to connection encoding. This also
27 happens when importing data so if you haven't got the character set of the
28 data correct, it'll get corrupted on the way in.
29
30 Related to the automatic conversion, some fields in the DB contain raw URLs
31 with un-URL-ified parameters that can be in any character set. These fields
32 had to be set to BINARY for them to be usable.
33
34 Another gotcha related to this is that php's mysql support defaults to
35 latin-1 encoding (at least on my current installation) and has no setting
36 for it. The only solution there was to execute "SET NAMES ujis" on every
37 connection.
38
39 --
40 Jason Stubbs