1 |
On Monday 12 September 2005 02:51, Francesco R wrote: |
2 |
> Jason Stubbs wrote: |
3 |
> >On Monday 12 September 2005 00:05, Francesco R wrote: |
4 |
> >>http://dev.gentoo.org/~vivo/doc/mysql-update.html |
5 |
> > |
6 |
> >With step 2, you should probably mention the issues that can arise with |
7 |
> >non-ASCII data in char fields. The character set really needs to |
8 |
> > specified in the dump. After the upgrade to 4.1, the default charset of |
9 |
> > the server should be set to something compatible and then the charset |
10 |
> > of the data should be specified to mysql when re-importing the backup. |
11 |
> |
12 |
> --default-character-set=charset |
13 |
> should be that of my.cnf config file, mysqldump don't permit an atomic |
14 |
> setting of this variable. |
15 |
> The only option for this kind of users is to atomically dump the tables |
16 |
> and then concat the results. |
17 |
> |
18 |
> Importing in mysql-4.1 it's ok, provided your default character set is |
19 |
> utf8. |
20 |
> |
21 |
> Russian, asian whatever person has experience on this please speak now |
22 |
> to correct what affermed here. |
23 |
|
24 |
I had a 4.0 database with strings mostly stored in SJIS that I upgraded to |
25 |
4.1 a while back. 4.1 then uses the "connection characater set" to do on |
26 |
the fly translation of db encoding to connection encoding. This also |
27 |
happens when importing data so if you haven't got the character set of the |
28 |
data correct, it'll get corrupted on the way in. |
29 |
|
30 |
Related to the automatic conversion, some fields in the DB contain raw URLs |
31 |
with un-URL-ified parameters that can be in any character set. These fields |
32 |
had to be set to BINARY for them to be usable. |
33 |
|
34 |
Another gotcha related to this is that php's mysql support defaults to |
35 |
latin-1 encoding (at least on my current installation) and has no setting |
36 |
for it. The only solution there was to execute "SET NAMES ujis" on every |
37 |
connection. |
38 |
|
39 |
-- |
40 |
Jason Stubbs |