Difference between revisions of "Converting your MySQL database to UTF8"

Jump to: navigation, search

Note: You are currently viewing documentation for Moodle 2.0. Up-to-date documentation for the latest stable version is available here: Converting your MySQL database to UTF8.

m (French link)
Line 1: Line 1:
This document looks at how to convert your MySQL database from the latin1 charset to UTF8. Moodle requires that your Database is now UTF8 and will not upgrade if your database is not, following the steps below will guide you in converting your database so that things once again work.
+
This document looks at how to convert your MySQL database from the latin1 charset to UTF8. Moodle requires that your Database is now UTF8 and will not upgrade if your database is not. Following the steps below will guide you in converting your database so that things once again work.
 +
 
 +
For more information about UTF8 have a look at the doc on [https://docs.moodle.org/en/Unicode unicode].
  
 
==Why?==
 
==Why?==
Line 8: Line 10:
  
  
Moodle requires UTF8 in order to provide better multilingual support and has done since Moodle 1.8. However the UTF8 check during install and upgrade has only been implemented recently and there are likely going to many users who find they are unable to upgrade because they did no set their database up correctly when they first installed Moodle, or for those who have been running Moodle from before 1.8 because they simply havn't already converted their database.
+
Moodle requires UTF8 in order to provide better multilingual support and has done since Moodle 1.8. However the UTF8 check during install and upgrade was only been implemented in Moodle 2.0 and you may find you are unable to upgrade because your database was not set up correctly when you first installed Moodle or because you have been running Moodle since before 1.8 and haven't previously converted your database.
 +
 
 +
==Converting an empty database==
 +
If you have created your database schema and are receiving the error during your initial installation your Moodle database will still be empty. You can simply run the below query in your database to resolve the issue.
 +
{code:SQL}
 +
alter database mydatabasename charset=utf8;
 +
{code}
  
For more information about UTF8 have a look at the doc on [https://docs.moodle.org/en/Unicode unicode].
+
==Converting a database containing tables==
 +
If you have previously installed Moodle and are now getting the error the following process will allow you to convert your database.
  
==Linux & Mac==
+
===Linux & Mac===
 
<code bash>
 
<code bash>
 
mysqldump -uusername -ppassword -c -e --default-character-set=utf8 --single-transaction --skip-set-charset --add-drop-database -B dbname > dump.sql
 
mysqldump -uusername -ppassword -c -e --default-character-set=utf8 --single-transaction --skip-set-charset --add-drop-database -B dbname > dump.sql
Line 31: Line 40:
 
</code>
 
</code>
  
===Explained===
+
====Explained====
 
The following steps will guide you in creating a database dumb, editing the database dump so that the correct charset and collation are used and then restoring the new database.
 
The following steps will guide you in creating a database dumb, editing the database dump so that the correct charset and collation are used and then restoring the new database.
  
Line 83: Line 92:
 
Now that we've made the required changes we simply need to restore the database over top of the existing database. We can do this by running the above command.
 
Now that we've made the required changes we simply need to restore the database over top of the existing database. We can do this by running the above command.
  
==Windows==
+
===Windows===
 
Could someone who is familiar with Windows please convert the above into something that will work on Windows?
 
Could someone who is familiar with Windows please convert the above into something that will work on Windows?
 
There are likely several additional arguments for mysqldump you will need to specify including setting the file for output using -r to avoid newline issues.
 
There are likely several additional arguments for mysqldump you will need to specify including setting the file for output using -r to avoid newline issues.

Revision as of 06:27, 19 January 2011

This document looks at how to convert your MySQL database from the latin1 charset to UTF8. Moodle requires that your Database is now UTF8 and will not upgrade if your database is not. Following the steps below will guide you in converting your database so that things once again work.

For more information about UTF8 have a look at the doc on unicode.

Why?

You may see the following error when upgrading your Moodle.

It is required that you store all your data in Unicode format (UTF-8). New installations must be performed into databases that have their default character set as Unicode. If you are upgrading, you should perform the UTF-8 migration process (see the Admin page).


Moodle requires UTF8 in order to provide better multilingual support and has done since Moodle 1.8. However the UTF8 check during install and upgrade was only been implemented in Moodle 2.0 and you may find you are unable to upgrade because your database was not set up correctly when you first installed Moodle or because you have been running Moodle since before 1.8 and haven't previously converted your database.

Converting an empty database

If you have created your database schema and are receiving the error during your initial installation your Moodle database will still be empty. You can simply run the below query in your database to resolve the issue. {code:SQL} alter database mydatabasename charset=utf8; {code}

Converting a database containing tables

If you have previously installed Moodle and are now getting the error the following process will allow you to convert your database.

Linux & Mac

mysqldump -uusername -ppassword -c -e --default-character-set=utf8 --single-transaction --skip-set-charset --add-drop-database -B dbname > dump.sql
cp dump.sql dump-fixed.sql
vim dump-fixed.sql
:%s/DEFAULT CHARACTER SET latin1/DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci/
:%s/DEFAULT CHARSET=latin1/DEFAULT CHARSET=utf8/
:wq
mysql -uusername -ppassword < dump-fixed.sql

or alternatively using sed:

#  $1-dbusername $2-password $3-dbname
mysqldump -u$1 -p$2 -c -e --default-character-set=utf8 --single-transaction --skip-set-charset --add-drop-database -B $3 > dump.sql
sed 's/DEFAULT CHARACTER SET latin1/DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci/' <dump.sql | sed 's/DEFAULT CHARSET=latin1/DEFAULT CHARSET=utf8/' >dump-fixed.sql
mysql -u$1 -p$2 < dump-fixed.sql

Explained

The following steps will guide you in creating a database dumb, editing the database dump so that the correct charset and collation are used and then restoring the new database.

To start please open a new terminal and move to a temp directory.

mysqldump -uusername -ppassword -c -e --default-character-set=utf8 --single-transaction --skip-set-charset --add-drop-database -B dbname > dump.sql

The first step is of course to dump out the database and of course we will use mysqldump for this. We do however need to set several arguments in order to clean up the charsets and provide a dump that is not going to cause you any problems if you are moving this database to a different database server or find yourself having to restore on a reverted system.

username 
The username to access your database.
password 
The password for the above user.
-c 
Complete inserts for better compatibility.
-e 
Extended inserts for better performance.
--default-character-set=utf8 
To set the default character set.
--single-transaction 
To reduce our workload if anything goes wrong.
--skip-set-charset 
Obviously not wanted or needed as we are changing it anyway.
--add-drop-database 
Required so we can restore over the top of our existing database.
-B 
We use this option so that our dump will contain drop table and create table syntax (which we will change the syntax for).
dbname 
The name of the database to convert.

When you run this command a database dump will be generated into dump.sql

cp dump.sql dump-fixed.sql

Next step is to copy dump.sql to dump-fixed.sql.

We will make the desired changes within dump-fixed.sql and we will keep dump.sql as it is as a backup just in case.

vim dump-fixed.sql
:%s/DEFAULT CHARACTER SET latin1/DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci/
:%s/DEFAULT CHARSET=latin1/DEFAULT CHARSET=utf8/
:wq

Now we need to edit the dump and correct the incorrect charsets that have been used. I have chosen to do this with VIM however you can use any search+replace editor or program. ( I choose VIM for this only because every linux user is/should be familiar with it).

First we open the file using VIM, and then run the three commands.

The first command replaces all instances of DEFAULT CHARACTER SET latin1 with DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci. This is used to fix up the database's default charset and collation.

The second command replaces all instances of DEFAULT CHARSET=latin1 with DEFAULT CHARSET=utf8. This converts all tables from using latin1 to using UTF8.

The third command simply saves it and exits.

mysql -uusername -ppassword < dump-fixed.sql

Now that we've made the required changes we simply need to restore the database over top of the existing database. We can do this by running the above command.

Windows

Could someone who is familiar with Windows please convert the above into something that will work on Windows? There are likely several additional arguments for mysqldump you will need to specify including setting the file for output using -r to avoid newline issues.

More information