Note: You are currently viewing documentation for Moodle 2.0. Up-to-date documentation for the latest stable version is available here: UTF-8 PostgreSQL.

UTF-8 PostgreSQL

From MoodleDocs
Revision as of 23:10, 2 April 2006 by Patrick Li (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

UTF-8 migration > Database conversion > PostgreSQL notes

This page is under construction!!

PostgreSQL UTF8 Migration How-to

This is a guide on how to migrate your moodle installation on PostgeSQL database to UTF8.

All the data in the existing database must be translated to UTF8 valid characters. As PostgreSQL does not allow you to change the database encoding of an existing database, you need to create a new database with encoding UNICODE (PostgreSQL 7.x) or UTF8 (PostgreSQL 8.x). If you are already using a database with UNICODE/UTF8 encoding, the migration will process the data and save back to your existing database.

This process requires a *nix operating system, like any linux distribution or MacOSX. It will not work on windows version of PostgreSQL as windows does not have the command line tools "grep" and "iconv".

Migration Procedure

  1. Create a new database with encoding UNICODE or UTF8 if you are not using one;
  2. Go to the admin panel and click on the migration link.
  3. The script will check for the settings those required to commence the migration.
  4. If your current database is not in UNICODE/UTF8, the script will require you to provide the connection parameters of the UTF8 database.
    NOTE: You might notice that there is an extra option called "PostgreSQL Cluster". This is intended to pass the "--cluster" parameter to the PostgreSQL command line tools while copying the database. If you do not have more than one PostgreSQL server installed on the system or you have no idea about it, just leave it empty.
  5. Start the process following the instructions.
    NOTE: If you happened to get any errors in the copying data process you might want to do that manually. To do it manually please follow the instructions below this section.
  6. After the process finishes, modify your config.php to point the database to the new UTF8 database.

Manually Copy your current database to the new UNICODE database

  1. Make a plain text dump of your current db.
    Run command:
    PGCLIENTENCODING='UNICODE' PGDATABASE={dbname} pg_dump -Fp -O -x -U {dbuser} > tempfile1
    Replace {dbname} and {dbuser} with your own values, these are for your old non-unicode database. Supply --cluster, -h, -p to pg_dump if necessary.
  2. Strip out "COMMENT ON SCHEMA" statement which will cause error while importing with a non-root user.
    Run command:
    grep -v "COMMENT ON SCHEMA" < tempfile1 > tempfile2
  3. Strip out any non-unicode characters with iconv
    Run command:
    iconv -f UTF-8 -t UTF-8 -c < tempfile2 > tempfile3
  4. Import the data to your new unicode database
    Run command:
    PGDATABASE={dbname} psql -q -U {dbuser} -v ON_ERROR_STOP=1 < tempfile3
    Replace {dbname} and {dbuser} with your own values, these are for your new unicode database. Supply --cluster, -h, -p to psql if necessary. The ON_ERROR_STOP=1 parameter will make psql stop immediately upon any errors.

Now you have your unicode database with all your old data. Run the migration script again to finish the migration. The script is clever enought not to copy the data again if you already done so.