UTF-8 DB
Warning: This page is no longer in use. The information contained on the page should NOT be seen as relevant or reliable. |
UTF-8 migration > Database conversion
This page is under construction!!
Only some preliminary ideas have been defined.
Differences between MySQL and PostgreSQL
Required 4.1! Detect how dabase/tables/columns have been created (charset...) How the set_names affect everything (from PHP to MySQL)
A big XML file will map all the required information to make the upgrade for global tables (like course, user etc)
Both modules and blocks will have their own XML file. (like those in /db *.sql) e.g.
<TABLE name="lesson_answers"> <FIELDS> <FIELD name="answer" method="PLAIN_SQL_UPDATE" type"text" length="0"> <SQL_DETECT_COURSE> SELECT l.course FROM {$CFG->prefix}lesson l, {$CFG->prefix}lesson_answers la WHERE l.id = la.lessonid AND la.id = RECORDID </SQL_DETECT_COURSE> </FIELD> <FIELD name="response" method="PLAIN_SQL_UPDATE" type"text" length="0"> <SQL_DETECT_COURSE> SELECT l.course FROM {$CFG->prefix}lesson l, {$CFG->prefix}lesson_answers la WHERE l.id = la.lessonid AND la.id = RECORDID </SQL_DETECT_COURSE> </FIELD> </FIELDS> </TABLE>
The XML file will specify how to locate the course and the user every field belongs to, because translation will be done from course->lang or user->lang or site->lang to utf-8.
Those XML files will be read by one processor. This processor will pre-check that there is XML info available for each table in the DB. The processor will annotate the exact point it is, allowing to resume on crash!
It will be a two-stages process. 1st to convert all the texts, then to convert tables and column encodings (a possible alternative is to duplicate the entire DB with empty tables and then, move all contents to the new one) ?
It's highly possible that we need to increase DB requirements to MySQL 4.1 (great for the future!). Other users should stick at 1.5.
Collation support inside MySQL is great so we could specify different collations per lang and, dynamicaly sort query results differently.
As postgresql is quite different, an alternate DB migration must be performed for it.
Indexes could break under MySQL (because utf strings are 3x in size). Don't forget it!
create it able to run under different methods: - Under PHP application mode. - Processing items in chunks, saving the latest performed table and record to be able of resume in the exact point of break. - Able to be skipped for sites in only one encoding (making manual iconv to the database dump and importing again.
two stages, the conversion and the altering schema.
Check that the XML contains information for each $CFG->prefix table.
Try to convert table and column encoding following the XML too (reading metadata?)
Indexes to long. It's going to be a problem, sure!!
Modify all the DB creation scripts inside Moodle to specify encoding!!
All the fields in a table at the same time, record by record.
Deactivate the cron (creating a new $CFG->stopcron variable) and check that the site in under single-user mode (maintenance mode enabled).
List of tables and fields that need to be converted
(?) = in progress, not sure yet (x) = quite sure =P
Please let me know any table that I don't need to worry about.
adodb_logsql (?)
assignment (x) - name, description - course->lang = select c.lang from course c left join assignment a on c.id = a.course where a.id = a.id - user->lang = -
assignment_submissions (x) - comment, data1, data2 - course->lang = select c.lang from course c left join assignement a on c.id = a.course left join assignment_submissions as ass on a.id = ass.assignment where ass.id = ass.id - user->lang = select u.lang from user u left join assignment_submissions as ass on u.id = ass.userid where ass.id=ass.id
backup_config (?)
backup_courses (x)
backup_files (x)
backup_ids (?)
backup_log (?)
block (x)
block_instance (?) -configdata (all empty?)
block_pinned (?) -configdata (?)
block_rss_client (?) -title, preferredtitle, description(?)
cache_filters (?)
cache_text (?)
chat (x) -name, intro
chat_messages (x) -message
chat_users (x)
choice (x) -name, text
choice_answers (x)
choice_options (x) -text
config (x) -value see hokudai page 2 in db, chars embedded in tags
config_plugins (?)
course (x) password(raw string, see koniv database), fullname, shortname, idnumber, summary, teacher, teachers, student, students
course_allowed_modules (x)
course_categories (x) -name, description
course_display (x)
course_meta (x)
course_modules (x)
course_request (?)
course_sections (x) -summary
data (x) -name, intro
data_comments (x)
data_content (x) -conent(composite)
data_fields (x) -name, description
data_ratings (x)
data_records (x)
dialogue (x) -name, intro
dialogue_conversations (x) -subject
dialogue_entries (x) -text;
downloads (?)
enrol_authorize (?)
enrol_paypal (?)
event (x) -name(composite!), description
exercise (x) -name, md5(password?)
exercise_assessments (x) -generalcomment, teachercomment
exercise_elements (x) -description
exercise_grades (x) -feedback
exercise_rubrics (x) -description
exercise_submissions (x) -title
forum (x) -name, intro
forum_discussions (x) -name
forum_posts (x) -subject, message, attachment (stripeed &#, no need to convert?)
forum_queue (x)
forum_ratings (x)
forum_read (x)
forum_subscriptions (x)
forum_track_prefs (x)
glossary (x) -name, intro
glossary_alias (x) -alias
glossary_categories (x) -name
glossary_comments (x) -comment
glossary_entries (x) -concept, definition
glossary_entries_categories (x)
glossary_formats (x)
glossary_ratings (x)
grade_category (x) -name
grade_exceptions (x)
grade_item (x)
grade_letter (x) -letter
grade_preferences (x)
groups -name, description, password(stripped), theme(?)
groups_members (x)
hotpot(x) -name, summary
hotpot_attempts (x)
hotpot_details (x) -details
hotpot_questions (x) -name
hotpot_responses (?)
hotpot_strings (x) -string
journal (x) -name, intro
journal_entries (x) -comment, text
label (x) name, content (with html)
lams (???)
lesson (x) -name, password is (md5(NCR))?
lesson_answers (x) -answer, response
lesson_attempts (?) -useranswer(?)
lesson_branch (x)
lesson_default (x) -password(md5(NCR))?
lesson_grades (x)
lesson_high_scores
lesson_pages (x) -title, content
lesson_timer (x)
log (x) -info (hokai as well)
log_display (x)
message (x) -message (not NCR!)
message_contacts (x)
message_read (x) -message (not NCR!)
modules (x)
quiz (x) -name, intro, password(md5)?
quiz_answers (x) -answer, feedback
quiz_attemptonlast_datasets (x)
quiz_attempts (x)
quiz_calculated (x)
quiz_categories (x) -name, info
quiz_dataset_definitions (?)
quiz_dataset_items (?)
quiz_essay (x)
quiz_essay_states (??)
quiz_grades (x)
quiz_match (x)
quiz_match_sub (x) -questiontext, answertext
quiz_multianswers (x)
quiz_multichoice (x)
quiz_newest_states (x)
quiz_numerical (x)
quiz_numerical_units (x) -unit
quiz_question_datasets (x)
quiz_question_instances (x)
quiz_question_versions (x)
quiz_questions (x) -name, questiontext
quiz_randomsamatch (x)
quiz_rqp (?)
quiz_rqp_servers (?)
quiz_rqp_states (?)
quiz_rqp_types (?)
quiz_shortanswer (x)
quiz_states (x) answer - composite!
quiz_truefalse (x)
registry (x) -sitename, adminname
registry_votes (?)
resource (x) -name, summary, alltext
scale (x) -name, description, scale
scorm (?) -name, summary
scorm_scoes (?)
scorm_scoes_track (?)
sessions (?)
stats_daily (x)
stats_monthly (x)
stats_user_daily (?) -stattype
stats_user_monthly (x)
stats_user_weekly (x)
stats_weekly(x)
survey (x) -name, intro
survey_analysis (x) -notes
survey_answers (?) -answerone, answertwo??
survey_questions (?)
timezone(x)
user (x) -firstname, lastname, department, institution, address, city, description, secret(?)
user_admins (x)
user_coursecreators (x)
user_preferences (x)
user_students (x)
user_teachers (x)
wiki (x) -name, summary, pagename, initialcontent
wiki_entries (x) -pagename
wiki_pages (x) -pagename, content, author, ref
workshop (x) -name, description
workshop_assessments (x) -generalcomment, teachercomment
workshop_comments (x) -comments
workshop_elements (x) -description
workshop_grades (x) -feedback
workshop_rubrics (x) -description
workshop_stockcomments (x) -comments
workshop_submissions (x) -title