UFT-8 initial steps

Jump to: navigation, search
Warning: This page is no longer in use. The information contained on the page should NOT be seen as relevant or reliable.
UTF-8 migration > UTF-8 justification > Initial steps

Initial steps

Stabilising min requirements for Moodle 1.6

Since Moodle 1.0 release, the LAMP environment has evolved continuously. Both PHP and MySQL are now better products (more functionalities, safer, following standards, faster...) than some years ago. Current Moodle 1.5 works with any PHP (>=4.1.x) and MySQL (>=3.23.x) version released in the last 5 years.

Although this has been considered a feature up to now (Moodle running in near every LAMP environment), since some time ago the reality is telling (shouting!) us that it's time to change such min requirements:

  • PHP has evolved positively, with extended support to so many features (better OOP, unicode support, speed...). In the real world near every host is running, at least, PHP 4.3.x (released in December 2002!), so it shouldn't be crazy to stabilise PHP 4.3 as a minimum requirement. Read more about PHP details below.
  • MySQL, currently in version 5.x, raised a lot of interesting changes in the 4.1.12 release (boolean queries, sub-queries, caching of results, unicode support...). Today, to support pre 4.1 versions force us to build a lot of workarounds inside Moodle code, making things more difficult to maintain, adding a lot of alternative pieces of code based upon the DB being used. Finally, due to some problems found at Unicode level, the required version will be 4.1.16
  • PostgreSQL: 7.4. Why, well, just ask PostgreSQL gurus... ;-)

PHP details

A good list of mandatory/recomended configuration options for the webserver and PHP is defined in the Installation Documentation. Also, in order to make things easier under Moodle 1.6 due to the transition to UTF-8, there are some libraries that are highly recommended and, depending of the langs being used, mandatory. This libraries (PHP extensions) are:

  • iconv
  • mbstring
  • recode

Although there are a lot of languages whose conversion to UTF-8 don't require them (you will learn more about this below), those extensions will be automatically detected by Moodle and used if present, so having them installed and enabled is absolutely recommended (and mandatory to upgrade some sites, don't forget it!). In the next topic you'll find why these extensions are so important and when they are a must.


Creating a common text library to be used everywhere

After some comparisons and alternatives, we've selected some useful stuff from the Typo3 project. Main functionalities of that charset-handling libraries are:

  • Provide a automatic method to use iconv/mbstring/recode/internal code if available.
  • Enough stable to be used safely.
  • A lot of UTF-8 functions are available (substr, strlen, strtoupper, strtolower, strpos, strrpos...).

Anyway, we have wrapped all those Typo3 functions in a new textlib.class.php class. It will allow us to add more and more string manipulating functions and to change/use other libraries different from Typo3.

To use such class functions all you have to do is:

1. Get one instance of the class (currently it's a singleton):

     $textlib = textlib_get_instance();

2. Execute the desired function:

     $len = $textlib->strlen($string, current_charset());

Currently it includes these functions:

  • convert($text, $fromCS, $toCS): to convert strings between charsets.
  • substr($text, $start, $len, $charset)
  • strlen($text, $charset)
  • strtolower($text, $charset)
  • strtoupper($text, $charset)
  • strpos($haystack,$needle,$offset) UTF-8 only!!
  • strrpos($haystack,$needle) UTF-8 only!!
  • specialtoascii($text,$charset)

(with $charset being optional and defaulting to 'utf-8' in all the functions above)

Related links