Note: You are currently viewing documentation for Moodle 3.0. Up-to-date documentation for the latest stable version of Moodle may be available here: Languages:Tim's crazy proposal based on maketext.

Development:Languages:Tim's crazy proposal based on maketext: Difference between revisions

From MoodleDocs
(New page: They say that the border between madness and genius is is very narrow. Here goes. The best article I know about the problems of localising software is http://search.cpan.org/~ferreira/Loc...)
 
 
Line 20: Line 20:
==Proposed runtime lang file syntax==
==Proposed runtime lang file syntax==


In <tt>moodledata/lang/<tt> there are subfolders like <tt>en/</tt> (note we lose the legacy <tt>_utf8</tt>) that contains files like <tt>mod_quiz.php</tt> or <tt>core_moodle.php</tt>, that is, using the new component naming convention.
In <tt>moodledata/lang/</tt> there are subfolders like <tt>en/</tt> (note we lose the legacy <tt>_utf8</tt>) that contains files like <tt>mod_quiz.php</tt> or <tt>core_moodle.php</tt>, that is, using the new component naming convention.


Suppose we have a hypothetical plugin <tt>admin/report/dylan</tt> with its current lang file <tt>admin/report/dylan/lang/en_utf8/report_dylan.php</tt> that contains:
Suppose we have a hypothetical plugin <tt>admin/report/dylan</tt> with its current lang file <tt>admin/report/dylan/lang/en_utf8/report_dylan.php</tt> that contains:
Line 65: Line 65:
}
}
</code>
</code>


==Using that runtime format==
==Using that runtime format==

Latest revision as of 12:07, 23 November 2009

They say that the border between madness and genius is is very narrow. Here goes.

The best article I know about the problems of localising software is http://search.cpan.org/~ferreira/Locale-Maketext-1.13_82/lib/Locale/Maketext/TPJ13.pod. I particularly like the narrative in the first half. Also, I must warn you that I have not read much about localisation, so my endorsement may not mean much.

OK, so the key point it makes is that really, a language string like "There have been $a quiz attempts" is really a function (in the mathematical sense of a mapping, not necessarily as a programming language construct). Depending on $a, we want it to output

  • There have been no quiz attempts
  • There has been one quiz attempt
  • There have been 42 quiz attempts

So the question is, why not make it a function in the programming language construct sense as well.


Two representations

Up to Moodle 1.9, the files Moodle used at runtime were exactly the same as the files that translators edited. That was convenient, but limited us to a human-readable and editable format. Also, it meant that Moodle had to do a lot of searching at runtime.

In Moodle 2.0 we are already proposing to split the representations, which lets us optimise the runtime format to be pretty much whatever we like, without making the format edited by translators impossible.


Proposed runtime lang file syntax

In moodledata/lang/ there are subfolders like en/ (note we lose the legacy _utf8) that contains files like mod_quiz.php or core_moodle.php, that is, using the new component naming convention.

Suppose we have a hypothetical plugin admin/report/dylan with its current lang file admin/report/dylan/lang/en_utf8/report_dylan.php that contains: <?php $string['howmanyroads'] = 'How many roads must a man walk down?'; $string['roadsx'] = 'Roads: $a'; $string['xroadsfromytowns'] = '$a->numroads roads from at least $a->numcities different cities.'; // Can't really handle pluralisation in that last one in Moodle :-(

In my new proposal, moodledata/lang/en/report_dylan.php will contain: <?php class strings_en_report_capability extends strings_base {

   protected static $helper = lang_helper_en::get(); // Get singleton instance.
   public function howmanyroads($a) { return 'How many roads must a man walk down?'; }
   public function roadsx($a) { return self::$helper->quant($a, 'road'); }
   public function xroadsfromytowns($a) { 
       return self::$helper->quant($a->numroads, 'road') . ' from at least ' .
               self::$helper->quant($a-> numcities, 'different city', 'different cities') . '.';
   }

} (I manually line-wrapped that last example to make it readable. in reality, remember that this file is being automatically compiled from some more human-readiable source.)

Also note that moodledata/lang/fr/report_dylan.php will look like: <?php include_once($CFG->langdir . '/en/report_dylan.php'); class strings_fr_report_capability extends strings_en_report_capability {

   protected static $helper = lang_helper_fr::get(); // Get singleton instance.
   public function howmanyroads() { return 'Combien de rue ...'; }
   // etc.

}

And moodledata/lang/fr_ca/report_dylan.php is:

<?php include_once($CFG->langdir . '/fr/report_dylan.php'); class strings_fr_ca_report_capability extends strings_en_report_capability { }

Using that runtime format

Then, string_manager() becomes: class string_manager {

   protected $stringclasses = array();
   public function get_string($identifier, $component = , $a = null) {
       $component = $this->fix_legacy_component_names($component);
       $this->get_string_class(current_language(), $component)->$identifier($a);
   }
   protected function get_string_class($lang, $component) {
       global $CFG;
       if (!isset($this->stringclasses[$lang][$component])) {
           $this->stringclasses[$lang][$component] = 
                   $this->load_string_class($lang, $component)();
       }
       return $this->stringclasses[$lang][$component];
   }
   protected function load_string_class($lang, $component) {
       $file = "$CFG->langdir/$lang/$component.php";
       $class = "strings_{$lang}_{$component}";
       if ($CFG->langediting && !$this->is_up_to_date($file)) {
           compile_lang_strings($lang, $component);
       }
       if (!is_readable($file)) {
           return new strings_base(); // See below.
       }
       include_once($file);
       if (!class_exists()) {
           throw new coding_exception($file . ' did not define the ' . $class .
                   'class. There must be a bug in compile_lang_strings.');
       }
       return new $class();
   }
   // A few other methods omitted.

} Note that if you have a developer/translator flag on ($CFG->langediting) then is_up_to_date checks various file timestamps, so that lang files can automatically be recompiled as needed for those people, without hurting runtime performance for production sites.

As a final bit of magic, we have class strings_base {

   public function __call($name, $arguments) {
       return "$name";
   }

}

Remember that the strings_en_report_capability inherited form this. This gives us our classing missing string fallback. Also:

class lang_helper_en {

   private static $inst = null;
   public static get() {
       if (!$inst) {
           $inst = new lang_helper_en();
       }
       return $inst;
   }
   public function quant($number, $singular, $plural = ) {
       if ($number = 1) {
           return "$number $singular";
       } else if ($plural) {
           return "$number $plural";
       } else {
           return "$number {$singular}s";
       }
   }
   // Other helper functions.

}


Problems

I think this gives us the fastest possible runtime performance (particularly when combined with a PHP accelerator. Of course, it leaves open the following problems:

  • what string format do translators to edit? (I suggest we copy the maketext format.)
  • can we write the compile_lang_strings function?


See also

Template:CategoryDeveloper