Note:

If you want to create a new page for developers, you should create it on the Moodle Developer Resource site.

Languages:Tim's crazy proposal based on maketext: Difference between revisions

From MoodleDocs
(New page: They say that the border between madness and genius is is very narrow. Here goes. The best article I know about the problems of localising software is http://search.cpan.org/~ferreira/Loc...)
 
m (Text replacement - "<code php>" to "<syntaxhighlight lang="php">")
 
(5 intermediate revisions by 5 users not shown)
Line 1: Line 1:
They say that the border between madness and genius is is very narrow. Here goes.
{{obsolete}}They say that the border between madness and genius is is very narrow. Here goes.


The best article I know about the problems of localising software is http://search.cpan.org/~ferreira/Locale-Maketext-1.13_82/lib/Locale/Maketext/TPJ13.pod. I particularly like the narrative in the first half. Also, I must warn you that I have not read much about localisation, so my endorsement may not mean much.
The best article I know about the problems of localising software is http://search.cpan.org/~ferreira/Locale-Maketext-1.13_82/lib/Locale/Maketext/TPJ13.pod. I particularly like the narrative in the first half. Also, I must warn you that I have not read much about localisation, so my endorsement may not mean much.
: The article mentioned is also available from the authors' homepage: [http://interglacial.com/tpj/13/ "Localizing Your Perl Programs"] --[[User:Frank Ralf|Frank Ralf]] 00:34, 21 July 2011 (WST)


OK, so the key point it makes is that really, a language string like "There have been $a quiz attempts" is really a function (in the mathematical sense of a mapping, not necessarily as a programming language construct). Depending on $a, we want it to output
OK, so the key point it makes is that really, a language string like "There have been $a quiz attempts" is really a function (in the mathematical sense of a mapping, not necessarily as a programming language construct). Depending on $a, we want it to output
Line 20: Line 21:
==Proposed runtime lang file syntax==
==Proposed runtime lang file syntax==


In <tt>moodledata/lang/<tt> there are subfolders like <tt>en/</tt> (note we lose the legacy <tt>_utf8</tt>) that contains files like <tt>mod_quiz.php</tt> or <tt>core_moodle.php</tt>, that is, using the new component naming convention.
In <tt>moodledata/lang/</tt> there are subfolders like <tt>en/</tt> (note we lose the legacy <tt>_utf8</tt>) that contains files like <tt>mod_quiz.php</tt> or <tt>core_moodle.php</tt>, that is, using the new component naming convention.


Suppose we have a hypothetical plugin <tt>admin/report/dylan</tt> with its current lang file <tt>admin/report/dylan/lang/en_utf8/report_dylan.php</tt> that contains:
Suppose we have a hypothetical plugin <tt>admin/report/dylan</tt> with its current lang file <tt>admin/report/dylan/lang/en_utf8/report_dylan.php</tt> that contains:
<code php>
<syntaxhighlight lang="php">
<?php
<?php
$string['howmanyroads'] = 'How many roads must a man walk down?';
$string['howmanyroads'] = 'How many roads must a man walk down?';
Line 29: Line 30:
$string['xroadsfromytowns'] = '$a->numroads roads from at least $a->numcities different cities.';
$string['xroadsfromytowns'] = '$a->numroads roads from at least $a->numcities different cities.';
// Can't really handle pluralisation in that last one in Moodle :-(
// Can't really handle pluralisation in that last one in Moodle :-(
</code>
</syntaxhighlight>


In my new proposal, <tt>moodledata/lang/en/report_dylan.php</tt> will contain:
In my new proposal, <tt>moodledata/lang/en/report_dylan.php</tt> will contain:
<code php>
<syntaxhighlight lang="php">
<?php
<?php
class strings_en_report_capability extends strings_base {
class strings_en_report_capability extends strings_base {
Line 43: Line 44:
     }
     }
}
}
</code>
</syntaxhighlight>
(I manually line-wrapped that last example to make it readable. in reality, remember that this file is being automatically compiled from some more human-readiable source.)
(I manually line-wrapped that last example to make it readable. in reality, remember that this file is being automatically compiled from some more human-readiable source.)


Also note that <tt>moodledata/lang/fr/report_dylan.php</tt> will look like:
Also note that <tt>moodledata/lang/fr/report_dylan.php</tt> will look like:
<code php>
<syntaxhighlight lang="php">
<?php
<?php
include_once($CFG->langdir . '/en/report_dylan.php');
include_once($CFG->langdir . '/en/report_dylan.php');
Line 55: Line 56:
     // etc.
     // etc.
}
}
</code>
</syntaxhighlight>


And <tt>moodledata/lang/fr_ca/report_dylan.php</tt> is:
And <tt>moodledata/lang/fr_ca/report_dylan.php</tt> is:


<code php>
<syntaxhighlight lang="php">
<?php
<?php
include_once($CFG->langdir . '/fr/report_dylan.php');
include_once($CFG->langdir . '/fr/report_dylan.php');
class strings_fr_ca_report_capability extends strings_en_report_capability {
class strings_fr_ca_report_capability extends strings_en_report_capability {
}
}
</code>
</syntaxhighlight>
 


==Using that runtime format==
==Using that runtime format==


Then, <tt>string_manager()</tt> becomes:
Then, <tt>string_manager()</tt> becomes:
<code php>
<syntaxhighlight lang="php">
class string_manager {
class string_manager {
     protected $stringclasses = array();
     protected $stringclasses = array();
Line 103: Line 103:
     // A few other methods omitted.
     // A few other methods omitted.
}
}
</code>
</syntaxhighlight>
Note that if you have a developer/translator flag on (<tt>$CFG->langediting</tt>) then <tt> is_up_to_date</tt> checks various file timestamps, so that lang files can automatically be recompiled as needed for those people, without hurting runtime performance for production sites.  
Note that if you have a developer/translator flag on (<tt>$CFG->langediting</tt>) then <tt> is_up_to_date</tt> checks various file timestamps, so that lang files can automatically be recompiled as needed for those people, without hurting runtime performance for production sites.  


As a final bit of magic, we have
As a final bit of magic, we have
<code php>
<syntaxhighlight lang="php">
class strings_base {
class strings_base {
     public function __call($name, $arguments) {
     public function __call($name, $arguments) {
Line 113: Line 113:
     }
     }
}
}
</code>
</syntaxhighlight>


Remember that the strings_en_report_capability inherited form this. This gives us our classing missing string fallback. Also:
Remember that the strings_en_report_capability inherited form this. This gives us our classing missing string fallback. Also:


<code php>
<syntaxhighlight lang="php">
class lang_helper_en {
class lang_helper_en {
     private static $inst = null;
     private static $inst = null;
Line 137: Line 137:
     // Other helper functions.
     // Other helper functions.
}
}
</code>
</syntaxhighlight>




Line 150: Line 150:


* [[Languages]]
* [[Languages]]
{{CategoryDeveloper}}
[[Category:Language]]

Latest revision as of 13:38, 14 July 2021

Warning: This page is no longer in use. The information contained on the page should NOT be seen as relevant or reliable.

They say that the border between madness and genius is is very narrow. Here goes.

The best article I know about the problems of localising software is http://search.cpan.org/~ferreira/Locale-Maketext-1.13_82/lib/Locale/Maketext/TPJ13.pod. I particularly like the narrative in the first half. Also, I must warn you that I have not read much about localisation, so my endorsement may not mean much.

The article mentioned is also available from the authors' homepage: "Localizing Your Perl Programs" --Frank Ralf 00:34, 21 July 2011 (WST)

OK, so the key point it makes is that really, a language string like "There have been $a quiz attempts" is really a function (in the mathematical sense of a mapping, not necessarily as a programming language construct). Depending on $a, we want it to output

  • There have been no quiz attempts
  • There has been one quiz attempt
  • There have been 42 quiz attempts

So the question is, why not make it a function in the programming language construct sense as well.


Two representations

Up to Moodle 1.9, the files Moodle used at runtime were exactly the same as the files that translators edited. That was convenient, but limited us to a human-readable and editable format. Also, it meant that Moodle had to do a lot of searching at runtime.

In Moodle 2.0 we are already proposing to split the representations, which lets us optimise the runtime format to be pretty much whatever we like, without making the format edited by translators impossible.


Proposed runtime lang file syntax

In moodledata/lang/ there are subfolders like en/ (note we lose the legacy _utf8) that contains files like mod_quiz.php or core_moodle.php, that is, using the new component naming convention.

Suppose we have a hypothetical plugin admin/report/dylan with its current lang file admin/report/dylan/lang/en_utf8/report_dylan.php that contains:

<?php
$string['howmanyroads'] = 'How many roads must a man walk down?';
$string['roadsx'] = 'Roads: $a';
$string['xroadsfromytowns'] = '$a->numroads roads from at least $a->numcities different cities.';
// Can't really handle pluralisation in that last one in Moodle :-(

In my new proposal, moodledata/lang/en/report_dylan.php will contain:

<?php
class strings_en_report_capability extends strings_base {
    protected static $helper = lang_helper_en::get(); // Get singleton instance.
    public function howmanyroads($a) { return 'How many roads must a man walk down?'; }
    public function roadsx($a) { return self::$helper->quant($a, 'road'); }
    public function xroadsfromytowns($a) { 
        return self::$helper->quant($a->numroads, 'road') . ' from at least ' .
                self::$helper->quant($a-> numcities, 'different city', 'different cities') . '.';
    }
}

(I manually line-wrapped that last example to make it readable. in reality, remember that this file is being automatically compiled from some more human-readiable source.)

Also note that moodledata/lang/fr/report_dylan.php will look like:

<?php
include_once($CFG->langdir . '/en/report_dylan.php');
class strings_fr_report_capability extends strings_en_report_capability {
    protected static $helper = lang_helper_fr::get(); // Get singleton instance.
    public function howmanyroads() { return 'Combien de rue ...'; }
    // etc.
}

And moodledata/lang/fr_ca/report_dylan.php is:

<?php
include_once($CFG->langdir . '/fr/report_dylan.php');
class strings_fr_ca_report_capability extends strings_en_report_capability {
}

Using that runtime format

Then, string_manager() becomes:

class string_manager {
    protected $stringclasses = array();
    public function get_string($identifier, $component = '', $a = null) {
        $component = $this->fix_legacy_component_names($component);
        $this->get_string_class(current_language(), $component)->$identifier($a);
    }
    protected function get_string_class($lang, $component) {
        global $CFG;
        if (!isset($this->stringclasses[$lang][$component])) {
            $this->stringclasses[$lang][$component] = 
                    $this->load_string_class($lang, $component)();
        }
        return $this->stringclasses[$lang][$component];
    }
    protected function load_string_class($lang, $component) {
        $file = "$CFG->langdir/$lang/$component.php";
        $class = "strings_{$lang}_{$component}";
        if ($CFG->langediting && !$this->is_up_to_date($file)) {
            compile_lang_strings($lang, $component);
        }
        if (!is_readable($file)) {
            return new strings_base(); // See below.
        }
        include_once($file);
        if (!class_exists()) {
            throw new coding_exception($file . ' did not define the ' . $class .
                    'class. There must be a bug in compile_lang_strings.');
        }
        return new $class();
    }
    // A few other methods omitted.
}

Note that if you have a developer/translator flag on ($CFG->langediting) then is_up_to_date checks various file timestamps, so that lang files can automatically be recompiled as needed for those people, without hurting runtime performance for production sites.

As a final bit of magic, we have

class strings_base {
    public function __call($name, $arguments) {
        return "[[$name]]";
    }
}

Remember that the strings_en_report_capability inherited form this. This gives us our classing missing string fallback. Also:

class lang_helper_en {
    private static $inst = null;
    public static get() {
        if (!$inst) {
            $inst = new lang_helper_en();
        }
        return $inst;
    }
    public function quant($number, $singular, $plural = '') {
        if ($number = 1) {
            return "$number $singular";
        } else if ($plural) {
            return "$number $plural";
        } else {
            return "$number {$singular}s";
        }
    }
    // Other helper functions.
}


Problems

I think this gives us the fastest possible runtime performance (particularly when combined with a PHP accelerator. Of course, it leaves open the following problems:

  • what string format do translators to edit? (I suggest we copy the maketext format.)
  • can we write the compile_lang_strings function?


See also