Languages subsystem improvements 2.0

Revision as of 10:42, 23 November 2009 by Tim Hunt (talk | contribs) (Proposals)

Jump to: navigation, search
Languages subsystem improvements
Project state Research and planning
Tracker issue n/a
Discussion n/a
Assignee David Mudrak

Moodle 2.0

This is an initial proposal of changes to the language strings processing in Moodle.

Current issues

String files are not branched 
We must keep all strings from all branches in place for backwards compatibility and we are unable to easily clean up language packs. Some say the branching and merging is too big toast for our translators.
Plural forms, gender forms and other grammar 
We are unable to handle plurals at all. For example, handling plural forms in gettext is traditional, well tested and robust way (see MDL-4790). MDL-12433 by Sam Marshal shows alternative approach based on logical expressions.
Strings can't be modified 
It is difficult to notify translators that some string was modified (expanded, fixed, changed). The current work around it the policy of adding another string with the same suffixed name (like 'license2'). Would be nice if such strings were tagged/highlighted in the translation UI.
We do not use standard formats 
Translators can't use specialized tools for translation (PO/gettext editors, community translation portals). Also, I am not aware of any benchmarking showing the performance differences between out native $string[] format compared to, for example, standard .po format.
More syntax checks are required 
So the translators do not brake Moodle functionality (see MDL-12433)


  1. Do not reinvent the wheel. Keep "do one thing and do it well" principle. Keep it simple and stupid.
  2. Make simple things easy and hard things possible
  3. ...

Key design questions

What is the data structure for storing the master copies of the lang packs that translators work on 
At the moment it is plain PHP array, editable via translation UI or directly. Petr proposes a change to keeping these strings in database, sort of syncable with some central repo. Whatever the format is, we must be able to store some metadata - the timestamp of the last modification, the author name, proposed alternatives, comments etc (see rosetta translation tool at launchpad for the example of possible metadata)
What is the UI for translators, what are the processes of contributing and how the translations are redistributed to Moodle sites 
Out translators should not be forced to use the only one possible tool. We should consider switching to a standardized common format (like PO or XLIFF) that is supported by a variety of advanced tools (equipped with translation memory, connected with dictionaries, i18n portals etc).
What is the data structure Moodle uses at runtime 
This is just a performance optimization (implementation detail), should be independent on the native format that humans work with so it could be modified anytime in the future. For example, see the system proposed by Tim based on calling class methods (inspired by Perl's Maketext).
What is the format of a lang string, and how are placeholders substituted 
This is the most important issue we have at the moment but as it is strongly tied together with the runtime format, it can be changed any time. On the other hand, both the UI and storage format must support it.

Use cases

  1. Developers add new strings to the core
  2. Translators translate untranslated core strings and publish their work
  3. Admins want to locally modify the language pack
  4. Contributors add new string to the contributed code
  5. Translators translate untranslated contrib strings and publish their work
  6. ...


This is the list of projects, resources and tools being explored

  • Great CPAN article about software localization. Plain string based lexicon is not enough. Strings can be translated by functions only. "A phrase is a function; a phrasebook is a bunch of functions."
  • XLIFF - XML Localization Interchange File Format
  • Virtaal - promising, we could have XLIFF <-> .php conversion
  • Launchpad - translation portal used by Ubuntu and many other projects. Would require BSD licensing, therefore IMO not suitable as we could not import our current GPL'ed translation. Seems to be pretty slow during the process.
  • Plural forms in gettext
  • Zend_Translate reference guide
  • MDL-12433 - Sam Marshal's proposal
  • MediaWiki approach: Grammar forms and plurals:
    {{plural:1|is|are}} {{plural:2|is|are}}
    (Example of how mediawiki outputs the correct given pluralization form depending on the count. Plural transformations are used for languages like Russian based on "count mod 10").


File format translators work with

Translation tool

From Martin in Dev chat:

if you want crazy ideas, how about get_string returns some special tags and those tags get converted to ajax on the GUI so that translators can translate directly in the main Moodle GUI?

What a cool idea. Could be a special mode you have to turn on in the admin screens. Perhaps even if you turned this mode on, it would still only be active for people with certain roles, or perhaps when it was turned on, it would have to apply to all roles, so that you could edit strings for not-logged-in users. Anyway, when this mode was on, it would:

  1. Adds <span class="moodle-lang-string" id="lang_string|admin|langedit">Language editing</span> around each string on the page - to use one example.
  2. $PAGE->requires->js an extra JS file that adds an on-click handler to all such spans, so that when you click on it, it pops up the language editing UI in a YUI dialogue.

Runtime file format

See also