Talk:UTF-8 and BOM

Jump to: navigation, search

Introduction

Since Moodle transitioned to UTF-8 files now contain 16-bit characters to facilitate the use of multiple character sets. This gives rise to the facility of the Byte Order Mark(BOM) which is used to signal the Endianness of the file stream - e.g. the language php files.


The problem

The inclusion of the BOM in language files can cause odd bytes to be sent to the client's browser from the web server. The bytes can appear before the prefixing '<DOCTYPE ....>' tag at the top of the html stream. This can have the effect on Internet Explorer of it ignoring the the 'DOCTYPE' tag and placing it into 'Quirks' mode which in turn causes the page to be rendered incorrectly.


Solution

The solution is to ensure that all files you save that use extended characters beyond standard ANSI are saved in UTF-8 without the BOM. This can be achieved by '
grep -rl $'\xEF\xBB\xBF' . | xargs sed -i '1 s/^\xef\xbb\xbf//' {}
' - credit to Using awk to remove the Byte-order mark or in Notepad++:


conversion solution.png


Reference

This issue was first raised in Moodle Tracker 31343 as a consequence of Easy Post Format Breaks themes in Internet explorer.