WikklyText fully supports multilingual content using both traditional multibyte character sets as well as Unicode. Source file encodings can be specified in two ways:
- Files written in UTF-8, UTF-8-SIG, UTF-16LE, and UTF-16BE formats are automatically detected from their Byte Order Marker (BOM).
- Non-UTF multibyte formats are detected by placing the following comment somewhere in the wikitext:
/% encoding: ENCODING %/
Where ENCODING is any Python-supported encoding.
First, a sample showing multilanguage content directly from this source file (this source file is encoded in UTF-8, as are all the samples below):
| Language | Sample |
|---|
| Bulgarian | Здравей, свят! |
| Chinese, simplified | 世界你好! |
| Chinese, traditional | 哈囉,大家好! |
| English | Hello world! |
| Estonian | Tere kõik! |
| Georgian | სალამი მსოფლიოს! |
| Greek | Γεια σου, κόσμε! |
| Hebrew | םלועה לכל םולש |
| Japanese | 世界よ、こんにちは! |
| Korean | 안녕하세요, 여러분! |
| Persian | سلام بر هم |
| Polish | Witaj świecie! |
| Russian | Здравствуй, мир! |
| Serbian | Поздрав свима! |
| Turkish | Merhaba, dünya! |
| Ukranian | Привіт світ! |
| Vietnamese | Chào thế giới ! |
If the characters are not displayed correctly in your browser, you need to install the approriate font.
References: It is possible to mix content encoding types within a document by using the
<<include ...>> macro to import other files. This table below mixes a variety of traditional multibyte encodings in a single table.
WikklyText handles them using a combination of BOM detection and
/% encoding ...%/ tags.
| Filename | Language (Encoding): Sample |
|---|
| bulgarian.txt | Bulgarian (UTF-8): Здравей, свят! |
| chinese-simp.txt | Chinese, simplified (GB2312): 世界你好!
|
| chinese-trad-big5.txt | Chinese, traditional (Big5): 哈囉,大家好! |
| chinese-trad-utf8.txt | Chinese, traditional (UTF-8): 哈囉,大家好! |
| english.txt | English (ASCII): Hello world! |
| estonian.txt | Estonian (ISO-8859-15): Tere kõik! |
| georgian.txt | Georgian (UTF-8): სალამი მსოფლიოს! |
| greek.txt | Greek (ISO-8859-7): Γεια σου, κόσμε!
|
| hebrew.txt | Hebrew (ISO-8859-8): !םלועה לכל םולש
|
| japanese-eucjp.txt | Japanese (EUC-JP): 世界よ、こんにちは!
|
| japanese-shiftjis.txt | Japanese (Shift-JIS): 世界よ、こんにちは!
|
| korean.txt | Korean (EUC-KR): 안녕하세요, 여러분!
|
| persian.txt | Persian (UTF-8): سلام بر هم |
| polish.txt | Polish (ISO-8859-2): Witaj świecie!
|
| russian.txt | Russian (KOI8-R): Здравствуй, мир!
|
| serbian.txt | Serbian (UTF-8): Поздрав свима!
|
| turkish.txt | Turkish (UTF-8): Merhaba, dünya! |
| ukrainian.txt | Ukranian (KOI8-U): Привіт світ!
|
| vietnamese.txt | Vietnamese (UTF-8): Chào thế giới ! |
(The text samples were taken from GNU hello, which is Copyright (C) 2006 Free Software Foundation, Inc.)