Yes, MoinWorksWithChinese.

I want to point out especially, MoinMoin should work with Chinese in utf-8 or Unicode. That I insist that all ChineseWikis should be encoded in utf-8 or other Unicode scheme.

I have test-drived MoinMoin for a few days in utf-8 on a iMac. The result is exciting. Basically, it works!

Based on what I have tested, I will collect the results below:

關於在MoinMoin 中使用中文的一些摘記, 技巧, 臭蟲和想法

本頁 記載 在 MoinMoin 使用 中文 時遇到的因難和有待改進之處。

  1. 中文的標點符號。 最理想的情況:中文和英文的標點符號全部等價,系統自動根據符號所處的上下文來調整使用何種符號。如:當跟在中文字後時,使用中文的符號。跟在西文之後時,用西文的符號。而當匯出其它格式時,可根據系統的配置來作調整。此外,空格可以用來分隔漢字中的詞,使系統可以自動及半自動作分詞,作者亦可干預機器的錯誤的分詞。在顯示時將漢字中的空格除去。

  2. 標題索引。 TitleIndex 中,所有中文的標題都排在Others中,這不是理想的做法,當有大量的中文標題時,會造成非當的不便。解決的辦法會多種。需要仔細規劃並逐步實施。

  3. 詞索引。 目前無法產生詞索引。產生中文詞索引的方法有全自動和半自動兩種。

  4. xml. 目前的實施中,xml產生一個8859-1的encoding屬性。將來有需要改為moin_config.py中的encoding的內容。


Some notes, tips, bugs, and fantasy of using Chinese in MoinMoin

This page is to record the difficulty and hope of improvement of using Chinese with MoinMoin.

  1. Symbols

    In Unicode, the Chinese have their special symbols (full-stop, commas, etc.) encoded differently from other languages. STUPID and violating the principle of unicode, but it is the reality. These symbols should be treated as their counterpart in other languages to avoid making confusion in MoinMoin. For example, when such a symbol follows a WikiName, that symbol is treated as part of the new wikiname, thus MoinMoin fails to recognize that it is an existing wikiname but thinking that it is a new name. Moreover, a symbol should render different if it is nexted to different character (font) to make it look good. Moreover, extra spaces can be used to identify word segmentation of Chinese such that searching operation can be done, but they should not be render as display.

  2. TitleIndex All Chinese Title indexes were listed in the section of Others. This is not an optimum way. There are many possibilities to improve this.

    How?

  3. WordIndex

    • Currently, there is no word index generated by MoinMoin for Chinese. There are two ways to generate the word segmentation. A button to perform the segmentation operation by inserting spaces at the segmentation point is perferred way as all current segmentaton algorithms were not all optimum.

    How do it better?


/!\ General comment: For all people not knowing chinese, code extensively commented in english would help more. ;)

I second that opinion/request.

MoinMoin: MoinWorksWithChinese (last edited 2007-10-29 19:08:11 by localhost)