A proposal for page names handling, planned for 1.3 version. If you want to develop for MoinMoin, you should read this.

See also: MoinMoinBugs/CanCreateBadNames, MoinDev/Storage, QuotingWikiNames

Different name representations

As of version 1.3, moin uses Unicode internally and a page name can be any Unicode string.

Page names have 3 representations:

  1. as URL in links in the wiki or anywhere
  2. as internal Unicode string
  3. as a filename on any file system

(!) In the future pages might be saved in a database, with or without name limits.

Accepting URLs

URL -> [unquote %] -> Unicode -> [normalize] -> [quote for storage ] -> Storage
  1. (./) Unquote characters using %hex quoting - done by cgi

  2. Replace "_" with " " - done today by unquoteWikiname, but we must do it in a different place - urls should not be unquoted with the storage unquoter. We should not support internal storage format in url.
  3. (./) Decode from config.charset to Unicode - wikiutil.decodeUserInput

  4. Normalizing page names - see MoinMoinBugs/CanCreateBadNames

  5. Use Unicode for processing, like matching the list of page names
  6. (./) Quote name for storage if you want to save a page - wikiutil.quoteWikinameFS (using new quoting, patch-78)

Notes:

Normalizing page names

Normalizing page name/user names include (in this order):

  1. Remove silly white space:
    Page Name / Sub    Page -> Page Name/Sub Page
  2. Remove multiple slashes
    PageName/  / /SubPage -> PageName/SubPage
  3. Remove leading and trailing "/" from page names, because it's used for sub pages.
    /PageName/ -> PageName
  4. Check length of name:
    • Limited by file name quoting, so we have to quote the name for this
    • Limited by offline wiki, need to add ".html" to each file name
    • Limiter by temporary pages, currently adding "#PageName.timestamp#" around the quoted name - I think we should fix this wrong tempfile handling by using os.tempfile or using temp directory.

    (!) I think we can restrict pagenames more, maybe use only Unicode alpah numeric characters, but ohter developers don't like the idea. -- NirSoffer 2004-09-06 23:36:59

Normalizing user and groups names

User and groups name will be restricted to Unicode alpha numeric characters, including one optional space chracter between words.

Normalizing user and group names include (in this order):

  1. Normalize as page name (see above), becuase groups are pages, and user are usually pages, and might be pages in the future.
  2. Replace non alpha numeric charactrs with the replacement character: "-"
    Page:Name -> Page-Name
    Page,Name -> Page-Name
    User/Name -> User-Name

(Code removed, is not updated for the new description)

Generating URLs

From a file name:
    
    Storage -> [unquote file name] -> Unicode -> ...

From Unicode:
    
    Unicode -> [encode to config.charset] -> [quote for URL] -> URL
  1. (./) unquote from storage - wikiutil.unquoteWikiname (using new quoting, patch-78)

  2. If needed, decode to do any processing on the Unicode name, then encode to config.charset
  3. (./) quote for URL - wikiutil.quoteWikinameURL

Notes:

Offline Wiki

Using moin_dump, we can save the wiki as a collection of html pages. In this version, links should use file names.

There are two options:

Notes:

If we do subpages by subdirectories (see StorageRefactoring/PagesAsBundles) we should also convert pages and subpages html output to a directory structure like this:

ParentPage
    index.html
    SubPage.html

Now both http://domin/ParentPage/ and http://domain/ParentPage/SubPage.html will work, and we don't have any problem with long wiki names like VeryLongChineeseParentPage/VeryLongChineeseSubPage, which could easily reach 254 characters using utf-8 encoding.

Obtaining page lists

Page class now contains getPageList and getPageDict, returning a list of page names or a dict of {pagename: Page object} of user-readable pages (either of user given, or request.user, or all pages for user=""). /MoinEditorBackup pages are filtered out.

MoinMoin: PageNames (last edited 2007-10-29 19:10:42 by localhost)