Description

Moin (like Mediawiki) uses _ to replace a space char to make URLs nicer (so it is not Thomas%20Waldmann but Thomas_Waldmann).

Of course this is a problem if you really need an underscore somewhere, e.g. if you want to link to files on disk (I played with some "filesystem virtual page" plugin for doing that). The underscore must not be replaced by a blank by unquoting functions in that case.

So the first idea is of course to use %5f to encode. But that doesn't work because we first do url_unquote (makes an underscore char out of it) and then we replace all underscores by blanks... 8(

Usually this should be easily fixed by just doing first the replace("_", "%20"), then do the unquoting and then NOT do a replace("_", " "). But the unquote stuff in moin is a (partly even platform dependant) mess...

Steps to reproduce

  1. compare resulting page name from those:

Details

This Wiki.

Workaround

None.

One may use one of unicode characters similar to underscore (FF3F _). However the current version (1.5.2) doesn't handle the conversion well in some cases. But for basic uses this might suffice. The other d Disadvantage is that this character is not present in all fonts, so some visitors may see a square instead and it makes manual entering of url complicated. -- hajma 27.2.2006

For those who know about the problem, you can type "page name" instead of page_name as a search term. This seems to work perfectly under Windows/IIS, others? This is not a real workaround, though, because there will always be someone who does not get the memo, and it's counterintuitive.

Discussion

Shall we do a refactor branch for the quote / unquote mess?

/!\ This should be solved before we unify pages and attachments.

The problem is trying to be smart and making _ == ' '. This does create nicer urls, but make problems for files, or with names like __foo__, which should be legal names, for example in a wiki about Python. The simplest and robust solution is remove the extra magic, don't replace _ with ' '. If a user want a nice url, he will use this_name, if he want spaces, he will use this name, or he may use this-name or ThisName - who cares? let theme use what they want. The page name is the url, the page title can be anything the user like to use.

Since changing this might break existing pages and links, lets make it a configuration option, and disable the replacement by default in the next version, and check the user response.

Some browsers (Safari) automatically unquote urls, so any urls are always nice url e.g. this%20name displayed as this name. -- NirSoffer 2006-02-11 21:36:40

I know this is marked as "fixed" in 1.6 branch (since over a year ago), but we're still suffering with the problem in 1.5.x. Any chance there's a hack or a work-around we can apply until 1.6 is implemented? In the mean time I'll see if I can create my own fix, but I'm still pretty slow with python/moinmoin. Thanks. -- SteveDavison 2007-08-10 02:17:01

Personally, I think that turning spaces into underscores is just fine. Maybe it's not for everyone, but I'd prefer to have it as an option. The thing is... if you are going to equat "_" with " " in one case, you have to do it in every case. Searching for "some_word" and "some word" should be treated as exactly the same thing. -- SteveDavison 2007-08-11 05:03:28

The underscore magic is already removed in 1.6 (did more harm than good). For stuff containing blanks, we need quoting. BTW, google also does it with quoting. -- ThomasWaldmann 2007-08-11 11:04:18

Plan


CategoryMoinMoinBugFixed

MoinMoin: MoinMoinBugs/UnderScoreQuotingProblem (last edited 2007-10-29 19:09:16 by localhost)