Talk - MoinMoin Developer Introduction

Abstract

MoinMoin is a popular and powerful wiki engine in Python.

The talk will give an introduction to the MoinMoin core source code, extension concepts and extension development. We will use the current development branch (1.6) to do this, but most stuff also applies to 1.5 (production releases).

Shortly, we will present the flexibility of MoinMoin and how the user base (Python, Apache, Ubuntu, ...) looks like.

We will give an architecture overview as well as explain some code and show how you can write your own.

Introduction

MoinMoin is a wiki engine implemented in Python. We expect most of you know what a wiki is (or even how to use MoinMoin), so we will not do a wiki usage introduction here, rather a MoinMoin developer introduction.

It started as a one-man project by Jürgen Hermann, but is currently maintained by the MoinCoreTeamGroup.

We also often get contributions in form of patches, translations and documentation by our users.

MoinMoin is used by all sorts of organizations and individuals. User base grew significantly after introduction of internal unicode processing and using utf-8 as standard encoding and again after the introduction of the MoinMoin Desktop Edition (MMDE).

Some historical data

Date

Version

SLOC

Comment

2000-07

0.1

470

Jürgen Hermann's fork of Martin Pool's PikiPiki 1.62 with improvements

2001-05

0.9

6800

2002-05

1.0

15300

team development started

2003-11

1.1

23500

2004-02

1.2

14700

moved contrib stuff to wiki, removed some python libs we had included

2004-12

1.3.0

20100

Introduced MoinMoin DesktopEdition

2006-01

1.5.0

45000

2006-07

1.5.4

47000

latest released version

2006-07

1.6dev

47500

development branch: Lupy search replaced by Xapian, bigger refactorings

Data was generated by sloccount

Some users

There are various groups using the MoinMoin wiki. For example:

How development is done / Communication

On IRC (network Freenode), we have two channels:

For communication, documentation and support we use two wikis.

There is a mailinglist for users.

We use Mercurial (master repository on http://hg.moinmo.in/ ) for version control. Have a look at the talk "Achieving High Performance In Mercurial" held by Bryan O'Sullivan on Tuesday morning.

We mostly use PEP8 coding style. This results in mixed styles because PEP8 changed over the time and there were different people involved in development.

No, we don't use "hungarian notation". :)

Architecture Overview

MoinMoin distribution code is some (more or less modular) core code plus a set of (built-in) plugins.

Plugins

MoinMoin uses a lot of plugin types:

action
generates some full output page (e.g. diff, show or delete a page, editor, ...)
formatter
output functions for different output mimetypes (text/html, text/plain, docbook, ...)
parser

ParserName(arg1, arg2, ...) + some lines of text -> formatter -> part of a page

macro

MacroName(arg1, arg2, ...) -> formatter -> part of a paragraph or page

xmlrpc
XMLRPC plugins are used for easy remote procedure calls
filter

is used by indexing search engine to understand various mimetypes (like OpenOffice documents)

theme
different skins for web UI

Additionally to the builtin plugins the user can add more plugins to data/plugin/{action,macro,parser,theme,...} directories, on a per-wiki basis.

Modular stuff

This stuff is currently not intended to be extended by a data/plugin/ mechanism. So those are not plugins, but at least made in a modular way:

auth
user authentication modules (LDAP, HTTP basic auth, session cookie, PHP session)
script

since 1.5.3 there is a generic script plugin mechanism, so that we only need 1 moin shell command

converter
converts HTML from GUI editor back to e.g. wiki markup
request
request methods used by different server types we support (CGI, WSGI, FCGI, mod_python, Twisted, ...)
i18n
translations of the system texts to different languages (we use gettext to make *.mo from *.po files) and some i18n related tools
logfile
access the log files (writing entries, parsing entries, ...)
mail
some mail processing (notify as well as mail import)
security
security policies (antispam and autoadmin currently)
server
some code used by standalone, Twisted and WSGI servers
stats
making access statistics, text or graphics
support
some stuff we include for convenience, like fixed versions of Python stdlib modules, support for cgi tracebacks or Xapian wrappers

Core Code

Stuff like Page, PageEditor, wikiutil, wikiacl, user, etc. (most stuff located directly under the MoinMoin module) is currently considered core code, as well as some packages distributed within the distribution archive.

(!) Some of that stuff could be also done in a more modular way or as a plugin (we are working on that).

Architecture Details

request

This is where everything begins.

If Moin gets invoked, the first thing it does is to construct a request object, representing the request it is currently processing (including handling of the API differences, bugs and features the different supported servers have).

For example, if you call moin via the moin.cgi CGI script, it will use MoinMoin.request.CGI module to get all needed data from the CGI standard environment variables and store them into request object's attributes. The request object also offers read() and write() functions to read (form) data from the user and write output data to the user. Usually, the form data are available via request.form attribute after request object initialization.

Other request classes include stuff for: FastCGI, Twisted, Standalone Server (based on BaseHTTPServer), WSGI, CLI

Page/PageEditor

These are the classes that currently represent pages (read-only / read-write) that are stored in the wiki and also have the filesystem storage code (this needs to be refactored into a separate mimetype item and storage classes, we will come back to that later).

If you need to read or write wiki page content, you should use either these classes or XMLRPC in your code. Or, you could use PackageInstaller to create/modify pages.

/!\ If you access the filesystem directly, it is likely to stop working when we change the storage layout.

user

Usually you access the current user by request.user. What you get is either an anonymous user object (when the user did not log in) or a logged in wiki user.

Using this user object, you can get at this stuff:

wikiutil

As the name suggests, this is a collection of misc. utility functions, which are used all over the place:

action Plugins

An action is a rather flexible operation.

For example, the diff action can be called by requesting this URL:

http://moinmoin.wikiwikiweb.de/WikiSandBox?action=diff&rev1=23&rev2=42

After the basic request object setup, moin finds out quite early that a special action has been called, imports the action handler function from the action package and then just executes that function.

After that, it is completely up to that function what will happen.

In case of the diff handler, it will:

Other examples for actions: show (most used :), DeletePage, sitemap (emits XML for google sitemap).

(!) So keep in mind, that actions:

parser / formatter Plugins

Wiki pages use wiki markup for entering content. MoinMoin stores to disk exactly what you entered in the wiki text editor form.

But a browser can only render html, so someone has to do the transformation from the input format "wiki markup" to the output format "HTML".

Starting with 1.6, we try to use mimetypes to modularize and generalize this, e.g.:

So when the show action is called (or no specific action, as the default for action is show), moin will look at the item you request (e.g. WikiSandBox page), determines its mimetype (will result in text/moin-wiki) and then load MoinMoin.parser.text_moin_wiki to parse the wiki markup.

If you are not requesting some specific output format, it will assumes you want to have some html rendered in your browser (mimetype text/html), so it loads MoinMoin.formatter.text_html as the formatter for that.

The wiki parser will then go through the text line by line and throw a big ugly regex at it to find out what markup is used there. While parsing the input data, it will call the formatter to generate output data. The parser has handler functions for the different kinds of specific markup used, the formatter has some generic functions for the stuff usually needed (make something underlined, bold, make a paragraph, a headline, ...).

For example, if you have some text like "This is __underlined__." (renders as: "This is underlined.") on your page, this is the output of the text_html formatter (comments show what happened):

This is             # seen normal text: parser calls formatter.text
<span class="u">    # seen __ markup: parser's _u_repl calls formatter.underline and toggles self.is_u
underlined          # seen normal text: parser calls formatter.text
</span">            # seen __ markup: parser's _u_repl calls formatter.underline again and toggles self.is_u
.                   # seen normal text: parser calls formatter.text

Other parsers moin has include: text_rst, text (text/plain and fallback), text_docbook, text_python, text_csv, ... Other formatters moin has include: text_plain, text_docbook, text_python, ...

(!) While text_python parser is a simple highlighting parser for python source code, the formatter text_python is a rather special beast: it formats with python code as target format. The code contains request.write("...") calls for the static content as well as calls to the formatter for dynamic content like macros. After we have all output python code for a page, we byte-compile it and store it to the cache directory. Next time when the same page gets requested, we just load and execute the bytecode - that makes rendering wiki markup as fast as it can get while still being dynamic, when needed.

(!) So keep in mind:

macro Plugins

Often you do not want to hack the parsers just to get a tiny bit of additional functionality into your pages - this is why we have macros to embed into wiki pages: they get a (usually small) piece of text as parameters and processes them (usually calling the formatter to output stuff).

The usual syntax is MacroName(arg1, arg2, ...).

A trivial macro is [[BR]] - it just calls the formatter to format a linebreak (text_html formatter returns<BR>) and then returns the result:

   1 # BR.py
   2 def execute(macro, args):
   3     return macro.formatter.linebreak(0)

Of course there are also more complex macros, like RecentChanges, WantedPages or MonthCalendar.

(!) So keep in mind that macros:

theme Plugins

Moin supports pluggable themes to let you customize how it looks like. A theme is made from:

(!) Please note that if you want to write your own theme (doing more than just using modern with different colours and icons), this will be a lot of work (initially and also for maintenance over the years) and you have to test it with different browsers (and try to work around some browsers that really suck), check whether left-to-right as well as right-to-left languages are usable with it, check if usability is still there, even on small screens and much more.

Not every theme that looks really cool is also a theme with good usability/compatiblity.

xmlrpc Plugins

Moin implements the wiki XMLRPC APIs v1 (the official stuff, but slightly sucks) and v2 (a slightly different, slightly less official interpretation of the standard).

XMLRPC sounds a bit complicated when you have never seen it working, but it is not. All the complicated stuff (XML and RPC) is done for you by MoinMoin (and Python's xmlrpclib). You just have to write some code and you can call it remotely. Keep security in mind. :)

An example for an XMLRPC application is the automatic distribution of the BadContent page that keeps the anti-spam patterns. There is also a plugin to generate group definition pages by a script. Wikisync will use XMLRPC as well.

filter Plugins

Filters are simple plugins that help the moin search indexer to index documents. A filter simply gets a filename and has to return file content as unicode object (that doesn't need to be pretty, just the potential search terms should be there).

We currently have filters for:

auth Modules

You can use cfg.auth to configure a list of auth modules. Moin will use that list to call one auth method after the other.

An auth module gets this as input:

And returns:

converter Modules

Currently there is only one converter module: text_html_text_moin_wiki - we use this together with FCKeditor (the nice GUI editor we use).

This is how it works:

Using this method we could also process other markup, but the first goal is to improve the current converter (it still has bugs and limitations).

A walk through the code

... Show some macros and action ... Write some easy macro or action? e.g. we write some color macro to make coloured text ...

/!\ Keep this short.

Future

Moin2 / Google Summer of Code

We are currently mentoring three Google Summer of Code projects, which hopefully will result in some interesting code contributions. While doing SOC (and using code from it), we are working towards MoinMoin 2.0 - in small steps (and 1.6 will be some of those steps).

Here are some of the expected improvements:

Refactoring

We will get a more OO-like design of the core.

We will try to empower moin by exploiting inheritance and generalization and at the same time make it cleaner and simpler. We hope to do more with less code.

We will not have pages and attachments any more, but a hierarchy of mime-type item classes (one mimetype will be text/x-moin-wiki, that will replace what a page is now, a sub-item of mimetype image/jpeg is what an jpeg attachment is now). In that way, attachments will get revisioning for free and pages get up/download for free.

Each item will have meta data (mimetype, language, etc. is stored there). Rendering (and other actions) will depend on the mimetype.

The Xapian search engine is already integrated in the 1.6 branch. It will index all mimetype items supported by filters. We will have an improved search UI exploiting Xapian's features. It will replace the old Lupy approach. Thanks to the work currently done by Franz Pletz (SOC).

Storage

We will have a storage backend API with backend plugins.

One plugin will support old storage (1.3/1.5 style) - mostly for converting your existing data - and a new backend will support new MoinMoin 2 features.

Also, if somebody wishes to implement some Mercurial/SVN/SQL/etc. storage, he can do so by implementing it as a storage plugin.

Thanks to Alex Adranghi.

Wiki Synchronisation

MoinMoin 1.6 will offer a way to syncronise different wiki sites. Changes will be distributed automatically, concurrently changed pages will be merged. There are several use cases:

Your contribution

If you like MoinMoin, you can help us by contributing code, bug fixes, translations and documentation - sometimes even an idea with a good plan is quite valuable.

There is a long todo and lots of ideas for MoinMoin on the wiki. When the MoinMoin 2 core is starting to work, that todo will even get longer due to new possibilities.

For beginning with MoinMoin, the easiest thing to do is some macro or simple parser or action. While doing that, you can dig deeper (there is lots of sample source code under the MoinMoin module directory) and have more fun. :)

For bigger work on core code, please stay in contact with the core team and use the MoinMoin site for collaboration. You can use Mercurial SCM to clone our repository on http://hg.moinmo.in/ - regular code contributors will also get write access there.

We need your help! :)

MoinMoin: EuroPython2006/CoderTalk (last edited 2007-12-15 03:40:57 by ThomasWaldmann)