Unify parsers and macros for both simpler user interface and code

See also UnifyMarkup

Unify Parsers and Macros

After having finished with UnifyParsersAndProcessors, it's time to take a look at parsers and macros.

A macro is an extension plugin that can be called from wiki markup to perform some output depending on arguments given to it. In contrast a parser is an extension plugin that performs some output depending on some parameters and a block of lines given to it.

Conclusion: A macro is a parser that gets an empty block of lines (or _lines=None).

So if we could unify parsers and macros, we get a nice class based api for both types of extension plugins: Macro and Parser.

We are dealing with two different problems: the code handling plugins and the wiki markup. Unifying the code that handle both macros and parsers and plugins in general is good, but not related to the markup. Macro and parser are not the same - parser has content.

We can use one namespace for macros,parsers e.g. PythonParser, CPPParser, FullSearch - makes sense as all are page extensions and doing the same thing "process some text and output some page content". OTOH actions, formatters and themes are wiki engine extensions.

The name of the thing

Ongoing GoogleSoc2008 work of BastianBlank implements a MoinMoin.converter2 package. converter2 because we already have a converter package for the gui editor that follows a different api and goal. Nevertheless, if we speak of a converter below, we mean converter2.

Discussion

How will the converter markup look like?

Current 1.6/1.7 markup

<<ExtensionName(arguments)>>   # current 1.6/1.7 macro syntax,
                               # arguments may be processed by argument parser magic

{{{#!ExtensionName arguments   # current 1.6/1.7 parser syntax,
line                           # not much argument processing
line
line
}}}

Note: includes some magic so that more than 3 { also work (for nesting).

We could just keep that markup, parse it and map it to a converter call:

Macro:  converter(arguments, _lines=None)
Parser: converter(arguments, _lines=[line, line, ...])

We should process both argument lists by same argument parser.

PluginBase Class

class PluginBase:

    # file extensions this plugin may apply to...
    _extensions = None
    # mimetypes this extension may apply to...
    _mimetypes = None
    # they are either None (== no parser like extension)
    # or a list of strings (extensions or mimetype globs)

    # what dependencies (caching) does this plugin have?
    _dependencies = ["time"]

    # how to parse the arguments for this plugin?
    _parameters = None

    def __init__(self, _request, _lines, **kw):
        # underscore, so we won't collide with kwargs
        self._request = request
        self._lines = lines
        # just save the kwargs as self attributes
        for key in kw:
            setattr(self, key, kw[key])

    def help(self):
        ''' Describe what your plugin can do. 

        The base class can have code to produce automatic help from 
        available instance attribtues.
        '''

Plugin Protocol

Instead of defining calls that raise, or return nothing, we can define a informal protocol. Plugins that want to answer this calls from the parser will implement them. We can do the same by puting these methods in the base class, but that can make it harder when the base class return something that you don't want. If we define raise NotImplementedError for each such method, the poor developer that sub class this will have to override ALL methods which she does not care about, just to remove that raise.

In the case of help(), its is useful if the base class can produce some automatic help for you, base on your methods docstrings and names, so it's better to have this in a base class. Same for automatic parsing code of arguments etc.

By using a protocol, any object can be a plugin. parser and plugins just have to use the same protocol, they don't care what is the object type, just what protocol methods it likes to implement.

   1 """ Draft for Informal protocol for plugins.
   2 
   3 When you write your plugin, copy and pase the methods
   4 prototoyes from here.
   5 
   6 This protocol might be extended in future versions. You don't have
   7 to implement any of these mehthods - just what make sense for your
   8 plugin.
   9 """
  10 
  11 def executeInParagraph(self, formatter):
  12     """ What the plugin should do in a pargragrph context  
  13     
  14     Example: 
  15         This is text [[ThisMacro]] more text
  16 
  17     In this context, the parser will call you with
  18     executeInParagraph(). If you don't implement this, the
  19     parser will ignore you. You can return an error message
  20     if it does not make sense to put the macro in this
  21     context.
  22 
  23     @rtype: unicode
  24     @return: formatted content
  25     """
  26     
  27 def executeInPage(self, formatter):
  28     """ What the plugin should do in a page context  
  29     
  30     Example: 
  31         This is a wikii paragraph 
  32 
  33         [[ThisMacro]]
  34 
  35         Another paragraph...
  36 
  37     In this context, the parser will call you with
  38     executeInPage(). If you don't implement this, the
  39     parser will ignore you. You can return an error message
  40     if it does not make sense to put the macro in this
  41     context.
  42 
  43     @rtype: unicode
  44     @return: formatted content
  45    """

How the parser can use such object?

   1 def _repl_plugin(self, text):
   2     # Get plugin instance (using oliver/fabi parser and import plugin)
   3     plugin = self._getPlugin(text)
   4     if in_p:
   5         execute = getattr(plugin, 'executeInParagrph', None)    
   6     else:
   7         execute = getattr(plugin, 'executeInPage', None)
   8 
   9     if execute:
  10 
  11             # the parser does not have to open a pargraph or div or
  12             # whatever for a plugin. The plugin writer is responsible for
  13             # correct html element in either a page context or paragraph
  14             # context.
  15             
  16             output = execute(self.formatter)
  17             self.request.write(output)
  18     else:
  19         # Maybe error message, or plugin markup in red, or just ignore

markup conversion

We should have some markup conversion rules ready for a mig script:

old syntax (parser, processor or macro)
----
new syntax (PPM)

{{{ plaintext }}}
----
`plaintext`    # hmm, does this line show we need 2 quoting methods anyway?
                     # needs backtick_meta = 1 hardcoded

{{{
... pre section ...
}}}
----
{{{       # no specification = default to Parser(text/plain, addpre=1)
... pre section ...
}}}

{{{#!python
... py src
}}}
----
{{{#!python        # or PythonParser !? - shortcut for Parser(python) / Parser(text/x-python)
... py src ...
}}}

<<BR>>
----
<<BR>>    # implicitely this is BR()

<<OtherMacro(arg1,arg2)>>
----
<<OtherMacro(arg1,arg2)>>

Syntactic sugar

Some parsors/processors have its own arguments. For example bibtex in ParserMarket and SimpleTable in ProcessorMarket.

{{‏{!#SimpleTable 4
... contents ...
}}‏}
----
{{{SimpleTable 4
... contents ...
}}}

Or we can allow another syntax for arguments.

{{{SimpleTable(4)
... contents ...
}}}

We can extend this idea to macros.

<<OtherMacro(arg1,arg2)>>
----
<<OtherMacro arg1 arg2>>

or

<<OtherMacro(arg1,arg2)>>

And for arguments, we can design an escape mechanism.

{{{Macro1(arg1,second arg with space)}}}
{{{Macro2 arg1 second,arg,with,comma}}}
{{{Macro3("arg 1", "arg 2, with comma")}}}

If we do quoted argument lists, then lets use SGML like attributes, which are easier to use then positional arguments, better known then Python conventions, and will allow us to merge css attributes naturally, e.g. let us set a css class to a wiki element.

{{{TableOfContents level="2" class="sidebar"}}}
{{{ParserName line-numbers="on" class="folding"
Text...
}}}

{{ }} for "include"?

While looking for new ideas on another wiki, I found that they use {{imgage.png}} for showing ("including") images.

I thought that {{ItemName or URL}} could in general mean "include the content from that place here", while [[ItemName or URL]] means "link to that content".

That conflicts with usage of {{text}} for plain / pre / verbatim text (see section above), but maybe we could just define that default to that "include" processing and choose something more verbose for verbatim.

On first look it also isn't quite intuitive why "include" is similar to PPM (or why it is even the default of it), but on the second look one notices, that this just means "include the output of that PPM here" or "include the output we get when doing default processing on the mimetype we find there", so this is not that far-fetched.

It would be further easy to think of {{ as taking single character "parameters" so that {{{ possibly used for extensions, simply means to the human wikizen, "an included thing, that had code do some of the work". Other {{ or [[ thingies are easily imagined then, e.g. [[*icon* link]] mark the link special somehow, {{| big box around my inclusion |}} {{[lang] run parser on my attachment for display }} ... ok, so these are not consistent where the "end mark" of the inner parameter is, but the human sees the usability much better than cumbersome {{Include(pagename)}} {{Parser(language,linkname)}} -- why should wikizens need to know what we call these thing? They should be able to concentrate on the effects.

The markup must be clear and handle all the hard cases, like page or file names with spaces. Therefore, {{:pagename label}} does not work. {{ThisName}} is also not clear; does it run ThisName macro, or include ThisName page content, or include ThisName image?

All markup can use the same Smalltalk-like syntax:

{{object:value variable:value}}

Which happen to be like our acl syntax:

#acl: Name:rightlist AnotherName:rightlist

And like our search syntax:

title:This regex:That AnotherTerm

Where object can be http, page, image, and plugin which are a Renderer, that render the contents of the resource in the page. All of them can be implemented as macros which can be overridden by private wiki macros. So, if someone want to add custom icons for http links, he can simply use its own dancing and singing http renderer.

Here how common markup will look like:

We can use one namespace as the default, maybe the pages namespace, as its the most used namespace in a wiki. Variables names can be omitted to save typing: [[:Page Name :Different Title]] - this syntax already exists: [:Page Name :Different Title] -> Different Title.

If attached image is just like a regular page, for example, saved as SomePage/ImageName.png, then including an image is just {{SomePage/ImageName.png}}, and a link to that image is [[SomePage/ImageName.png]]

This will simplify the parser, as there will be no more magic words to parse. Only text, ''text style'', = heading =,  * list,  #. ordered list, [[link]] and {{include}}. Both include and link will use same parser for the content.

I do not like it because it feels kinda bloated. {{plugin:BR}} It rather looks like a parser simplification than a syntax simplification because the long prefixed macro/parser calls are not very intuitive.

The name PPM does describe the techniques are used to write the code but normally I use the words extension or plugin or module to describe someone who does like to use the wiki and its PPMs. If the code is unified we should use in my eyes a common name like firefox and others are doing. Just call it extension -- ReimarBauer 2005-11-19 07:25:54


Just an other idea for the syntax invoking a parser or macro call.

{[ImageLink(arguments)]}

or

{[python(arguments)
text
]}

Then it is pretty clear that it is a new synatx and for some time the old syntax could be used for comtibility issues. In our cases we may be won't be able to update all wikiservers at the same time. -- ReimarBauer 2006-09-05 10:16:33

How can both types in practise handle arguments

macro with arguments

macros since 1.7 can be written using the argument parser. e.g.

   1 # HelloMacro
   2 def macro_HelloWorld(macro, color=(u'red', u'blue')):
   3     return macro.request.formatter.text(
   4                          'Hello World',
   5                          style="color:%s" % color)
Example

<<HelloWorld(color=red)>>

parser with arguments

some parser have parameters and they can benefit also from the argument parser. The same example as parser:

   1 #  HelloParser example code
   2 from MoinMoin import wikiutil
   3 
   4 parser_name = __name__.split('.')[-1]
   5 
   6 def macro_settings(color=(u'red', u'blue')):
   7     return locals()
   8 
   9 class Parser:
  10     """ parser """
  11     extensions = '*.xyz'
  12     def __init__(self, raw, request, **kw):
  13         self.pagename = request.page.page_name
  14         self.raw = raw
  15         self.request = request
  16         self.form = None
  17         self._ = request.getText
  18 
  19         args = kw.get('format_args', '')
  20         self.init_settings = False
  21         # we use a macro definition to initialize the default init parameters
  22         # if a user enters a wrong parameter the failure is shown by the exception
  23         try:
  24             settings = wikiutil.invoke_extension_function(request, macro_settings, args)
  25             for key, value in settings.items():
  26                 setattr(self, key, value)
  27             # saves the state of valid input
  28             self.init_settings = True
  29         except ValueError, err:
  30             msg = u"Parser: %s" % err.args[0]
  31             request.write(wikiutil.escape(msg))
  32             
  33     def render(self, formatter):
  34         """ renders """
  35         return self.request.formatter.text("HelloWorld",
  36                          style="color:%s" % self.color)    
  37 
  38     def format(self, formatter):
  39         """ parser output """
  40         # checks if initializing of all attributes in __init__ was done
  41         if self.init_settings:
  42             self.request.write(self.render(formatter))
Example

{{{#!HelloParser color=red
}}}

The parser uses invoke_extension_function(request, macro_settings, args) to verify arguments. In current definition of a parser we can not directly exit it at the initialization if one enters an unknown parameter or a wrong value. Therefore we use the status var self.init_settings in format.

How to call a parser as macro

In difference to a parser a macro can be called in textflow e.g. in a table.

Parsers which are known to have a macro as wrapper too:

This is a helper macro which makes it easy to have one macro to wrap arround parsers which sometimes wanted to be called in textflow.

   1 # -*- coding: iso-8859-1 -*-
   2 """
   3     MoinMoin - macro Parser
   4 
   5     This macro is used to call Parsers,
   6     it is just a thin wrapper around it.
   7               
   8     @copyright: 200X by MoinMoin:JohannesBerg
   9                 2008 by MoinMoin:ReimarBauer
  10     @license: GNU GPL, see COPYING for details.
  11 """
  12 from MoinMoin import wikiutil
  13 
  14 def macro_Parser(macro, parser_name=u'', div_class=u'nonexistent', _kwargs=unicode):
  15     if not _kwargs:
  16         _kwargs = {}
  17 
  18     # damn I know that this can be done by one command, someone can change it please.
  19     args = []
  20     for key, value in _kwargs.items():
  21         args.append("%s=%s" % (key, value))
  22         
  23     try:
  24         Parser = wikiutil.importPlugin(macro.request.cfg, 'parser', parser_name, 'Parser')
  25     except wikiutil.PluginMissingError:
  26         return macro.formatter.text('Please install "%(parser_name)s"!' % {"parser_name": parser_name})
  27 
  28     the_wanted_parser = Parser("", macro.request, format_args=','.join(args))
  29     if the_wanted_parser.init_settings:
  30         return '<div class="%(div_class)s"> %(result)s </div>' % {"div_class": div_class,
  31                                               "result": the_wanted_parser.render(macro.formatter)}
Example

<<Parser(HelloParser, color=blue)>>

-- ReimarBauer 2008-06-28 20:42:12

Unifying macros, parsers, page format processing instructions

Just to document some more ideas of BastianBlank and me. -- ThomasWaldmann 2009-03-04 23:44:36

Macro:
{{{#!macroname params}}}

Parser (inline):
{{{#!parsername params|content}}}

Parser (block):
{{{#!parsername params
content
}}}

Full page:
----------------- start of page (implicitly working like {{{ does)
#!parsername params
content
----------------- end of page (implicitly working like }}} does)

As you see, it is all about the same thing!

Modifications:

Even simpler:

Because you can often call a parser/macro on a page and usually you don't want to set always the same parameters, e.g. columns=0,show_text=False for arnica. I think we should have per page params which can be overwritten by parser arguments. parsers/macro should check pragma definitions of their parameters on a page. And use the values as default. e.g.

#pragma columns 0
#pragma show_text False

This also means refactoring and migration of may be duplicated different written pragma definitions.

MoinMoin: UnifyParsersAndMacros (last edited 2009-03-30 20:53:06 by ReimarBauer)