Word2Moin

About

This is a Visual Basic script that converts Microsoft Word 2000 documents to MoinMoin markup. It is based on swythan's WordToWiki macro for TikiWiki (http://tikiwiki.org/tiki-index.php?page=WordToWiki_swythan).

I (JohnWhitlock) created this code because I had a dozen Word documents, with tables, lists, headings, etc., that I wanted to convert to Moin pages for our intranet Wiki. Manual conversion took about 8 hours for a complex, 50 page document. With this script, it took about 2 hours, mostly to fix tables, lists, and extract images. Now, I'm too busy doing his job to make the script any better, but I hope it is useful for someone else in its current ugly form.

Installation

  1. Download the macro (see below).
  2. Start Microsoft Word.
  3. Open the Visual Basic Editor (Tools > Macro > Visual Basic Editor)

  4. Click the Normal heading, so that the macro is installed and available for all Word documents.

  5. Install the macro (Visual Basic Editor: File > Import File...) into the Normal template

How To Use

  1. Open the document you want to convert.
  2. Within Word: from the 'tools' menu, select Macro > Macros...

  3. Select Word2Moin, then click 'Run'

  4. The document will be converted in place, and copied to the clipboard.
  5. Paste the results in the wiki editor window (Ctrl-V).
  6. If you have many images in your Word doc, see section below for tips.

Exporting images from Word Docs

If you have documentation in Word that has many images or diagrams, a faster way to re-size and convert all your images at once (to .png, .gif, or .jpg files) is to leverage Word's "Save as Web Page..." capability.

  1. Open the document you want to convert (you kept a backup, right?!).
  2. From the 'File' menu, select "Save as web page..."
    • /!\ Important: In the save dialog, set the "Save as type" to "Web page (*.htm; *.html)" (the default *.mht is a single file, you can't pull the images).

  3. Exit word and locate your save location. You will see a example-document.html and a folder named example-document_files (where example-document is the name of your document). You can delete the example-document.html file, we don't care about that, the images you need are in the folder.

  4. Upload the graphic files to your wiki page, and use the attachment: tag to place them on your wiki page.

(!) Tip: To enable exporting to PNG:

Downloads

download

Moin Version

Word2MoinV21.bas

1.6 and newer

Word2MoinV2.bas

1.5 and prior

WordToMoin.bas

1.5 and prior

What Works

What Doesn't Work

Issues/Workarounds

Comment Lines Between Table Rows

Version 2 generates comment lines (lines beginning with ##) between table rows to make tables visually easier to edit in text mode. However, some Wiki flavors may have issues with these comment lines. If you experience problems, when the macro generates the table ML like this:

## #######################################
||<v>column 1||<v>column 2||<v>column 3 ||
## #######################################
||<v>a       ||<v>d       ||<v>g        ||
## #######################################
||<v>b       ||<v>e       ||<v>h        ||
## #######################################
||<v>c       ||<v>f       ||<v>i        ||
## #######################################

simply take out the comment lines (the lines beginning in ##) like so:

||<v>column 1||<v>column 2||<v>column 3 ||
||<v>a       ||<v>d       ||<v>g        ||
||<v>b       ||<v>e       ||<v>h        ||
||<v>c       ||<v>f       ||<v>i        ||

To Do

There are bugs, some significant, and no error checking to speak of. The converted version is seldom ready to post directly, and requires stepping through the whole document, often with a printed copy of the Word document, to fix the differences. On the other hand, if the document is important enough to go on your Wiki, then you probably planned to read through it once anyway.

Contact

Feel free to contact me (JohnWhitlock) with comments and questions, but I don't have much time to work with any problems you might be having. However, feel free to change and play with the code, if you know enough Visual Basic to improve it.

History

10/17/08

  1. Updated macro to write <<BR>> tags, not [[BR]].

6/28/07 Thu

Softintheheadware completes version 2.0, with support for multi-level/alpha/roman lists and easier-on-the-eye wiki ML for tables.

  1. Converts tables to Wiki ML that is easier to read & edit

    1. Comment lines between rows.
    2. Space padded columns (edit these in INSERT mode using a fixed-width font).
  2. Extended support for lists
    1. Multi-level lists
    2. Alpha lowercase
    3. Alpha uppercase
    4. Roman lowercase
    5. Roman uppercase

06/16/07

This script has gone about as far as it can go without a more formal approach. If I get a week to come back to it (and it makes sense for my job), then I'll start over with a ProgrammerTest model, with generated Word documents as test cases.

A potential improvement might be to generalize it, so that it could target several Wiki engines. That way, the net could be cast as far as possible for developers. However, at that point we're talking about a SourceForge project, something I don't have the time for at the present.


(crosslinked here for convenience)

MoinMoin: MicrosoftWordConverter (last edited 2011-07-26 09:33:26 by NicoZanferrari)