What we need

from MoinMoin.util import pysupport
modules = pysupport.getPackageModules(__file__)

Starting Points:

First version

With this patch the search in attachments is possible in version 1.3. So far only PDF files are supported by means of pdftotext (xpdf package) (yes, currently only on Linux). The following files have changed:

moin/lib/python2.3/site-packages/MoinMoin/search.py
moin/lib/python2.3/site-packages/MoinMoin/Attachment.py
moin/lib/python2.3/site-packages/MoinMoin/formatter/text_html.py
moin/lib/python2.3/site-packages/MoinMoin/action/fullsearch.py
moin/lib/python2.3/site-packages/MoinMoin/action/AttachFile.py
moin/lib/python2.3/site-packages/MoinMoin/attach2txt/pdf2txt.py
moin/lib/python2.3/site-packages/MoinMoin/attach2txt/__init__.py

You will have to create the directory data/cache/AttachSearch manually.

It works as follows: When a text search is performed, for every page its attachments' text versions are searched in a special directory. Given page WikiPage has an attachment att.suf. Then the file data/cache/AttachSearch/WikiPage/att.suf.txt is opened if it exists and a normal search is performed. If it does not yet exist, in the attach2txt package the proper conversion method is looked up in attach2txt.__init__.converter_mapping, which is a dictionary ( {"pdf": pdf2txt.convert} ).

If pdftotext does not manage to convert the pdf file it creates an empty text file. At the next search moinmoin is not trying to convert again that file.

Ideas

TODO

MoinMoin: MoinMoinTodo/ExtendedSearch/AttachmentSearch (last edited 2007-10-29 19:21:07 by localhost)