Attachment 'replace_in_wiki.py'

Download

   1 """
   2     WikiRpc Search-and-replace tool for MoinMoin 
   3 
   4     Copyright 2008 Remco Boerma <r.boerma@drenthecollege.nl>
   5 
   6     Purpose: 
   7         Using this script you can search-and-replace in multiple wiki pages at once. 
   8         {{{ preformatted }}}, `text` and = level 1 titles = are preserved at will.
   9 
  10         There are several parameters to be used, some of them set using environment
  11         variables ('username' and 'password' for example). While sometimes one wants 
  12         to change every occurence of a needle, at times it's very imported not to 
  13         change just-everything. The user is asked when running if Save replacement 
  14         should be used or not. 
  15 
  16     Configuration:
  17         Environment variables: 
  18             username :: the username to be used for xmlrpc 
  19             password :: the password to be used for xmlrpc <optional>
  20             moin_url :: the hostname of your moinmoin server
  21 
  22         Hardcoded:
  23             MOIN_URL :: set this to your site's hostname if you don't want to use 
  24                         the environment variable
  25             RPC_URL  :: normally set using your MOIN_URL, but this allows for 
  26                         complex urls
  27             sever regular expressions used in get_save_regions :: 
  28                         regular expressions used to test what lines not to touch
  29 
  30     Usage: 
  31         1. Set the username and moin_url environment variables. If you like, 
  32         set the password variable as well (or retype this at every execution). 
  33         2. start replace_in_wiki.py 
  34         3. enter the expression to select the pages to appy this search-and-replace on
  35         4. enter the regular expression (FORCED) to search for in the pages
  36         5. enter the replacement expressions (can use \1 and \2 etc)
  37         6. enter yes,y or simply hit enter to enable SaveReplacer (and keep 
  38            {{{preformatted}}} `text` = intact = ) or enter anything else to replace
  39            every occurence per page. 
  40         7. Examen the result of diff comparison on the old and new content and decide: 
  41         8. Either accept the change (yes, y, or simply enter) and upload this new 
  42            page to the server, or enter any other value (and enter) to not upload the
  43            new version. 
  44         
  45     Notes: 
  46         1. The regular expression search is performed with re.MULTILINE and re.DOTALL 
  47            as default options: this allows for multiline matches, as well as using ^ and
  48            $ in your expression on a line basis. 
  49 
  50 
  51     Verified to work with: 
  52       * Python 2.5
  53       * MoinMoin 1.5.6 (using plain http authentication)
  54 
  55     @copyright: GPL v2
  56 
  57 """
  58 import xmlrpclib
  59 import getpass,os,re,difflib,sys
  60 
  61 
  62 MOIN_URL  = os.environ.get('moin_url','your.wiki.host.here')
  63 RPC_URL   = "http://%%s:%%s@%s/?action=xmlrpc2" % MOIN_URL
  64 
  65 DEBUG = 0 # use this for SaveReplacer debugging... 
  66 class SaveReplacer(object):
  67     def __init__(self,text):
  68         self.text = text
  69 
  70     def _is_a_save_region(self,start,end):
  71         global DEBUG
  72         if DEBUG>2: print 'Checking %d,%d in %s'%(start,end,self._save_regions)
  73         for save_start,save_end in self._save_regions:
  74             if (start >= save_start and start <= save_end) or \
  75                 (end >= save_start and end <= save_end): 
  76                   return True
  77         else :
  78             return False
  79 
  80     def _do_replace(self,match):
  81         global DEBUG
  82         start = match.start()
  83         end = match.end()
  84         text = match.string
  85         newtext = match.expand(self.replacement)
  86         if not self._is_a_save_region(start,end):
  87             if DEBUG: print 'Changing',`text[start:end]`,'@ %d-%d' % (start,end),'to',`newtext`
  88             return newtext
  89         else:
  90             if DEBUG: print 'PREVENTED',`self.text[start:end]`,'@ %d-%d' % (start,end),'to',`newtext`
  91             return match.string[start:end]
  92 
  93     def get_save_regions(self):
  94         regions  = [match.span() for match in re.finditer(r'^=\s.*?\s+=\s*$',self.text,re.MULTILINE)] # multiline to work per line with ^ and $
  95         regions += [match.span() for match in re.finditer('{{{.*?}}}',self.text,re.DOTALL)] # dotall for . being all including newlines
  96         regions += [match.span() for match in re.finditer('`.*?`',self.text)]
  97         return regions
  98 
  99     def run(self,needle,replacement,options=0):
 100         self.replacement = replacement
 101         self._save_regions = self.get_save_regions()
 102         self.text = needle.sub(self._do_replace,self.text,options)
 103         del self._save_regions
 104         del self.replacement
 105 
 106 if __name__ == '__main__':
 107     assert sys.version_info[0:2]>=(2,5), "\n\nThis script requires python 2.5 or higher .. \n\n(re.finditer() method should accept options...)"
 108     username = os.environ.get('username','RemcoBoerma')
 109     print 'Using username:',username
 110     password = os.environ.get('password',None)
 111     if not password: 
 112         password = getpass.getpass('password for '+username+' : ')
 113 
 114     server = xmlrpclib.ServerProxy(RPC_URL % (username,password))
 115     print 'searching...',
 116     pagelist = [ pagename for pagename, junk in server.searchPages(raw_input('Page search:'))]
 117     maxcount = len(pagelist)
 118     print 'Found',maxcount,'pages...'
 119     needle = re.compile(unicode(raw_input('Needle: ')))
 120     replacement = unicode(raw_input('Replace with: '))
 121     wants_safe_replace = raw_input('Safe replace  yes/no [yes]:').lower().strip() in ['yes','y','']
 122     for count,pagename in enumerate(pagelist):
 123         print '--[%d/%d]------[%s]----' % (count+1,maxcount,pagename)
 124         text = unicode(server.getPage(pagename))
 125         if wants_safe_replace:
 126             save = SaveReplacer(text)
 127             save.run(needle,replacement,re.MULTILINE + re.DOTALL)
 128             patched = save.text
 129         else:
 130             patched = needle.sub(replacement,text,re.MULTILINE + re.DOTALL)
 131         if patched == text:
 132             print 'NO Changes'
 133             continue 
 134         lines = list(difflib.Differ().compare(text.splitlines(1),patched.splitlines(1)))
 135         for line in lines:
 136             if line[0] in '-+?':
 137                 print line.encode('utf-8'),
 138         update_answer=raw_input('Update wiki? yes/no [yes] ').lower().strip()
 139         if update_answer in ['yes','y','',' ']:
 140             print 'Updating',
 141             updated =  server.putPage(pagename,patched)
 142             print `updated`
 143         else:
 144             print 'Skipped'
 145             

Attached Files

To refer to attachments on a page, use attachment:filename, as shown below in the list of files. Do NOT use the URL of the [get] link, since this is subject to change and can break easily.
  • [get | view] (2008-02-22 15:57:42, 6.4 KB) [[attachment:replace_in_wiki.py]]
  • [get | view] (2011-03-01 18:11:30, 6.7 KB) [[attachment:replace_in_wiki_LDAP.py]]
 All files | Selected Files: delete move to page copy to page

You are not allowed to attach a file to this page.