1 2010-04-03T00:01:44  <dreimark> set([hit.page_name for hit in result.hits]) could be come a test util function
   2 2010-04-03T00:05:16  *** selevt_ has quit IRC
   3 2010-04-03T00:11:21  *** franklin has quit IRC
   4 2010-04-03T00:12:23  *** franklin has joined #moin-dev
   5 2010-04-03T00:28:07  <ThomasWaldmann> it could all become a big list and generative tests
   6 2010-04-03T00:30:39  *** valeuf has quit IRC
   7 2010-04-03T00:31:45  <ThomasWaldmann> http://paste.pocoo.org/show/196886/
   8 2010-04-03T00:31:56  <ThomasWaldmann> fix for search, please review
   9 2010-04-03T00:34:41  *** valeuf has joined #moin-dev
  10 2010-04-03T00:48:11  <JosefMeier> Ok guys. GN8
  11 2010-04-03T00:48:36  *** JosefMeier has quit IRC
  12 2010-04-03T02:31:43  *** valeuf_ has joined #moin-dev
  13 2010-04-03T02:33:37  *** valeuf has quit IRC
  14 2010-04-03T02:33:37  *** valeuf_ is now known as valeuf
  15 2010-04-03T04:20:29  *** valeuf_ has joined #moin-dev
  16 2010-04-03T04:21:06  *** valeuf has quit IRC
  17 2010-04-03T04:21:06  *** valeuf_ is now known as valeuf
  18 2010-04-03T05:57:00  *** valeuf_ has joined #moin-dev
  19 2010-04-03T05:59:38  *** valeuf has quit IRC
  20 2010-04-03T05:59:38  *** valeuf_ is now known as valeuf
  21 2010-04-03T07:09:32  *** valeuf has quit IRC
  22 2010-04-03T07:10:25  *** valeuf has joined #moin-dev
  23 2010-04-03T11:02:17  <dreimark> the second part is known, that was one of my first ideas. I am not sure about the first hunk. Why isn't it if self.use_re or self.case:
  24 2010-04-03T11:02:47  <dreimark> we know that xapian can't do case search, so we could also return an empty query
  25 2010-04-03T11:04:25  <dreimark> For this or the other condiftion i am not sure if we get the same results. because xapian does a title and attachment match.
  26 2010-04-03T11:05:37  <ThomasWaldmann> if moinsearch is doing postprocessing for xapian, it gets a page list from the results of xapian search, so it doesn't have to search all pages
  27 2010-04-03T11:06:25  <dreimark> the list was in the past empty
  28 2010-04-03T11:06:52  <ThomasWaldmann> thus, giving the results of a cases-insensitive search is better than giving all pages
  29 2010-04-03T11:06:55  <dreimark> btw. moin
  30 2010-04-03T11:07:28  <ThomasWaldmann> yeah, moin :)
  31 2010-04-03T11:12:00  * dreimark builds a new index
  32 2010-04-03T11:15:16  <dreimark> I am not sure what the result is but I think case:re should have less results than re alone
  33 2010-04-03T11:20:14  <ThomasWaldmann> sure
  34 2010-04-03T11:20:32  <ThomasWaldmann> <= to be exact
  35 2010-04-03T11:21:39  <dreimark> yeah, that was aloso broken in the past
  36 2010-04-03T11:24:48  *** selevt_ has joined #moin-dev
  37 2010-04-03T11:27:02  <dreimark> ThomasWaldmann: i think i found a new problem, i think we can't search for attachments if the page does not exists, or?
  38 2010-04-03T11:27:43  <dreimark> i added a pwgen string to a file and uploaded it to a not existing page
  39 2010-04-03T11:29:28  <ThomasWaldmann> not sure, but that is a bad usecase anyway
  40 2010-04-03T11:29:45  * dreimark verifies if by a rebuild it becomes added to the index
  41 2010-04-03T11:30:55  <dreimark> the usecase is bad, but if you add by xmlrpc first the files and then the page they might never go into the index
  42 2010-04-03T11:32:41  <dreimark> hmm, ok if the page is created it founds the file
  43 2010-04-03T11:40:24  <ThomasWaldmann> can we first check the problem that is addressed by the fix?
  44 2010-04-03T11:40:42  <dreimark> yes
  45 2010-04-03T11:40:57  * dreimark just found that because of a testsetup
  46 2010-04-03T11:41:13  <dreimark> it looks to me that TitleIndex has become faster
  47 2010-04-03T11:51:04  <dreimark> ThomasWaldmann: the xapian regex querys are not optimized for re patterns.
  48 2010-04-03T11:51:14  <dreimark> something like this http://moinmo.in/MoinMoinBugs/1.9.2XapianRegexNeedsCase?action=AttachFile&do=view&target=xapian_re_search_opt.patch#CA-264d623c1e3cc0a188a3e0cf91e765900716ac8f_23
  49 2010-04-03T11:51:26  <dreimark> makes an out put 4 times faster
  50 2010-04-03T11:51:58  <dreimark> for a searchterm of title:re:Webseiten.*
  51 2010-04-03T11:52:03  <ThomasWaldmann> TitleIndex does not search
  52 2010-04-03T11:52:38  <dreimark> Ergebnisse 1 - 25 von ungefähr 959 Ergebnissen aus ungefähr 894 Seiten. (8.62 Sekunden)
  53 2010-04-03T11:52:43  <dreimark> without the patch
  54 2010-04-03T11:53:16  <dreimark> and with it
  55 2010-04-03T11:53:19  <dreimark> Ergebnisse 1 - 25 von ungefähr 959 Ergebnissen aus ungefähr 894 Seiten. (2.67 Sekunden)
  56 2010-04-03T11:53:59  <dreimark> factor of 3.3
  57 2010-04-03T11:54:03  <ThomasWaldmann> fs caching?
  58 2010-04-03T11:55:09  <dreimark> may be, i am currently wondering, or there was an updatedb process running
  59 2010-04-03T11:57:09  <dreimark> redoing the regex patch test is not indfluenced much only 0.2 sec difference
  60 2010-04-03T11:57:23  <dreimark> retries titleindex
  61 2010-04-03T11:59:11  <ThomasWaldmann> that (unrelated) xapian pattern "optimization" you do in your patch is just covering some few special cases (maybe even incorrectly?) and does nothing for equivalent/similar other regexes
  62 2010-04-03T12:01:46  <dreimark> yes i know already it was a trial and it is not covered by tests.
  63 2010-04-03T12:02:16  <ThomasWaldmann> and it looks like maybe just using search might be better than requiring .*
  64 2010-04-03T12:02:18  <dreimark> but also it points to the problem of a few bad regex which can run out of RAM
  65 2010-04-03T12:03:20  <ThomasWaldmann> yes, but that all is not related to the patch we look at now
  66 2010-04-03T12:03:45  <dreimark> i know but i comared it with the one i did
  67 2010-04-03T12:03:49  <dreimark> +p
  68 2010-04-03T12:04:06  <dreimark> ok titleindx is now always 1.2 secs
  69 2010-04-03T12:15:44  * dreimark thinks the patch is working
  70 2010-04-03T12:16:15  <ThomasWaldmann> at least the tests tell so
  71 2010-04-03T12:17:37  <CIA-55> Thomas Waldmann <tw AT waldmann-edv DOT de> default * 5645:028f38513d8e 1.9/MoinMoin/search/queryparser/expressions.py: fix regex content search for xapian search
  72 2010-04-03T12:20:55  <dreimark> \o/
  73 2010-04-03T12:22:43  <ThomasWaldmann> ok, for everything else please new bug reports / tests
  74 2010-04-03T12:24:04  <ThomasWaldmann> also, I would like to get rid of the len == 14 (or similar) checks in search tests
  75 2010-04-03T12:24:55  <dreimark> me would like to have search strings we usually don't have in a testwiki
  76 2010-04-03T12:25:34  <dreimark> that is quite annoying to have to remember that the test would not fail if i had used an empty wiki
  77 2010-04-03T12:26:03  <dreimark> while on the other hand a desktop wiki with different conent can show a problem too
  78 2010-04-03T12:26:09  <ThomasWaldmann> the test wiki setup should be fixed
  79 2010-04-03T12:41:30  * ThomasWaldmann will be away later, doing some server hardware repair
  80 2010-04-03T12:42:34  *** JosefMeier has joined #moin-dev
  81 2010-04-03T12:42:38  <JosefMeier> Moin
  82 2010-04-03T12:46:35  <dreimark> hi JosefMeier see the recent cs
  83 2010-04-03T12:47:07  <dreimark> ThomasWaldmann: s/away/back/ ?
  84 2010-04-03T12:48:32  <ThomasWaldmann> both, as I am still here :)
  85 2010-04-03T12:53:25  <JosefMeier> dreimark: I'm confused now. Can you describe in a few sentences, how the combination xapian<->moin search can regex-search in attachments also? I always thought that I need xapian for being able to search in attachments. But regex search should work also for attachments and if xapian can't do that, this means to me, that regex search in attachments won't work. Right?
  86 2010-04-03T12:55:42  <dreimark> attachments search works with xapian
  87 2010-04-03T12:56:19  <dreimark> it was not able to search in page content by regex search
  88 2010-04-03T12:56:36  <dreimark> don't forget that moin-1.9 don't have items
  89 2010-04-03T12:56:49  <dreimark> pages and attachments handled different
  90 2010-04-03T12:56:59  <dreimark> +are
  91 2010-04-03T12:59:24  <JosefMeier> dreimark: you mean that regex search in attachments work with xapian ?
  92 2010-04-03T13:00:42  <ThomasWaldmann> you can't regex search in attachments
  93 2010-04-03T13:01:01  <ThomasWaldmann> .... contents
  94 2010-04-03T13:01:06  <dreimark> you can only do a match of a word
  95 2010-04-03T13:03:34  <JosefMeier> In my opinion it would be great to describe all this on the HelpOnXapian page. Cause at the moment it isn't very clear, what's possible with xapian/moin search and what's not.
  96 2010-04-03T13:04:28  <ThomasWaldmann> do it :)
  97 2010-04-03T13:04:43  <dreimark> ThomasWaldmann: was faster :)
  98 2010-04-03T13:06:44  <JosefMeier> Cool idea. Someone who doesn't understand a feature writes documentation for it. Maybe there's some misunderstanding: I'm not working for Microsoft :-)
  99 2010-04-03T13:09:06  <dreimark> haha, we improve it all the time, but also we don't know what all someone needs or wanted to know about it
 100 2010-04-03T13:12:30  <ThomasWaldmann> bbl
 101 2010-04-03T13:33:00  <dreimark> bbl2
 102 2010-04-03T13:53:29  *** JosefMeier has quit IRC
 103 2010-04-03T14:05:32  *** JosefMeier has joined #moin-dev
 104 2010-04-03T14:53:57  *** JosefMeier has quit IRC
 105 2010-04-03T14:55:46  *** JosefMeier has joined #moin-dev
 106 2010-04-03T14:58:46  *** JosefMeier1 has joined #moin-dev
 107 2010-04-03T14:58:50  *** JosefMeier has quit IRC
 108 2010-04-03T15:01:52  *** JosefMeier1 has quit IRC
 109 2010-04-03T15:02:38  *** JosefMeier1 has joined #moin-dev
 110 2010-04-03T15:21:03  *** JosefMeier1 has quit IRC
 111 2010-04-03T15:43:52  *** cosmodad has joined #moin-dev
 112 2010-04-03T15:49:19  *** selevt has joined #moin-dev
 113 2010-04-03T15:49:45  *** selevt_ has quit IRC
 114 2010-04-03T15:57:18  *** JosefMeier has joined #moin-dev
 115 2010-04-03T16:19:40  *** cosmodad has quit IRC
 116 2010-04-03T16:41:13  <ThomasWaldmann> re
 117 2010-04-03T18:40:08  *** JosefMeier has quit IRC
 118 2010-04-03T19:01:09  *** JosefMeier has joined #moin-dev
 119 2010-04-03T19:29:14  *** JosefMeier has quit IRC
 120 2010-04-03T19:31:31  *** JosefMeier has joined #moin-dev
 121 2010-04-03T19:48:26  *** JosefMeier has quit IRC
 122 2010-04-03T19:52:01  *** JosefMeier has joined #moin-dev
 123 2010-04-03T20:03:53  *** valeuf_ has joined #moin-dev
 124 2010-04-03T20:06:01  *** valeuf has quit IRC
 125 2010-04-03T20:06:01  *** valeuf_ is now known as valeuf
 126 2010-04-03T21:14:04  <xorAxAx> is there a way to get mails on new student proposals?
 127 2010-04-03T21:20:37  *** valeuf has left #moin-dev
 128 2010-04-03T21:26:21  <ThomasWaldmann> not afaik, but after there is one, one can subscribe to updates
 129 2010-04-03T21:32:46  *** valeuf has joined #moin-dev
 130 2010-04-03T21:40:22  <ThomasWaldmann> valeuf: are you getting notified when someone posts a comment to your application?
 131 2010-04-03T22:49:49  *** selevt has quit IRC
 132 2010-04-03T23:28:28  <ThomasWaldmann> http://moinmo.in/FeatureRequests/AutoScrollingTheEditorTextArea anybody objecting including this in 1.9?

MoinMoin: MoinMoinChat/Logs/moin-dev/2010-04-03 (last edited 2010-04-02 22:15:03 by IrcLogImporter)