1 2009-11-10T00:33:59  *** grzywacz has quit IRC
   2 2009-11-10T00:47:33  *** |mmk[null]| has joined #moin-dev
   3 2009-11-10T00:51:09  *** |mmk[null]| has quit IRC
   4 2009-11-10T04:22:49  *** dimazest has quit IRC
   5 2009-11-10T04:22:49  *** TheSheep has quit IRC
   6 2009-11-10T04:24:05  *** dimazest has joined #moin-dev
   7 2009-11-10T04:28:23  *** TheSheep has joined #moin-dev
   8 2009-11-10T09:00:58  *** grzywacz has joined #moin-dev
   9 2009-11-10T09:16:10  <ThomasWaldmann> moin
  10 2009-11-10T09:36:24  *** grzywacz has quit IRC
  11 2009-11-10T09:39:55  <dreimark> moin
  12 2009-11-10T13:21:47  *** LotekThirteen has joined #moin-dev
  13 2009-11-10T13:28:32  <LotekThirteen> moinnmoin
  14 2009-11-10T13:29:05  <LotekThirteen> it feels that stemming is not working well with the latest release (1.9). For german words it's not working at all..
  15 2009-11-10T13:29:52  <LotekThirteen> Should I fill out a bug report or is this already known?
  16 2009-11-10T13:49:15  <LotekThirteen> funny, did some more tests.. some german word stemmed correctly and other not... you need some examples / testcase I guess?
  17 2009-11-10T14:06:44  <TheSheep> I think you'd report that to xapian, not sure
  18 2009-11-10T14:24:07  <dreimark> LotekThirteen: we know a bit of stemming problems but may be your bug report improves our knowledge
  19 2009-11-10T14:24:31  <dreimark> we have currently a failing test for stemming on which dimazest is working
  20 2009-11-10T14:24:36  <dreimark> bbl
  21 2009-11-10T15:40:29  <LotekThirteen> dreimark: it's look likes stemming is working for some words and for other not. comparing to the moin 1.8 it's not that good in stemming any more... I will write a bug revport with some examples, bye
  22 2009-11-10T16:56:06  <LotekThirteen> there seems to be a general problem with the search currently, all my test cases failed, even a simple title search. see here: http://www.moinmo.in/MoinMoinBugs/1.9XapianStemmingNotWorkingCorrectly/Test
  23 2009-11-10T17:13:15  <ThomasWaldmann> btw, you don't need the "www."
  24 2009-11-10T17:16:46  <LotekThirteen> ThomasWaldmann: :-)
  25 2009-11-10T17:17:09  <ThomasWaldmann> LotekThirteen: ^^. btw, moinmo.in had the locking issue, so it likely didn't update the index with your changes.
  26 2009-11-10T17:22:22  <LotekThirteen> ThomasWaldmann: well if you could rebuild the index we see, but I did some test on my home server also and stemming was not working..
  27 2009-11-10T17:22:25  <ThomasWaldmann> LotekThirteen: and it queued up quite some stuff in the update-queue...
  28 2009-11-10T17:22:48  <LotekThirteen> ThomasWaldmann: yes, I created some test pages :)
  29 2009-11-10T17:26:42  <ThomasWaldmann> ok, index is current now
  30 2009-11-10T17:26:51  <ThomasWaldmann> but it still doesn't work, no even for english
  31 2009-11-10T17:27:44  <LotekThirteen> ThomasWaldmann: keyword connet works, but other fails http://moinmo.in/MoinMoinBugs/1.9XapianStemmingNotWorkingCorrectly/Test?action=fullsearch&context=180&value=connect&titlesearch=Titles
  32 2009-11-10T17:28:04  <ThomasWaldmann> btw, for the search queries, we have a rather basic problem anyway: in which language shall we stem the query terms?
  33 2009-11-10T17:29:59  <LotekThirteen> hmm, if the wiki page contatins some language info and then in the configfile
  34 2009-11-10T17:30:35  <ThomasWaldmann> yes, for page we have language info (more or less, there could be a language mix on a page)
  35 2009-11-10T17:30:59  <ThomasWaldmann> but (assuming that the wiki is multi language), we don't know the query language
  36 2009-11-10T17:31:57  <LotekThirteen> then the user still can add  "#language en", I see no other solution
  37 2009-11-10T17:32:09  <ThomasWaldmann> add where?
  38 2009-11-10T17:32:32  <LotekThirteen> into the wiki page itself, like http://moinmo.in/HelpOnConfiguration?action=raw
  39 2009-11-10T17:32:41  <LotekThirteen> if the default from config is not right...
  40 2009-11-10T17:32:58  <ThomasWaldmann> yes, that gives the page language
  41 2009-11-10T17:33:16  <ThomasWaldmann> but not the query language (== what you type into the search box)
  42 2009-11-10T17:34:43  <LotekThirteen> maybe wiki user account settings, default config (language) and user agent string
  43 2009-11-10T17:35:00  <ThomasWaldmann> much guessing...
  44 2009-11-10T17:35:25  <LotekThirteen> yes and bad, because if the user use normaly english interface and want to search for german words...
  45 2009-11-10T17:38:41  <LotekThirteen> i guess for a search it's also not possible to implement a logic, that if the stemmer can not reduce a word, he will tries it with another language (it's maybe even too slow)
  46 2009-11-10T17:41:39  <LotekThirteen> in my eyes use some config language stuff and optional make the user account language be  the master (if the admin of a wiki wants it). multilanguage wikies are seldom, I guess
  47 2009-11-10T17:45:29  <ThomasWaldmann> i don't think we can check for "if a stemmer can't reduce a word"
  48 2009-11-10T17:45:43  <ThomasWaldmann> (or that we should use another language then)
  49 2009-11-10T17:45:46  <LotekThirteen> maybe look for multilingual stemming: http://en.wikipedia.org/wiki/Stemming#Multilingual_Stemming
  50 2009-11-10T17:48:02  <LotekThirteen> on sphinx search somebody did it with snow ball (english and russian) http://www.sphinxsearch.com/forum/view.html?id=11
  51 2009-11-10T17:57:45  <LotekThirteen> ThomasWaldmann: Maybe it's possible to merge a english and german stemming algorithm together... but doin' this for all the aviable languages seems to be illusory, as long as not xapian support this.
  52 2009-11-10T18:00:37  <LotekThirteen> not only xapian but also snowball...
  53 2009-11-10T18:17:24  <LotekThirteen> bye
  54 2009-11-10T18:22:45  *** LotekThirteen has quit IRC
  55 2009-11-10T18:33:05  <dimazest> moin
  56 2009-11-10T18:52:44  <dreimark> moin
  57 2009-11-10T18:52:52  <dreimark> welcome back dimazest
  58 2009-11-10T18:53:07  <dreimark> ThomasWaldmann: ping
  59 2009-11-10T18:53:39  <dreimark> dimazest: please have a look at ThomasWaldmann latest cs on 1.9
  60 2009-11-10T18:54:43  <dreimark> the most interesting question currently is if we can remove the moin's write-locking and use only xapian locking
  61 2009-11-10T18:58:29  <dreimark> dimazest: and also there is a bugreport http://moinmo.in/MoinMoinBugs/1.9XapianStemmingNotWorkingCorrectly
  62 2009-11-10T19:19:43  <dimazest> ok i will investigate it
  63 2009-11-10T19:27:14  <dreimark> dimazest: thx
  64 2009-11-10T19:27:30  <dreimark> ThomasWaldmann: has more details, after he is back.
  65 2009-11-10T19:27:45  <dimazest> good
  66 2009-11-10T19:46:10  <ThomasWaldmann> hi dimazest
  67 2009-11-10T19:46:57  <ThomasWaldmann> the locking issue happens rather often on moinmo.in. The xapian index won't get updates, because it can't acquire the lock.
  68 2009-11-10T19:47:10  <ThomasWaldmann> That stays until one manually removes the write-lock. :|
  69 2009-11-10T19:47:30  <ThomasWaldmann> gtg / brb
  70 2009-11-10T19:54:13  <dreimark> hmm what smiley tells a "no"
  71 2009-11-10T20:08:52  <dreimark> bbl
  72 2009-11-10T21:29:13  *** LotekThirteen has joined #moin-dev
  73 2009-11-10T21:50:40  <dimazest> strange i can't clone moin 1.9 repo
  74 2009-11-10T21:51:18  <dimazest> in the begining the speed is fast, but later it falls to ~1kb/s
  75 2009-11-10T21:52:15  <dimazest> may be it is my unstable internet connection
  76 2009-11-10T21:53:28  <dreimark> dimazest: last time I needed by my slow connection more than 1 h
  77 2009-11-10T21:54:17  <dreimark> applying changes was easting much time
  78 2009-11-10T21:58:40  <dimazest> it works.... i made a bundle on my server
  79 2009-11-10T21:59:06  <dimazest> and copying it with rsync, so i can continue after connection went too slow
  80 2009-11-10T21:59:38  <dreimark> dimazest: http://moinmo.in/ReimarBauer?action=AttachFile&do=get&target=1.9.tgz
  81 2009-11-10T21:59:51  <dreimark> ah ok, had the same idea
  82 2009-11-10T22:00:40  <dreimark> that was created some minutes ago, if you want to fetch it
  83 2009-11-10T22:00:52  <dreimark> please tell
  84 2009-11-10T22:01:37  <dimazest> i've got almost half, thanks
  85 2009-11-10T22:31:24  <LotekThirteen> moinmoin
  86 2009-11-10T22:31:47  <dreimark> dimazest: LotekThirteen has reported the latest bugreport
  87 2009-11-10T22:31:48  <LotekThirteen> about the xapian bug, I added some stuff / touches about multilingual stemming on this bug report http://moinmo.in/MoinMoinBugs/1.9XapianStemmingNotWorkingCorrectly
  88 2009-11-10T22:33:30  <LotekThirteen> it's maybe also something to think about or can lead to a problem if you try to find german pages in an english wiki :-)
  89 2009-11-10T22:34:08  <LotekThirteen> I even do not know how it is currently working the stemming stuff...
  90 2009-11-10T22:38:29  <dimazest> LotekThirteen: thanks for examples
  91 2009-11-10T22:39:02  <dimazest> i will stem them (so we will see what is output for stemmer)
  92 2009-11-10T22:39:48  <LotekThirteen> yes ThomasWaldmann helped, and there is maybe some basic discussion about stemming necessary needed. because in my eyes multilingual stemming is not possible with the current xapain / snowball stuff.. but I'm not a pro in this case :-)
  93 2009-11-10T22:40:27  <dimazest> with mixed languages we can stem a word in all avaliable languages :)
  94 2009-11-10T22:40:44  <dimazest> but i do not know will it help or not
  95 2009-11-10T22:41:35  <dimazest> the main problem is in snowbal (to make search work both with moin search and xapian)
  96 2009-11-10T22:42:00  <dimazest> and stemming problem may be not in the stemming but because of this duality
  97 2009-11-10T22:42:22  <dimazest> there is a failing testcase which i do not know yet how to fix
  98 2009-11-10T22:43:22  <LotekThirteen> even if you can stem words "multilingual" how the queryparser should know in with language he should stem his input... as far as I saw snowball can not do this, for now...
  99 2009-11-10T22:45:00  <dimazest> http://hg.moinmo.in/moin/1.9/file/a728d059c78e/MoinMoin/search/_tests/test_search.py#l431 here is the test
 100 2009-11-10T22:45:46  <dimazest> what is wnowball?
 101 2009-11-10T22:47:24  <LotekThirteen> wnowball? don't know I wrote snowball.. http://snowball.tartarus.org/
 102 2009-11-10T22:48:57  <dimazest> sorry, that's my typing
 103 2009-11-10T22:53:16  <dimazest> ok, i have working environment with moin and xapian
 104 2009-11-10T22:56:23  <dimazest> good night
 105 2009-11-10T22:57:16  <LotekThirteen> bye dimazest
 106 2009-11-10T22:58:09  <dreimark> good night dimazest
 107 2009-11-10T23:07:41  <dimazest> oops, with xapian-core-1.0.16 i have failing tests
 108 2009-11-10T23:16:29  <dreimark> dimazest: may be try with a fresh test wiki
 109 2009-11-10T23:19:28  <dreimark> http://trac.xapian.org/ticket/185 this is marked as fixed in 13
 110 2009-11-10T23:22:13  <dreimark> good night
 111 2009-11-10T23:31:47  *** LotekThirteen has left #moin-dev
 112 

MoinMoin: MoinMoinChat/Logs/moin-dev/2009-11-10 (last edited 2009-11-09 23:45:02 by IrcLogImporter)