Description

The standalone server in moin 1.7.1 leaks a file descriptor on every request: it looks like it opens data/edit-log for reading, but then never closes it. As a result, running it for our intranet wiki (~3000 pages, ~100 users) fails: it runs for 10-15 minutes and then crashes.

Incidentally, I'm not positive that the crash is due to the file descriptor leak, since there is nothing logged! I would have at least expected a stack trace with "Too many open file descriptors" somewhere. But even if something else is crashing moin, leaking a file handle on every request is pretty bad.

Steps to reproduce

  1. run "moin ... server standalone"
  2. run "lsof -a -c moin -d 0-1023" to list open files (this works on Linux, and should work on any other Unix with lsof): note that a brand-new moin server process has only stdin/stderr/stdout, a TCP socket for listening, and any log files you have configured
  3. load a page, wait a second for things to settle, and re-run lsof: moin has one file descriptor open to data/edit-log in read mode

[...repeat until bored...]

If you repeat the last step enough times, presumably moin will eventually run out of file descriptors and crash.

Incidentally, this affects more than just loading pages: requesting an edit, cancelling the edit, preview, save, ... pretty much any HTTP request to the standalone server leaks a file descriptor.

Also, the number of open files occasionally drops after a while. I assume this is GC kicking in long after the fact. (Weird: you expect that from Java with its asynchronous GC, but not from Python. Not sure what is going on there.)

Example

See reproduction steps.

Component selection

Details

MoinMoin Version

1.7.1 (likely also other, older moin versions)

OS and Version

Ubuntu 7.04 (feisty fawn); RHEL 4

Python Version

2.5.1; 2.3.?

Server Setup

standalone (likely also other persistent servers)

Server Details

Language you are using the wiki in

en

Workaround

Do not use standalone server: we went back to the CGI script.

(!) Of course this was a bug in the code that needed fixing.

But maybe consider using mod_wsgi with maximum-requests=N (with N >> 1) instead of CGI for such cases:

Discussion

Note: Closing the file is done by Python's garbage collector when last reference to EditLog object has been removed and it gets collected. The bug likely was some circular reference that kept the gc from collecting it.

Long term solution will be to get rid of EditLog code, some work on that has been done in the storage refactoring project in Summer of Code 2008.

Plan


CategoryMoinMoinBugFixed

MoinMoin: MoinMoinBugs/StandaloneServerLeaksFiles (last edited 2008-08-30 14:46:00 by ThomasWaldmann)