Description

When accessing any page with spider user agent, you get HTTP/1.1 403 FORBIDDEN response AND the page content, instead of <h1>Forbidden</h> or similar error.

Example

FrontPage:

curl --head --user-agent google http://nirs.dyndns.org/main/FrontPage
HTTP/1.1 403 FORBIDDEN
Date: Thu, 09 Jun 2005 19:05:56 GMT
Server: Apache/2.0.52 (Unix) DAV/2 mod_fastcgi/2.4.2 mod_python/3.1.3 Python/2.4
Content-Type: text/plain; charset=ISO-8859-1


curl --user-agent google http://nirs.dyndns.org/main/FrontPage
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
<meta name="robots" content="index,follow">

Why is this forbidden?

HelpContents:

curl --head --user-agent google http://nirs.dyndns.org/main/HelpContents
HTTP/1.1 403 FORBIDDEN
Date: Thu, 09 Jun 2005 19:10:00 GMT
Server: Apache/2.0.52 (Unix) DAV/2 mod_fastcgi/2.4.2 mod_python/3.1.3 Python/2.4
Content-Type: text/plain; charset=ISO-8859-1


curl --user-agent google http://nirs.dyndns.org/main/HelpContents
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
<meta name="robots" content="index,nofollow">

Why is this forbidden?

MissingPage

curl --head --user-agent google http://nirs.dyndns.org/main/NoSuchPageHere!
HTTP/1.1 403 FORBIDDEN
Date: Thu, 09 Jun 2005 19:19:06 GMT
Server: Apache/2.0.52 (Unix) DAV/2 mod_fastcgi/2.4.2 mod_python/3.1.3 Python/2.4
Content-Type: text/plain; charset=ISO-8859-1


curl --user-agent google http://nirs.dyndns.org/main/NoSuchPageHere!
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
<meta name="robots" content="index,nofollow">

It make sense that this page is forbidden - but then why we return content?

Details

Release 1.3.4

Workaround

Discussion

This is not a bug in moin, but cause by curl --head to try a HEAD request, which we do not support, or only partially support. When trying with wget -s, we get correct results:

wget --user-agent=google -s http://nirs.dyndns.org/main/FrontPage

HTTP/1.1 200 OK
Date: Thu, 09 Jun 2005 19:51:51 GMT
Server: Apache/2.0.52 (Unix) DAV/2 mod_fastcgi/2.4.2 mod_python/3.1.3 Python/2.4
Connection: close
Content-Type: text/html;charset=utf-8

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/st
rict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
<meta name="robots" content="index,follow">


wget --user-agent=google -s http://nirs.dyndns.org/main/HelpContents

HTTP/1.1 200 OK
Date: Thu, 09 Jun 2005 19:54:40 GMT
Server: Apache/2.0.52 (Unix) DAV/2 mod_fastcgi/2.4.2 mod_python/3.1.3 Python/2.4
Connection: close
Content-Type: text/html;charset=utf-8

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/st
rict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
<meta name="robots" content="index,nofollow">


wget --user-agent=google -s http://nirs.dyndns.org/main/NoSuchPageHere 
--22:55:52--  http://nirs.dyndns.org/main/NoSuchPageHere
           => `NoSuchPageHere'
Resolving nirs.dyndns.org... 192.115.134.51
Connecting to nirs.dyndns.org[192.115.134.51]:80... connected.
HTTP request sent, awaiting response... 404 NOTFOUND
22:55:53 ERROR 404: NOTFOUND.

Pain

Urf. After couple hours of pointess debugging why one monitoring framework gets 403 for moin I stumbled upon this one. How about returning 501 Not Implemented for HEAD request instead of 403 Forbidden? Thanks!

Plan

Priority:
Assigned to:
Status:

CategoryMoinMoinNoBug

MoinMoin: MoinMoinBugs/403ErrorWithPageContent (last edited 2008-08-03 16:24:48 by 85-23-18-40-Korvensuora-TR1)