A HoneypotPage is a page that intenionally left open for editing to attract WikiSpam bots, so they can be easily detected and banned. It's meant to be a spam prevention feature that's simple and automagical.

HonypotPage can be used on:

A good example for such page is FrontPage, but many other pages can be used.

Most admins lock their wiki FrontPage to prevent WikiSpam and WikiVandelism. However, by intentionally leaving the FrontPage open for editing to attract WikiSpam bots to spam it, it's possible to detect WikiSpam as they arrive and then proceed to ban their IPs from editing any other pages for a limited time.

Possibilities

Basic Use Case

Here's a basic use case of this feature, using FrontPage as example:

  1. The admin makes the FrontPage to be editable by anyone

  2. When FrontPage is edited by standard user, a warning message is displayed:

    • /!\ Do not edit this page or be banned for 30 minutes and lose all of your changes made to the entire wiki in the past 5 mins

  3. WikiSpam bot arrives on FrontPage and try to add spam links on it

  4. As soon as that happens, WikiSpam's IP address is recorded and banned for 30 mins

  5. The FrontPage is not saved

  6. The user get a message: "This page is protected, you will not be able to edit any other page in this wiki for the next 30 minutes."
  7. WikiEngine find all pages edited in the last 5 minutes that their last edit was from same IP as the spam bot, and revert the spam bot edits.

Additional Cleanup

Trigger from SecurityPolicy save method:

  1. Make a diff from the unsaved HoneypotPage and the original revision

  2. Scan the 'diff' for URLs e.g. http://somespamsite.com/

  3. Search the entire wiki for the existence of this URL
  4. Alerts admin of all pages containing the spam links
  5. Possibly reverts all spammed pages if they were added recently and from the same IP

This maybe a different feature, useful anytime you revert a spam edit.

Advantages

Problems

Due to these reasons, only very recent edits (say, in the last 5 mins) from the same IP should be reverted automatically.

Alternatively, the WikiEngine could be more aggressive, and scan all edits in, say, the last 30 minutes, if it limits reverts to only reverts pages where a spam link was added.

One possibility of providing a safety net is to show the careless user a list of pages that was reverted. A spambot will disregard the list (or try to spam every page on the list, which will be rejected). The careless user in good faith will try to un-revert all the pages, even if they're someone else's (other people connected from the same proxy server) changes. MoinMoin has separate permissions for revert and write, so it's entirely possible for a user banned for editing to revert pages. On WikiEngines with no separate revert rights, the careless user will have to wait no more than 5-10 mins until his ban expires, which is a pretty reasonable wait.

If this method becomes popular, spammers may rewrite their spambots to skip the FrontPage entirely and post spam to other pages only, hence render this useless. However, nothing stops the admin to setup other honeypot pages on their wikis as a workaround.

(should we rename this feature simply HoneypotPage ?)

See Also

A commonly sugguested antispam method is URL blacklisting (see BlackList, BadContent).

Also see AntiSpamFeatures.

Orginal WikiFeatures proposal can be found at here. There may be dicussions there you want to read. - GoofRider

Contributors

Discussions

Does not work for this wiki

This concept does not work for 95% of the spam that goes into this wiki. Simply because the spam is not submitted by bots. The antispam/BadContent system seems to defeat most bot attacks (or they don't attack MoinMoin at all).

Safer way to detect behavior

To make it work, we need a safe way to detect a bot edit. This can be a page that the only way to get it is by "clicking" a hidden link, for example html link like this:

<a href="/SomePage?ban_flag=1" style="display: none important!">HoneyPot</a>

With css rule like this: .hidden {hidden = display: none;}

Even this kind of link might be ignored by some browser, for example a screen reader.

The ban_flag should be dynamic so bot can't detect it simply.

Only if we have a safe way to create such link we can say that we can detect behavior, and then reject edits from bots or scan them.


CategorySpam

MoinMoin: HoneypotPage (last edited 2007-10-29 19:10:02 by localhost)