fighting spam in mediawiki
DESCRIPTION
SMWCon Fall 2012 conference tutorial on fighting spam. The video is available here: http://www.youtube.com/watch?v=rhC1DFeblik&list=PLwtfwT1GnUQRaLki-YcF-_n8ndayi--W5&index=3&feature=plpp_videoTRANSCRIPT
Fighting spam in MediaWiki
(C) WikiVote! 2012
Yury Katkov
How effective is spam?
Effective enough to be profitable
(C) WikiVote! 2012
5.6% click-through rate
for porn spam0.02%
click-through rate for pharma spam
0.0075%click-through rate for Rolex watches spam
Types of wiki-spam
By user• Anonymous spam• Spam from registered userBy page action• Spamming on a user page• Spamming by creating new page• Spamming on existing pagesBy sort of spam itself• Posting links to websites• Posting text with non-spam links for example liks
to a URL-shortener services• Posting text without a links
(C) WikiVote! 2012
The best anti-spam techniques
1. Active community2. Bulk-editing cyborgs3. Blacklists4. Honeypots5. Captcha6. Reasonable delays and confirmations7. Behaviour analysis (in the worst
cases)
(C) WikiVote! 2012
(C) WikiVote! 2012
Active communityWhen someone need your wiki they will get rid of spam
(C) WikiVote! 2012
The best anti-spam techniquesActive community, cyborgs
If you have healthy community, the spam will be deleted by participants themselves.• Search for heroes• Turn the superheroes into
cyborgs– AutoWikiBrowser– Secretaribot– User:ClueBot_NG – just amazing– Nuke– Delete Batch
• Allow and encourage the use of bots by your heroes:– http://en.wikipedia.org/wiki/User:Emijrp/Anti-va
ndalism_bot_census(C) WikiVote! 2012
(C) WikiVote! 2012
Honeypots
(C) WikiVote! 2012
The best anti-spam techniquesHoneypots
Extension:SimpleAntiSpamPrinciple:
Adding hidden fields that only bot will fillAdvantages:
Plug-and-playDisadvantages:
Works only for the dummiest bots
(C) WikiVote! 2012
(C) WikiVote! 2012
Blacklisting
(C) WikiVote! 2012
The best anti-spam techniquesBlacklisting
What can be blacklisted:• Spam text patterns– Extension:SpamBlacklist
• Spammers IP addresses by IP or by DNS:– Extension:SpamBlacklist
– DNSBL
(C) WikiVote! 2012
$wgEnableDnsBlacklist = true;$wgDnsBlacklistUrls = array( 'xbl.spamhaus.org',
'opm.tornevall.org' );
$wgSpamBlacklistFiles = array( "[[m:Spam blacklist]]","http://en.wikipedia.org/wiki/MediaWiki:Spam-blacklist" );
The best anti-spam techniquesBlacklisting
What can be blacklisted:• Spam text patterns– Extension:SpamBlacklist
• Spammers IP addresses by IP or by DNS:
– DNSBL
(C) WikiVote! 2012
$wgEnableDnsBlacklist = true;$wgDnsBlacklistUrls = array( 'xbl.spamhaus.org',
'opm.tornevall.org' );
$wgSpamBlacklistFiles = array( "[[m:Spam blacklist]]","http://en.wikipedia.org/wiki/MediaWiki:Spam-blacklist" );
Death to URL
shorteners!
(C) WikiVote! 2012
CAPTCHA
(C) WikiVote! 2012
The best anti-spam techniquesCAPTCHA
Many extensions for CAPTCHA exist, but you won’t make mistake if you choose ConfirmEdit:• Used by all Wikimedia sites• Has several types of CAPTCHA
included• Easily configurable and flexible
(C) WikiVote! 2012
ReCaptcha Questy captcha
Advantages• Great for mediawiki
autonomous bots (asks meaningful questions)
• It’s as smart as you• Can be adopted to your
wiki!Disadvantages• No good for guided
spambots
UsageFor small and medium-sized wikis
Advantages• Unlimited set of captchas• Works most of the time
Disadvantages• May be tricky for users
UsageFor big public wikis or if you know that someone is hunting you
(C) WikiVote! 2012
The best anti-spam techniquesWhich CAPTCHA to choose?
You may also like:Asirra
The best anti-spam techniquesWhat should be captchued?
CAPTCHA is usually needed when:• Anonymous is trying to register• Anonymous is adding the link• There are too much tries to log in– BTW, it’s good to turn on
$wgPasswordAttemptThrottle
(C) WikiVote! 2012
The best anti-spam techniquesExpress yourself in a CAPTCHA
(C) WikiVote! 2012
(C) WikiVote! 2012
Tricks and tips
(C) WikiVote! 2012
The best anti-spam techniquesConfiguration tricks: permissions
Depending on how desperate you feel, you can do one of the following:
• Force people to register before they are allowed to edit
• Add a timeout interval after signing up.
• Require e-mail confirmation to edit:
(C) WikiVote! 2012
$wgGroupPermissions['*']['edit'] = false; $wgShowIPinHeader = false;
$wgAutoConfirmAge = 3600*24;$wgGroupPermissions['*']['createpage'] = false; $wgGroupPermissions['user' ]['createpage'] = false;$wgGroupPermissions['autoconfirmed']['createpage'] = true;
$wgEmailConfirmToEdit=true
The best anti-spam techniquesConfiguration tricks: permissions
Depending on how desperate you feel you can do one of the following:
• Require the approval of new accounts by a bureaucrat:
• Turn off the registration for everyone:
• Turn off the server
(C) WikiVote! 2012
$wgGroupPermissions['*']['createaccount'] = false;
require_once("$IP/extensions/ConfirmAccount/SpecialConfirmAccount.php");
The best anti-spam techniquesEmail configuration
Wiki allows people to send e-mails to each other. The following will be always useful: • require email authentication for using any email
function (except password reminder)
(C) WikiVote! 2012
$wgEnableEmail = true; $wgEmailAuthentication = true;
(C) WikiVote! 2012
Behavior analysis
(C) WikiVote! 2012
The best anti-spam techniquesBehavior analysis
In the worst cases you can install AbuseFilter:• Define heuristics of suspicious behavior• Can also handle vandalism• Use with great care! • Tip: you can copy the bad behavior from
Wikipedia:http://en.wikipedia.org/wiki/Special:AbuseFilter
(C) WikiVote! 2012