Codeable info

WordPress Plugins which help reduce the damage done by RSS Blog Scrapers

Posted on by in Blog

The more popular your blog becomes, the more likely your content is likely to be stolen by blog scrapers. This can be incredibly frustrating, particular when you have spent a lot of time researching and writing your articles. Any blog which has over a thousand subscribers is likely having it’s content scraped, it’s just something which popular WordPress sites need to deal with.

You may have read some articles on how you can stop blog scrapers. In reality, the only way you can stop people from stealing your content from your blog is to not offer a full feed (i.e. change from full text to summary in the wp-admin/options-reading.php page). And many large websites do just that, but for most people that isn’t an option as RSS subscribers would drop considerably. After all, the reason 99% of people subscribe to a blogs RSS feed is so they can read the website through their newsreader.

So if you do offer a full feed (which I believe you should), you will inevitably have your content scraped at some point. I can speak from experience when I say that there is little you can do when you find a website which is stealing your content. You can report the website to any company which they are advertising through (e.g. Google Adsense) and you can contact the website host too. Though you will rarely get a response, and if you do and the site is removed, you will find that several more sites are scraping your content. And the whole process starts again.

Put simply, it’s better to spend your time working on your website rather than chasing these content thieves 24/7 (though admittedly, if I find someone scraping my blog I usually complain via Google Adsense using their form).

Get your own back!

Do not despair! Content scrapers are, by their very nature, incredibly lazy. So whilst there might be no great way of stopping RSS scrapers, there are lots of ways to get your own back.

Blog scrapers automatically pull data from your RSS feed and rarely change any of the content, which means that it’s in your best interests to link back to your own blog heavily within your posts. It’s worth adding a copyright notice too. You would think that these measures would discourage them from scraping your content in the first place, but in practice you’ll find that most scrapers don’t even bother to check such things.

Unfortunately, you can’t stop them from hot linking your images. With any other website you can stop external websites from hot linking your article images using htaccess, however doing this would stop many feeds from displaying your images too, therefore it cannot be used.

I will now show you some great WordPress plugins which will help you reduce the damage done by blog scrapers.

WordPress Plugins to help you fight Blog Scraping

Copyright Plugins

Adding a copyright to your RSS content is a quick and easy way to inform the world that the content they are reading is from your blog. It’s also useful for linking back to the original article and/or your website.

Below is a small list of copyright plugins which are available for WordPress. For WP Mods I’m currently using Yoasts RSS Footer plugin to add a link back to the article and the website at the top of every post and Simple Feed Copyright for a standard copyright notice at the end of the post.

Anti Scraper Bot Plugins

There are plugins available which try and stop known agents which scrape content. These do work well though since bots and unfriendly agents are constantly changing the ip address they are using, it is not full proof. However, it is another step which you can take to reduce RSS scraping.

More Information in your RSS Feed

It’s possible to add more information to your RSS Feed. For example, the Yet Another Related Posts Plugin adds a related posts list to your website and your RSS feed. This encourages readers to read more articles on your site and adds more incoming links to your RSS feed.

Many social media plugins also let you add voting buttons to your feed too. Though I prefer to use the FeedFlare service which Feedburner offers.

You can also expand things even further. Why not replace a text link copyright with a banner image back to your blog to increase the exposure you get on content stealing websites.

Taking it Further

If copyright infringement and plagiarism is becoming a really big problem with one or two sites in particular, you may wish to take things further and go down a legal route. You can sometimes catch scrapers as they send a pingback back to your blog. Though this doesn’t always happen so it’s worth checking your site at a plagiarism checker such as CopyGator or Copyscape.

Jonathan Bailey, who used to work with me closely on one of my other blogs, knows a lot about copyright issues and discusses them regularly on his blog Plagiarism Today. I encourage you search through his website to find out more about what you can do to stop content theft. In particular you should find the Cease and Desist letters helpful :)

Codeable info

Comments (5)

Comment by Link To Your Own Articles The Easy Way says:

[…] Blog scraping is inevitable once your site reaches a certain level so it’s good to link to older articles within your posts. This will increase incoming links to your site and make it more obvious to readers of the spam site that the content was stolen from your site. […]

Comment by Bruce Simmons (BruSimm) says:

Great insight. Me, I use specific catch phrases and google for them intermittently. Helps catch the sources that tend to block pingbacks! Thanks again for the info!

Comment by Kevin Muldoon says:

No need to thank for the link :)

This article was published a month or so after WordPress 2.92 therefore WP 3.0 compatibility wasn’t an issue.

I’ve found that most major plugins have been upgraded already to work with 3.0. Most unsupported plugins have not been updated – some work, some don’t – therefore you need to test each one to make sure it works :)

Comment by Jonathan Bailey says:

Thanks for the link to the site. You’ve got some great plugins in this list, including a few I didn’t know. However, though some are very tempting, I’m seeing that there isn’t a lot of info about WordPress 3.0 compatibility and that has me nervous. Still, some awesome ideas here!

Thanks for the post and the link!

Comment by The Best Ad Management Plugins for WordPress - WP Mods says:

[…] A good option for those who want to add banners or text links to their RSS feed. Could also be used to add a copyright notice to your feed. […]

Codeable info