How to Detect and Stop Referrer Spam

Goldfish with Shark fin and words: "How to Detect and Stop Referrer Spam"

Do you know how to detect and stop referrer spam on your website? Do you even know what referrer spam is?

Whatever you call it ~ referrer spam, referral spam, crawler spam, Google Analytics spam, or ghost spam ~ it is fake and it is a nuisance!

It is time-consuming to deal with, consumes bandwidth, and ruins your website analytics. And what’s worse is that many website owners don’t realize they’ve been affected.

What is Referrer Spam?

Referrer spam is a sneaky tactic spammers use to increase their website traffic.

Spammers many times impersonate valid domains in order to increase lead generation and/or improve their own site rankings.

They send out fake referrals from these domains that show up in your visitors’ logs and analytics such as Google Analytics and JetPack.

These links get indexed by Google, clicked on by unsuspecting website owners, and therefore, the spammers are getting exactly what they are after: increased website visitors and an increase in rankings.

Technically, this is accomplished through Google’s Measurement Protocol which gives developers access to Google Analytic Servers.

However, spammers have found ways to exploit this protocol to their advantage. And it’s getting worse.

Google is reportedly working on this, but in the meantime, website owners need to be proactive in order to ensure the health of their analytics.

Referrer spam can consume your bandwidth and inflate your analytics with meaningless data. It can also affect your site’s bounce rate since the spam bots never truly visit a page and thus a 100% bounce rate is recorded.

Within the past year the incidence of referrer spam has skyrocketed, and yet I was unaware that our website was being targeted, too.

Depending on the type of spam (crawler or ghost) you can block it in .htaccess or with a Google Analytics filter.

A filter is a way to limit or modify raw data included in a view. For example, you may have a filter set up to exclude your IP from visits to your own website. You can set up other filters as well.

There are plenty of articles online that address setting up filters for crawler and referrer spam. I’ve listed these helpful resources at the end of this post.

The rest of this article will explain how I detected referrer spam on my website and what I did about it.

Symptoms of Referrer Spam

Four red flags over the past year warranted investigation on our website. They are listed here in the order I discovered them:

  1. Crawler Spam: Suspect referral URLs in Jetpack Site Stats
  2. Slow Website Performance despite caching and CDN
  3. Referrer Spam: Suspect referral URLs in System Logs
  4. Referrer Spam: Suspect referral URLs in Google Analytics

What started out as a simple investigation of suspect referral traffic from Semalt and buttons-for-websites ended up with a much-needed website audit and investigation of the website back-end.

I’ll share what I discovered here in the hopes that you will have a starting point for understanding referrer spam, how to detect and how to stop it.

How to Detect Crawler Spam in Jetpack

Early in 2015 I started to see traffic from Semalt and buttons-for-websites. At first I ignored it, but then I realized it didn’t seem to be a valid referral source. I did a quick Google search and realized this was bogus traffic.

Since these bots were actually visiting my site, it wasn’t considered ghost spam. So I was able to successfully block them in .htaccess using this code. You can even redirect the spammers back to themselves as I did with Semalt.

To detect crawler referrers in Jetpack:

Click on Jetpack > Site Stats > Referrals.

Any of the crawler spam bots can be stopped through the .htaccess file by adding this code:

# Block visits from buttons-for-website
RewriteCond %{HTTP_REFERER} buttons\-for\-website\.com
RewriteRule ^.* - [F]

# Block visits from semalt.com
RewriteCond %{HTTP_REFERER} ^http://([^.]+\.)*semalt\.com [NC]
RewriteRule (.*) http://www.semalt.com [R=301,L]

S-L-O-W Website Performance

After a recent website redesign, I spent a substantial amount of time working to optimize and improve our site performance.

As mentioned in a previous post about optimizing WordPress, it’s a good idea to use a variety of tools to help you tweak your site.

It’s also wise to test your site at various times throughout the day to get an average of your page load time.

While testing site performance, I was bothered by slow load times throughout the day despite the fact that were using a caching plugin and we put our site on CloudFlare months earlier.

The sluggish load times prompted me to log into cPanel and check the system log and latest visitors log.

And what I found puzzled me even more.

There was a referring URL which was repeated continuously in the log (every few minutes), which indicated to me that I was getting an influx of bogus traffic.

Not only that, the frequency of visits was consuming bandwidth and impacting my website.

At first I was convinced that my site was hacked. I continued to investigate and later learned that my site was a victim of referrer spam.

I contacted my hosting company and they suggested modifying .htaccess to block the referrer spam. But that didn’t work one bit, so I kept on researching.

The problem is that this type of spam is more correctly known as “ghost spam” or “ghost referral spam.” This type of referral traffic never hits your website (since it’s a “ghost”, right?) so it’s pointless to block it from .htaccess.

I’ve learned that spammers will impersonate valid domains and send out lots of fake referral traffic from them. In fact, the referrer spam in my logs appeared to be coming from a valid domain.

How to Detect Referrer Spam in System Logs

Most web hosting companies keep system and visitor logs up on the server.

Log into cPanel or FTP into your server and look for those logs.

Open them up and see who is visiting your site:

Cpanel Latest Visitors Log

Here is where I found the major culprit:

Latest Visitors Log

The spammers have apparently targeted my domain sending fake referrer spam.

How to Detect Referrer Spam in Google Analytics

Log into your Google Analytics account and select your property.

Click on Acquisition > All Traffic > Referrals:

Here I found more fake referrers. Notice how the average session duration is zero and the bounce rate is 100%. That is a HUGE clue!

I went through each of the URLs in the source column, and continued to make a list of all the suspect referrers.

I didn’t want to link out to them (why should they get my website visit?) so I did a Google search and sure enough, was able to verify all of the fake ones.

I ended up with nine fake referrers; here is a partial report showing three of them:

Google Analytics Ghost Spam

How to Stop Referrer Spam

  1. As you saw earlier, you can stop crawler spam like semalt and buttons-for-websites using directives in your .htaccess file. But remember to back up your .htaccess file and be careful when editing it. One simple typo can crash your website.
  2. To stop ghost spam, it’s recommended that you set up filters in Google Analytics. It was fairly quick for me to set the filters up for the offending referrers. A simple Google search provided lots of resources for blocking specific types of referrer spam.
  3. However, a better approach would be to set up one custom filter that includes hits from real hostnames only. Otherwise, you’ll be spending all your time trying to identify and stop referrer spam on a case by case basis.
  4. If you are behind Sucuri’s firewall, you are in good shape. They have systems in place to stop the traffic before it even gets to your website.
  5. Additionally, there are a few WordPress plugins that help deal with referrer spam. Search the plugin repository and choose one that is well-supported and updated for the most current version of WordPress.

In Summary

It’s important to keep an eye on your analytical tools to detect referrer spam: JetPack, Google Analytics, system logs, and whatever other tools you are using.

If something doesn’t look right to you, explore it further and fix it before it’s too late.

Acquaint yourself with Google Analytics and Filters and make sure you understand how they work.

It’s important to know that filters can be destructive; that is, your data is permanently affected once a view is applied.

Having a Master View allows you to have a web property view that contains all historical data.

Below are some excellent articles with step-by-step instructions that helped me work through the process of dealing with referrer spam.

Resources:

Stop Ghost Spam in Google Analytics with One Filter

Definitive Guide to Removing Google Analytics Spam

Keep Calm and Stop Google Analytics Spam

Google Analytics Help – About View Filters

Google Analytics Help – Create and Mange View Filters

Google Analytics Help – Example Account Structures

I hope this article was helpful to you. Do you have any other tips for detecting and stopping referrer spam?

Image Credit: Fotolia

Similar Posts