Removing Spam from Google Analytics

By December 30, 2016Tools and Tips

removing spam scam phishingChances are if you’re running a website, you’re using Google Analytics to study your traffic. (Here’s a quick guide to getting started.) However, one of the main hinderances to understanding your performance is the prevalence of spam clogging up the results. If removing spam once and for all is top of your to-do list, follow these easy tips.

Types of spam

There are three main ways spam can infest your Google Analytics results: referrer, crawler, and language spam.

Referrer spam is where a website receives multiple hits from another website. The purpose of this kind of spamming is to generate backlinks to the referrer site. These links target blogs that publish their referrers. More inbound links results in higher search engine results for the referrer site.

Because the referrer doesn’t always visit the affected site, this method is also known as “ghost spam.” Identify referrer spam in your visitor data by looking for multiple hits that log zero time on your site and/or have a 100% bounce rate.

Crawler spam is a less sophisticated version of referrer spam. It uses a web crawler to log time on your site. The purpose is to generate hits for the site the crawler links to. It relies on human error to get those hits, hoping blog owners click to see who spent four hours reading their old blog posts.

As crawler spam is an older model it’s often filtered out of Google Analytics results before you ever see it, but occasionally some bots slip through.

Language spam is the newest type of spam infiltrating Analytics results.  In your visitor data, Analytics shows the language of the browser your visitor was using: en-us, es, fr, etc. Language spam is when the language code is replaced by anything else—usually political slogans, websites, or advertising.

Most of this spam isn’t harmful. It doesn’t mean your site’s security has been compromised, or your data is at risk. Although it’s important not to reward the spammers: never click a link to a site you don’t recognise in your visitor history. Search engines are aware of these scams and are constantly working to eliminate them from their results. As such, your site’s SEO performance is unlikely to be affected.

Removing spam

Although not usually harmful, spam is incredibly annoying. For a site like Amazon a few thousand rogue hits might not be a big deal, for but small websites they wreak havoc. The good news is,removing spam results from your Google Analytics account can be done in a few simple steps. It might not stop the spam at source, but it will prevent it from affecting your site’s visitor data.

Firstly, ensure Google’s spam blockers are turned on for your Analytics account. Simply log in, and go to the Admin tab. Select your Account, Property, and View, then select View Settings from the top of the View section. Near the bottom you’ll see a checkbox for Bot Filtering. Enabling it will keep the View updated with all known bots and spiders filtered out of the results.

The next step is to create filters to weed out referrer and language spam. Firstly, in the View menu, click on View Settings, and then the Copy View button on the top right. This will create a duplicate View that you can make changes to without losing all your data if something goes wrong. Give the second View a name such as “Test” so you’ll know it’s the one you can experiment with.

From the Admin tab, select your test View and click the Filters option below it. In the new screen, select the red Add Filter button. The first filter will be for removing spam languages, so give it a name such as “Language” for quick identification. You want to create a custom filter, and select Exclude as the type from the bullet points. In the options below, choose “Language Settings” as the Filter Field, and type the following as the Filter Pattern:

\s[^\s]*\s|.{15,}|\.|,

At the bottom you’ll see an option to verify the filter before applying it. Upon verification, you should see that the spam options removed from future results.

The filter removes the spam language from future Analytics results once it’s been applied.

Removing other types of spam

The same technique can help removing referrer and ghost spam. Firstly, you’ll need the information to create the filter. Referrer spam works by pinging your website but doesn’t actually visit it. Because these scams involve thousands of sites, they don’t bother to check the name of the site they’re pinging. Instead they’ll enter different websites or random strings into the search query, which show on Analytics as the Hostname. To check you’re experiencing this kind of spam, go to the Network Report and select Hostname as the Primary Dimension. The Hostname should be the name of your site, anything else is evidence of referrer spam.

From the Hostname list, make a note of all the URLs that legitimately belong to your site. These will include yoursite.com but don’t forget subdomains (if you have any) such as shop.yoursite.com.

Once you’ve got a list of your legitimate Hostnames, return to the Admin panel and create another custom filter (again, it is recommended to use a test View, as filters are permanent and you don’t want to lose any data). This time select Include data. Use “Hostname” as the Filter Field, and type your website (and variations such as subdomains) as the Filter Pattern. Use a straight line | as a break in between each Hostname.

Check none of your legitimate Hostnames are excluded by verifying the filter. Then apply the filter to remove non-authentic results from your Analytics data.

Conclusion

Removing spam is a constant battle for any website owner, but with these simple tips, you can eliminate it from your Google Analytics data and prevent it from affecting your results. If you’re not sure how to start, pick a plan and let TechTe.am’s friendly experts help.

Leave a Reply