April 16th 2020

How to Exclude Bot Traffic in Google Analytics

This article assumes you have a bit of background knowledge on bots and how they can affect your site. Start over at ‘What is a Bot’ if you need to brush up your knowledge.

Bot traffic on your site is often harmless, but because bots behave similarly to humans, it can affect your traffic data in Google Analytics if it isn’t filtered out correctly. Clean data is crucial in helping us to make effective marketing decisions, and while filtering out bot traffic won’t ensure your data is completely clean, it will be a very good start.

Note: Before you do anything in Google Analytics, always make sure you have a Raw View, a Preferred View and a Testing View. Once data is lost, you can never recover it, so keep a Raw View as a backup, and test your changes in a Testing View before you transfer them to your Preferred View.

Filter Out Known Bots

Luckily, Google is aware of most of the bots that your site is likely to encounter, which means you can filter this traffic out of your GA account with the click of a button. All you need to do is go to Google Analytics and click on the ‘Admin’ cog in the bottom left hand corner.

Go to ‘View Settings’ and tick the checkbox that says ‘Exclude all hits from known bots and spiders’.

Check the box that says 'Exclude all hits from known bots and spiders'.

It’s worth noting that ticking this box will only affect traffic after that point, and will not filter bot traffic out retrospectively, so make sure you do this as soon as possible.

Identifying and Filtering Unknown Bots

However, Google Analytics’ known bot filter isn’t failsafe, and bot traffic can still find its way into your account even with this option selected. Bot traffic is often relatively easy to spot when you start digging – you’ll often see a group of users (from the same city, with the same device, same network provider etc) with anomalously different behaviour to the rest of your users. Generally, bot traffic is marked as ‘Direct’. Sometimes sudden spikes in traffic in certain dimensions can alert you to bot hits, although this is not always the case. The best way to hunt down bot traffic is to play around with different combinations of dimensions and secondary dimensions to see if you spot anything out of the ordinary. Here are a few tell-tale signs of bot traffic you can look out for:

  • Large proportion of uses with the location (not set)
  • Groups of users (e.g. from the same city, network, service provider etc.) with particularly high bounce rates or short (a few seconds) session durations
  • Groups of users with an abnormally large proportion of new users
  • Groups of users with particularly high bounce rates
  • Sudden traffic spikes for particular dimensions
  • Strange or suspect referral sources
  • Hostnames that aren’t your website (Hostname is a secondary dimension)

Once you’ve identified a bot, you can set up a filter to filter it out of future GA data.

Setting Up A Filter for Unknown Bots

1. Create a New GA View – This is probably the most important step, because once you filter data out of GA, you can never get it back. You can create a new view to test your changes on and check everything works as you’d expect it to first. Always keep a ‘Raw’ or unfiltered view in your GA account. This acts as a backup in case anything goes wrong.

2. Review Your Bot – What does your bot traffic have in common? (e.g. is it hostname, city, IP etc.)

3. Go to the Admin panel in Google Analytics for your new view and click ‘Filters’ in the ‘View’ column on the right.

4. Click ‘Add Filter’

5. Give your filter a name that will help you recognise it in future.

Select your filter type, field and pattern.

6. Select your filter type. Depending on the factor you are going to use to filter out your bot (e.g. hostname) you may need to use a Predefined or Custom filter. Look in both to see which suits your example.

7. Make sure you have ‘Exclude’ selected and use the ‘Filter Field’ drop down to select the type of dimension you want to filter out, and enter the text you’re using to identify bot traffic in the ‘Filter Pattern’ box. (For example, if you are getting bot hits from the hostname nastybots.com, you would select ‘Hostname’ from the ‘Filter Field’ dropdown and enter ‘nastybots.com’ into the ‘Filter Pattern’ field.)

8. Hit ‘Save’!

9. Now it’s time to check your filter worked as you expect. It normally works straight away, but it can take up to 24 hours, so check back in a day or so. To check this is filtering out the correct traffic, you need to compare data from your new testing view against your preferred view. You should still be able to see the bot traffic in your preferred view, but it shouldn’t be visible in the testing view. (You should also check that the bot traffic is the only thing you’ve filtered out! Make sure none of your genuine user data as been filtered out with it). It’s a good idea to check back a few times over the course of a few days to be sure nothing’s slipped the net.

10. If your testing view is behaving as you’d expect it to, it’s time to add your new filter to your preferred view. Simply complete the above steps, making sure everything is exactly the same as your testing view.

And there you have it! Bot traffic has now been identified and filtered out of your precious GA data.

If your filter hasn’t worked as you’d expected it to, or if you suspect your GA data could use a bit of TLC, feel free to get in touch! We’d be glad to take a look.