Like me, I’m sure you can’t get enough of data.

Data is and should be, the backbone of decision making when it comes to your digital marketing strategy. That’s why it’s so important to have data that you can trust; making decisions off the back of bad data is going to lead to bad decisions. End of story.

Google Analytics has more data than you can shake a stick at and is an invaluable tool for insight and analysis. However, how much of this data is good data?

Below are five things to look out for that may be affecting the quality and integrity of the data in your Google Analytics account:

1. Ghost Traffic

Before you go bragging to your manager about your overnight spike in website traffic, you might just want to hang fire and investigate things a little more thoroughly. A sudden, unexpected increase in website traffic is a likely attribute of spam or bots.

Google Analytics graph showing ghost traffic spike

Although this increase might look nice at first, this bot traffic skews your data completely and keeping this in your account does more harm than good. Not only are your traffic numbers fake and untrustworthy, but it will also have knock-on effects on your conversion rate and engagement stats.

Spam bots are annoying, but luckily there are ways to catch these in the act; read our quick and easy tips on how to identify and remove spam traffic from your GA account.

2. (not set) as a dimension

I hate this and so should you. Seeing the ‘(not set)’ value in your Google Analytics reports can be very irritating and limits data insight and analysis. All that potentially amazing data just sitting there going to waste.

There are several dimensions where the ‘(not set)’ value can occur in Google Analytics, but the more significant ones are Landing Pages, Hostname and Page Title.

Screen shot of not set dimension on google analytics

This can happen for a variety of reasons, but these are the most common:

  • Coding Issues: If you’re running Google Tag Manager and Google Analytics code on your site there may be a conflict between the two that’s causing this to happen, particularly if there are active pageview triggers in both the GTM container and the GA snippet. There may also be an issue with where your Google Analytics code snippet is placed on the site – double check your setup to ensure code has been installed in line with best practice.
  • Spam Bots: If you see the ‘(not set)’ value in your Hostname or Browser reports, there’s a good chance this is ghost spam that’s left false data in your account.
  • Session Timeouts: A session ends after 30 minutes of inactivity by default. If a visitor is on a page on your site for longer than 30 minutes and then makes an action on that page which triggers an event (for example, downloading a PDF or clicking on a video, etc), this will trigger a brand-new session without a landing page attributed to it, hence ‘not set’ appearing as a landing page in your reports. Consider tweaking the default session timeout in your account if this is scenario is applicable to your site.

This is list by no means exhaustive, but the above pointers should help kick-start your investigation and hopefully reduce the sheer rage that occurs when ‘(not set)’ pops up in your reports.

3. The ‘Other’ Channel

The ‘Other’ channel in Google Analytics is like the island of misfit toys. Or rather, the island of misfit Source/Mediums…

If Google Analytics does not recognise the Source or Medium being pulled in via campaign tagging (likely due to spelling/casing errors or having the source/medium the wrong way around) this traffic is probably going to be hanging out in the ‘Other’ channel.

You should avoid data going into here at all costs. Keep on top of this often and, if traffic does come through here, reallocate as soon as possible to ensure that the marketing channels that should be getting the credit for this traffic are not being undervalued.

If you see a sudden reduction in data for a particular channel, it would be worth looking in your ‘Other’ channel for a sudden spike in traffic – there’s bound to be a correlation here.

Google Analytics graph showing spike in traffic due to 'other' channel

From here, you can investigate the ‘Other’ channel with a primary dimension of ‘Source’ and secondary dimension of ‘Medium’ – this should highlight quickly what the tagging errors are (for example, Facebook listed as the Medium rather than the Source).

It wouldn’t be alarming to see small amounts of data in the Other channel. However, any obvious anomalies should be rectified as soon as possible, either in your campaign tagging setup or by amending the rules for Default Channel Groupings (Admin > Channel Settings > Channel Grouping).

If there is a relevant amount of traffic for a source or medium that doesn’t really fit into any of the predefined Channel Groupings, or you’d rather a more granular view on specific source/mediums or campaigns, you can always create a brand-new channel to house this data going forward (for Paid Social campaigns, for example).

4. Payment Gateways as Referrers

If your site uses a 3rd party payment a gateway like PayPal or Sage Pay and you see a high amount of transactions/revenue attributed to Referrals, there’s every chance the conversion value of your other marketing channels is being greatly undervalued.

Commonly, a conversion in Google Analytics is triggered when the user navigates to a page that indicates the transaction is complete (a Thank You page, for example).

A user navigates from your site to checkout on a payment gateway, and then upon completion of payment is redirected to the Thank You page on your site which completes the transaction and triggers the conversion data in Google Analytics. This scenario is technically treated as a new session and the original source/medium that brought the user to your site is lost. It will instead be attributed to a referral from that payment gateway website as this was the last known interaction.

This is a pretty important issue to rectify – having all of your conversion data attributed to referrals will completely skew the data across all of your marketing channels and valuable insight into how your different mediums convert will be lost.

Look at the list of sources in your Referral channel report to see if any of the payment gateways you use are listed here as a source. If so, adding the domains of your payment gateways to the Referral Exclusion List (Admin > Property > Tracking Info > Referral Exclusion List) will signal to Google Analytics to ignore whatever domain you list as a referral and should keep the integrity of the original session intact.

Google analytics referral exclusion list screen shot

5. URL Query Parameters in Content Reports

Query Parameters in URLs can cause quite a headache in the content reports in Google Analytics. They can create a lot of duplication in these reports and render high-level data inaccurate.

For example, website.com/tshirts/ may have hundreds or thousands of possible different URLs due to on-page manipulations such as sorting by price, filtering by colour, etc. Using this example, you would get additional URL’s in your landing page report like this:

website.com/tshirts/?colour=pink OR website.com/tshirts/?sort=price

Not only is this very messy, but it splinters the data too. We recently saw a website that had around 500 sessions per month for one of their main landing pages. However, there were hundreds of different permutations of this URL due to its parameters, all of which appeared separately in the landing page report. After stripping these parameters in Excel and pivoting the data for these pages together, we saw that this page was driving closer to 10,000 sessions per month, rather than the 500 we saw at first glance in the landing page report. Traffic isn’t the only metric splintered here – it would also affect engagement stats and conversion data too.

Fortunately, Google Analytics has a setting that lets you exclude query parameters from content reports and allows for a much cleaner and accurate view of landing page data (Admin > View > View Settings > Exclude URL Query Parameters). If there are consistencies in the parameters that you want to exclude, like in the example above, you can enter them here and these will be stripped and aggregated in the content reports going forward.

URL query parameters in google analytics

Still need advice on your Analytics strategy? Book a chat with one of our Analytics specialists or set us a challenge.