
To my family
Chapters 1–5 are in ebook 1, ISBN 978-1-910591-01-7, available from your ebook retailer.
Foreword
Preface
Introduction
1 Preparing to Measure Success
2 Choosing the Right Tool
3 Setting Expectations and Building the Process
4 Assessing Your Data Quality
5 Jumpstart Guide to Key Features
Appendix: Terminology Explained
Index
About the Author
This is ebook 2, ISBN 978-1-910591-02-4, containing chapters 6–10. Chapters 1–5 are in ebook 1, ISBN 978-1-910591-01-7, available from your ebook retailer.
Foreword
Preface
Introduction
6 Jumpstart Guide to Key Tracking Methods
7 Data Responsibilities
8 Building Your Insights Team
9 Using Key Performance Indicators and Dashboards
10 Insights and Success Stories
Appendix: Terminology Explained
Index
If you have $100 to invest in magnificent, glorious success from your analytics efforts, invest $10 in tools and implementation and invest $90 in big brains (people!).
I humbly postulated that ground truth as the 10/90 rule on May 19, 2006. With every passing year, I’ve come to believe in that rule more and more (and more and more). The reason is quite simple. Every facet of the business world is throwing off ever more data, and every facet of our personal existence (and insistence on sharing) is throwing off ever more data. Data, it turns out, is free; identifying specific actions business leaders can take based on rigorous analysis is not free.
This is why I’m so excited about Brian’s book. It dispenses with the normal omg, omg, look at how much data is there and is that not amazing, let us spend 18 months on implementation, and gets to what it really takes to shift from data puking to recommending business actions based on data.
Here’s one of my personal examples of the difference in emphasis, and what ultimately drives success. In every company, every leader wants a dashboard. “Get me a summary of the business performance. Decisions shall be made!” Analysts scurry around and an intense burst of data, manifested as tables and charts, is presented on a vanilla-scented piece of paper. Happiness? Job promotions?
Sadly, no.
It turns out that the higher you go up the chain of command, the more analytical skills go down, and the context required to make sense of the numbers on the dashboard is also dramatically reduced. Few decisions are made, and if there is a meeting to discuss this it devolves into a discussion of the data quality, missing data, colors in charts, and everything except making a business decision.
The answer? Words in English. More specifically: insights, actions, business impact.
Every dashboard in the world should include as few tables and charts as possible. It should include insights written in English (or your native language) by the analyst, followed by the recommended actions and—the most important critical must-have bit—the impact on the business if the actions are taken.
That vanilla-scented piece of paper will no longer drive one more awful discussion about the data itself; it will drive a discussion of which actions to take first. Hallelujah!!
It is incredible to realize that in the end, data by itself does nothing. It is just data. It is the $90 part—the big brains—that identifies insights, actions, and business impact that will push your company’s profitability and customer delight to new, incredible, heights.
Next time you receive a dashboard, look for the balance between tables, charts, and English text, and you’ll know if it will add value or waste time.
That’s my little appetizer for you as you dive into Brian’s wonderful book.
The entire book is awesome. It is beautifully structured, and you should go from Chapter 1 to Chapter 10 on your we will make the most of data voyage. But if you wanted to be a little naughty and jump around, my favorites are Chapter 8 (you can read it anytime, and you can’t work on the recommendations soon enough!) and Chapter 10 (every time you find a task daunting, find hope in the success of others in the case studies).
I wish you all the very best. Carpe diem!
Avinash Kaushik
Digital Marketing Evangelist: Google
Author: Web Analytics 2.0, Web Analytics: An Hour a Day
This is my fourth book on Google Analytics, but this one is different. Rather than making it a tool-specific practitioner’s bible (as my Advanced Web Metrics series endeavored to be), I approached this book as I do my work: helping ambitious organizations make a success of their business by using data intelligently.
As I have come to realize over the years, success does not depend on tool expertise alone. The bigger issue is the organization. It needs to trust the data and have confidence in the process, structure, and people behind it—things not directly related to the tools being used. So I approached this book very much from the business point of view first, then worked backward toward the nontechnical aspects of the tool—Google Analytics. My intention is that senior managers, stakeholders, and practitioners all speak a shared language and set a common path to building a credible data-driven environment. I hope the method has worked.
As for all authors, my writing of this book was not a solo exercise. It required love, support, help, guidance, advice, friendship—even random and unrelated conversations. (You would be surprised at what can spark an idea connected with data!) The people I list here are those who have directly contributed to the book or to my thinking about applying successful analytics.
Sara Clifton’s never-ending love, support, and guidance keep me on the right track and always help me to see the bigger picture of measurement, digital, and life in general.
Shelby Thayer has sanity-checked every word of my last three books, including this one. She is a great analyst, with a ton of experience at driving web measurement acceptance within a large organization, and her feedback and experiences have helped me significantly in writing a tightly focused book.
Brad Townsend is my valued technical editor. He is a smart (and modest) Googler who, as a software engineer, knows the technicalities and back end of Google Analytics like no other. David Vallejo, an expert Google Analytics implementer, developer, and all-round smart guy, helped me enormously with his technical problem-solving skills. Nothing can’t be done with this guy at hand! Dave Evans expertly reviewed Chapter 7 (“Data Responsibilities”) and provided insightful discussions about data privacy law. Dick Margulis is my trusted editor, who has now helped me write and structure three books and is my go-to man for navigating the tricky waters of the publishing world. His guidance and advice have been invaluable.
Avinash Kaushik has honored me (again) by writing the foreword to this book and setting the scene so enthusiastically and logically for the reader—in a way that only he can. I am lucky to count him as a friend and former colleague. He inspires me (and many others) with his advocacy and excitement for all things that can be measured.
John Wedderburn, Tobias Johansson, and the team at Search Integration (where I work) have engaged in many “quality time” meetings and open-ended discussions that have broadened and deepened my knowledge.
And last but not least, the vibrant and smart GACP community pushes back the boundaries of what can be done with Google Analytics, and importantly, what can be simplified with it.
I hope I have remembered everyone.
Brian Clifton
January 2015

“We think we want information when really we want knowledge.”
—Nate Silver, from The Signal and the Noise

According to a recent survey of IT professionals,1 “55% of big data analytics projects are abandoned.” Most of the respondents said that the top two reasons the projects fail are that managers lack the right expertise in house to connect the dots around data to form appropriate insights, and that projects lack business context around data. Similarly, the “Online Measurement and Strategy Report 2013” from Econsultancy2 asked companies, “Do you have a company-wide strategy that ties data collection and analysis to business objectives?” Only 19% said yes, a figure that had hardly changed during the previous five years.
I wrote this book for those managers struggling to make headway—to empower you to make informed decisions and overcome the obstacles.
My goal with this book is to get you to think in terms of insights—not Google Analytics data. An insight is knowledge that you can relate to. It’s a story that puts you in the shoes of your visitors, so that you can understand their requirements when they come to and view your website, app, or other digital content.
A company’s ability to satisfy the needs of a website visitor depends on two important factors:
• Visitor expectations, discerned from how they got to your content—what search engine, campaign ads, or social conversation drove their decision to seek you out
• User experience—how easy it was to use your content, to navigate around, find information, engage with you (contact you, purchase, subscribe, give feedback)
It is your organization’s ability to manage, analyze, and improve these two factors that determines your digital success (or not). In this book I describe how insights are used to pull all of the relevant data points together to build a story of your visitors’ journey and their experiences. With that knowledge you can improve these: as I show in Chapter 10, improvement can be dramatic performance gains in terms of your online visibility, revenue, or efficiency savings.
Yet Google Analytics doesn’t provide insights by itself—no tool can. Producing insights requires an understanding of your business and its products, your value proposition, your website content, its engagement points and processes, and of course its marketing plan. Google Analytics provides the data (and lots of it) that enables you to assess these. However, people—not machines—build insights. This is the role of your analytics team. They sift through the noise to find the useful data, translate it into information to explain what is happening, then build stories of useful knowledge for the organization—the insights.
This book is about showing you how to do that. This book is about knowing what to focus on, what you can expect in return, the talent you need to hire, the processes you need to put in place, the pitfalls to avoid, and how much investment is required in order to make it all happen.
This is a detailed book by necessity. Building an environment where you can trust your data, understand it, and make important decisions based on it requires a deep level of immersion, not an executive summary. However, my approach throughout this book is to focus on the insight gained for the business, not the minutiae.
This book is for you if you are a manager who needs an overview of the key principles of website measurement, the capabilities of Google Analytics, and how to grow and give direction to your organization when it comes to its digital strategy. Your ultimate interest is in insights and knowledge, not more data!
In short, I aim to put you in control and provide a perspective on the entire process of building a data-driven environment using Google Analytics.
1 http://visual.ly/cios-big-data
2 https://econsultancy.com/reports/online-measurement-and-strategy-report

By understanding the key tracking methods available within Google Analytics, you will be able to direct your teams to cater for current data needs, and you will be better able to contribute to the discussion and planning of your organization’s future data needs. That is, you will become proactive in proposing innovative ways to measure success—helping the business make smart decisions based on data.
To be innovative, you need a nontechnical understanding of how Google Analytics works, its key tracking methods, the principles of attribution, and what it’s possible to do with your data outside of Google Analytics.
Although in the early days (circa 2005), the Google Analytics pricing of free was a strong motivator for adoption by many a small business, it rapidly became clear to many analytical experts that its real allure is its ability to be both a broad brush and a scalpel for helping you understand your website. In addition, because of its implementation simplicity, it is also incredibly flexible.
Even with its groundbreaking setup simplicity and its intuitive user interface, Google Analytics is a complex product. It aims to simplify complicated processes and visitor journeys by hiding the technicalities behind them. That of course is a good thing and I deliberately avoid any code in this book—you don’t need it! However, if this subject matter is new to you, take your time understanding the concepts described in this chapter. Your knowledge gained will stand you in good stead for your career—being comfortable with data in the digital world is becoming an expected skill, rather than an optional extra.

As with this entire book, my focus is on the tracking of website visitors. This is what Google Analytics is mainly used for today. However, there are also Google Analytics tools available to track how your mobile apps are used—both on Android1 and iOS.2 In addition, Google Analytics is now moving into an area referred to as Universal Analytics—a central platform to track anything that can communicate with the Internet—from barcode scanners to turnstiles.

Google Analytics is known in the measurement industry as a page-tag solution. That means it uses a snippet of code (the tag) placed on your pages to collect and transmit visitor information to Google servers. For Google Analytics, the snippet of code is approximately a dozen lines of JavaScript and is called the GATC. By this method, all data collection, processing, maintenance, and program upgrades are managed by Google as a hosted service—software as a service (SaaS). The process and data flow are illustrated in Figure 6.1.
The operational process for the latest tracking code, released in 2013,3 is described as follows (note that previous versions of the tracking code work in a similar, though not identical, way):

1 Nothing happens until a visitor arrives at your website. This can be via many different routes, including search engines, social networks, email marketing, and referral links. Whatever the route, when the visitor views one of your pages containing the GATC, an automatic request is made for the file at www.google-analytics.com/analytics.js. This is the Google Analytics master script that is downloaded only once during a visitor session. It contains all the code required for tracking. Further requests for it will be retrieved from the visitor’s browser cache.
The JavaScript download occurs asynchronously. That means it doesn’t interfere with the rest of the page completing its download. If there is a delay for analytics.js, or the file does not download, the browser continues as normal—there is no impact on user experience, though data collection may be affected.
With analytics.js in place, anonymous visit and visitor data are collected, and a first-party cookie is created. The cookie contains a unique visitor ID plus some meta information regarding the version number and the domain where the GATC is placed.
2 For each pageview, the GATC sends this information to Google data collection servers via a formatted URL of the form https://www.google-analytics.com/collect?v=1&tid=UA-XXXX-Y&cid=555&t=pageview&dp=%2Fhome-page. The transmission of data takes a fraction of a second.
Although a pageview is the default hit type captured by the GATC, other hit types can be sent by communicating with the analytics.js file via JavaScript commands, such as tracking in-page events, transactions, or labeling visitors.
3 At regular intervals, Google processes the collected data and updates your Google Analytics reports. Because of the huge quantity of data involved (Google Analytics is used on more than 20 million websites worldwide), reports are typically displayed 3–4 hours in arrears. This may be longer if you have a high-traffic website, though it should not be more than 24 hours.

As described in Chapter 5 (in ebook 1), Google Analytics has a set of real-time reports where current activity is displayed with a delay of typically 4 seconds. This is an impressive subset of your data, though not every report within Google Analytics can be updated so quickly.
To compare your activity for different date ranges or different visitor segments, you will review the standard Google Analytics reports. How fresh this data is—in other words, how up-to-date your reports are—depends on a number of factors. The most relevant is the volume of data you send to the Google Analytics collection servers.
For most websites, your data is likely to be 3–4 hours in arrears. This may be significantly less if your site receives fewer than 10,000 visits per day. If your site receives more than 200,000 visits per day and you use the free version of Google Analytics, your reports are processed once every 24 hours.4 For Premium users, freshness is guaranteed at under 4 hours—regardless of data volume. Chapter 2 (in ebook 1) details the difference between free and Premium Google Analytics.


A page tag (for example, the GATC) is an object embedded in a web page that is invisible to the user but allows checking that a user has viewed the page. Alternative names are web beacon, tracking bug, and tag.
The page-tag methodology that Google Analytics uses is not unique in the industry. In fact, the technique has been around since the late 1990s, and the majority of web analytics vendors employ the same technique for data collection (though each vendor has its own particular tweaks and patents for their tool).

By design, Google Analytics uses the same analytics.js tracking code for all visitors and for all website owners. That means it is cached by a very large proportion of web users—a major advantage of having an adoption base of millions of websites. If a visitor to your website has previously visited another website that also runs analytics.js, the file does not need to be downloaded at all—it will already be cached. Even if analytics.js is not cached, the typical file size is 33 KB. That takes around 200–300 milliseconds (less than one-third of a second) to download on a 10 Mbps Internet connection. The result is that Google Analytics has a minimal impact on your page loading times.
As you have probably realized from the description of Figure 6.1, if a visitor blocks the execution of JavaScript or blocks the setting of first-party cookies, or if you forgot to add the GATC to your page, or your web server does not allow the GATC to execute (that is, it’s behind a firewall), Google Analytics will not function and no data will be collected. Once data is lost, you cannot go back and reprocess it, so regular audits of your GATC deployment should be part of your implementation plan—as described in Chapter 4 (in ebook 1).
In 2013 Google launched their latest update to the GATC, called Universal Analytics. Their stated goal is to take web analytics to another level, moving away from being simply a website measurement tool, to becoming the central data platform for providing insights for an entire organization’s activities. This could include, for example, online marketing, offline marketing, traditional e-commerce tracking, in-store sales (physical in-person sales), telephone orders, event participation, and many other possibilities that smart people around the world are currently experimenting with.
Universal Analytics is not a separate product or tracking method. It is the same GATC code described for tracking websites, and this is likely to remain its most common use. I separate the different uses here only for the purpose of clarity.
Because analytics.js uses a standardized low-level protocol to send data hits to Google Analytics data collection servers, developers can program any Internet-connected device to send raw user interaction data directly to Google Analytics servers. The beauty of this method is its simplicity—broadcasting a formatted URL is straightforward for a developer.
As a simple example, suppose a visitor can print out a coupon for use in your Main Street store (or have it emailed to them). If the barcode for the coupon includes the Google Analytics visitor ID, then the offline conversion can be tied back to the website visitor.
Now imagine you run an event such as a conference, concert, or sports game. Visitors to your event use their barcoded ticket to access your venue. If you connect your barcode scanners to the Internet, each user can be tracked by Google Analytics as they enter and exit the venue (anonymously, of course). This is powerful stuff. You can take advantage of Google Analytics real-time reports to quickly and efficiently assess the flow of visitors to your venue.
Perhaps your event visitors also use their ticket at local merchandise stores for a special discount while the event is running. Using the same barcode scanning technique, local stores can send data to Google Analytics and provide detailed reports of the purchases made and the revenue transacted. If you manage a stadium, concert hall, or conference venue, Google Analytics can provide you with a host of performance information in an easy-to-visualize, standardized format. Tax-free shopping at airports is another example, as cashiers already check boarding cards for purchases.
You can apply this same technique to garage ticketing, premise turnstiles, electronic key fobs (your car), remote controls (your TV or music system), and security tags—in fact, anything that can be programmed to send a formatted URL over the Internet. The Olympics, World Cup, and Super Bowl are prime examples for using this type of technique to capture and display big data. However, any event, large or small, is suitable. For example, I have used this technique at a conference and displayed the real-time data of visitors entering the event to the audience while I was speaking. Others have tracked home appliance usage in this way—just for fun.5 Essentially, any user-centric data is a potential good fit for Google Analytics.
Imagine the power of Universal Analytics if you run a shopping mall, supermarket, or retail store. In a relatively straightforward way (this is not rocket science), you can tie online and offline marketing to store visits, store purchases, emails, telephone calls, even your garage usage! You can integrate this data with your CRM system (via an anonymous visitor ID in Google Analytics). That way, all your online and offline data can be tied to a specific customer. This is potential nirvana for a digital marketer—and also any senior manager who needs data to make informed decisions.
Offline, online, and specific customer data (CRM) can all be combined into a unified data collection and reporting platform—namely, Google Analytics. Figure 6.2 shows components of offline data, with online data simplified to a single node, labeled Google Analytics (the constituents for online data are shown in Figure 1.3 of Chapter 1, in ebook 1). A customer or potential customer may touch any number of these. My point is that they can and should be tracked if they are of value to your business.


The potential for making strategic business decisions using Universal Analytics is enormous and very exciting. My only niggling concern is privacy. With so much “anonymous” data being collected around the individual, the visitor becomes increasingly more identifiable. That is, a process of triangulating anonymous data points can lead to identifying who a person is.
A now famous example of this is the 2006 AOL search query release. This intentional data release was made public and intended for research purposes. However, users could be identified indirectly by the query terms they entered into the search engine. It is a classic example of how the power of data can be underestimated by its owners.6
I discuss your responsibilities with respect to data collection in Chapter 7.

When it comes to tracking a visitor’s interaction with your website, or whatever you use Google Analytics data collection for, there are three main techniques available to you:
• Track the interaction as a pageview—the default tracking method for websites.
• Track the interaction as a virtual pageview—Google Analytics considers the data hit sent as the equivalent of a pageview.
• Track the interaction as an event—any interaction that you would not consider a pageview, such as a visitor clicking on a file download link.
Pageview, virtual pageview, and event tracking cover the vast majority of tracking requirements. However, there are other tracking options available and special tracking considerations to be aware of. See the section “Significant Others.”
As shown in Figure 6.1, the default behavior when the standard GATC is deployed on your web pages is for Google Analytics to capture a pageview—the path, page name, and page title—that is loaded in your visitor’s web browser. This, coupled with the anonymous visitor information—such as what site or campaign brought the visitor to you, the date and time of the visit, the location of the visitor, whether they have come before or not—is a powerful data set that forms the basis of your initial analysis.
Apart from deploying the GATC on your pages, the exact same tracking code on every page, nothing else is required for this to work. The ease of obtaining data by this method is a key advantage of using it. Therefore, tracking pageviews should be your default approach unless there is a specific reason for doing something else.

Because collecting pageview data is so easy, you may find yourself overwhelmed with information, and you may exceed the data volume limits of your account (for the free version of Google Analytics this is 10 million data hits per month). There are two ways you can reduce the volume of pageview data:
• Place your GATC only on pages that you wish to track.
• Set a sampling rate to statistically sample your visitors and their pageviews.
The first method is perfectly valid, though it does raise a question: If you don’t wish to track it, what is the point of having the content in the first place? In other words, if the content is present on your website and you wish to keep it, surely you will want to measure its performance? Therefore, I recommend the sampling technique as the better way to reduce data volume. It is a simple online setting added to your GATC.7

As the name suggests, a virtual pageview is analogous to a standard pageview, except that you define the path, filename, and page title sent to Google Analytics. This can be useful in two ways:
1 Renaming existing pages—if your website has meaningless page names that make no sense to anyone reading your reports, you can rename these at the point of data collection.
2 Faking new pages—if a standard pageview is not tracked by default for the visitor’s action, you can create one.
Consider the following standard pageviews captured by Google Analytics from the visitor’s browser for a fictional leather store:
/section-1/sectionA/productID54?sku=123
/section-2/sectionB/productID86?sku=456
To anyone reading your reports, these pageviews are meaningless. They convey no information to the reader as to what content is being viewed by visitors; you would need a lookup table to translate what the section, productID, and sku values refer to. With some simple changes to the GATC,8 these can be renamed into something much more readable in your reports.
Consider these alternative virtual pageviews for the same content:
/shoes/female/high heeled boot?size=38&color=black
/jackets/male/sports?size=large&color=brown
A report containing these virtual (renamed) pageviews is much more informative than the ones captured by default. The same method can also be applied to modify the page title. A smart web developer will be able to automate this process—renaming of default paths and filenames into more readable virtual paths and virtual filenames.
To illustrate faking new pages, consider an information request form on your website, with a pageview of
/contact/form-1.php
I often discover that content management systems do not modify the pageview when visitors submit the form. That is, the URL does not change in their browser when they have completed their action. This is obviously a problem for Google Analytics—it cannot differentiate between a form view and a form submission. However, such a submission is an important engagement that has a great deal of value to your organization—it is how anonymous visitors identify themselves (gold dust!). To overcome this problem, use a virtual (fake) pageview. For example, when the form is submitted, send one of the following virtual pageviews to Google Analytics:
/contact/form-1.php?submit=success
/contact/form-1.php?submit=fail
Having these additional pageviews allows Google Analytics to calculate your form conversion and failure rate. This tells you how good your website is at soliciting a contact from potential new customers.
More examples of using virtual pageviews to improve and enhance your Google Analytics reporting are discussed in Chapter 4 (in ebook 1).
Pageviews (both real and virtual) are for recording how a visitor navigates your content. Events are best for recording how users interact with that content.
An event is an interaction on your website that is not considered a pageview. For example, links to file downloads, outbound links, and interactions with embedded video are not tracked by default because they do not load a webpage. Hence, it makes sense to categorize such interactions differently. This is what event tracking is for. Other examples of event tracking are discussed in Chapter 4 (in ebook 1). As events are not tracked by default in Google Analytics, they require a modification to the GATC on your pages in order to take effect.9

Given the choice of two different tracking methods, what circumstances make it appropriate to use a virtual pageview, and what circumstances make it more appropriate to use an event?
Using virtual pageviews to capture actions such as form submissions or outbound links obviously inflates your pageview count. However, if the action you are tracking can be considered analogous to viewing a page of content, then a virtual pageview is valid. In my opinion, this is the case for readable content such as the thank you page of a form submission and the click on an outbound link (a page on a different website). That is, from a visitor’s perspective the file format is irrelevant—it is simply content to them, in the same way your other HTML pages are content.
Use event tracking when the action being tracked is not related to a pageview. For example, the downloading of a ZIP or EXE file, the playing of an embedded video, adding an item to a shopping cart, or the interaction with an in-page widget, such as a loan calculator.
What About Readable File Downloads—PDF, DOC,…?
For this scenario there is an overlap in potential tracking methods—a PDF file is both a file download and readable content. Therefore, both virtual pageview and event tracking methods are valid. If all PDF files on your website are of a similar value to you and