Tracking UI Level Links: An Open Source Script

One of the challenges with the current complex site designs is that multiple links to the same destination tend to appear on the same page.

This does not allow you to understand how your allocation of real estate is being used without resorting to really fancy analytics packages.

To solve this problem I developed a script that upon every click, walks up the document object model (DOM) looking for a an attribute on an HTML tag of ui.

If it finds one before it hits the BODY tag, then it adds a parameter to the link called ui with the value of that attribute, allowing you to understand which links on a page are being used. Shown below is a report from Google Analytics for this site showing how people arrive at my bio page:


If you visit my bio from the About menu at the top of my blog, it adds ui=nav.

I've licensed the script under the MPL and you're welcome to use it on your site, providing you share any enhancements. Grab it at http://alwaysbetesting.com/abtest/includes/logger.js.

All you have to do is include the script on your page and add ui=element_name attributes to key HTML elements that make up your site structure.

Because this happens with Javascript, and only when the user clicks, there's no danger of creating duplicate content for the search engine spiders. Alas, that means it doesn't fix typical analytics overlay views either for duplicate links, but you can typically see the source element in your path reports.

Google Analytic Gems #2: Quantifying Deliberative Conversions

Another little known gem in Google Analytics is the Time to Purchase and Visits to Purchase.

For MyWeddingFavors, where the purchase is an exceptionally meaningful one for our customers, some 40% of our sales happen on subsequent visits:
.

Careful though! Looking into Days to Purchase, we see only 25% of sales happen on a different day than the introduction to the customer.



So many (15%) of our shopping experiences are stretched out over a day, while only 25% happen on a subsequent day visit. Understanding this pattern has some serious implications for design, business strategy, and e-commerce feature set.

Google Analytic Gems #1: Split Test Evaluation with Only New Users

A question on LinkedIn, now closed, asked "what are some hard to find but useful reports in GA?" While clicks to task completion is far from the ultimate metric, Google Analytics does suffer from some slightly onerous depth issues for specific data points.

We do a lot of split testing with Google Analytics by defining custom segments and serving each segment a different UI. One issue with understanding the impact of new features, particularly for sites with lots of repeat visitors (e.g. content / blog sites vs ecommerce), is the novelty effect. New features, or even simple changes in layout, can have a short term halo as users notice and engage with the changed content.

Google Analytics does allow you to look at your user defined segments for new and repeat visitors, but it does require a few clicks. Follow along with the picture:


Starting in the visitors submenu (1), the New & Returning report allows you to drill into New users (2). The segment drop down has lots of useful pivots, including "user defined" (3).

Picture #4 shows the results for new users of a split test that moved an mailing list subscribe box from left to right. The magnitude of the effect diminished over time as we tested this. However, by drilling into only new users, we see the original effect size. Looking at all users, the effect is smaller, and looking at returning users, the effect is smaller still.

Dealing with the halo effect is one of the reservations that was expressed during "AB Testing: Designer Friend or Foe" at SXSW. Splitting users into new and returning is one of the easiest strategies for seeing through this confound.

Getting Serious About Testing: Learn from the Pros

Last week's SXSW panel on AB Testing: Designer Friend or Foe left me wishing for a more robust treatment of the experimental design issues around online testing. It was a great panel, and I appreciated the real world experience of the panelists, but aside from Micah, the approach was very much from a design world. This is fine, but issues came up that stats exist to solve, and the distinction between multivariate and AB testing was glossed over.

In particular, designed well, multivariate testing can be used to test hypotheses about user models, not just a way to play roulette with font colors and sizes.

There is a robust body of knowledge that lives between statistics, traditional experimental psychology, cognitive modeling, and resting on the shoulders of giants in practical business success through experimentation.

The Exp Platform, led by Ronny Kohavi, at MSFT publishes from this position of strength. Their latest, 7 pitfalls to controlled experiments on the web, is a solid read for those aspiring to live in this space.

AB testing might indeed be a foe to the designer when done without appropriate expert support -- at least for more aggressive evaluations.

Here's a recap of the Seven Pitfalls:

  1. Avoiding experiments because computing the success metric is hard.
  2. Attempting to run experiments without the pre-requisites: representative & sufficient traffic, appropriate instrumentation, agreed upon metrics.
  3. Hubris: Over-optimization without collecting data along the way.
  4. Bad math: inappropriately deployed confidence intervals, % change, and interaction effects.
  5. Use of composite metrics when power is insufficent. An example, not in the paper, is the use of checkout completion for a product page change, when add-to-cart % would be more sensitive.
  6. Un-even sampling: bad balance between control and test distributions.
  7. Lack of robot detection.

I've blogged the guide to practical web experiments and it's also highly recommended. It provides an overview of the key issues to deal with in setting things up including sampling, failure versus success evaluation, and common pitfalls like day of the week effects.

More from the historical '05 SXSW Design perspective with How to Inform Design: How to Set Your Pants on Fire March 14th, 2005 presented by Nick Finck, Kit Seeborg, and Jeffrey Veen

Design Metrics Wrapup

What fun! The SXSW conversation format is quite cool, though it really needs a dedicated space as our group size was limited by how far voices carried.

Drop a line in the comments if we promised to follow up on something I haven't posted. This is a work in progress, so check back if there are some empty items when you visit.

References during the chat

Blogs

Resource Lists

Analytics

Usability Training

Books

...

Thanks to everyone who participated, and to Micah for bringing me along.

SXSW: Driving Design From User Data

I wrote about the crucial conversation at SXSW with Micah Alpern a few weeks ago. The time has come!

In talking through this with Micah, we came back to the crucial insight that the availability of artifacts of the usage of internet software creates an opportunity and challenge for designers. What follows is a reference for our conversation, which will include a short intro and mostly conversation. Subject to conversational flow, we'll be asking the participants to share stories:

  1. What's your favorite HIPPO story? For those of you who haven't encountered the hippo meme, it's about decision making based upon something more than the highest paid individual's personal opinion.
  2. What business or user goal would you like to be informed by metrics?

The talk precis:

Design Metrics: Better Than 'Because I Said So':

Too often designers are put in a position of defending design decisions based on personal preference or an unarticulated sense of expertise. We'll discuss how to use metrics to understand user and business goals. Then how these metrics can be used to evaluate design decisions, make tradeoffs, and shape strategies.

Our goal is to better enable productive conversations with key stakeholders, using the tools of metrics to understand and advocate a position.

In the most productive cases, this means designing with measurement toward end goals in mind. In less developed scenarios, there may be some foundations in need of construction.

There are a lot of reasons to test designs with live users. The most pedestrian is business acceptance testing. We'll be more focused on using metrics to resolve internal debate, multivariate testing learn more about the motivations, mental models, and personas of users, as well as value estimation.

We believe the "Role of Designer " is to drive hypotheses about the user and to internalize results and use to inform future design.

Of course, testing is not the only tool in the toolbox. You have to choose the right tool for the job. Key dimensions:

  • Quantitative, Qualitative
  • Small vs. large scale
  • Advanced techniques: Sequence modeling, learnability metrics
  • Repeat vs. non-repeat visitor
That said, creative techniques with Greasemonkey or limited scale prototypes can make testing available in situations you might not think it's possible.

We're scheduled for 11:30 AM in Ballroom E on Monday. Hope to see you there. If you can't make it, stay tuned for a follow-up.

SXSW Coming Up! Design Metrics: Better Than 'Because I Said So'

I'm greatly looking forward to SXSW 08 in a couple of weeks. I'll be doing a "core conversation" with Micah Alpern:

Core Conversation: Design Metrics: Better Than 'Because I Said So': Too often designers are put in a position of defending design decisions based on personal preference or an unarticulated sense of expertise. We'll discuss how to use metrics to understand user and business goals. Then how these metrics can be used to evaluate design decisions, make tradeoffs, and shape strategies.

While design efficacy can be treated as a contributor to overall site success, there are some more subtle metrics which can reveal specific strengths and weaknesses of design. I'll post a recap following the gig.

Online Video Metrics: How to Deal with Scrubbing?

Over at webmetricsguru.com, Marshall quotes the following key video metrics from Dennis @ Visual Revenue:

9 Essential Online Video Metrics

  • Online video started
  • Online video Pre-roll advertisement started*
  • Online video core content started
  • Online video Post-roll advertisement started*

  • Online video positive consumption action
  • Online video negative consumption action

  • Online video ended
  • Online video played, percentage of total
  • Online video played, seconds
As another blogger points out, things get really interesting when you start to consider embedded videos.

There is a challenge that neither of these authors mention -- what about user timeline scrubbing? Video complete doesn't mean the same thing if the user fast forwarded through most of it. Logging total time, % viewed, and complete gives you a bit of insight into this. Consider this range of user behavior:
DescriptionTotal Time Played% viewedComplete
Full view12:00100%Yes
Fast forward to watch a 2 minute segment2:1818%No
Screencast how-to view with pause, play actions while following instructions 16:00110%Yes
Quick Scan, fast foward, watch, etc5:0024%Yes

There are a lot of subtleties here: Do you double credit re-watching to allow > 100% viewed? If so, you confuse the real meaning of %. It's a good justification for logging % in addition to time, as otherwise, you could simply compute % as a normalization of user behavior across different video lengths.

We've created custom logging in the FreeIQ video player, both for the video embedder (who uses Google Analytics) and the management of the FreeIQ site. We simply log complete, but are working on a efficient way to capture some of the sublteties here.

From this logging, we computed an average 25% video completion for our Going Natural 2 series videos -- not bad given that these are greater than 20 minutes in length.

Eye Miles: Measuring Mental Effort and Design Quality with Eye-Tracking

While eye-tracking is incredibly useful for understanding how humans interact with computers, and websites in particular, it's something of a holy grail to be able to instantly interpret from eye tracking data whether a design is good or bad.

There's a solid amount of prior art here, but no widely practiced & easily obtainable top level metrics. Check out Table 1 from a HCI2007 paper by Ehmke & Wilson titled "Identifying Web Usability Problems from Eye-Tracking Data" for a complete review of metrics considered.

While analyzing the data for the Scrutinizer Click Fu video for a study we did on our Tobii eye-tracker, we computed mouse miles for a couple participants. Here's what we came up with:

This is the distance the eye traveled during a Question/Answer study in which users were asked to choose a search result that had or led to an answer to the question provided.

This "eye miles" metric in concept, if not name, seems to have had it's first appearance in 2002: Goldberg, J. H., Stimson, M. J., Lewenstein, M., Scott, N., and Wichansky, A. M. Eye tracking in web search tasks: Design implications. In Proceedings of the Eye tracking research and applications symposium (ETRA 2002). ACM Press, New York, NY, 2002, 51-58. . Personally, I released a analytics system for measuring mousemiles in 2001. A more robust metric would, to carry the analogy further, distinguish between highway and city miles. For eye tracking, where major and minor saccades (eg. short and long) indicate very different underlying cognitive functions, this might be especially informative.

We'll keep on searching for the best diagnostic metrics from eye-tracking. Perhaps some of them will apply to user activity with our gaze simulating web browser as well?

Google Analytics Site Search: Usability & Business Goals

So Google Analytics site search features have been out for a week. Here's a unique insight that can be garnered by studying search queries from a key location, the shopping cart page, in an e-commerce site.

This collage shows the a drilldown view from Content -> Site Search -> Start Pages to the search terms that were issued from our add to cart page in the last week. We see "invitations" and "camera" where it seems the wedding participant to be is trying to knock out a few more items from the todo list.

While upsells can be done poorly, as described in Top SURL's 10 Shopping Cart Design Errors, good web site design is not just about usability. It's about maximizing the intersection of the user goals and the business goals. If some users are slightly distracted by an upsell that a smaller number find appealing, and that contributes to the bottom line (e.g. it's not obnoxious enough to cause users to bail on the process), that's solid user experience design. SURL cautions against forced upsell "interstituals" during the add to cart process. Best practice is to display these upsells on the cart result page, not as a forced interruption during the process.

The new site search features have already contributed to the bottom line in our e-commerce business. My full analysis, training, and case studies are available to members of StomperNet but I'll be sharing some of the more unique insights here in the future.

More Entries

Built with BlogCFC, version 5.9. Contact Andy Edmonds or read more at Free IQ or SurfMind. © 2007.