With the announcement of updates to the Google referral string it's now important to update the script and there's some serious new features.
For those unfamiliar with the SEO Position script, in the picture above/right, the query "analytics motion charts" generated a click from Page 1 of the Google results while "motion charts" generated a click from Page 2.
SEO Position Plus tracks more types of referrals than the original script. All the categories logged to Google Analytics events are prefixed with SEO.
Mark at MivaMerchant wrote up a great tutorial (Stomper Members: See my video in the portal).
So that's all cool and useful, but the last item "SEO Google Position" captures the exact rank of the result clicked on Google to generate this visit. The announcement says the referral change is being rolled out and we're currently only seeing it for 1/10 to 1/40 of traffic.
I expect the data volume to increase and for this to provide much better data on ranking, ranking changes, and the ROI on ranking changes.
Here's what you'll get when a referral comes through with the new cd parameter:
The script is available at /abtest/includes/seopositionplus.js and is released as open source under the Mozilla Public License. Use it for free for whatever you like, but if you make it better, you have to share!
Place the script following your Google Analytics code/ Copy the SEOpositionPlus file to your server and add the following line beneath your call to pageTracker:
The most tantalizing aspect of this is the notion of a "fast index", perhaps in-memory on many servers, dedicated to indexing (and computing authority or PageRank) for rapidly moving content like Digg and YouTube video honors. In general, with twitter bubbling, the notion of real time search is focusing the industry on one of Google's key relevance metrics, freshness.
As I've been compiling my thoughts on this, I created a nifty Prezi with some observations on Jeff Dean's content.
Some 10 years ago, Google had to flip indexes to accomplish updates. This is described as happening on a per machine basis. We've seen increases in the speed of updates but recently the degree to which Google is paying attention to fast moving social media suggests an revolutionary speedup.
Jeff's talks hint at some of the mechanisms.
Take AwayI'm still pretty early in assessing the impact of a new understanding of the underlying mechanisms, but I'll offer one hypothesis: Google is now capable of detecting the duration a link lives on a "hot list" like Digg's upcoming page. This means a successful social media promotion can have a much greater effect than simple social media participation.
We see the amount of diggs affect how long it takes for a Digg permalink page fall out of the top rankings. It's likely that the anchor text, or title of the Digg, is added to the index record for the page -- So pick your social media link text very carefully. It's also likely the thing that has the biggest impact on the long term effect of social media promotion.
A word from our SponsorNeed more search engine success? We give you the basics and the hard-hitting science in the Stomping the Search Engines 2 DVD course. I teach the understanding search engines segment and try and walk a fine line between the basics and deep, long term insights.
Get it for just a $1 when you try the Net Effect magazine from StomperNet. It covers traffic, conversion, social media, business building and operations, and more.
The SEO Position script now supports Yahoo, MSN & Live, in addition to Google. The name of the event has changed from "Google SEO" to Google. Thanks to Jim M. from Bunk Beds Now for the help expanding the script.
Alas, it was not originally apparent that logging a source event will flip the bit on "user bounce" and deflate your bouce rate for a page. Rapid detection of near-page-1 rankings may be worth the trade-off.
I've also updated the UI Region Logger to use eventing instead of synthetic page views. Just flip the ui_useEventing boolean to true in the script source.
Read more in the orginal post, but to recap, this script requires that you tag key areas of your interface with a UI attribute. For example, you'd add ui="sidebar" to the div that contains your sidebar. With every click, the script walks up the DOM and checks if there's a UI label above it. I'm looking forward to providing a tool to do a heatmap style visualization of click regions once I build up a good data set with this one. The data is much easier to isolate than in the prior exit link mode.
The image shows the first data from this site with one click on the tag cloud in the right sidebar and 2 clicks in the tools menu.
TheRarestWords.com is a impressive little web hack with a nice global and enthusiast artisan. The application attempts to identify the rarest words on the page.
Why would you care? First, understanding where you veer from the mainstream is quite interesting. It's a great way to find mis-spellings, people's names, and other less pedestrian rareties.
|An Aside: Language is thought to be infinitely generative. Perhaps ever human utters a huge number of completely unique statements, ignoring person and location names to give the computation a fighting chance. Hard to say... likely a typical long tail situation, maybe with a little more tail.|
The buzzword in IR is TFIDF, or term frequency inverse document frequency. This is a method for giving more importance to the less common words in a document that match the query. Mid-range frequency words get discounted, but they're likely key terms, if the page is truly relevant, and often repeated.
Rarest Words at AlwaysBeTesting
Moving beyond term frequencies gets you to n-grams and the requirement to recognize frequencies of multi-word segments. Here, basic part of speech tagging and related tech can really help reduce the problem set -- or you can go the hard way and simply embody part of speech tagging by crunching huge quantities of a language's text. Google has published a database of n-grams, 1,024,908,267,229 of them in fact spanning 13,588,391 unique words with frequencies over 200. They don't report how big of a web crawl generated this database.
Think about "jade apple tree". Jade is going to be truly rare. If you do n-grams, you can detect that "apple tree" is a common two word pattern and give credit for the co-occurrences' infrequency. I'll return to the impact of the degree of common use of a word in search at the end.
The tool exposes your most unique word uses and your uses of very common words that don't quite approach the stop word list level.
TechCrunch brought the rarest words.com to my attention. I ran it on my analytics & e-commerce focused blog, alwaysBetesting.com. It returned an interesting set of related blog suggestions based upon rarest words. The author is quite humble on the feature, but it's an interesting hack!
- LukeW's Functioning Form shares a focus on eye-tracking and user experience in online transaction while ROI Revolution focuses on Google Analytics and transaction modeling.
- Things get interesting with Bill Rempel.com -- numbers junkie in stock trading.
- A social software site and a beyond blogging site all make sense and are perphaps something of a random sample given theRarestWord's partial database of word frequencies.
- Blog conversation is exposed in the link to my fellow StomperNet Faculty Dan Thies' blog.
Some of the other suggestions were a bit more offkilter. Perhaps due to a bit of a word fetish, I turned up a few oddball matches. Rare words on my site include deliberative, sxsw, subjective, onerous, and quantifying. Some of the common words are lack, cool, level, solution, opinion, and quick.
A categorization feature produces somethings that don't quite look like categories, but are sensical none the less: Use Cases,Web Designing, Understanding, Designer, Internet Business, Toolbox, Marketing Strategies, Tasks, Evaluate, Recommend, Tool Box, and Requirements.
Hat's off to the craft Russian coder on a hobby project!
Is TheRarestWords an SEO Tool?If you're truly advanced in targeting content to user needs and variations in expression in a way to maximize your coverage of the query tail, then this type of analysis is quite productive for SEO. For most folks aiming at SEO, the fact that less frequent words are less frequent means that you don't really care. You'll likely be amazed at the mid-frequency queries you match at just by occupying your niche and doing the basic practices well.
I've even considered a similar app but the ROI for most site owners is in good, accessible markup and solid off-site promotion strategies. As it happens, we at StomperNet just released a SEO evaluation tool along the lines of existing predecessors but free (with email subscription) and including numerous instruction videos on corrective actions. Check out Stomper Site Seer if you're really aiming for traffic.
In a regular afternoon check of techmeme.com, I was intrigued to find a linkbait post of a most intriguing kind from Search Engine Land:
Get a Free Link from Wired
Sure enough, Wired has a wiki. I quickly hacked up a how to on conducting usability testing, plugging my own Scrutinizer Browser which is an interesting way to empower a novice with expert level ability to observe the thoughts of a test participant.
I didn't pay much attention to what Danny had posted to Wired, but it turns out he spammed them with non-how to promotional content. I did manage to check back, after calling attention to opportunity to my fellow Stomper faculty, Don Crowther. We're in the midst of launching a course on how to do social marketing. But unlike Danny's post, mine served Wired's presumed goal of accumulating a large number of how to articles.
Danny's post has been updated a couple times, so it's a bit late to do a blow by blow. Lots of folks followed suit in spamming wired, consuming Wired server and brand capital for purposes not advantageous to Wired. Editorial is keeping up with the low flow of crud thankfully. Search Engine Land has apologized for inviting spam.
There was an early update that I found quite homourous:
Postscript: Seems like Wired is now calling our test entry spam and deleting it. Plus, Ross Mayfield, Wired's Wiki Editor is incorrectly saying that nofollow doesn't "work" on wikis.
Ross Mayfield, a true technologist, countered that no-follow is inappropriate for wikis. Danny@SEL rebuked this statement strongly and others followed suit. When all the engines jumped on the no-follow bandwagon in January '05, I wrote a post title "Settling for Just Good Enough":
When Ross says that no-follow doesn't work for wikis, he's speaking from a semantic POV, not a SEO/business/untrusted content one. For the wiki to work, a wide range of potentially contrasting viewpoints are possible.
While the recent move across multiple industry players to support rel="no-follow" on links is a positive step, it falls rather short. Vote links, with -1,0, or 1 values, would have been a much more interesting solution to this problem and left room for the community to engender evolution, instead of simply elimenating a threat to the already plotted growth.
I'm confident that blogging tools will soon support this for comments and referrer links, but I regret that the effort will be spent on the most impoverished conception of link typing...
I find it sad that a linkbait stunt forced Wired's hand into going the no-follow route when they might have found an editorial or user feedback route to keep SPAM under control. Just because Wikipedia made what seems a smart decision to no-follow, doesn't mean that's the right solution for Wired How-To -- a smaller and more restrained content space.
I look forward to the day when a major search engine takes on not just a single boolean about the level of endorsement of a link to supporting a range of levels of support, or even relationships.
It's long overdue that analytics packages diagnose the quality of site search and Google announced several additions coming soon to Google Analytics at EMetrics. While GA is late to the game with on site search query term analysis, they are adding some diagnostics for user success with search.
As a partial solution, I historically have set up a funnel mapping search result to clickthrough success by adding custom urchinTracker results to all search results pages and pages referred by search results. Here's what the various views looks like:
Search success is a tough thing to measure. Search, then click is a clear success pattern, but what about search, refine, and then click? What if users click several times?
Check out Avi's post for screenshots of the new reports and my post in the webanalytics forum for some of the wider issues.
Additional features are coming for more detailed event logging, including default outbound link tracking, but the details are not yet available.
I'm just back from SIGIR 2007 where I interviewed Jim Jansen on his long term research on query strings as well as more recent work on the nature (quality, perception) of search engine sponsored ads. Finally, we talk about new work Jim presented at SIGIR studying the search process as learning (not decision making).
Check it out at Free IQ.
About Search @ MSFT
- About MSFT Search Technology: SearchEngineWatch covers the neural net ranker.
- NYTimes interviews Steve Berkowitz: "Sometimes the connections to the engine room aren't there."
Cool features at MSN/Live Search:
- Super cool ajax image search
- 2005 Operator release: filetype, contains, etc.
- 2006 Operator release: Macros and LinkFromDomain. See also my post on "diggRank".
- Using linkfromdomain and inUrl: digged youtube videos in RSS format.
Listen here...Segment 1 - Looking at MSN's Live Search engine