More info on the Caffeine Update
by Matt Cutts on August 10, 2009
in Google/SEO
Google recently opened up a preview of our new Caffeine update, and I wanted to give a little more background on this change. At the Real-Time CrunchUp a few weeks ago, I joked that the half-life of code at Google is about six months. That means that you can write some code and when you circle back around in six months, about half of that code has been replaced with better abstractions or cleaner infrastructure. Six months is an exaggeration, but Google is quite serious about scrutinizing our codebase regularly and rewriting the parts that don’t scale well to make them more robust, more elegant, or faster.
Here are some questions and answers:
Q: How do I check out the Caffeine update?
A: If you search on www2.sandbox.google.com you can get a preview of how the search results will change over the next few weeks and months.
Q: It doesn’t look any different to me?
A: The Caffeine update isn’t about making some UI changes here or there. Currently, even power users won’t notice much of a difference at all. This update is primarily under the hood: we’re rewriting the foundation of some of our infrastructure. But some of the search results do change, so we wanted to open up a preview so that power searchers and web developers could give us feedback.
Q: Is this Caffeine Update because of Company X or Y is doing Z?
A: Nope. I love competition in search and want lots of it, but this change has been in the works for months. I think the best way for Google to do well in search is to continue what we’ve done for the last decade or so: focus relentlessly on pushing our search quality forward. Nobody cares more about search than Google, and I don’t think we’ll ever stop trying to improve.
Q: The url www2.sandbox.google.com doesn’t seem to work for mobile phones? I can only test on google.com, not google.co.uk?
A: That’s right. For now this is a only a preview, so we didn’t hook up a mobile version or an international version at this point. You’ll have to search on google.com to see the results right now.
Q: How do I give Google feedback?
A: If you want to give us feedback on how the search results are different, look on the search results page for a link at the bottom of the page that says “Dissatisfied? Help us improve.” Click on that link and type your feedback in the text box. Make sure to include the word caffeine somewhere in the feedback.
Q: Is there a way to give feedback in person?
A: Yes! If you want to give me feedback in person, I’ll be at Search Engine Strategies San Jose this week. I’m doing a site review panel on Thursday, or just walk up and say hello!
You can also read more about this change on Techmeme if you’re interested.
Update, August 11, 2009: I did a video interview about the Caffeine update with Mike McDonald. The Caffeine info begins about 1:15 into the video. You can also enjoy seeing my very-short summer haircut in the video.
Update, August 12, 2009: Embedding the video interview directly:
{ 187 comments… read them below or add one }
It appears to filter the results by using &gl=uk, but is it giving true results of the new engine?
By the way, if you want to learn a lot more about Google updates (data refresh vs. algorithm change vs. infrastructure change), here’s a few resources:
www.mattcutts.com/blog/explaining-algorithm-updates-and-data-refreshes/
www.mattcutts.com/blog/whats-an-update/
www.mattcutts.com/blog/more-info-on-updates/
and I also made a video back in 2006 that’s pretty relevant:
Whatever change you made, I like it. All of my personal sites, and the sites of the company I work for rank higher on the new results.
Good job!
Seriously though, for a few selected queries, I’m seeing more authoritative sites rank higher. So that’s good.
Hi Matt,
Been sussing this out all morning, from the Google blog and your post its seems there are algo changes / updates on the current sandbox. I’ve noticed some definate changes in the how the SERP is different in the sandbox in my play this morning.
Could you confirm this?
Matt,
Can you (or another engineer at Google) speak more about the specific infrastructure changes? I ask as a fellow curious software developer.
For example, (just speculation), “We reworked the indexing backend to automatically push continuous real-time updates to all datacenters,” or “The ranking system automatically reweights individual algorithms based on real-time clickstream data from google.com,”
Just looking for some cool thing to ogle over at the water-cooler tomorrow
– Dan
It’s always frightening when search engines start screwing with their algorithms, but Google seem to get it consistently right. Looking forward to it!
Hi Matt, thanks for the post, i always find them very informative. I’m all for the best search results, after all i use google to find what i want.
I have a site that has been de-indexed from google. I requested reconsideration but still not indexed. i have posted a question about it Sitepoint Forum Question if you can help. cheers scott
Nice write up Matt…
My website shows my free host url instead of my purchased domain name?
Caffeine!
Search results are very different IMO. Perhaps better? Not sure what to make out of it and how it effects website owners who gets traffic for specific keywords (I am not that good in SEO). I have searched for the keyword “linux” and “chess” both gave me different results for the sandboxed version and the regular version. Looking forward to it – I have faith in google to do a good job.
The “power googler” side of me likes it but the “website owner” side of me is a bit concerned.
Like what I see, though I have to admit it isn’t very different from the regular Google currently. Does it also give a preview into the future of currently penalized sites?
I find the timeline and related searches quite interesting, great for reviewing relevant keywords.
The reviews option, are the results primarily from blogs and review websites?
One major change I am seeing does appear to involve Twitter. A search for BOB BIGELLOW on the standard Google search shows Bob’s Twitter profile as the first result, followed by some other Bob Bigelow’s personal site (one L) as the second link. However, a search for BOB BIGELLOW on Caffeine shows Bob’s Twitter profile as the first result, followed by an additional indented result from Twitter as the second result. Furthermore, there is a plus-link “Show more results from twitter.com” that appears below the second result. Clicking this shows five additional links (instantly?) all from Twitter and all associated with Bob’s Twitter content.
john chen and Daniel Sterling, most of the changes are in things like our core indexing, so there’s less changes for things like rankings. Lots of users won’t notice a big difference.
pavs and McMohan, we’re not looking to make huge changes in ranking with this new infrastructure. Some rankings will change, but that’s not the main thrust of the infrastructure.
Steven, if you don’t want to mention your domain name/free host url here, feel free to use the “caffeine” feedback mechanism from the post to pass on the specifics.
I have one personal blog and some sites working on. While searching on Sandbox version I found its a big ranking difference for my blog but for my other sites is quiet same.
Ok Matt, my question is, “Is the result will be exactly what we are watching in sandbox version or it could be different?”
Search is the soul of Google. It is the bedrock of Google’s entire operations. I trust whatever Google’s tweaks and updates of its search capability, it is for the benefit of the world.
My initial comparisons running the new caffeine search engine result pages vs. the current / not so current crawl data did show significant improvements for 2 distinct types of results (1) a broader interpretation of context across synonyms and stemmed keywords and (2) a Vince update remix which favored authority-rich branded sites such as Technorati and Facebook.
It will be interesting to see just how far, deep or frequent the crawls will be for content bordering on supplemental. The notion of pages “getting a second chance” in droves of billions for everything that falls in between should make for some interesting rankings through the fusion of the old and the new.
Is the swap of engines imminent at this point, or just a juxtaposition at present to see which wins the stand off?
Hi Matt,
I’m working in Turkey’s largest online book seller. Our pages rank well on 1st page of Google.com.tr but none of our pages rank well in cafeine. And after for a couple of searches, cafeine gave 403 forbidden error. It is written “We’re sorry, … but your query looks similar to automated requests from a computer virus or spyware application”…
I think there’s something wrong with the Turkish version of Cafeine. Am I right??
Hi Matt, I hope to run into you @ SES … but If I don’t, can you cover this on a future video, post, etc. How can we “teach” Google NOT to do “Did you mean: xyz” when there is a site that matches what the user was looking for?
There are a few domains that are important to me, and Caffeine thinks people are searching for .com instead of .somethingelse
Try this one, search for Molly.FR or Molly.co.uk, Google has
Did you mean: molly.com
and below that, the site the user was looking for.
But If you search for molly.se or molly.dk, you do not get the “did you mean.” Why?
(I’m not associated with the above Molly sites)
Thanks, @SocialJulio
As always looks interesting. The ‘new’ database must be crawling far deeper as i see i just of x10 on number of results returned for some keywords.
Do we know the timescale for the move to the new engine ? Are we talking this year or or does it still have a long way to go ?
Going through changes often brings some hardship, but we need to coupe up with it in order not to leave behind.
Hello,
You mentioned there is no UK preview yet, but the same upgrade will still be happening on the international sites I guess. Will the roll out be the same time.
Fascinating stuff!
Julie
Hi Matt
The URLs are using # to store parameters rather than a standard query string. Is this just for the sandbox or will this be the case when the change is rolled out?
www2.sandbox.google.com/#hl=en&q=car&aq=f&oq=&aqi=g10&fp=Zf6F-1XmmwQ
Matt, Google is broke again, all my site pages are no longer #1 for all my chosen search terms. Please fix ASAP
BTW, you do realise you are causing about 1 billion EXTRA Google searches per day by posting this info? Of course you do
The best thing about www2.sandbox.google.com is that there are absolutely no sponsored links in the results. This is my new home page!
Some very interesting changes in rankings here. Only minor, but some of my sites have gone up and some down.
Now I have to try and figure out why?
Is the blue cross next to a result denoting Google Business/Maps listing new or is this only peculiar to the .com SERPS? I can’t recall seeing it in .co.uk results.
The results seem to be the same for first page, although I’m seeing some pretty low rankings further down for pages of mine that were on page 3 / 4…
Funny thing though is people are spamming the G Webmaster Blog link you gave us Matt – maybe you should change their usernames to deathwish1 and deathwish2!
Thumbs up on this update — I hope it is released soon — any idea when it will actually be moved to production??
For my site, which is a quality content site, we are regularly having other ‘marketed’ websites getting higher results in google and this update seems to filter out a lot of the pages which got to the top by marketing over content. Kudos to google!
it is not different for my web sites that real search on google.com
mayby it’s better=)
it was usefull video to watch
Thanks Matt, thanks for the post and to let us know about Google caffeine updates.
Bing is still producing better results…….
As far as feedback goes, once Google starts sending me a check for my time, I’ll provide feedback but since I don’t have my lips firmly planted on Matts rear, I don’t work for free.
All those SEOs who give feedback about their sites are providing Google a nice target to shoot at.
Google is the evil corporate enemy and must be treated as such.
I’ve been keeping an eye on quite a few searches for several months now. It sure does seem like a lot of irrelevant content is gone. Nice work, Google!
Back 5 or 10 years ago there was a lot of talk about the importance of pages that actually had the information you were looking for versus pages that had lots of links related to the topic you were searching for. Since then, Google seems to have been focusing on returning pages that have the information. The sandbox version seems to focus on pages with links a bit more on some of the searches I did (City, ST and “free books online”) The difference is slight (maybe one or two extra “link pages” in the top 10) but noticable.
Interesting! What I’m most curious about at this moment, is if there’s a specific reason for the fact that some terms deliver less results then before. In the Sandbox most search terms provide much more results then the normal results, but I’ve found some terms which provide less results. We’ll see Anyhow my personal opinion is that the results look better and join up better, eventough the differences are not that big.
Results that return a stock ticker aren’t showing the chart – though this may well be expected.
I also noticed that page modification dates don’t seem to be appearing whereas they do on a normal search – not sure if that’s deliberate or another side effect of the testing setup.
i made a page to compare the two its here Google vs Google caffine it helps to see the two side by side
My first impression, after just a quick comparison:
Lots of changes in the snippets. Is that because of working with a separate index or a deliberate change?
Except for some snippet changes, for most queries the first 4 or 5 results seem to be unchanged. The next 5 or 6 change drastically.
Hi Matt, i’d like to bring to your attention a serious YouTube bug that’s been around for more than a month (subscriptions basically aren’t working anymore for a lot of people). Here’s just one of the forum threads:
www.google.com/support/forum/p/youtube/thread?tid=5e4bc580dae2eb68&hl=en
Maybe you can forward this to the YouTube guys… Sorry for being off-topic.
Thanks!
just did some spe