The Freshest Linkscape Data Ever

Moz News

Since the launch of Open Site Explorer and our API update, Chas, Ben and I have invested a lot of time and energy into improving the freshness and completeness of Linkscape's data. I'm pleased to announce that we've updated the Linkcape index with crawl data that's between two and five weeks old—the freshest it's ever been. We've also changed how we select pages, in order to get deeper coverage on important domains and waste less time on prolific but unimportant domains.

You may recall Rand's recent post about prioritizing the best pages to crawl, and mine about churn in the web. We've applied some of the principles from these posts to our own crawling and indexing. Rand discussed how crawlers might discover good content on a domain by selecting well-linked-to entry points:

In the past, we've selected pages to crawl based purely on mozRank. That turned out to favor some unsavory elements (you know who you are :P). Now, we look at each domain and determine how authoritative it is. From there we select pages using the principle illustrated above: Highly linked-to pages—the homepage, category pages, important pieces of deep content—link to other important pages we should crawl. From intuition and experience we believe this gives the right behavior to crawl like a search engine would.

In a past post, I discussed the importance of fresh data. After all, if 25% of pages on the web disappear after one month, data collected two or more months ago just isn't actionable.

From now on, we're focusing on that first bar in the graph above. By the time our data approaches that second bar (meaning most of it is out of date), we should have an index update for you. If and when we show you historical data, we'll mark it as such.

What this means for you is that all our tools powered by Linkscape will provide fresher, more relevant data, and we'll have better coverage than ever. This includes things like:

As well as products and tools developed outside SEOmoz using either the free or paid API:

There are plenty more. In fact, you could build one too!

Because I know how much everyone likes numbers, here are some stats from our latest index:

URLs: 43,813,674,337
Subdomains: 251,428,688
Root Domains: 69,881,887
Links: 9,204,328,536,611

Our last index update was on January 17th. You might recall some bigger numbers in the last update. Because of the changes to our crawl selection, our latest index should exclude a lot of duplicate content, spam pages, link farms, and spider traps while keeping high quality content.

Our next update is scheduled for March 11. But we'll update the index before then if the data is ready early :)

As always, keep the feedback coming. With our own toolset relying on this data, and dozens of partners using our API to develop their own applications, it's critical that we hear what you guys think.

NOTE: we're still updating the top 500 list at the moment. We'll tweet when that's ready.

The Freshest Linkscape Data Ever

Table of Contents

The Freshest Linkscape Data Ever

Scale revenue from SEO with Moz Pro

With Moz Pro, you have the tools you need to get SEO right — all in one place.

Read Next

Convince Your Boss to Send You to MozCon 2025 [Plus Bonus Letter Template]

How To Drive More Conversions With Fewer Clicks [MozCon 2025 Speaker Series]

London Calling: 15 Activities to Bookend Your Trip to MozCon

Comments

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved

The Freshest Linkscape Data Ever

Table of Contents

The Freshest Linkscape Data Ever

Scale revenue from SEO with Moz Pro

Get the latest SEO tips and strategies in your inbox

With Moz Pro, you have the tools you need to get SEO right — all in one place.

Read Next

Convince Your Boss to Send You to MozCon 2025 [Plus Bonus Letter Template]

How To Drive More Conversions With Fewer Clicks [MozCon 2025 Speaker Series]

London Calling: 15 Activities to Bookend Your Trip to MozCon

Comments