How Google Knows What Sites You Control And Why it Matters
The author's views are entirely their own (excluding the unlikely event of hypnosis) and may not always reflect the views of Moz.
Google obviously looks at a great many factors in determining site rankings, but what do they know about us as administrators of our sites, and how do they use that information? In today's Whiteboard Friday, Cyrus talks about some of the actions that Google takes based on information it can see, offering advice on how you can see the inherent benefits and avoid the pitfalls.
For reference, here's a still of this week's whiteboard!
Howdy Moz fans. Welcome to another edition of Whiteboard Friday. My name is Cyrus. Today we're going to be talking about how Google knows what sites you control and why it matters. How much does Google know about the sites you own? Can that be used to your advantage? Can it be used to hurt you? These are important questions that webmasters often ask.
Now technically, when there are relationships between websites, when you own websites, you have a lot of websites you control, this is traditionally known as an administrative relationship, meaning that you are an administrator. You can control the links on the site. You control the content of the site. Maybe these are sub-domains that you own. Maybe these are multiple properties within your business.
But Google, over the past decade or so, has spent an incredible amount of energy trying to figure out administrative relationships between websites, both to help you and sometimes to discount links between those sites or potentially some negative consequences. The reason that this is so important to Google is because, if you think about the link graph and relationships between sites, links between sites that are controlled by the same person probably shouldn't count as much as links that are editorial and controlled by other people, because when you get links, you want them to be natural and not something that you control.
The Good Side of Related Websites
But other times, Google wants to reward you for links that are related to one another. There are some definite advantages to establishing those relationships between sites that you own, and sometimes you want to tell Google that you own multiple sites. One example is to distribute the authority between those sites.
Now a perfect example is something like eBay. eBay has a site in the United States, and they open a brand new site in Ecuador. They want that Ecuador site to rank well, but they don't want to start over. They've already built up their American site so much. They want to transfer some of that link equity. So they want to let Google know that, "Hey, this is eBay. This is us. This should be an authoritative website."
This also works on a much smaller scale too sometimes, often on sub-domains. You see a lot of blogs being started on subdomains' websites because it's easier from a development point of view or for whatever the reason. You want that sub-domain, that blog to have the same authority as your main site. Now it's oftentimes up to Google whether or not they give that authority to your blog or your sub-domain. But if you can give them signals to tell them, "Yes, this is associated with my main domain," that often goes a long way in helping that sub-domain to rank.
Same with alternate languages. You have French content. You have Spanish content. You have English content. They're all on your site. Maybe they're on a different sub-domain or a different top level domain, but you want Google to know that they have the same authority as your main site that you worked so long to build up.
Also, we're starting to see identity play a role in administrative relationships, more at a page level with things like Google Authorship and things like that. But identity is becoming a big issue, and Google is working to figure out those identities on the web.
Negative Side Effects of Related Websites
Then there's the flip side, the bad side of administrative relationships. That's traditionally what SEOs and webmasters have been dealing with when they think about these things. The biggest problem is diminished link equity. Again, that problem of Google seeing that you control these sites, why should they pass as much link equity as sites that you don't control? So a lot of black-hat SEOs and gray-hat SEOs go to great lengths to hide their relationships between sites, because they don't want Google to discount that link equity.
Also, there's the idea of link schemes in bad neighborhoods. If there are 12 sites, and they're all interlinking to each other, that might be a pretty good signal to Google that it's sort of a link scheme and those links shouldn't count, or they could be penalized.
Finally, we're seeing a new phenomenon in Google: penalties following people around the web. These are instances where people are penalized. They burn their site to the ground. They're so frustrated. They decide to just start over on a completely new domain. But when they do so, ironically, amazingly, they find the penalty transferring to that new domain, even though they've cut all the backlinks. They've changed the URL, everything. How does Google know that that's the same site?
So these are important questions to ask yourself and help determine: Can you be helped by establishing these relationships between sites, or can you be hurt? If you understand some of the signals Google is using, you can take advantage of this.
Potential Signals of Related Sites
Now one thing I want to emphasize is we don't know all the signals. We have a few clues. Traditionally, Google has been looking at things like ownership, WhoIs records, very freely available on the Internet, where your site is hosted, the IP address, things like that. Elijah, what's the name of that website that we go to, to check who owns what?
Elijah: SpyOnWeb?
Cyrus: That's right. SpyOnWeb. Here's a simple experiment you can do. Go to SpyOnWeb.com. Type in a very common domain, like Moz.com. You can see all the relationships that we have, Moz, with all these sites that we either own or hosted on the same IP or same Google Analytics code or the same AdSense code. All this information is publicly available on the web. You don't need access to your Google Analytics account or your AdSense account. It's all there in the source codes of the websites.
By scraping the web and gathering all this information together, you can create a web of ownership that's pretty easy to dissect. Traditionally, C-blocks have been an indication of relationships on the web. It's something, at Moz here, we report in Open Site Explorer, number of unique linking C-blocks.
Right now though, we are in a transition with this, where the web is moving to a new Internet Protocol version, Version 6 (IPv6). The old C-block was based on Version 4. So C-blocks, it's actually going away, and the engineers here at Moz, working with some very smart people in the consulting world, such as Distilled, we're figuring out some new standards to report instead of C-blocks because we're losing these very soon.
Also, link patterns, when you have, again, a lot of sites linking to each other, and Google has a complete catalog of links or the most complete catalog of links on the web, when you take all this together, using various statistical analysis methods, you can determine pretty closely who's associated with what, who has control over what. These are all things that people are looking at, all that publicly identifiable information.
Some signals that people don't often consider are what I would call soft signals or content signals. These are more advanced signals that people don't actually always think about, but things that Google could look at, that we've seen them talk about in patent papers, are things like when two sites have identical or similar content, meaning content on Site A is the same as content on Site B. This would be a strong clue to Google that it may be the same site. They would probably look for a few of these other things, such as who has registration or analytics code or something like that, because a lot of sites get scraped. It's not a very clear signal.
But if you're simply moving your site from one site to another to escape a penalty, that may be not enough if you're using the exact same content and some of those other things, such as similar images. Two images hosted on different sites with the same content could be an indication that the sites are owned by the same entity.
Formatting, CSS, you'll often see sites that are owned by the same individual use a lot of the same WordPress templates, for example, or a lot of the same CSS files or JavaScript files. Again, by themselves, this is not a definitive clue because there's a lot of templates out there, a lot of free stuff floating around the Internet. But when combined with the other signals, it can create a very, very clear indication of those relationships.
Even something as simple as the contact details on your About Us page, if those are the same from site to site, it can be very clear that these sites are related.
Then on the page level, we have things like authorship. I've seen this work really well with in-depth articles, certain authors. This isn't a domain level signal, but more of a page level signal that can help individual pages to rank.
For content and language signals, the hreflang. Again, this is when you have sites in different countries, different languages, using this attribute can help establish those relationships to help you to rank.
So in general, it's very hard to hide these relationships from Google, because they have so much data available, and it's really not worth it. But oftentimes it is worth it in the cases of sub-domains, alternate languages, authorship, that you want to help boost these signals. Understanding how these all work can give you clues as to why you're ranking, why you're not, and sometimes what you can do to help.
That's all for today. Thanks, everybody. Bye-bye.
(hint... special bonus scene starting at 8:03)
Comments
Please keep your comments TAGFEE by following the community etiquette
Comments are closed. Got a burning question? Head to our Q&A section to start a new conversation.