Skip to content

Cyber Sale: Save big on Moz Pro! Sign up by Dec. 6, 2024

Search engines 5511dd3

An Idiot's Attempt to Explain Term Vector Theory, Term Weight, and a Slight Argument for Keyword Density

Adam Henige

This YouMoz entry was submitted by one of our community members. The author’s views are entirely their own (excluding an unlikely case of hypnosis) and may not reflect the views of Moz.

Table of Contents

Adam Henige

An Idiot's Attempt to Explain Term Vector Theory, Term Weight, and a Slight Argument for Keyword Density

This YouMoz entry was submitted by one of our community members. The author’s views are entirely their own (excluding an unlikely case of hypnosis) and may not reflect the views of Moz.

I know that term vector theory, term weight and keyword density have been discussed a bit on here, but I've never seen anyone put numbers to it, which helps some people get their head around things.  Hopefully I didn't make any egregious mistakes here...Anyhow, on with the blog.

Among the SEO creme de la creme, keyword density is a phrase not even worth re-hashing, but for many, it remains a mainstay in their SEO lexicon.  I'd be lying if I said I haven't run my own experiments and seen my own successes in the past with upping my keyword density, despite seemingly "knowing better."  I've always had some knowledge of term vector theory, and had an understanding at only the highest of levels, but never actually sat down and tried to walk my math-retired mind through it to truly get it.

I'm not even certain I'm explaining this properly, and I do not claim to be an expert, so please correct or add where necessary.  This, is my attempt to explain term vector theory, and why it makes keyword density irrelevant (in most instances).

As a refresher, basic elements of the term weight model include:

  • Term frequency
  • Maximum term frequency in a document: the number of times the most frequent phrase (of the same length as the target phrase) appears in a document
  • Document frequency
  • Total documents
The equations of interest are:
  • Term weight: term frequency/maximum term frequency in a document*log(total documents/document frequency)
  • Inverse document frequency, which is just the second half of the term weight equation: log(total documents/document frequency)

The term weight is the article of interest from an SEO standpoint.  If we assume this is a part of the search algorithm, we would like a high term weight (but much like the wheel on The Price Is Right, without going over - i.e., keyword stuffing).  Keyword density is discounted in this formula as it's only a ratio of one document, while this model takes into account the larger database (search engine) environment.

Keyword density makes us feel warm and fuzzy because we can calculate it.  But, in the grand scheme of things, that really means nothing.  Why?  Because in the term vector model, we're dealing with unknowns.  We do not know the exact document frequency, and we do not know the exact number of documents in the database (unless you willingly believe the ever changing number of results offered).  Further, if you look at the equation, the logarithmic function helps create the inverse document frequency, which helps to normalize results.  Let me explain this with a tangible example.

Let's say we have two pages optimizing for the phrase "Netvantage Marketing."  The first page has a frequency of three for this phrase, while the other has a frequency of six.  The maximum frequency for a two word phrase on each page is six.  Theoretically, one would think that the document with higher frequency of "Netvantage Marketing" would have a higher term weight, but that's dependent on the other variables.  The inverse document frequency is largely impacted by the number of documents containing the term, the document frequency.

tv-4

As the true document frequency number, as well as the total documents are unknown (unless you're willing to believe Google's ever changing number of results), you're really just shooting at a moving target.  Here, the page with the term frequency of 3 actually has a higher term weight.  So, to some extent, one cannot confidently say that a higher "keyword density" or term frequency will actually lead to better results.

As you can ascertain, the more documents a term appears in, the more it's marginalized, thus causing less impact when changing frequency.

tv-5-copy

So this is why people, particularly in competitive markets, pay little or no attention to "keyword density" and instead focus on writing good content.  The way to win in the SERPs for those markets will be through links.

However, the more I dug into this, the more I realized that term vector theory and term weights actually validate the positive results that people claim to get (me being one of them) when it comes to increasing keyword density/frequency.  In less competitive markets and when going after long tail terms, the math actually makes sense.  Consider that in these markets the document frequency doesn't fluctuate or grow at the rate of hyper-competitive markets.  That said, changing the term frequency can have more of a dramatic impact, especially when considering competitor sites probably aren't well optimized on page and probably don't have great link strength.  Check out this simplified example of this scenario:

tv-6

If the environment doesn't change much, the effect of the number of total documents is little; it's the document frequency's stagnation that allows for a (seemingly) sustained advantage in this non-competitive market.  So, if you're in such a market, by all means, fiddle with your keyword density/frequency, you might see some serious results through improving your term weight.

Now whether or not this is how search engines use term vector theory/term weight, I can't say for sure, but hopefully this either sheds some light or starts a good conversation on the subject, as I'm still evolving my understanding.  And if I've got this entirely wrong, it gives someone a lot smarter than I a chance to clarify my stupidity! 
Back to Top
Adam Henige
Netvantage Marketing provides PPC, SEO, Link Building and Social Media services.

With Moz Pro, you have the tools you need to get SEO right — all in one place.

Read Next

How To Create Helpful Content Post-HCU

How To Create Helpful Content Post-HCU

Nov 05, 2024
What Is Keyword Intent and How Does It Impact Your Conversion Rate?

What Is Keyword Intent and How Does It Impact Your Conversion Rate?

Oct 09, 2024
How Pipedrive Increased Organic Sign-Ups by 33% with BOFU Content

How Pipedrive Increased Organic Sign-Ups by 33% with BOFU Content

Sep 18, 2024

Comments

Please keep your comments TAGFEE by following the community etiquette

Comments are closed. Got a burning question? Head to our Q&A section to start a new conversation.