Using Multiple Regression to Predict Organic Traffic
This YouMoz entry was submitted by one of our community members. The author’s views are entirely their own (excluding an unlikely case of hypnosis) and may not reflect the views of Moz.
As we wade through endless rows and columns of ranks, rates, and revenue, it is important to never lose sight of the metrics that drive our most actionable findings. If conversion is king, visits are vassals, the substantive backbone upon which a website’s success depends. Given that the maximization of traffic is key to our success as SEOs, it is vital to identify and control the factors that determine this metric. Statistical analysis suggests, however, that we may have far less control over the ebb and flow of organic traffic than we may think.
Two obvious factors that contribute to variation in organic traffic are changes in rank and search volume. By the power of multiple regression, we can assess the extent of the effect that changes in rank and search volume have on changes in organic traffic for a website. Multiple regression is a statistical test used to define the relationship between a dependent variable and the multiple independent variables that determine it.
The data analyzed below was collected over a nine month period from one of our online stores. Weekly rank, search volume, and organic visits were recorded for 44 of the site’s keywords. Keywords from the head, middle, and tail of the search demand curve are equally represented in this sample.
After running a regression of traffic on rank and search volume, the following data is generated:
The R-Square statistic tells us how much of the variance in traffic is due to changes in rank and total search volume. An R-Square of .55 tells us that 55% of the variation in traffic is explained by rank and total monthly searches. Stated differently, as organic traffic driven to the site by a particular keyword increases and decreases over time, 55% of that rise and fall is explained by changes in rank and search volume.
The implications of this finding are immediately evident. Rank and search volume, the two factors that we would intuitively expect to account for all of the variation in organic traffic, account for only half of that variation. Individual regressions of these two variables further reveal that rank only accounts for 1% of the variation in organic traffic, and search volume accounts for the remaining 54%.
In summary, only half of the variation in organic traffic is due to factors we can identify, and only about 1% of variation in organic traffic is due to factors we can control. Is it time to panic? I’m not sure.
Rather than inciting panic, these findings should highlight the limitations of analyzing ranking data. As fine-tuned algorithms tailor each SERP to fit a specific user in a specific location at a specific time, the generalized ranking data collected by my rank tracking software becomes more and more arbitrary. The disparity between the SERP delivered to my target market and the SERP delivered to the software manifests the static of unexplained variance observed above, calling into question the ability of ranking changes to effectively measure organic standing or progress.
The finding that rank is not an effective predictor of visits complicates the practice of reporting progress to clients in terms of ranking changes. If variation in rank cannot statistically explain a corresponding variation in the desired outcome (visits), a host of new questions are raised. What variables can we use to explain the remaining 45% of the variation in organic visits? Are more variables at play here or is this just statistical static? Can we control these variables? If it were true that we could only control 1% of the variation in organic traffic, what would that mean for SEO?
Those still reading may note that I am deeply troubled by the above findings. I welcome each and every consoling comment below. Offer your theories, explanations, and conjectures. Run your own regression, and if you get results similar to mine, it may be time to measure progress with a new metric.
Comments
Please keep your comments TAGFEE by following the community etiquette
Comments are closed. Got a burning question? Head to our Q&A section to start a new conversation.