The Ultimate Guide to Internal Linking (Part 2)
Posted by Pete - 11/08/08 at 10:08:12 amIn part one of our ultimate guide to internal linking, we looked at how to structure a site’s links based on the original PageRank formula. However, as has been said many times, the original formula has evolved somewhat since it was first used more than ten years ago, and is now something altogether more complex.
Patrick Altoft recently posted a great article on how the modern day PR equation probably looks more like a Schrodinger equation, which whilst I can see what he’s getting at, I think misses the mark slightly.
Now, I have a slight advantage here, in that over the past year or so, I’ve managed to make theoretical physics into a sort of hobby. I know, I need to get a life. That said, it does mean that I have the advantage of having a reasonable understanding of quantum theory. So, to help the layman, I thought I’d translate the paper in question in basic terms, and then take a look at what I believe the current PageRank formula is (which bears more than a passing resemblance to what the paper states).
- Quantum Theory and the PageRank Formula
- Appliction to Potential Modern PageRank Formulae
- How It Helps Us
- Navbars and Sitemaps
Quantum Theory and the PageRank Formula
The first thing to note about this, is that it’s not what Google actually uses; it’s just expounding on an idea as to what they could use at some point. The second is that whilst this is a different formula, it still riffs off most of the same principles as the original PR formula (both formulae are eigenvector problems at their root).
For example, in an website with a single page, the ‘wave function’ of the page provides a complete description of how the user will interact with it. The function itself can be broken down into a series of potential actions that the user might take, each with different probabilities of completion, which form a basis for the possible wave functions. For websites with more than one page, the value of the wave function is the sum of all the total possible probabilities of all the options on all pages. Thus the wave function describes the probabilities of the potential configurations of a user navigating a site through it’s links.
For the masochists amongst you, the formula used in the new equation is:
w = (k^o - aA^T)^-1 * F’ = (I - aB)^-1 * (k^o)^-1 * F’
..where w is the wave function, F’= aF, k^o is a matrix whose elements are all zero apart on the diagonal where they are given by the outdegree of vertices and B is equal to (k^o)-1 * A^T.
Appliction to Potential Modern PageRank Formulae
The important thing that these people have recognised is the shortcoming in giving every link on every page the same dampening factor. It was originally introduced to account for the variable likelihood of a user clicking on any particular link. However, not every link is as valuable to a user as any other link on a page. As such, I tend to introduce three more factors into the PR equation, to help when structuring internal links. These are R, which is the semantic relevance of a link’s target page’s content to the content of the current page, multiplied by 0.15, Q, which is the likelihood that the page they’re on contains the information they were expecting after clicking a previous link (and thus they are more likely to click a link rather than hit the back button) and u, which is the ratio of incoming links from similar pages to the total number from the entire site. px1 are the pages with high relevance to the page in question and px2 are the pages with low relevance.
When you put these into the original formula, you end up with:
PR(pi) = ( (1-R/N) + R * (? pj ? ((u * M(pi1)) + ((u*0.85)) * M(pi2))) * (PR(pj)/L(pj)) ) / Q
What this means is that the PageRank of any page is equal to the sum of the PR from inbound links from relevant pages, plus the sum of the PR from inbound links from non-relevant pages, divided by the probability of the page in question being the page the user was expecting, based on the semantic link between the pages in question, and the relative PR of each page.
How It Helps Us
Essentially, it means you don’t have to use nofollow quite so much. Unlike in the last post, which worked off the original PageRank formula, this new one (which I believe more closely models the current one) takes into account the idea that you might want to offer a link to a user merely as a point of interest, rather than because it is something they need to visit. This allows the search engine to take into account (though semantic analysis and comparative PR value speculation) the odds of any particular link being clicked, relative to any other link on that page.
This then allows PR not to be assigned to a page as a whole, but divided up unequally across the links on a page. We then look at PR not as something dolled out to sites or pages, but instead to each individual link. As a result, PR becomes more like XML; abstracted from the content of pages, and merely assigned to where it needs to be: the links themselves.
When applying this to your own internal linking, all you need to then do is decide whether a link is contextually valuable, or not. As such, things like Accessibility Policies and so on would still be worth nofollowing, as whilst the engine will reduce the PR value assigned to those links, they nevertheless still get some, and this could be better used elsewhere. On the other hand, links that are more ‘points of interest’ or navbars, rather than things you really want the user to visit can be left alone, as the engine will assign them a lower value automatically. Finally, the really good links, such as ones from in a highly targeted piece of copy on a service you offer, to another page on a particular part of that service will naturally get the highest amount of PR, as the engine can analyse the content of the two pages, the relevance of that link at that time to the user, and determine that it’s a valuable link, worthy of a higher-than-normal percentage of linkjuice from that page.
If you found this post helpful, please Stumble it or vote it up on Sphinn
1 Trackbacks/Pingbacks
- Pingback: Getting good links to improve Pagerank on August 11, 2008





Leave a comment
You must be logged in to post a comment.