I had an argument once with a colleague about the quality writing and communication and the popularity of those communications. In short, my colleague argued that even if another colleague (a “coac” or “colleague of a colleague”) wrote a blog that only 10 other people read, it was still a good blog. My counter was that, unless the subject matter of the blog was so esoteric that there would only be 10 other people in the world interested and knowledgeable on the subject, then no, it was not a good blog if only 10 people read it. We argued about it quite heatedly, but my basic position was that if someone was writing on subjects of relatively broad impact (say, healthcare or the economy) and was publishing publicly (through a blog that has adequate tags, etc.), THEN, if only 10 people are reading it, it’s not a good blog. Wow, those words have come back to haunt me.
In truth, this appears to be EXACTLY how the PageRank system works. There’s even a great quote from the paper on this very subject: “The average web page quality experienced by a users is higher than the quality of the average web page. This is because the simplicity of creating and publishing web pages results in a large fraction of low quality web pages that users are unlikely to read.” In short, if it’s read, it’s better than if it’s not read. And for much of the web, this is true. The whole “wisdom of crowds” thing. The whole “efficiency of the market” thing. Good, valuable, clear, dense information will win out over bad information. That’s the concept.
And now, I wonder. This may be true if you wish to get information the good restaurants in New York City. Or the best places to go bird watching in Central Park. People will search. They will find and link to and refer others to the best places. And those will rise up with PageRank algorithm.
That’s true for those things that can, by their nature, have validity and usefulness determined by consensus. Which is fine. But there are many many many things that consensus is very very bad at. It’s the reason why we have the Supreme Court in the United States and decisions like Brown v. Board of Ed. Because lots of times, consensus is wrong, or just wrong-headed.
And that means that looking for information on cancer treatments using PageRank is a very very questionable assumption. Because most people are not experts on cancer treatment. And yet, many people are affected by cancer and by worry for their loved ones and themselves and so look for the best information they can find. Unfortunately, for something like cancer treatment, the information that everyone finds is not necessarily the best information. And so PageRank fails.
PageRank is also, I suspect, extremely susectible to manipulation by individuals or organizations adept at generating interest in a subject. People talented in PR and grass roots initiatives frequently have a natural talent for drawing attention to something that, perhaps, was not always of interest to people. But if enough “buzz” is generated, then, again, the PageRank algorithm will bubble up the pages.
I know there are literally an endless number of ways to approach searching. From purely keyword, to purely popularity to heavily analytical, using linguistic formulas. At the end of the day though, I suspect that expertise is the only true factor in really determining value, at least where there are issues that are not simply consensus issues (e.g., ”great movie” is probably a consensus issue, while atomic weight of Cessium is most definitely not). So, do you insert a “person in the middle” much the way wikipedia does? I’m not sure. I suspect that, in fact, the very reason why google has not eliminated wikipedia and the like is because editorialized expertise has as much of a place as consensus and therefore cannot be eliminated.
Lastly, WOW. How about this –> “”Currently, the predominant business model for commercial search engines is advertising. The goals of the advertising business model do not always correspond to providing quality search to users. … In general, it could be argued from the consumer point of view that the better the search engine is,the fewer advertisements will be needed for the consumer to find what they want. This of course erodes the advertising supported business model of the existing search engines. However, there will always be money from advertisers who want a customer to switch products, or have something that is genuinely new. But we believe the issue of advertising causes enough mixed incentives that it is crucial to have a competitive search engine that is transparent and in the academic realm.”
WOW. Can’t really add to that. Wow.