When we announced our Blog Impact reports earlier this week – where we report on blog traffic and spikes for our users – I talked about how we use a set of algorithms we call FirstRain MarketScore to sort out which ones to include in the statistics at any point in time. This is important to continuously present only only the statistics that matter to the market – to sort the wheat from the chaff.
A cynic might say – how do you do that with any credibility? There are so many and growing all the time? So I thought I’d share some insight on how we do it.
Our goal is to identify the most influential blogs within a specific domain of interest to the user so we can spot emerging trends and anomalies – and we do this by comparing blogs based on three characteristics prominence, reach and authority.
– Prominence is the reputation of a given blog within the target domain, along with assessment of its reputation beyond that domain.
– Reach is assessed based on indicators of viewership within the target domain along with some weight for readership beyond the target domain.
– Authority is assessed based on factors that measure trust for the blog’s judgment on question of judgment and projection.
It’s complex to measure these factors but we can do it because we can run measurements on the content of the blog over time -and it would be impossible to do if we weren’t processing and analyzing the content. By running algorithms on the content we can see factors like:
– domain relevance / focus
– originality (not copied from other places)
– syndication (not copied to other blogs)
– productivity (how often is the content relevant to an investment topic – of which we have many thousands)
Of course we also look at the traditional market factors like traffic, backlinks and 3rd party ranking to sanity check our results. And importantly, the model is constructed to yield a highly stratified set with a ‘high bar.’ So a blog which is widely read within the domain, which gets cited by other (especially high authority) blogs on questions of interpretation, and is even occasionally read/cited/quoted beyond the domain would score high.
For example, the Gartner analyst blog which covers the software space might be scored fairly well on prominence and reach, and highly rated on authority within the technology blog domain, but would pale in comparison to an IDC analyst blog which covers the worldwide PC market, which scores more highly on prominence and reach because it gets consistently quoted on global PC shipments in financial industry blogs and the Wall Street Journal.
Combine this scoring approach with our ability to normalize results by domain and by company and you get the ability to pull out interesting changes in volume and subject matter – which fill in the research mosaic and can, sometimes, move the market.